A Massive Intel Hardware Bug May Be on the Horizon

SighTurtle · Jan 2, 2018

The blog post you linked to made a update. Apparently we might not have to wait long before the embargo on this ends, January 4 is the suspected public date.

Zarathustra[H] · Jan 2, 2018

Strangely enough it doesnt look like the great selloff has started yet:

arnemetis · Jan 2, 2018

Makaveli@BETA said:
That chip doesn't exist.

However what they can do is send you a chip with more cores that will give you 30% more mutlithread performance

I specifically went with this chip because I need the higher single threaded performance for my work, which doesn't care about multiple threads/cores. I put a Ryzen 1700x in my file server where it makes sense, but my main machine is all single thread. If Intel had a 6ghz quad core I would have gotten that instead.

Deleted member 93354 · Jan 2, 2018

naib said:
This isn't todo with swap, this is todo with how memory is mapped.
Your RAM is part of a larger concept called memory which is divided into pages and easily exceeds the amount of RAM and addressable RAM your systems has or could have

This I realize. But I thought to go beyond the number of pages you have, you have to have virtualization on. You can still randomize address loads. I thought virtualization was only activated with the virtual swap file on. (I could be wrong about this as this delves into a layer of x86 programming that is deeper than my x86 assembly programming days)

naib · Jan 2, 2018

Zarathustra[H] said:
Strangely enough it doesnt look like the great selloff has started yet:

View attachment 48511

Probably because techheads haven't even figured this out yet to determine the severity LET alone traders.

The fact the CEO sold a tonne of his stock when aspects of this were 1st tested stinks. The fact the open-source patches were finally made available before windows update all aligns with a co-ordinated patch sequence to mitigate this -> quite bad

Deleted member 93354 · Jan 2, 2018

naib said:
Probably because techheads haven't even figured this out yet to determine the severity LET alone traders.

The fact the CEO sold a tonne of his stock when aspects of this were 1st tested stinks. The fact the open-source patches were finally made available before windows update all aligns with a co-ordinated patch sequence to mitigate this -> quite bad

If true this is bigger than a micro-code update fix. It will be a floating point bug all over again. And Intel will give us "$30 off our next processor" as a sorry by the time the class action settles.

Zarathustra[H] · Jan 2, 2018

naib said:
The fact the open-source patches were finally made available before windows update all aligns with a co-ordinated patch sequence to mitigate this -> quite bad

Yeah, the fact that the open source community and Microsoft are coordinating (and that that the authors of Linux Kernel commits could be convinced to redact their comments) points to the fact that its a pretty bad security vulnerability.

The question is, is it a bad vulnerability that is fine once patched, or are we guaranteed to see the kind of performance hit that's been talked about. Everything has vulnerabilities that need patching over time. The question is, are there other downsides?

The blog post suggests there might be, but he also proclaims to not be an expert. So lets wait and see.

DeathFromBelow · Jan 2, 2018

Where's juanrga? I want to watch him spin like Taz.

OutOfPhase · Jan 2, 2018

DigitalGriffin said:
This I realize. But I thought to go beyond the number of pages you have, you have to have virtualization on. You can still randomize address loads. I thought that was only activated with the virtual swap file on. (I could be wrong about this as this delves into a layer of x86 programming that is deeper than my x86 assembly programming days)

Virtualization just ensures that processes get a simplified and abstract view of memory which looks like their own clean machine. You can run 13 instances of an application, each compiled to write something at address 0x00004452. The virtual page tables will map each to its own physical page, luckily, instead of everyone truly getting that address. In the simplest view, that's what virtual memory is about.

And yes, beyond that you can do cool stuff like not actually have the page in RAM at all until used (lazy commit), or page things in and out to secondary storage.

thesmokingman · Jan 2, 2018

Yay for Intel?

DeathFromBelow said:
Where's juanrga? I want to watch him spin like Taz.

He's still figuring out how to spin this one?

Ultima99 · Jan 2, 2018

Is this how they'll make Icelake 40% faster than 8th gen i7?

BinarySynapse · Jan 2, 2018

DigitalGriffin said:
This I realize. But I thought to go beyond the number of pages you have, you have to have virtualization on. You can still randomize address loads. I thought virtualization was only activated with the virtual swap file on. (I could be wrong about this as this delves into a layer of x86 programming that is deeper than my x86 assembly programming days)

Virtual memory is on regardless of swap or page file settings. It's how applications get to put data wherever they want in the address space without stepping all over another applications data. The OS and CPU work together to translate the application "virtual" address into a real physical address that is readable in RAM.

The page file comes in when there aren't enough physical addresses to match all of the applications' virtual addresses. The OS will just map that virtual address to a part of the page file and flag it as so that when the application that owns that page accesses it, the OS know it's not in RAM and will swap things around to make it look like it always was.

GNUse_the_force · Jan 2, 2018

This must be a big one. Fairly sure i saw another very big tech forum with this story 15 mins ago and now it's been pulled

Deleted member 93354 · Jan 2, 2018

ryan_975 said:
Virtual memory is on regardless of swap or page file settings. It's how applications get to put data wherever they want in the address space without stepping all over another applications data. The OS and CPU work together to translate the application "virtual" address into a real physical address that is readable in RAM.

The page file comes in when there aren't enough physical addresses to match all of the applications' virtual addresses. The OS will just map that virtual address to a part of the page file and flag it as so that when the application that owns that page accesses it, the OS know it's not in RAM and will swap things around to make it look like it always was.

I remember in the old days addresses on 32 bit were Segment offset to form 40 bits and limit you to ~3.5GB. But the page determined where in that memory that segment offset sit. I didn't know you could address outside the physical # of pages. That just seems....well stupid.

Good info. Thanks to all.

naib · Jan 2, 2018

Zarathustra[H] said:
Yeah, the fact that the open source community and Microsoft are coordinating (and that that the authors of Linux Kernel commits could be convinced to redact their comments) points to the fact that its a pretty bad security vulnerability.

The question is, is it a bad vulnerability that is fine once patched, or are we guaranteed to see the kind of performance hit that's been talked about. Everything has vulnerabilities that need patching over time. The question is, are there other downsides?

The blog post suggests there might be, but he also proclaims to not be an expert. So lets wait and see.

Well... considering the linux patch is being backported to older kernel speaks volumes, so much so that they are porting it at the expense of virtualization hosts until a better patch exists.

a ycombinator post summaries the type of attack: https://news.ycombinator.com/item?id=16001476

Attackers: We can inject machine code into the stack and execute it [1].

Developers: Not anymore; stack pages aren't executable [2].

Attackers: We can turn your own code against you by manipulating the stack to "return" to an arbitrary sequence of pre-existing functions with arguments of our choosing [3].

Developers: Not anymore; we randomize the address space every time we load the code, so now you don't know where those functions are. You could guess, but you're going to be wrong and just crash the system almost all of the time [4].

Attackers: For the kernel on x86/amd64 systems, we have a way of reading the page table from an exploited user process, which we can then parse to discover where you located those functions. <---- you are here

1] https://en.wikipedia.org/wiki/Stack_buffer_overflow#Exploiting_stack_buffer_overflows
[2] https://en.wikipedia.org/wiki/NX_bit
[3] https://en.wikipedia.org/wiki/Return-oriented_programming
[4] https://en.wikipedia.org/wiki/Address_space_layout_randomization

I guess the severity will need to be determined if means to disable this is exposed to end-users (linux has a new kernel option). Equally how fast the black-hats start dissecting the windows update to determine where the hole is in windows

///AMG · Jan 2, 2018

M76 said:
Well it says typical workloads, until I know what workloads are those, it is possible we won't even be affected at all. But I feel the problem, as someone who runs processes that run for days.

Yea, but if it does affect me and there is an alternative I will be switching.

naib said:
this flaw appears to be associated with virtual memory, so if your application has to page out alot or uses a lot of virtual paging you are probably going to be affected more... this is a royal PITA... I use matlab daily and 24gig isn't enough for what I simulate ... a quad i7 slowing down to mitigate this flaw could potentially hit my productivity

Yea, I run a lot of simulation stuff (Arena, Simio, and Matlab) and most of my runs take easily half a day sometimes more and if there is a greater than 5% slowdown and close to 30% I am going to lose a lot of productivity.

naib · Jan 2, 2018

///AMG said:
Yea, I run a lot of simulation stuff (Arena, Simio, and Matlab) and most of my runs take easily half a day sometimes more and if there is a greater than 5% slowdown and close to 30% I am going to lose a lot of productivity.

I'm tempted to benchmark a model I have right now (approx 4hours for one operating point) and then cross reference the kB this fix is associated with and run after....

If this really is 5-30% regression (which should be believed) the financial markets could be in for a shock... Windows7 might hang on for alot longer as people hold off this patch UNTIL an exploit is in the wild

Jim Kim · Jan 2, 2018

arnemetis said:
So will Intel send me a 5.35ghz to 6.63ghz 8700k to offset the 5-30% performance loss on my brand new setup?

Check your mailbox, it's probably already there.

Jim Kim · Jan 2, 2018

naib said:
I'm tempted to benchmark a model I have right now (approx 4hours for one operating point) and then cross reference the kB this fix is associated with and run after....

That would be awesome, please do.

almalino · Jan 2, 2018

I have i7-8700k CPU + parts lying around to assemble a gaming box. If this is true and 5% to 30% speed reduction will be a reality in games I will regret my decision of buying intel as I wasted 700 EUR on CPU and mobo then.

Deleted member 93354 · Jan 2, 2018

Curious. The fix seems to force a reload of all the virtual tables into cache for page lookups. That sounds like it would cause the expected slowdown. So it sounds like cache poisoning. Maybe rowhammer on the cache??

Edit: As I suspected: Ivy Bridge fixed rowhammer by causing two row refreshes on the cell contents. So only cache could be affected.

Since the release of Ivy Bridge microarchitecture, Intel Xeon processors support the so-called pseudo target row refresh (pTRR) that can be used in combination with pTRR-compliant DDR3 dual in-line memory modules (DIMMs) to mitigate the row hammer effect by automatically refreshing possible victim rows, with no negative impacts on performance or power consumption. When used with DIMMs that are not pTRR-compliant, these Xeon processors by default fall back on performing DRAM refreshes at twice the usual frequency, which results in slightly higher memory access latency and may reduce the memory bandwidth by up to 2–4%.[6]

dgz · Jan 2, 2018

this is HUGE. It's not like AMD is in a position to capitalize on this, though. Such a shame

GNUse_the_force · Jan 2, 2018

dgz said:
this is HUGE. It's not like AMD is in a position to capitalize on this, though. Such a shame

At least there not cozily wedging their GPU's inside Intel CPU's.

Khahhblaab · Jan 2, 2018

///AMG said:
Wow if I actually see a even 10% slowdown im going to be build a new rig. 5-10% for me means 45mins-1.5 hrs.

..from ///AMG to AMD I bet

naib · Jan 2, 2018

dgz said:
this is HUGE. It's not like AMD is in a position to capitalize on this, though. Such a shame

Depends... In response to Intel's ME farce, AMD are offering end-users the option to disable their equiv on Ryzen while Intel doubled-down and disabled the ability to downgrade.

AMD shares are up, thread ripper and co got good reviews, mobile CPU's being offered by Dell... A bit of marketing, say benchmarks in a couple of weeks could do wonders

GNUse_the_force · Jan 2, 2018

naib said:
AMD shares are up, thread ripper and co got good reviews, mobile CPU's being offered by Dell... A bit of marketing, say benchmarks in a couple of weeks could do wonders

Would a price drop now be a good time for Ryzen line, say something along the lines of 5% - 10% and also announce Ryzen 2 ?

///AMG · Jan 2, 2018

Khahhblaab said:
..from ///AMG to AMD I bet

Yea, Already looking at the Epyc processors. I was planning on getting the e5-2699 V5 this year in a 2P config but looks like I am going AMD if they truly arent affected.

naib · Jan 2, 2018

Jim Kim said:
That would be awesome, please do.

Ok what I'll do is try on my Linux setup. This way I am in control of updating the kernel

This will run faster than on my work laptop as it is a ryzen 1600 so with the parallel pool enabled it will have 2 more cores to crunch with.
One problem is this path breaks on Gentoo Linux due to the harden options enabled

I'll try to run the Sims tomorrow, just need to get the libraries

OutOfPhase · Jan 2, 2018

(To no one person specifically)

I'd recommend some patience and a little less breathless hysteria for the time being. There have been many errata like this you never really knew about but ended up getting a pretty invisible and unobtrusive fix.

///AMG · Jan 2, 2018

naib said:
Ok what I'll do is try on my Linux setup. This way I am in control of updating the kernel

This will run faster than on my work laptop as it is a ryzen 1600 so with the parallel pool enabled it will have 2 more cores to crunch with.
One problem is this path breaks on Gentoo Linux due to the harden options enabled

I'll try to run the Sims tomorrow, just need to get the libraries

Yea, Im going to see if I have some smaller models to work with and bench those.

naib · Jan 2, 2018

Phoronix have run some benchmarks

https://www.phoronix.com/scan.php?page=article&item=linux-415-x86pti&num=2

Nice hit

1_rick · Jan 2, 2018

Is this something that affects all instructions, slowing down everything? Or is it, for example, just page table lookups?

DeathFromBelow · Jan 2, 2018

PhaseNoise said:
(To no one person specifically)

I'd recommend some patience and a little less breathless hysteria for the time being. There have been many errata like this...

naib · Jan 2, 2018

1_rick said:
Is this something that affects all instructions, slowing down everything? Or is it, for example, just page table lookups?

Just pagetables. So some applications wont be affected, but others might be.

Phoronix initial tests show a nice drop for compiler based operations

///AMG · Jan 2, 2018

naib said:
Phoronix have run some benchmarks

https://www.phoronix.com/scan.php?page=article&item=linux-415-x86pti&num=2

Nice hit

Wow Thats nuts.

1_rick · Jan 2, 2018

naib said:
Just pagetables. So some applications wont be affected, but others might be.

Phoronix initial tests show a nice drop for compiler based operations

I saw that after I made my comment. Looks ugly, nonetheless.

kirbyrj · Jan 2, 2018

Zarathustra[H] said:
They'll likely just give you a $5 off coupon for the next hardware revision CPU that incorporates the fix

And make you buy a new motherboard to use it knowing Intel.

OutOfPhase · Jan 2, 2018

DeathFromBelow said:

OR - pretend anyone suggesting a sober response is defending Intel.

longblock454 · Jan 2, 2018

naib said:
Phoronix initial tests show a nice drop for compiler based operations

And a huge I/O hit.

atarione · Jan 2, 2018

could be worse... news could have broke AFTER... I had pulled the trigger on an upgrade from my i7-4770K =)

but yeah..... this does sort of suck... AMD may however be slightly pleased currently.. I had been a pretty loyal team Red guy for many years, finally relenting to Intel's pretty clear advantage back in the C2D days...

next upgrade could be to AMD however... waiting on ryzen+

A Massive Intel Hardware Bug May Be on the Horizon

[H]ard|Gawd

Extremely [H]

2[H]4U

Deleted member 93354

Guest

[H]ard|Gawd

Deleted member 93354

Guest

Extremely [H]

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

[H]F Junkie

Gawd

Deleted member 93354

Guest

[H]ard|Gawd

2[H]4U

[H]ard|Gawd

2[H]4U

2[H]4U

[H]ard|Gawd

Deleted member 93354

Guest

Supreme [H]ardness

Gawd

Limp Gawd

[H]ard|Gawd

Gawd

2[H]4U

[H]ard|Gawd

Supreme [H]ardness

2[H]4U

[H]ard|Gawd

Supreme [H]ardness

Supreme [H]ardness

[H]ard|Gawd

2[H]4U

Supreme [H]ardness

Fully [H]

Supreme [H]ardness

2[H]4U

2[H]4U