Skylake and Kaby Lake Processors Allegedly Have Broken Hyper-Threading

Megalith

24-bit/48kHz
Staff member
Joined
Aug 20, 2006
Messages
13,000
A warning advisory has gone out on Debian mailing lists claiming that Skylake and Kaby Lake processors may “dangerously misbehave” when Hyper-Threading is enabled: the writer advises that users should disable the feature immediately in their BIOS/UEFI to work around the problem. The defect can potentially affect any operating system (it is not restricted to Debian and other Linux-based systems).

This advisory is about a processor/microcode defect recently identified on Intel Skylake and Intel Kaby Lake processors with hyper-threading enabled. This defect can, when triggered, cause unpredictable system behavior: it could cause spurious errors, such as application and system misbehavior, data corruption, and data loss. It was brought to the attention of the Debian project that this defect is known to directly affect some Debian stable users (refer to the end of this advisory for details), thus this advisory.
 
wonder if this is actually firmware fixable or requires a hardware fix
 
It is in the article. It is a combination of microcode updates, and disable hyperthreading depending on processor architecture and steppings. Steppings are for Skylake with the issue.
 
To run these programs, disable Hyper-Threading through your computer's Basic Input/Output System (BIOS).

  • Restart your computer. ...
  • Press the key that the instructions mention. ...
  • Press the down-arrow key to select the "Hyper-Threading Technology" option.
  • Press "Enter. ...
  • Press the down-arrow until you have selected "Disabled.
  • Press the "Enter" key.
  • Press the "F10" key to save your changes, exit the BIOS and load your operating system.

I never knew you could do that actually I thought they were enabled no matter.
 
To run these programs, disable Hyper-Threading through your computer's Basic Input/Output System (BIOS).

  • Restart your computer. ...
  • Press the key that the instructions mention. ...
  • Press the down-arrow key to select the "Hyper-Threading Technology" option.
  • Press "Enter. ...
  • Press the down-arrow until you have selected "Disabled.
  • Press the "Enter" key.
  • Press the "F10" key to save your changes, exit the BIOS and load your operating system.

I never knew you could do that actually I thought they were enabled no matter.

Yep. Its not as necessary now but, back when Intel first introduced Hyperthreading you'd see disabling as something recommended to push a couple extra Mhz out of your CPU overclock.
 
That's a pretty severe "solution". Hyperthreading is why you get an i7 in the first place. And Skylake has been out a long time, how is this just now being noticed? Something doesn't add up.
 
But Intel never!!! Ever!! Right?
CPU bugs happen (remember TSX story?). That one was already fixed 2 months ago on top of it.
how is this just now being noticed?
Because it had pretty specific conditions to occur. For one, it involves usage of 64-bit registers and their 8-bit subregisters in the same time... That as you can guess is not too common in the code.
 
That's a pretty severe "solution". Hyperthreading is why you get an i7 in the first place. And Skylake has been out a long time, how is this just now being noticed? Something doesn't add up.

You should probably read the article ya know?

The issue was being investigated by the OCaml community since
2017-01-06, with reports of malfunctions going at least as far back as
Q2 2016. It was narrowed down to Skylake with hyper-threading, which is
a strong indicative of a processor defect. Intel was contacted about
it, but did not provide further feedback as far as we know.
...

Apparently, Intel had indeed found the issue, *documented it* (see
below) and *fixed it*. There was no direct feedback to the OCaml
people, so they only found about it later.

The defect is described by the SKZ7/SKW144/SKL150/SKX150/KBL095/KBW095
Intel processor errata. As described in official public Intel
documentation (processor specification updates):
 
Yep. Its not as necessary now but, back when Intel first introduced Hyperthreading you'd see disabling as something recommended to push a couple extra Mhz out of your CPU overclock.
When it first came out shutting off HT was the first reply on every gaming forum out there when people reported issues. Overclocking or not.
 
Well sh*t, sticking with Haswell and Win7 are looking like better decisions by the day ;)
Hopefully Intel doesn't FUBAR the Covfefelake / Z370 release.

EDIT : Read the posts more thoroughly. Pretty unlikely for an end-user to actually run into the flaw, but nevertheless something I'd want patched.
Will see if Dell releases an updated BIOS for my OptiPlex 5040 (i7 6700) at work.
 
Last edited:
Where do microcode updates come from? The OS vendor? Intel directly?
intel directly. windows can then bundle it into an update & mobo makers can bundle it into new bios. linux can apply at boottime
 
How are we supposed to get a fix? I'm i7-6600U 78 stepping 3. AMD thanks you Intel!!

You can't. End users are screwed until the fix is pushed out to the consumer. The only thing we can do is disable HT. That said, Windoze has a lot of room for errors unlike *nix, crosses fingers that Windoze is that dumb that it might not matter.
 
You can't. End users are screwed until the fix is pushed out to the consumer
Are you really screwed? That shit is pretty damn hard to trigger (there's a reason it took almost a year for first reports to come in, let alone for bug to be discovered).
 
Are you really screwed? That shit is pretty damn hard to trigger (there's a reason it took almost a year for first reports to come in, let alone for bug to be discovered).

Bigger question is can it be fixed. Also where is Juanrga, he loves to report supposed errata on the AMD side but not Intel? Funny how certain people become MIA around here on certain news.
 
OMG
Its the P3 scandal all over again.
http://windowsitpro.com/windows-server/cpu-embarrassment-intel-recalls-pentium-iii-113ghz

http://www.tomshardware.com/reviews/intel-admits-problems-pentium-iii-1,235.html
Kyle Bennett Informs Me About Seeing The Same Problems
After I had posted my article I soon received an email from Kyle Bennett of [H]ard|OCP , stating that he had seen the very same kind of problems with his test sample as well, although he had actually received the special Intel motherboard that was supposed to be supplied with the Pentium III 1.13 GHz processor. He had seen the failures on this board also.

"Horrible Instability Without The Latest Micro Code Update"
In fact, we found the CPU to not be close to 100% stable on the i820 board supplied even with their own Rambus.

Just FYI.

...
Kyle Bennett
WebMonger @ Hard|OCP
Purveyor of Smoothness @ Ratpadz
Hosting Ho @ Gamers|Hardware
 
Last edited:
I've had my 6700K for about 1.5 years and never experienced any issues with it. o_O News to me.
 
wonder if this is actually firmware fixable or requires a hardware fix
Where do microcode updates come from? The OS vendor? Intel directly?
Well sh*t, sticking with Haswell and Win7 are looking like better decisions by the day ;)
How are we supposed to get a fix? I'm i7-6600U 78 stepping 3. AMD thanks you Intel!!

First up, it can be fixed with a microcode update, and Intel has already created an update that has been distributed to mobo manufacturers. This would be packaged in a BIOS update; I expect any board with a 200 series chipset will get the update, and it's likely that 100 series boards would as well.
If you're on Linux, you eventually won't need the BIOS microcode update, as the Linux kernel can load microcode on boot, and all Linux distributions tend to package microcode updates. (this isn't new - it's always been this way, and processor errata is actually quite common)

This affects all SMP aware OSs, Linux and Windows alike, so Windows 7 won't save you.

Finally, if you haven't been affected by this yet, you're not likely to be affected going forward. This affects both Skylake and Kaby Lake, - so any new system in the past 2 years. If it were an issue that ordinary people would be hit by, people would have been reporting problems since the i7-6700s and i3-3X00s

How to tell if you're affected: when doing extreme mathy things (compiling software, or running very custom, very low level, highly threaded math applications, or designing rockets or brain surgery) does your computer crash always at the same spot, and does that workflow succeed in a i7-5770 or below? Well, then maybe you're affected.
 
^ No need to get your knickers in a twist. I had already updated my post to reflect the low likelihood of everyday users
running into the bug and that a microcode patch was already issued ;)

And I stand by my choice of Win7, what with the clusterf*ck that is the Win10 platform (vsync problems with WDDM2.2 in CU, etc.)
That said, I do have Win10 CU x64 installed on its own dinky little 128GB mSATA SSD, strictly for the few titles that I'd want to run in DX12.
 
Last edited:
Wait there, wasn't that fixed in recent microcode already?

Yep, already fixed.

Where do microcode updates come from? The OS vendor? Intel directly?

There are 2 distribution methods.

One is via a BIOS update. The other is via an OS update. Windows and *nix for example runs a microcode check at boot, if the OS got a newer one it applies it.
 
Microsoft can "patch" the issue by including a new microcode package that's loaded a boot time. It's happened before (example).
 
First up, it can be fixed with a microcode update, and Intel has already created an update that has been distributed to mobo manufacturers. This would be packaged in a BIOS update; I expect any board with a 200 series chipset will get the update, and it's likely that 100 series boards would as well.
If you're on Linux, you eventually won't need the BIOS microcode update, as the Linux kernel can load microcode on boot, and all Linux distributions tend to package microcode updates. (this isn't new - it's always been this way, and processor errata is actually quite common)

This affects all SMP aware OSs, Linux and Windows alike, so Windows 7 won't save you.

Finally, if you haven't been affected by this yet, you're not likely to be affected going forward. This affects both Skylake and Kaby Lake, - so any new system in the past 2 years. If it were an issue that ordinary people would be hit by, people would have been reporting problems since the i7-6700s and i3-3X00s

How to tell if you're affected: when doing extreme mathy things (compiling software, or running very custom, very low level, highly threaded math applications, or designing rockets or brain surgery) does your computer crash always at the same spot, and does that workflow succeed in a i7-5770 or below? Well, then maybe you're affected.


part of my reason for wondering if this could be fixed by a microcode update is the fact that they kept calling this a processor defect or a microcode defect. To me it came off as they were still somewhere questioning if this was a code defect or a physical design defect. As I would consider a processor defect a physical design defect more than a software bug. If Ford was to release a car where the GPS software would reboot that would be a code bug that could be fixed with a patch. If they designed a car in a way that the steering column had a tendency to break after about 20,000 turns that would be a car defect. Or looking at technology, the red ring of death in the Xbox 360 was a console defect something that caused the UI to run slow would have been a code defect. Couldn't fix the red ring of death with a software upgrade. You could maybe add ways in code to prevent it some but you still couldn't prevent it with a code update.
 
I found a set of conditions that each time I perform it, my machine will lock up.
I just tried without HT enabled and it works fine.
So looks like this bug affects me with my 6700K.

The bizarre condition is:
Win7-64
Running torrent client in a sandbox.
Sandbox and torrent client have limited system access via Comodo Firewall HIPS (not sure if this is relevant but it may help trigger it).

If I am downloading a torrent and try to play a card game (windows built in games) to pass the time, my machine will hard lock. Every time.
If I open the card game first or stop the torrent or let the torrent finish it works without issue.
I thought it might be related to locked down security issues but as its something basic I can cope with I didnt pursue a solution.

I tested opening a card game with a torrent downloading with HT disabled and it works fine.
So that explains that.
 
Quoting from a comment on /. (no idea as to the veracity - don't have time to check up on it @ work)
>>>
Apparently, the fix works only for some models of Skylake (models 78 and 94, stepping 3).
On any other Skylakes and all Kaby Lakes there's no way other than disabling hyperthreading entirely.
A fix might or might not be released in the future, Intel doesn't say a word about the issue.
<<<
 
I like how most of Asus's BIOS updates just say "Improve system stability". Vague enough?

The only one I see that even mentions microcode is from Feb of 2016. Assuming it's not this issue, but who knows?

Z170 PRO GAMING BIOS 1204
1.Update CPU Microcode
2.Improve system stability
2016/02/19
 
If the BIOS update is from May or newer it may contain the fix.
 
Reading can actually answer that question but apparently that's not your strong suit in this situation.

Doesn't look like its available for everyone that has the issue. Some of the older processors (steppings as well) are not included in that supposed fix that may be in a bios update or not. It's pretty damn vague and my question is still valid for the older chips, can they fix it or will they fit it. So try not to assume things as it can make you look bad.
 
Doesn't look like its available for everyone that has the issue. Some of the older processors (steppings as well) are not included in that supposed fix that may be in a bios update or not. It's pretty damn vague and my question is still valid for the older chips, can they fix it or will they fit it. So try not to assume things as it can make you look bad.

The microcode update for this is already available for Linux. If anyone really wants the update to avoid Linux problems, they can install it (in Debian) and follow the steps:

https://wiki.debian.org/Microcode

If you aren't using Debian, I presume the other distros have their own fixes.
 
Last edited:
Doesn't look like its available for everyone that has the issue. Some of the older processors (steppings as well) are not included in that supposed fix that may be in a bios update or not. It's pretty damn vague and my question is still valid for the older chips, can they fix it or will they fit it. So try not to assume things as it can make you look bad.

Tell me what retail Skylake CPUs are stepping 2. My 6700K from release day is stepping 3.
 
Covfefelake

dhMeAzK.gif
 
Back
Top