intel 13th and 14th gen xx900 and xx700 may have defective cores causing crashes in gaming

d3athf1sh

[H]ard|Gawd
Joined
Dec 16, 2015
Messages
1,245
So, this all started with a guy in the steam forums that claims he has tracked down the cause of "Out of video memory trying to allocate a rendering resource" errors and crash to desktop while gaming to being defective cores that actually seem to have been fine out of the box but then seem be degrading over time till they start throwing whea errors and to quote him " Inside those entries, look for 'Translation Look Aside Buffer' and 'Internal Parity Error'. There will be lots. That's a bad CPU."

he says he's been through 3 or 4 processors of both 13900k and 14900k varieties and it's always the same, fine for ~3-4 months then they all end up the same way, throwing errors and crashing. besides a lot of people on different forums just chalking it up to bad drivers or gpu. one of the main solutions that's been a temporary fix has been to undervolt and underclock the cpu. i know this sounds fishy at first but i'm trying to summarize a very long post and probably not doing a good job. but Tom's Hardware has now picked up the story and has reached out to intel for a response.

if this interests you i will post a link to his original post which is long and detailed and itself has a lot of links but also a link to the Tom's article as we await for a response from intel

https://steamcommunity.com/app/315210/discussions/0/4204741842669464852/?ctp=2

https://www.tomshardware.com/pc-com...blame-other-high-end-intel-cpus-also-affected
 
Not sure if this answers any questions. I haven't seen many overclockers complain about the CPU degrading so quickly on the forums but maybe they are too embarrassed to admit that they fried their chip a little bit. I tend to tune the voltages well below 1.4 even below 1.3 if I can with a high quality chip so I've never experienced this. I have noticed though if I let the BIOS run wild it can push up well into the past 1.4 volt regions so that might have something to do with it especially on extended heavy loads.
 
I'm running a $154 fire sale 12600k on The z790 Nova. The T create ram at 7200 xmp profile shoves 1.4v into the memory controller. So yeah I’m set at 1.25 and 6000 memory because seeing 1.4 all day every day seemed like a bad idea.

I’m willing to bet its from over clocking or even just memory OC.
 
Last edited:
I'm running a $154 fire sale 12600k on The z790 Nova. The T create ram at 7200 xmp profile shoves 1.4v into the memory controller. So yeah I’m set 1.25 and 6000 memory because seeing 1.4 all day every day seemed like a bad idea.

I’m willing to bet its from over clocking or even just memory OC.
Which some memory profiles on some boards do by default.
 
Well, I can't say what he's seeing is wrong/impossible, but without more info I wonder if it is a different cause as other here are speculating. We have a LOT of 13th gen systems at work. At least 100. We have seen zero failures. My personal rig at home is a 13900k, I built it like 10 months ago, it has no issues, including in Hogwarts Legacy which is what I've been playing lately.

So I'm not buying his "Your CPU is faulty," as though this is just Intel fucking everything up. I do wonder if power/cooling is an issue, as I do run mine at the Intel default 253 watt limit because it runs plenty fast, no need to push it harder.
 
Well, I can't say what he's seeing is wrong/impossible, but without more info I wonder if it is a different cause as other here are speculating. We have a LOT of 13th gen systems at work. At least 100. We have seen zero failures. My personal rig at home is a 13900k, I built it like 10 months ago, it has no issues, including in Hogwarts Legacy which is what I've been playing lately.

So I'm not buying his "Your CPU is faulty," as though this is just Intel fucking everything up. I do wonder if power/cooling is an issue, as I do run mine at the Intel default 253 watt limit because it runs plenty fast, no need to push it harder.
The boards themselves do a really good job at power smoothing and general rectification, so unless you have a woefully inadequate power supply it shouldn't be an issue.
 
We have a LOT of 13th gen systems at work. At least 100. We have seen zero failures.
most of them probably never use the p-cores and if they do i'm guessing they don't use stress them. not that i know what you do for work, i'm guessing the way they are used in 95% of office use cases

they've also said it's only on the high end chips that are already basically max oc'd from the factory. the lower clocked chips haven't seen the problems
So I'm not buying his "Your CPU is faulty," as though this is just Intel fucking everything up.
well, the thing is, i think his results are repeatable, that's why they accepted his cpu returns and they were also repeatable with the guys over at Tom's. He did say his last 14900k he got has been running good and he hasn't had any problems with it (yet). so there's no telling how wide spread this is right now or if it's already been fixed or what. but there have been a lot of people experiencing the same problems and up until now they've been blaming it on game developers and gpu drivers or hardware. but they started to notice the same thing in common with everyone that's reporting these issues, they're all running the same intel cpu's.
 
No issues with my 13700K so far, I enabled CPU Lite Load (9) in the bios though, everything else is at default. Played all the games Tomshardware mentioned. TLOU2 shader compilation ran at least 5-7 times due to driver updates and it was never an issue. My core clock in games is always at 5327Mhz.
 
So, this all started with a guy in the steam forums that claims he has tracked down the cause of "Out of video memory trying to allocate a rendering resource" errors and crash to desktop while gaming to being defective cores that actually seem to have been fine out of the box but then seem be degrading over time till they start throwing whea errors and to quote him " Inside those entries, look for 'Translation Look Aside Buffer' and 'Internal Parity Error'. There will be lots. That's a bad CPU."

he says he's been through 3 or 4 processors of both 13900k and 14900k varieties and it's always the same, fine for ~3-4 months then they all end up the same way, throwing errors and crashing. besides a lot of people on different forums just chalking it up to bad drivers or gpu. one of the main solutions that's been a temporary fix has been to undervolt and underclock the cpu. i know this sounds fishy at first but i'm trying to summarize a very long post and probably not doing a good job. but Tom's Hardware has now picked up the story and has reached out to intel for a response.

if this interests you i will post a link to his original post which is long and detailed and itself has a lot of links but also a link to the Tom's article as we await for a response from intel

https://steamcommunity.com/app/315210/discussions/0/4204741842669464852/?ctp=2

https://www.tomshardware.com/pc-com...blame-other-high-end-intel-cpus-also-affected
I had real bad problem with the game Wartales crashes and resets on Steam. So I uninstalled Razer synapse. Then used a real Xbox controller cord not a knockoff that was more flexible. That seemed to fix my problem. I own a 13700k waiting on Arrow Lake.
 
Last edited:
I'm running a $154 fire sale 12600k on The z790 Nova. The T create ram at 7200 xmp profile shoves 1.4v into the memory controller. So yeah I’m set at 1.25 and 6000 memory because seeing 1.4 all day every day seemed like a bad idea.

I’m willing to bet its from over clocking or even just memory OC.
Pretty much this. I would generally blame the CPU last (behind memory and mobo) in cases like this unless Intel directly acknowledged it. They should be putting everything at stock settings and clocks (even disabling XMP profiles), and replacing memory and/or mobos before the CPU as well. Maybe they've done this, but I didn't read it all as my work blocks the links.
 
I would imagine that the FrogMaster himself may have some info on this if it is indeed a thing.
 
I feel like this is more along the lines of these companies pushing the chips to the edge of the safety margin by default, like you use to have 50% headroom back in the day and now you have 5%. Anything not detected when the silicon is made and then binned to a lower tier chip can be faulty later on.
 
most of them probably never use the p-cores and if they do i'm guessing they don't use stress them. not that i know what you do for work, i'm guessing the way they are used in 95% of office use cases
Plenty of office use for sure, but also simulation software. Ansys (Structures, Fluids, and EM), Solidworks, Comsol, and so on. They can hit pretty hard.
 
" Inside those entries, look for 'Translation Look Aside Buffer' and 'Internal Parity Error'. There will be lots.

That is pretty undeniable proof that a core is malfunctioning. Whether it is Intel's fault or the BIOS' is a different matter.


I also note the irony that people are now relying on error reporting from parity while at the same time (semi-)voluntarily running gigabytes of RAM without error reporting (ECC).
 
Yeah this is a wait and see. Way too many variables with no hard evidence as to what the fault is. I'm glad a hardware site is getting into this, but I do wish it was someone other than Tom's assware.

If there really is a hardware fault....Intel isn't going to want to talk about it. Imagine having to recall all those CPUs and the fallout from that one.
 
Yeah this is a wait and see. Way too many variables with no hard evidence as to what the fault is. I'm glad a hardware site is getting into this, but I do wish it was someone other than Tom's assware.

If there really is a hardware fault....Intel isn't going to want to talk about it. Imagine having to recall all those CPUs and the fallout from that one.

Nightmare fuel for sure. It took them 1-2 years to admit the puma6 modem chips were bad.
 
This aint gonna keep me from upgrading to a 14900KF ;)
 
Last edited:
It's weird that there are literally millions of people all over the globe running the same hardware for the same use without any problems, and yet somehow this one person is able to buy 4 "defective" chips in a row from multiple generations of chips with individual purchases that were spaced months apart.

If this were a design defect, this would be hitting the millions of people who it currently is not. If this were a manufacturing defect, then we're looking at odds that make Powerball feel like a coin flip.
 
It's weird that there are literally millions of people all over the globe running the same hardware for the same use without any problems, and yet somehow this one person is able to buy 4 "defective" chips in a row from multiple generations of chips with individual purchases that were spaced months apart.

If this were a design defect, this would be hitting the millions of people who it currently is not. If this were a manufacturing defect, then we're looking at odds that make Powerball feel like a coin flip.
My thinking exactly. Some random guy pushing his chips beyond the max and blaming it on the manufacturer.
 
Last edited:
I feel like this is more along the lines of these companies pushing the chips to the edge of the safety margin by default, like you use to have 50% headroom back in the day and now you have 5%. Anything not detected when the silicon is made and then binned to a lower tier chip can be faulty later on.
Sounds like you are closest to being right. Intel pushed these too close to the limit. Instead of +5% headroom, you really have -5% headroom.
Tom's Hardware said:
With past Intel CPUs, setting a TDP or amperage absurdly high generally didn't make a difference, as complex frequency boosting rules determine how much actual power and current are used. There are protection mechanisms to prevent a CPU from damaging itself. However, the 13900K in particular can be negatively impacted by these high or unlimited power settings...
...we were routinely hitting up to 100C when we saw the crashing problems...
If you're using a 13th or 14th Gen Core i9 or i7 and you're experiencing these crashes with certain games, you might want to downclock, undervolt, and/or set a power/current limit on your CPU. There's a good chance one or more of those will fix the problem.
TH has always been an intel apologist. 100°C is where CPUs are going to crash - that's a chip/cooling problem (from 21 years of overclocking experience and moderating an OC forum).
The solution is the same - undervolt and/or downclock your CPU, until/if there is a BIOS fix to nerf the CPU.
 
Last edited:
Sounds like you are closest to being right. Intel pushed these too close to the limit. Instead of +5% headroom, you really have -5% headroom.
TH has always been an intel apologist. 100°C is where CPUs are going to crash - that's a chip/cooling problem (from 21 years of overclocking experience and moderating an OC forum).
The solution is the same - undervolt and/or downclock your CPU, until/if there is a BIOS fix to nerf the CPU.
Or just keep the vcore fixed... out of the box my bios was liberally applying vcore at load, my CPU would hit 100c instantly and throttle... locking vcore at 1.3v keeps the temps under 90c thankfully.. this 13700kf still runs pretty hot while stress testing.. I am thinking about a new 280mm AIO in push-pull when I go 14900kf.
 
Least we know its not electromigration, cause that would be a stupid question.
 
Well looks like my board defaults to it also. I just loaded defaults and ..

1708729406310.png

Now I never even come close to thermal throttling playing games and so on but I'm going to force the stock limits anyway.

One tick on the ez mode bios in the middle of the screen to Enable Intel Limits:

1708729945095.png


The default setting should be the CPU spec and everything else is tuning/overclocking and should be manual. Pretty silly to have this be the default behavior even on mid range and higher tier boards.
 
Last edited:
Nightmare fuel for sure. It took them 1-2 years to admit the puma6 modem chips were bad.
Well look at the security issues we had multiple bios updates to patch. And those patches don't really solve the problem!

I did read an article about that...back then Intel was making AMD look like garbage. Someone got the old hardware, installed all the patches and all of a sudden Intel wasn't really much faster than the old phenoms. Theory was that they intentionally bypassed some security protocols to get speed. Nobody can pove that though.

And did they ever do any kind of recall? Nope. We got patches that slow your system down instead.

So yeah I can see anyone getting to the bottom of this one any time soon. Maybe some more crippling bios updates.
 

Recent high-end Intel CPUs are crashing Unreal Engine games

When CPUs get too hot and power-hungry​

By Rob Thubron February 22, 2024 at 5:20 AM
In a nutshell: Are you using one of Intel's top-end 13th-gen or 14-gen processors and have noticed your games are crashing a lot? It's a problem that primarily affects Unreal Engine titles, and a division of Epic Games, along with Nvidia and gaming studios, are pointing the finger squarely at Team Blue's hot and power-hungry hardware.

There have been several reports of Core i9-13900K and Core i7-14900K processor users experiencing crashes in games that show an 'out of video memory' error. The issue is also being experienced by those using the Core i9-13700 and Core i7-14700.
 

Recent high-end Intel CPUs are crashing Unreal Engine games

When CPUs get too hot and power-hungry​

By Rob Thubron February 22, 2024 at 5:20 AM
In a nutshell: Are you using one of Intel's top-end 13th-gen or 14-gen processors and have noticed your games are crashing a lot? It's a problem that primarily affects Unreal Engine titles, and a division of Epic Games, along with Nvidia and gaming studios, are pointing the finger squarely at Team Blue's hot and power-hungry hardware.

There have been several reports of Core i9-13900K and Core i7-14900K processor users experiencing crashes in games that show an 'out of video memory' error. The issue is also being experienced by those using the Core i9-13700 and Core i7-14700.
Hopefully this turns out to be isolated incidents. I have found out over the many years in PC gaming that I no longer need the top tier chips to enjoy games. I was always a i7 or i9 guy, but now I'm happy with an i5. Runs so much cooler and less power hungry.
 
so this is back in the news. (techspot) i guess last week Nvidia has a note in their driver release notes directing people with problems towards intel.

moore's law is dead said in his "Intel 14th Gen Mass Failures Leak" (grain of salt) video yesterday he says he spoke with a friend at a retailer. which said:
mlid_ntel.jpg


also Jayztwocents and a couple other youtubers are releasing videos on the topic


View: https://youtu.be/HIubZYwBfPc?si=ZfpWNxkHqfPzJuQD

nvid_ntel2.jpg
 
so this is back in the news. (techspot) i guess last week Nvidia has a note in their driver release notes directing people with problems towards intel.

moore's law is dead said in his "Intel 14th Gen Mass Failures Leak" (grain of salt) video yesterday he says he spoke with a friend at a retailer. which said:
View attachment 649883

also Jayztwocents and a couple other youtubers are releasing videos on the topic


View: https://youtu.be/HIubZYwBfPc?si=ZfpWNxkHqfPzJuQD

View attachment 649888


Interesting.

I wonder what Steve has to say
 
It's funny this is gaining traction now, but Tomshardware covered this just as thoroughly back in February, based on the Oodle complaints in December, and found these same conclusions.
I suppose though now that the YouTubers are jumping on board maybe a fix will get delivered faster...?

https://www.tomshardware.com/pc-com...blame-other-high-end-intel-cpus-also-affected

Oh apparently Intel issued a BIOS update to their partners 4 days ago that adds a "Baseline" Profile to "fix" this issue.
https://www.tomshardware.com/pc-com...-on-raptor-lake-and-raptor-lake-refresh-chips

Now who's going to pay for the damages and who is going to lead the class action against the MoFo MoBo manufacturers.
 
Back
Top