5900x Hard crashes at 4.8ghz all core no matter voltage, thoughts?

eViL_M@LuM

Crosshair's Nephew
Joined
Aug 28, 2002
Messages
3,688
Hey all, it's been some time since my last post here. I come seeking some wisdom that I can't seem to find elsewhere. I'm running into an issue where anything 4.8ghz+ when cpu is under load like cinebench R23, or prime95. Now normally it's a whea error, however I only see a Kernal-power error 41 in the event viewer after restart. Now when I say a hard shutdown I mean it's as if you turned the power supply switch off while running. I'm wondering if I'm reaching the limits of the VRM's and it's thermally shutting down. I have tried multiple ram types, different motherboards, different PSU's.

Current setup
CPU : Ryzen 9 5900x @ 4.75ghz 1.3v - IHS Lapped mirror and TG Liquid Metal used with NZXT X73 Kraken - pulls 248-260w running prime/R23
MOBO: Asus Rog Strix X570-E Gaming - Bios 4021
Ram : 4 x 8GB G.Skill Trident Neo 3600mhz @ 3733mhz 14.14.14.28 2T 1.55v
HDD: WD Black Sn850
GPU: PC Red Devil Ultimate
PSU: EVGA SuperNova 1000w
Asus Tuf GT501

With this current setup I can run R23 for a full 30min and it equalizes around 71*C, doesn't appear to be cpu thermal throttling this is a 24hr stable setup. I know their is not a lot of head room on these in general, I just don't feel like this is a CPU related issue. It shuts down like an OCP would in the PSU. I'm just curious if anyone else has had this issue or if I'm overlooking something. The attached image was before I finished tweaking the ram. Which is another thing, no matter voltage, timings, 1/2/4 sticks it will not boot at 3800mhz.
 

Attachments

  • benchmarks.jpg
    benchmarks.jpg
    415.4 KB · Views: 0
Use HW info to check SOC volts and temp as your cranking that ram as hard as anyone I've seen.
 
  • Like
Reactions: Nobu
like this
https://www.guru3d.com/articles_pages/powercolor_radeon_rx_6900_xt_red_devil_review,6.html

https://www.evga.com/products/product.aspx?pn=120-GP-1000-X1

The powercolor uses around 350w max according to guru3d. Combined with the 260w from your CPU, that's about 610w. That leaves 390w on the 12v rail, but the max power for the whole PSU is limited to 1000w, so any power on the 3.3v, 5v, etc rails will reduce that number. Still, seems like you're fairly safely in spec for the PSU, before factoring in fans, pumps, etc., at least.
 
https://www.guru3d.com/articles_pages/powercolor_radeon_rx_6900_xt_red_devil_review,6.html

https://www.evga.com/products/product.aspx?pn=120-GP-1000-X1

The powercolor uses around 350w max according to guru3d. Combined with the 260w from your CPU, that's about 610w. That leaves 390w on the 12v rail, but the max power for the whole PSU is limited to 1000w, so any power on the 3.3v, 5v, etc rails will reduce that number. Still, seems like you're fairly safely in spec for the PSU, before factoring in fans, pumps, etc., at least.
I have this card modded under full load it reports 448w, even so the PSU should be within limits I agree. It'd make sense if it was a WHEA error. But this is just different.
Use HW info to check SOC volts and temp as your cranking that ram as hard as anyone I've seen.
You may be onto something - they hover between 95-105*C
 
Are you running it on a UPS? you may be having some issues on the socket run you are using for your computer. I know Ive had a bunch of power isses in the past, until I figured out the one socket I had been using for my computer was shared with 3 rooms.
 
Are you running it on a UPS? you may be having some issues on the socket run you are using for your computer. I know Ive had a bunch of power isses in the past, until I figured out the one socket I had been using for my computer was shared with 3 rooms.
I hadn't thought of dirty power or lack of power at the plug. Whoever wired this house definitely was a special person. I'll test that theory, need to find a UPS to test.
 
I don't think your system is stable. You sure 4750 is good with 1.3v? Seems pretty low. You are running flat 14s with 2T, is GDM off? Also your ram is running at 1833, not 1866.
 
What's the exact part number of that RAM? The stuff with the Samsgun IC's can do 3733MHz and beyond. The Hynix stuff really can't. Also, 1.55v on that RAM is way beyond what I'd be using on it. If that was the Samsung based RAM, you shouldn't need that much voltage. With AM4 systems, memory issues and poor cooling account for the vast majority of issues.
 
To be fair, for me to run 1866 14-14-14-34 1T with GDM on I need to use 1.5v. But 1.475v for 15-15-15-36 2T GDM off with a tight trfc @ 1900. But I am using a mixed set.. my Royals are held back by my cheap Black and Whites :D
 
To be fair, for me to run 1866 14-14-14-34 1T with GDM on I need to use 1.5v. But 1.475v for 15-15-15-36 2T GDM off with a tight trfc @ 1900. But I am using a mixed set.. my Royals are held back by my cheap Black and Whites :D
And there is your problem. The fact is, those are probably different IC's across all four DIMMs since you aren't using a matched set of 4. You are using two different matched sets of 2. Furthermore, AMD does not support speeds over DDR4 2933MHz using 4 single ranked modules and DDR4 2667MHz using 4 dual ranked modules. This is a screenshot from my reviewer's guide we got before the Ryzen 5000 series launched:

1639711511463.png


Pull out the mismatched pair, return your settings to 1.4v and clock the Trident Z Royals to their maximum rated speed (and timings), then see if the machine can handle a 4.8GHz all core overclock. If it can, you know your memory settings are the problem and I'd almost bet money that's where your instability comes from so long as your temps are good and you aren't seeing thermal throttling.
 
Hey all, it's been some time since my last post here. I come seeking some wisdom that I can't seem to find elsewhere. I'm running into an issue where anything 4.8ghz+ when cpu is under load like cinebench R23, or prime95. Now normally it's a whea error, however I only see a Kernal-power error 41 in the event viewer after restart. Now when I say a hard shutdown I mean it's as if you turned the power supply switch off while running. I'm wondering if I'm reaching the limits of the VRM's and it's thermally shutting down. I have tried multiple ram types, different motherboards, different PSU's.

Current setup
CPU : Ryzen 9 5900x @ 4.75ghz 1.3v - IHS Lapped mirror and TG Liquid Metal used with NZXT X73 Kraken - pulls 248-260w running prime/R23
MOBO: Asus Rog Strix X570-E Gaming - Bios 4021
Ram : 4 x 8GB G.Skill Trident Neo 3600mhz @ 3733mhz 14.14.14.28 2T 1.55v
HDD: WD Black Sn850
GPU: PC Red Devil Ultimate
PSU: EVGA SuperNova 1000w
Asus Tuf GT501

With this current setup I can run R23 for a full 30min and it equalizes around 71*C, doesn't appear to be cpu thermal throttling this is a 24hr stable setup. I know their is not a lot of head room on these in general, I just don't feel like this is a CPU related issue. It shuts down like an OCP would in the PSU. I'm just curious if anyone else has had this issue or if I'm overlooking something. The attached image was before I finished tweaking the ram. Which is another thing, no matter voltage, timings, 1/2/4 sticks it will not boot at 3800mhz.
Maybe undo your oc and run on stock? Its not worth to cost your nerves. Also cpu is fast on stock.:)
 
And there is your problem. The fact is, those are probably different IC's across all four DIMMs since you aren't using a matched set of 4. You are using two different matched sets of 2. Furthermore, AMD does not support speeds over DDR4 2933MHz using 4 single ranked modules and DDR4 2667MHz using 4 dual ranked modules. This is a screenshot from my reviewer's guide we got before the Ryzen 5000 series launched:
It runs great, they are all B-Die. 2 kits of 3200C14..

hwinfo.PNG


toight.PNG
 
Even if they are the same, running four modules over DDR4 2933MHz or DDR4 2667MHz (single vs. dual ranked) is not officially supported by AMD and could in fact, be your problem. As I said, pull two modules and see if it works at 4.8GHz.
I think you have me confused with the OP.. I don't have a problem..
 
I think you have me confused with the OP.. I don't have a problem..
Yes I did.

To be clear, while AMD doesn't support memory speeds over DDR4 2933MHz using 4x DIMMs, it is still possible. I was able to achieve DDR4 3600MHz clocks on 4x8GB modules on an MSI MEG X570 GODLIKE. However, there is no guarantee it will work and that's the point and why AMD doesn't support the configuration.
 
I don't think your system is stable. You sure 4750 is good with 1.3v? Seems pretty low. You are running flat 14s with 2T, is GDM off? Also your ram is running at 1833, not 1866.
I seen the Fclk wasn't matched and swapped it. It's stable with prime running for 24hrs small fft, and then running memtest86. It's not a normal unstable shut down. It's almost as if a thermal limit is triggering a soft shutdown. And yes GDM is disabled

What's the exact part number of that RAM? The stuff with the Samsgun IC's can do 3733MHz and beyond. The Hynix stuff really can't. Also, 1.55v on that RAM is way beyond what I'd be using on it. If that was the Samsung based RAM, you shouldn't need that much voltage. With AM4 systems, memory issues and poor cooling account for the vast majority of issues.
Part number is F4-3600C14-8GTZNB




To be fair, for me to run 1866 14-14-14-34 1T with GDM on I need to use 1.5v. But 1.475v for 15-15-15-36 2T GDM off with a tight trfc @ 1900. But I am using a mixed set.. my Royals are held back by my cheap Black and Whites :D
lol you probably don't want to know my voltages... but 2 decades of overclocking I've realized what the terms "longevity" are. I have a set of DDR4 3200 corsair that ran at 1.65v for 3 years and is still working to this day. I have this set at 1.575.


FOUND THE ISSUE.....

As I suspected it was something reaching a thermal shutdown point. Turns out I had forgot to reduce the SOC amperage for my 6900XT and it was reaching thermal max and shutting down. reduced it back to stock and now don't have stability issues.
 
  • Like
Reactions: Nobu
like this
Back
Top