Unusual results, stock 2600x prime95 errors

risc

Handle with Kid Gloves
Joined
May 18, 2017
Messages
188
Been out of the game a long time, I'd appreciate input on interpreting what's happening here.

Assembled new system and have been validating with my usual routine; assemble, all bios defaults, install OS/drivers, update, then stress test for a few hours for several days. If everything is ok, adjust basic bios preferences and hardware specs (in my case only setting ddr4 3000 on auto), and retest.

Everything is fine for a week with no concern, all prime95 tests and prime+furmark for many hours, superpi to 32m many times with all 12 core results matched.

Now on second week, I'm getting consistent prime95 errors. Backed ram off to 2933, 2600, then reset to optimized defaults (2133) and still errors.

Ran memtest86 for 2 hours and no errors.
Just ran prime95 and a worker failed around 20 minutes.
 
You need to understand why its failing. Whatever your memory is rated for should be absolutely fine. You may have to increase the voltage ever so slightly as some motherboards may not have the most consistent memory voltage. The memory subsystem has its own VRM's of varying quality, output etc. So you need to find out what your motherboard likes. Some RAM needs slightly higher voltage in some cases. Corsair RAM typically likes to be overvolted by .25v to .50v. You should use something to monitor your CPU temperatures and see if your CPU is getting the proper cooling. If your temps are too high as heat can cause cores to throttle and worker threads to fail. You may also need to adjust voltage up or down slightly to try and improve stability or reduce temps. Motherboard VRM's may also need load-line calibration set and the power phase duty cycle settings altered.

Also try running other stability tests. I've seen some tests that simply do not work well on some systems. Some systems may be stable with absolutely every other stability test and application that you can throw at them and fail on one specific one. Sometimes, its the benchmark itself as it relates to your specific hardware configuration. I saw this on all all my reference systems in Shadow of the Tomb Raier's built in benchmark. I downgraded to a different version of the game and the problem was solved. Applications like Prime95 are no different. Make sure you have the latest version of Prime95. Older versions have issues with some newer processors.
 
  • Like
Reactions: risc
like this
Understanding why this is happening is my goal, I've been racking my brain and the first possibility is CPU memory controller failure, but that seems crazy to me.
For the record, I treat all my systems the same, assemble, test, adjust, test, retest, test, test, confirmed good.

The only change I've made this last week is replacing my old PCP&C 750 with Seasonic SSR-650GD and replaced all my system fans with artic 120's and 140's.


This is my first new build in awhile, specs;
asus prime x470-pro
corsair 2x16 CMK32GX4M2B3000C15R
amd 2600x
artic freezer 34 dual fan
2x nvme (samsung 970 evo and crucial p1), 1 sata dvdr
evga 960 2gb recycled

I'm completely stumped on why I'd be stable, and now unstable. I am not overclocking at all, everything is auto/default except ram frequency, but same result with base defaults.
 
Understanding why this is happening is my goal, I've been racking my brain and the first possibility is CPU memory controller failure, but that seems crazy to me.
For the record, I treat all my systems the same, assemble, test, adjust, test, retest, test, test, confirmed good.

The only change I've made this last week is replacing my old PCP&C 750 with Seasonic SSR-650GD and replaced all my system fans with artic 120's and 140's.


This is my first new build in awhile, specs;
asus prime x470-pro
corsair 2x16 CMK32GX4M2B3000C15R
amd 2600x
artic freezer 34 dual fan
2x nvme (samsung 970 evo and crucial p1), 1 sata dvdr
evga 960 2gb recycled

I'm completely stumped on why I'd be stable, and now unstable. I am not overclocking at all, everything is auto/default except ram frequency, but same result with base defaults.

Like Dan said, sometimes the BIOS defaults are marginal for your parts. My experience with my Asus Ryzen boards is that the default stock settings tend to be a little more aggressive then stock on other boards. Additionally since you are running a Ryzen setup, I have to ask...

1) Are you on the latest firmware for your mainboard? If so, did you update it between your initial tests and the current ones? Firmware can improve memory compatibility on Ryzen, but can also rarely make formerly stable RAM a little less so.

2) I see you have a 2600X CPU there... Try changing PBO from "Auto" to "Disabled" and see if it makes a difference. PBO is automatic all-core overclocking, and on Asus boards, it is set to AUTO by default. You ARE overclocking using the defaults.

3) Like Dan said, try upping the CPU LLC setting from Auto to Normal or High. This helps with sagging voltages on the CPU, if that is the problem.

4) Do you still have the old Power Supply? If so, try temporarily reconnecting it and re-testing to see if the problem remains.
 
Last edited:
I'm testing with your suggestions, disabled Power Boost and upped max SOC power delivery to 110%.
Just passed an hour of prime95 Large FFT test where I've been failing.


1. Board came with latest firmware.

2. My cpu is not an interesting sample, momentarily hits 4.1 in single core tests, but does hover around 4 on all cores.
Frequencies, temps, and voltage are fairly consistent.
 
Nevermind, error occurred at 1 hour 5 minutes as I was replying.

[May 24 22:09] Worker starting
[May 24 22:09] Beginning a continuous self-test on your computer.
[May 24 22:09] Please read stress.txt. Choose Test/Stop to end this test.
[May 24 22:09] Test 1, 68000 Lucas-Lehmer iterations of M4818591 using FMA3 FFT length 240K, Pass1=768, Pass2=320, clm=2.
[May 24 22:14] Test 2, 68000 Lucas-Lehmer in-place iterations of M4718593 using FMA3 FFT length 240K, Pass1=768, Pass2=320, clm=2.
[May 24 22:18] Self-test 240K passed!
[May 24 22:18] Test 1, 68000 Lucas-Lehmer iterations of M5120737 using FMA3 FFT length 256K, Pass1=1K, Pass2=256, clm=2.
[May 24 22:22] Test 2, 68000 Lucas-Lehmer in-place iterations of M5030735 using FMA3 FFT length 256K, Pass1=1K, Pass2=256, clm=2.
[May 24 22:26] Self-test 256K passed!
[May 24 22:26] Test 1, 52000 Lucas-Lehmer iterations of M5605023 using FMA3 FFT length 280K, Pass1=896, Pass2=320, clm=2.
[May 24 22:30] Test 2, 52000 Lucas-Lehmer in-place iterations of M5505025 using FMA3 FFT length 280K, Pass1=896, Pass2=320, clm=2.
[May 24 22:34] Self-test 280K passed!
[May 24 22:34] Test 1, 52000 Lucas-Lehmer iterations of M5705025 using FMA3 FFT length 288K, Pass1=384, Pass2=768, clm=2.
[May 24 22:39] Test 2, 52000 Lucas-Lehmer in-place iterations of M5605023 using FMA3 FFT length 288K, Pass1=384, Pass2=768, clm=2.
[May 24 22:42] Self-test 288K passed!
[May 24 22:42] Test 1, 52000 Lucas-Lehmer iterations of M6225921 using FMA3 FFT length 320K, Pass1=320, Pass2=1K, clm=1.
[May 24 22:46] Test 2, 52000 Lucas-Lehmer in-place iterations of M6225919 using FMA3 FFT length 320K, Pass1=320, Pass2=1K, clm=1.
[May 24 22:50] Self-test 320K passed!
[May 24 22:50] Test 1, 44000 Lucas-Lehmer iterations of M6684673 using FMA3 FFT length 336K, Pass1=448, Pass2=768, clm=1.
[May 24 22:55] Test 2, 44000 Lucas-Lehmer in-place iterations of M6684671 using FMA3 FFT length 336K, Pass1=448, Pass2=768, clm=1.
[May 24 22:59] Self-test 336K passed!
[May 24 22:59] Test 1, 44000 Lucas-Lehmer iterations of M7471105 using FMA3 FFT length 384K, Pass1=384, Pass2=1K, clm=1.
[May 24 23:03] Test 2, 44000 Lucas-Lehmer in-place iterations of M7471103 using FMA3 FFT length 384K, Pass1=384, Pass2=1K, clm=1.
[May 24 23:07] Self-test 384K passed!
[May 24 23:07] Test 1, 36000 Lucas-Lehmer iterations of M7998783 using FMA3 FFT length 400K, Pass1=320, Pass2=1280, clm=1.
[May 24 23:11] Test 2, 36000 Lucas-Lehmer in-place iterations of M7798785 using FMA3 FFT length 400K, Pass1=320, Pass2=1280, clm=1.
[May 24 23:14] Self-test 400K passed!
[May 24 23:14] Test 1, 36000 Lucas-Lehmer iterations of M8716289 using FMA3 FFT length 448K, Pass1=448, Pass2=1K, clm=2.
[May 24 23:15] FATAL ERROR: Rounding was 0.5, expected less than 0.4
[May 24 23:15] Hardware failure detected, consult stress.txt file.
[May 24 23:15] Torture Test completed 16 tests in 1 hour, 5 minutes - 1 errors, 0 warnings.
[May 24 23:15] Worker stopped.
 
My 2600x does the same thing with prime95. I thought I had it fixed a couple times with line load calibration changes and cpu or ram voltages, but it always eventually gave rounding errors in prime95. This is water cooled and max 58 cpu 70 core temps and running stock on a B350-F Asus board. If you tell it to quit prime95 server/official testing...whatever its called, and then reboot, and load it back up, does it work for a while again? That was the only consistency I got, and I haven't ran the software since.
 
  • Like
Reactions: risc
like this
make sure your windows install image is new enough for ryzen. I was using a store bought Win10 usb stick, but the install version was older than the ryzen release so it barely ran once it booted up. everything pointed to memory or memory controller issues. once i downloaded the newest USB image from microsoft and reinstalled, everything was fine.
 
Well, it would be helpful to isolate what specific function in Prime95 is failing here. Large FFT does a lot of power draw and moderate RAM testing.

Try small FFT and see if it fails. If THAT fails then the issue might be with thermals, and if it does not, the issue is likely with power delivery.

You never did say whether the original power supply is still available for testing. Basic troubleshooting procedure would suggest trying that as you say the system was rock solid before this component was replaced. Seasonic makes great power supplies, but even they sometimes have issues - nobody's perfect. I've read on these forums about issues with certain runs of Seasonic power supplies having trouble with Vega video cards, for example, where the user has had to replace the power supply with a different model to resolve the issue (in this case, over current protection tripping despite being under the max draw of the power supply).

Edit: Also check Asus regularly for firmware updates. Ryzen firmware still changes frequently. Asus is in the process of rolling out AGESA 0.0.7.2a, and that has only been out for about a week for my Asus boards (give or take a few days depending on model).
 
  • Like
Reactions: risc
like this
Back
Top