Build Issues

Percy

Gawd
Joined
Sep 27, 2002
Messages
853
Built my pc early 2021. I recently started having issues with it roughly 4 months ago. I am having a tough time trying to troubleshoot these random reboots, hard freezes without reboots, and DPC Watchdog BSODS. I have built this PC nearly two years ago. Specs are below.

I replaced my CPU with a brand new one to test if the issue has been solved due to getting WHEA-Logger Cache Hierarchy Error's but that did not solve anything so I returned the CPU I bought to test. I did Memtest for 8 passes to see if I had any issues and all passed and is well. I also RMA'd the ram and they gave me sequential serial numbered sticks as replacements. I RMA'd my motherboard, after I put my new board in, I had no issues for almost 2 weeks and suddenly the random reboots, DPC Watchdog BSOD's, WHEA Logger Errors and freezing is back. I have also tried installing windows several times thinking it may be a corrupted install of windows, however that did not seem to solve anything.

One strange thing is when I put an old build's ram in, I start having strange shit and after a few restarts I have no issues at all. It's been 4 days so far all is well. Although this ram that I have is highly used in a lot of Ryzen builds, It is not in the QVL list. Does it really make that much of a difference? I had no issues for 2 years and out of nowhere this ram is problematic? even though I I just got 2 new sets from RMA, and an RMA board. Could it possibly be PSU or GPU that can be causing issues with the RAM?

Motherboard: ASUS Crosshair VIII Dark Hero
CPU: AMD Ryzen 5950x.
RAM: G.Skill Trident Z 3600MHZ 4x 16gb Sticks. (2x F4-3600C16D-32GTZR)
Sabrent Rocket 4.0 1TB NVME SSD
Sabrent Rocket 4.0 2TB NVME SSD
EVGA RTX 3090 Ultra FTW
EVGA Supernova 1200P2
Sound Blaster Z

The only three products that I haven't replaced yet are GPU, PSU and Sound Card.

Has anyone encountered issues like this before? Here are links to the dump files.
https://drive.google.com/drive/folders/1YGPKIs3WEmkP_ZiWgeNgibjrJTT_Ax4t?usp=share_link
 
It could be the 4 memory sticks, have you only fired up the computer with 2 16 gig sticks?
 
Strip it down completely, compress air clean everything and rebuild. IF then you have issues, consider troubleshooting in depth.
 
It could be the 4 memory sticks, have you only fired up the computer with 2 16 gig sticks?
Tried that. Didn’t work. I’m currently using old build ram and I have 4 sticks in there and all seems to be ok.
Strip it down completely, compress air clean everything and rebuild. IF then you have issues, consider troubleshooting in depth.
Tried this as well.
 
One troubleshooting technique is to simplify down to bare essentials and add pieces back one at a time until you find the issue. Reset bios to factory and perform minimal changes. This can be more challenging if you aren't sure you can trust your boot drive or memory.
If you're able to get windows stable, start stress testing with tools like OCCT and FurMark, HWInfo is good for looking at voltage and temperature sensors. If you can't get to windows reliably there are other tools like 'ultimate boot cd' or a usb bootable linux distro (I like linux mint). I haven't had to do anything like this in a decade though so YMMV.
 
One troubleshooting technique is to simplify down to bare essentials and add pieces back one at a time until you find the issue. Reset bios to factory and perform minimal changes. This can be more challenging if you aren't sure you can trust your boot drive or memory.
If you're able to get windows stable, start stress testing with tools like OCCT and FurMark, HWInfo is good for looking at voltage and temperature sensors. If you can't get to windows reliably there are other tools like 'ultimate boot cd' or a usb bootable linux distro (I like linux mint). I haven't had to do anything like this in a decade though so YMMV.
Here's the thing, the issues happen very randomly. Sometimes at idle. sometimes when watching a video. sometimes even at the login screen it just freezes and and BSOD, sometimes after entering my password from login screen it crashes. Very strange shit. What's even more weird is my old build's ram is what makes it stable. However when I had my 64gb ram when I built the PC initially, all was working fine for almost 2 years before these issues started happening.
 
Here's the thing, the issues happen very randomly. Sometimes at idle. sometimes when watching a video. sometimes even at the login screen it just freezes and and BSOD, sometimes after entering my password from login screen it crashes. Very strange shit. What's even more weird is my old build's ram is what makes it stable. However when I had my 64gb ram when I built the PC initially, all was working fine for almost 2 years before these issues started happening.
Everything seems to point to RAM issue. When you did memtest, did you run it with only one stick installed at a time? If you run it with all sticks installed, it could give false-positives or false-negatives. Did you test the memory on the JEDEC profile instead of the EXPO/XMP?

What is the make and model of the RAM from your old build?
 
Everything seems to point to RAM issue. When you did memtest, did you run it with only one stick installed at a time? If you run it with all sticks installed, it could give false-positives or false-negatives. Did you test the memory on the JEDEC profile instead of the EXPO/XMP?

What is the make and model of the RAM from your old build?

The ram I used in my current build has been RMA’d with 2 new sets that after putting them in, I still had issues.

Old build ram currently in is “ G.SKILL Ripjaws 4 Series 32GB (4 x 8GB) DDR4 2666 (PC4 21300) Memory Kit Model F42666C15Q-32GRKB”

I ran memtest with 4 sticks. Also IIRC, it was with DOCP enabled. This was before the RMA.
 
There was a BIOS update released last week. Maybe try updating your BIOS?

I will say on the QVL point that it can come into play when you're talking about overclocked memory. Having all 4 slots occupied can also increase the voltage needed to keep the system stable. One thing I would note is that the 4x8GB kit you're using right now is on the QVL while the two 2x16GB kits are not. The 4x8GB kit on the QVL is just the red version of the same memory while yours is the black.
 
There was a BIOS update released last week. Maybe try updating your BIOS?

I will say on the QVL point that it can come into play when you're talking about overclocked memory. Having all 4 slots occupied can also increase the voltage needed to keep the system stable. One thing I would note is that the 4x8GB kit you're using right now is on the QVL while the two 2x16GB kits are not. The 4x8GB kit on the QVL is just the red version of the same memory while yours is the black.
BIOS is already on the latest version.
 
So a little update. EVGA sent me a brand new PSU. All was well for about 3 weeks to a month before my PC stopped posting again with 0d and 1F codes. Had to revert back to my OG 4x8gb kit from my past build.

Could it be the CPU? that is the only thing I never fully replaced.
 
Everything you've said points to memory and I have a couple of thoughts on that.

The first is that anything over JEDEC specs is technically an overclock and not guaranteed so you might need to lower the speed, loosen the timings, or tweak the voltage. Sometimes XMP(or DOCP in your case) doesn't work right and you just need to enter in the settings manually.

The other thing is like others have mentioned running 4 sticks is much less likely to work properly than 2. The fact that your other memory works with 4 doesn't mean that the new memory will, it might have less margin and there's a much higher chance that the new memory is dual rank which can also be harder to run.

I would try entering in the specs manually first and try with 2 sticks next if that doesn't fix the problem. You could try running lower speed or looser timings too.
 
Eber is dead on, Intel engineer told me the same about the XMP issues with my X99 board. Four sticks seem to work best with quad channel platforms.
 
So a little update. EVGA sent me a brand new PSU. All was well for about 3 weeks to a month before my PC stopped posting again with 0d and 1F codes. Had to revert back to my OG 4x8gb kit from my past build.

Could it be the CPU? that is the only thing I never fully replaced.
Think you are answering your own question it is RAM related.

Gskill didn't play nice with my 12700K rig, Corsair RAM flawless with XMP.
 
Looks like it's time to shop for some ram I guess. Should I just snag something off of the QVL list?
 
Reset to BIOS defaults, reboot into BIOS and start over if you haven't. Try bumping the Boot Voltage for the DIMM. My CH7 with 4x8GB wouldn't boot properly all the time until I upped that by .015v or something.
 
Reset to BIOS defaults, reboot into BIOS and start over if you haven't. Try bumping the Boot Voltage for the DIMM. My CH7 with 4x8GB wouldn't boot properly all the time until I upped that by .015v or something.
I've tried this already.
 
Could be the GPU. I just went through this with my first 7900XTX. Constant crashing, mainly memory related. I went around in circles with RAM, motherboard, CPU, power supply, etc. I ended up moving the card to my spare rig and lo and behold, instant crashing on that one too. I returned the card and ordered a different brand and have had no issues since then.
 
Could be the GPU. I just went through this with my first 7900XTX. Constant crashing, mainly memory related. I went around in circles with RAM, motherboard, CPU, power supply, etc. I ended up moving the card to my spare rig and lo and behold, instant crashing on that one too. I returned the card and ordered a different brand and have had no issues since then.
I just got my replacement RMA GPU and the same issues were happening. I'm also encountering more issues recently. Before and after GPU replacement. All of my monitors go black and I eventually sometimes see the desktop. sometimes I see just see white borders and I have to force shutdown with button.

I've tried removing old drivers with DDU, and install drivers again. I've reseated GPU. Nothing seemed to fix it.

This has been so frustrating.

1690064405128.png

1690064724714.png

1690064714502.png
 
I just got my replacement RMA GPU and the same issues were happening. I'm also encountering more issues recently. Before and after GPU replacement. All of my monitors go black and I eventually sometimes see the desktop. sometimes I see just see white borders and I have to force shutdown with button.

I've tried removing old drivers with DDU, and install drivers again. I've reseated GPU. Nothing seemed to fix it.

This has been so frustrating.

Man I had been doing some research and I saw that Asus branded AM4 and AM5 boards have been having a ton of problems with the nvlddmkm crashes and it ended up being the armory crate/RGB control, just something to look at.

EDIT: after thinking some more, I remembered that some 30-series GPUs also needed a firmware update if you have resizable bar enabled, I don't know if you turned that on but it is worth a look.
 
Last edited:
Man I had been doing some research and I saw that Asus branded AM4 and AM5 boards have been having a ton of problems with the nvlddmkm crashes and it ended up being the armory crate/RGB control, just something to look at.

EDIT: after thinking some more, I remembered that some 30-series GPUs also needed a firmware update if you have resizable bar enabled, I don't know if you turned that on but it is worth a look.
The GPUs needed a VBIOS flash, to support resizable bar and reveal the option in the drivers. Not doing the flash shouldn’t otherwise cause problems.

The first Ampere cards to ship with rebar support, were the 3060. All 3090, 3080, 3070, 3060 ti sold to that point, need a bios flash to use rebar.

After the 3060 release, the other cards started shipping with updated VBIOS.
 
The GPUs needed a VBIOS flash, to support resizable bar and reveal the option in the drivers. Not doing the flash shouldn’t otherwise cause problems.

The first Ampere cards to ship with rebar support, were the 3060. All 3090, 3080, 3070, 3060 ti sold to that point, need a bios flash to use rebar.

After the 3060 release, the other cards started shipping with updated VBIOS.

It has been noted as a problem once or twice, normally not but there are always things that go wrong. I was just spitballing ideas.
 
Since the system is stable with your old memory that almost guarantees that the new memory is the issue, it doesn't matter that you RMAd because it's likely not defective but just not playing nice with your mobo/imc.

Try manually setting your speed, timings, and voltage. Giving the soc voltage a small bump(assuming you have headroom) like chameleoneel suggested is another good thing to try. If neither of those work try running the memory at a slower speed or looser timings, there are utilities like Ryzen DRAM Calculator that can help figure out the best settings but you probably don't need to go to that length unless you're trying to wring every last bit of performance out of it or are completely lost on timings.
 
I agree with the few that have told you to check your DIMM voltage. I'm also using Gskill Trident modules (though mine are Neos) and enabling XMP does not set the correct voltage for the timings. The default for the modules is 1.2V which is what the BIOS will go with and it causes all kinds of stability issues. If you go to their site and look up your sticks they'll likely say something like 'XMP tested at 1.35V'; set your DIMM voltage to that and see if it helps.
 
Since the system is stable with your old memory that almost guarantees that the new memory is the issue, it doesn't matter that you RMAd because it's likely not defective but just not playing nice with your mobo/imc.

Try manually setting your speed, timings, and voltage. Giving the soc voltage a small bump(assuming you have headroom) like chameleoneel suggested is another good thing to try. If neither of those work try running the memory at a slower speed or looser timings, there are utilities like Ryzen DRAM Calculator that can help figure out the best settings but you probably don't need to go to that length unless you're trying to wring every last bit of performance out of it or are completely lost on timings.
I will give the manually setting speeds, timing and voltage a try as well as soc voltage. I do remember something I never mentioned before. When I had my 64gb set of ram in and had issues with booting, My motherboard wouldn't post and the only codes that would pop up are 0d or 1F. In both of those states I couldn't enter bios because nothing would happen.
 
If its still unstable you can bump up the vSOC to like 1.075-1.100V. 4 DIMMS at 3600mhz is a tall order.
 
is VDDSOC not the one I was supposed to change?
That's the term that asus uses for SOC so yes that's the correct voltage setting. That said 1.2 is a bit high from what I understand, 1.1 or 1.15 would be good setting to try.
 
That's the term that asus uses for SOC so yes that's the correct voltage setting. That said 1.2 is a bit high from what I understand, 1.1 or 1.15 would be good setting to try.
1.2 is the high end of safe. But, it IS fine as a daily setting.

If its stable at 1.2, try it at 1.1. If its not stable, then set it back to 1.2. Or you can try somewhere in between, if you really want to fine tune it.
 
1.2 is the high end of safe. But, it IS fine as a daily setting.

If its stable at 1.2, try it at 1.1. If its not stable, then set it back to 1.2. Or you can try somewhere in between, if you really want to fine tune it.
I had heard that it was pushing it a bit but I'm far from an expert and didn't want to confirm the other part without at least mentioning that. It looks like it was running slightly under 1v on auto so I think that 1.1-1.15 still might stabilize it but good to know that 1.2 is ok.
 
So since Tuesday afternoon all was fine up until about an hour ago where I started blue screening again. It was non stop BSOD after BSOD. Swapped back to the working ram I was using and sure enough 0D code, all monitors were in standby no post. I cleared CMOS, got into bios and set DOCP for the ram that was working fine. I got strange BSOD's I've never seen before unfortunately no dump for those. But I do have the dump where BSOD happened when I had the 64gb kit in.

1692335413597.png


And here are some photos after swapping back to the working kit, I managed to get it to not spit out BSOD's any more after swapping sticks around and disabling DOCP and running in auto.

IMG_4938.JPG
IMG_4939.JPG
IMG_4940.JPG
IMG_4941.JPG
IMG_4942.JPG
IMG_4943.JPG
IMG_4944.JPG
 
Have you tried running your memory at JEDEC standard speed of like 2133MHz or 2400MHz? Whatever the BIOS defaults at.

Are these sticks single rank or dual rank? Trying to stabilize four dual rank sticks at 3600MHz will most likely be difficult. With Zen 3 you ideally want to run 4 single rank sticks or 2 dual rank sticks. Just because memtest passes doesn't mean CPU IMC can handle it. My gut says this is your issue.
 
Seems like you have RMA'd all the major components at this point. Done any testing to see if it might be your nvme drive?
A bad drive can cause random and varied blue screens like that but IME when a drive starts going you get a bunch of system freezes before it gets to that point.

Edit: Since the issue went away after swapping back to the known working memory that should rule it out as an issue.
 
Back
Top