5900x, running hot

Ryzen master doesn't show the full CPU power usage. The full CPU powerusage here is 38w, but ryzen master only shows CPU core power and SOC power. There are other parts to it and all contribute to the total power usage. The temp in ryzen master is die average so you can have hotter parts as well. I usually go by the hottest parts of the CPU and total CPU power consumption as I feel it is more relevant. Die average will be quite a bit lower than the hottest parts during low power usage. Difference in CPU die average between ryzen master and hwinfo is because of difference in polling snapshots.

1616787991631.png
 
Difference is in which number is used. E.g. the hottest part of the CPU will generally be either Tctl or CCD1, which generally are about 3-5 degrees above the die average (the number ryzen master uses) during light desktop usage (browsing, viewing youtube etc.). E.g. CCD1 shows average of 35 while the Die average shows 30. Fully idle is of course lower, which is the powerdraw when not using the computer at all and it just sits at the desktop with only background services.
 
Kind of interesting that you get full usage in prime95, in all core mine goes to just below 130w with 24 thread small fft, only with blend do I get close to 142w. With small FFT I can get 142w with 5 and 6 threads. Using 5 threads with small fft is the absolute hottest I can get my CPU (high 70s or low 80s depending on cooling settings), but once I run something like 12 or more threads small fft it detects it and "throttles" to around 130w and temps go way down. Not sure if there are some bios features that detect load and adjusts CPU settings or similar.

My memory is on the QVL so both G.Skill and Asus say it is tested to be OK at rated speeds (outside of my TRC edit). I did check the QVL as I know that the ryzen CPUs don't have the best memory controller.

Is gear down on? I have to run with geardown due to having dual rank, not sure if you can get away with it off with 3200mhz memory. Most likely your memory will run at 3200mhz with relaxed timings, e.g. setting them manually. The 1.2.0.2 bios is afaik on its way, but probably a few more weeks. 1.2.0.1 patch A (modification to 1.2.0.1) was rushed out to fix/mitigate the USB issues some people where having.
 
There's nothing wrong with my RAM itself. It's passed multiple nights running memtest86 at its rated XMP speed (3200) on 4 different CPUs on 2 different motherboards. This started immediately after installing the latest BIOS that uses AGESA 1.2.0.0 (and also on .1). I ran optimized defaults before and after, so whatever that is. My RAM is dual rank Samsung b-die, which is unbelievably still for sale today. I bought it in 2017 for $350 USD- https://www.newegg.com/g-skill-32gb-288-pin-ddr4-sdram/p/N82E16820232218?Item=N82E16820232218 I originally built my old 1800X with a single rank b-die kit (https://www.newegg.com/g-skill-16gb-288-pin-ddr4-sdram/p/N82E16820232530?Item=N82E16820232530) but I wanted 2x16GB.

My screenshot above is P95 torture test blend, to be clear. That's all I ever run with P95. I was happy to see it actually hit 142W though. That proves there's no leftover Zen(+) cruft preventing my system from hitting its true power limit. While I may have a less-than-optimized boost going on, I'm glad to know there's not a hard cap on my system as it is. Boost under CBR20 may improve with future AGESA releases I hope. If not, oh well. I'm punished for not buying a new motherboard I suppose. It's a great victory for AMD and Asus!

I just checked geardown mode, and it's set to auto. Which for Asus usually means off. I'll do some testing on this after work today.This is why I came back here, I noticed you had your head better in this stuff than I do. :) Intel was always more forgivable on things like this, the memory controllers just-worked.

edit- couldn't resist testing this.. no change, still crashes immediately with geardown enabled. On AGESA 1.1.8.0, I had the exact same settings and pretty sure I even ran P95 torture test blend with CBR20 at the same time.
Didn't mean that there is anything wrong with the ram, just that ryzen has a reputation for not being the best at memory compatability. Options that may be worth trying is to set more relaxed timings, e.g. 16-16-16-36 or upping the voltage a tiny bit to 1.4v. Personally I would probably increase timings rather than upping the voltage, but 1.4v should be fine for the CPU and b-die is known for it's ability to handle high voltage.

At least nice to see your system hitting full wattage, even if it sucks that you get instability at the same time.
 
XMP should set speed and voltage along with CAS, tRP, tRCD and RAS, the rest it leaves on auto. It is possible that AMD/ASUS has changed some of the auto settings and that might be what causes issues. There are afaik also some settings that you might not get to, that are set automatically. If you can't get it stable with a reset, then try setting D.O.C.P. and 16-16-16-36 afterwards to see if that is more stable.
 
What is your SOC voltage in HWINfo? For reference mine is between 1.070 and 1.082 on auto. Afaik that is the voltage that one increases, along with DRAM voltage, to get memory stable. Should not exceed 1.100v though.
 
Afaik it should change when you set XMP/D.O.C.P. If it doesn't then that could be what causes the instability. It should need higher voltage with the higher fclock etc. along with increase of VDDG as well. It is possible to set higher than 1.1v, but it shouldn't need that much. My guess is that it should be around 1.050v or so on auto with d.o.c.p.
 
Last edited:
9 Asus boards now have AGESA 1.2.0.2 beta releases, including two lowly B550 boards. Only vendor with 1.2.0.2 releases out right now. Other than 20+ years of good luck with them, this is why I always run Asus. I'm expecting 1.2.0.2 on my X470 board before any other vendor has them out for 400 series. edit- MSI has some 1.2.0.2 betas out too. I support that! 👍
lets hope it solves your issues :)
 
The 11900k may be faster in a few benchmarks, but for my usages it would be quite a bit slower. Geekbench is not a benchmark I have a lot of faith in, but did a short run with stock settings just for comparison. The place where Intel has the advantage is in crypto (AES) which is mostly likely run with hardware acceleration on and the Intel score is also influenced by short term boost (similar to PBO, but runs only for a minute or so if in spec). If you turn that off, then you get results that are in line with long term load. If within spec then Intel performance drops after initial boost is over so it is the correct way to review the Intel CPUs as it shows what the CPU will do over extended load, instead of just in the short term.

I do understand that you got frustrated with the instability issues you had. Personally I am happy with my AMD systems (2400G+x470, 5800x+x570 and 5900x+x570) with no stability issues at all. Tbh. I do think you damaged something with PBO, most likely something related to the VRMs, as you weren't complaining instability before you had the first unexpected shutdown with PBO. Your board 5900x setup was also under performing quite a bit in multicore, not sure why.

I often swap between GPU vendors, depending on which of them gives me the most bang for the buck. Ever since the 10xx series that has been Nvidia and will continue to be that until AMD gets their raytracing performance in order. From my experience there has been just as many issues with Nvidia as there has been with AMD. Lots of hotfix drivers have been installed for my Nvidia cards to be able to play the newest games and sometimes they broke old ones in the process. I do get that the internet "consensus" is that AMD drivers are bad, but in my experience Nvidia is just as bad if not worse.

Anyways happy that you got a working system, even if it is unfortunate that you had to swap a lot of parts to get it.

Geekbench scores for a 5900x on x570 system on defaults+D.O.C.P. (application of XMP settings). As you can see the multicore is a blowout vs intel and the single core advantage for Intel mainly comes from AES-XTS.
1621854782160.png

1621855362572.png

1621855418028.png
 

Attachments

  • 1621856004272.png
    1621856004272.png
    46.3 KB · Views: 0
With damage I meant that you may have overheated your VRMs to the point of damage and that less stable power delivery could be the cause of your instability. Only times I have had such drastic changes in memory stability have been due to pushing the limits with overclocking on a core 2 duo that ran at 35% above spec for many years and pushing it too far a few times too many.

The scores for multicore are a blowout in the sense that my 5900x scores almost 32% higher. Comparing per core is kind of irrellevant in all core testing, otherwise we would still be on 4 core CPUs boosting to max clock. The scaling has partially to do with power as the 5900x is distributing 142w over 12 cores, while your intel system with extra boost is probably using 200w or more for 8 cores (similar to running PBO on AMD).

The parts that seem to benefit the intel CPU is with the AES in geekbench. Remove that and the gap would change quite a bit. Personally I don't do a lot of AES encryption in my normal usage so it wouldn't have much of an effect. Don't really need AVX512 either as the stuff I do has no need for that either.

Personally I haven't had the USB issues, even though I run PCI-e 4 GPU and 2 PCI-e 4 m.2 drives, but it seems some have had it. Afaik it was only on systems running PCI-e 4 though.
 
I had the USB issues on my X470 on PCIE3. PCIE4 supposedly can exasperate the issue, but so can heavy CPU usage. Not everyone hits those conditions. I was able to duplicate it, not consistently as it seems to hinge on some sort of unique order-of-events, but I did get a video of it here.


When that happened, if I unplugged all my devices and plugged them back in, it would seem to go away for a while. How to get there, I'm not sure. But it's definitely an AMD design flaw.

I'm not seeing AES skewing the results, looks like single core the 11900K is faster in nearly everything by some margin or another. It doesn't really matter though, just I use something you don't, doesn't mean that one CPU isn't better on the whole. I like what I see here.. but if Geekbench doesn't represent your needs then that's just GB for you. Kind of like the USB issues, the fact you haven't ran into it doesn't mean they don't exist and that your hardware doesn't have the issue. I was eyeballs deep on that one. No one is safe from it, disabling PCIE4 doesn't resolve it, neither does disabling C-States (a "fix" for all Ryzen CPUs that I've ever owned for various issues, and supposedly, I always had bad steppings.. an awful lot of bad steppings by AMD! But was not required on my 5900X, finally).

I ran PBO2 for a total of one day, and the issue revealed itself then.. I never overclocked the 2700X that was in this board prior nor even enabled PBO once. If the VRMs are damaged, it was done very quickly, and without any custom changes to PBO2, which I ran at its default values. I don't believe that at all, not given the plethora of other issues plaguing the Zen3 launch for me and many others. I was trusting AMD to have PBO2 working correctly at its default values and it didn't, while a similar feature, MCE, just works flawlessly on my Z590.

I feel like I've noticed the AMD guys always want to blame broken/failed hardware. No need for all the excuses and "blame the victim" finger pointing... by Jove if that were the issue I'd just replace it! There's nothing wrong with that board. It's AGESA, it's AMD's failure. If I trusted them to sort it out and deliver a fix, I'd just wait it out. Who knows, but I have work to do and a life to live. Jesus H Christ.

As if I didn't or don't want my 5900X to work. It actually made me sad to tear it down. No one's gleefully spending $1000US on a pricey mITX Z590 board and 11900K. I don't trust any board vendor, given my research, to guarantee I won't run into the same USB/memory controller issues on another board. I'm not going to mess with AMD's stuff again, not today at least. I would say maybe again in the future, but that would be a lie, this Z590 with ALL other existing underlying hardware worked perfectly. That's it for me, I'm done, riding with Intel from here on out unless they drop their quality from their very clear high standard. I don't have time for all that pissing around, too many odd things happened over the last 4 years with my 4 Ryzen CPUs and 2 motherboards for me to cook up excuses for AMD's engineering department. AMD doesn't pay my bills, in fact, they've hurt my ability to pay mine. I suspect the very large number of these Ryzen rigs out there couldn't pass CTR2.0's stress test without crashing. Just a hunch though.

Mysteriously I buy my first Intel board and CPU in 13 years (I bought a Q9450 in early 2008) and viola, life is perfect. I used that Q9450 for about 9 years and refused to upgrade even with Sandy Bridge.. glitch free, just worked. I've had enough good experiences that at this point I'm saying "Intel, take my money!!". I also have an Intel X25-M 160GB SSD that I used the heck out of daily for 7 years, and STILL RUNS, hooked up to my old Wii U via USB. Flawless drive that was worth every cent of the $460 that I paid for it in 2010. It's pretty stupid of me after experiences like that to piss in Intel's face, and never again. I'm not the broke college student that I was in 2000 when I built my Athlon Thunderbird 700MHz. Which was also problematic thanks to the VIA KT266A chipset (AMD's chipsets were just trash).. but things got pretty decent on Nforce2 and later. Nvidia, building the platform that saved AMD's reputation in that era..

My point on breaking down the per-core performance is that it's a 32% higher performance outcome, with a 50% more cores. While I agree on your analysis that mine is consuming a lot of power to do it, the end result performance gap just isn't there in any unexpected way, it may even be a little on the low side. I will say my current settings are fully within Intel's warranty requirements. PBO2 voids warranty... but what I have here with my 11900K has not. That's worth something. And, a lot of people would rather see these sort of minimum frame and average frame uplifts like you get from Rocket Lake in Cyberpunk. https://www.eurogamer.net/articles/digitalfoundry-2021-intel-core-i9-11900k-i5-11600k-review?page=4 Being able to extract this level of performance out of an 8 core CPU is really the ideal and best gaming CPU you can get, unless you're only playing CSGo.

Why are you running P95 full tilt and expecting not to get some lag? Why are you also running all these game launchers? When I run nicehash I take it for granted my computer is going to lag if I try to do other tasks.
 
You seem quite anti AMD to me. I will use whatever meets my needs the best at a given time. Currently it is AMD CPU and Nvidia GPU. In a few years it might be Intel CPU and AMD GPU, it all depends on what I believe is better.

You can blame the AGESA all you want, but the facts are that ASUS is known to under-report the VRM temps on several boards (ASUS issue, not AMD) and you did not monitor your VRM temps while running PBO. Since your failures came after crashing with PBO, it is not unlikely that you ran the VRMs so hot that they took permanent damage before the system shutting itself down. People act like PBO can not damage their hardware, however there is a reason for why it is not on by default and it is the users responsibility to ensure that the hardware is operating within safe tolerances when running PBO.
 
That doesn't make any sense.. the system crashed at Zen3 memory spec of 3200MHz, long before PBO2 was ever enabled. I know I can blame AGESA, because I isolated it. I didn't do enough A:B testing with PBO2 to care to sort that out, I didn't think the performance uplift made enough sense to leave enabled, so there's a remote chance what you're saying is correct but I think you're seriously grasping for straws. I've never met an AMD diehard that didn't blame all issues at the feet of the user or broken hardware. Objectively and widely known to be the worst hardware vendor available for decades, but they made no mistakes. Got it. I will accept blame, I suck because I keep buying into their stuff every few years.

It's experience, like owning Chrysler products... it's not childish fanboyism. I'm too open minded for my own good and I'm way overdue on closing the book on AMD. Flatly put, my Intel systems have consistently been rock solid, for 40 years now. You may be able to write that off easily with some narrative about my sample size being too small to matter, yadda yadda yadda, but I can't and won't. Neither will a lot of people. Unfortunately for those with different views from mine, I've owned recent hardware from both vendors. There's no cheap talk here, I have a sample size of one with both a 5900X and 11900K and it's an absolute slam dunk in favor of the 11900K. Tough to talk trash on that, most open-minded, willing "anti-AMD" guy around. Cost me good money to be able to have the right to my viewpoint as well. Didn't just buy one, and sit around defending my purchase.
AMD is objectively and widely known to be the worst hardware vendor for decades? I had no idea...
 
That doesn't make any sense.. the system crashed at Zen3 memory spec of 3200MHz, long before PBO2 was ever enabled. I know I can blame AGESA, because I isolated it. I didn't do enough A:B testing with PBO2 to care to sort that out, I didn't think the performance uplift made enough sense to leave enabled, so there's a remote chance what you're saying is correct but I think you're seriously grasping for straws. I've never met an AMD diehard that didn't blame all issues at the feet of the user or broken hardware. Objectively and widely known to be the worst hardware vendor available for decades, but they made no mistakes. Got it. I will accept blame, I suck because I keep buying into their stuff every few years.

It's experience, like owning Chrysler products... it's not childish fanboyism. I'm too open minded for my own good and I'm way overdue on closing the book on AMD. Flatly put, my Intel systems have consistently been rock solid, for 40 years now. You may be able to write that off easily with some narrative about my sample size being too small to matter, yadda yadda yadda, but I can't and won't. Neither will a lot of people. Unfortunately for those with different views from mine, I've owned recent hardware from both vendors. There's no cheap talk here, I have a sample size of one with both a 5900X and 11900K and it's an absolute slam dunk in favor of the 11900K. Tough to talk trash on that, most open-minded, willing "anti-AMD" guy around. Cost me good money to be able to have the right to my viewpoint as well. Didn't just buy one, and sit around defending my purchase.
You were bragging about how your system was stable and enjoying the extra performance with PBO on all the time before you even mentioned a crash. Now you are claiming you were crashing before PBO was turned on and that the system was unstable.

You try to claim you are open minded, but based on what you are typing you seem more closed minded and bitter. Basically mad that a lot of us our enjoying our working systems from the horrible, evil AMD and don't give any traction to your attempts at making the 11900k seem far superior. This will be my last post on this topic so you can finally go back to your corner and pout.
 
So I checked out of this thread, but I figured I'd come back to mention that about a month ago I traded out the 100i for a 150i, and it made a noticeable difference. I had games that would run in the high 70s to mid 80s, but after switching to the 150i the same titles stayed in the 70s. That additional air flow, cooling surface area, and transfer time provides positive results.
 
Back
Top