Ryzen 3000 hype and review oddities

Discussion in 'AMD Processors' started by Morkai, Jul 12, 2019.

  1. Algrim

    Algrim [H]ard|Gawd

    Messages:
    1,429
    Joined:
    Jun 1, 2016
    Clever algorithms that can utilize cache slush to mask the increased latency between dies still doesn't negate the fact that the latency is there. Again, this is basic physics. (Does anyone remember Cray supercomputers? Does anyone remember why they were built in cylindrical fashion? Interconnect latency, is why. This isn't theoretical shit at all. This is fucking reality.)

    For the vast majority of workloads I'd bet that there is no detectable difference. There's absolutely no difference if you can keep the workload into 8c/16t (with SMT enabled) or 8c/8t (with SMT disabled) on a single chiplet. Most 'normal' workloads fit within these parameters. Again, no difference should be detectable if the entire workload can be contained on a single chiplet.

    Talking about HT and SMT is just a distraction to this very basic problem.

    Enthusiasts are very appreciative that AMD has met or exceeded Intel on almost every front. I certainly would like to build a Team Red system but financial priorities intervene (not to mention that with two bad eyes I don't game much at all). Acknowledgement of weaknesses in a design are not the antithesis to enthusiasts but is intrinsic to the basic nature of an enthusiast.
     
    Last edited: Jul 12, 2019
    Trimlock likes this.
  2. ChadD

    ChadD 2[H]4U

    Messages:
    3,942
    Joined:
    Feb 8, 2016
    Yes there is latency going from one die to another. Their are always design choices. I'm not arguing physics. The reality is the AMD fix is not JUST a "clever algorithm" they have over 4x the L3 Cache vs the intel parts. So yes intel has a monolithic design with a 4 core complex design where those complexes have access to 1/4 as much L3. AMD has a chiplet design where each chiplet has 2 4 core complexes with access to 32MB of L3 Cache. So yes intels interconnect is faster... amd is simply using their interconnect a lot less. Yes that will vary by workload... and yes when the cores get up over probably around 70% or os utilization those "clever algorithms" will need to understand how to best split the work between complexes to reduce cross talk.

    So ya latency with chiplets is an issue... the fix is to introduce larger and more "clever" caching systems to reduce the need for data to travel.

    Intel is going the same way... their next arch will use chiplets and they have their own version of inf fabric. Its a solution that everyone is moving to including Intel and Nvidia.

    Think of the way DDR works. Higher mhz leads to lower cas timings. So the interconnect timings are in fact slower then they have ever been... but compensated by a rise in clock speed. With core counts the solution of the future to rising costs of massive transistor silicon is to split it... creating latency and offset that latency with cache. One advantage of having a lot less on each bit of silicon is being able to include a lot more memory space for the same cost. Really chiplets are the only way forward for everyone.

    And your right SMT is a bit of an aside. I only mentioned it as it seems to have come up a few times in other places recently with all the talk of improved single thread performance. Part of the issue there for Monolithic and chiplet designs has been overflow of cache creating schedulers in both hardware and software feeding cores in a less then optimal way creating extra latency. THat is true of all designs... so cost savings of chiplets leading to massive caches for each complex and thread is a huge overall win I'm sure.
     
    blkt likes this.
  3. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    This is a common myth perpetuated by ultra benchmarks. If you play in 4k 60Hz, press ultra preset, never tweak anything - yes it is true, and it is your full right to play this way. But it is only true for this specific scenario.
    If so you need to rephrase it to "don't matter for me".



    If you don't, it has never been true. cpu workload is just much much higher in 4k (obviously). If you just play on ultra, untweaked, you are hard bottlenecking your gpu, essentially throwing your cpu performance in the garbage and sacrificing fluidity for a marginal increase in stills/enjoying the scenery.

    In destiny2, 4k, 144Hz HDR, 2080ti OC 6700k@4.5Ghz with tweaked settings of "high" and a couple of cpu intensive settings to medium, averaging 120+fps, the visual quality is still stunning, not a big difference to ultra at all. For stills. For motion, the tweaked almost-as-good 120fps settings will offer a superior motion resolution (less blur) to ultra in any scenario.Edit

    The resource usage looks like this in an action situation (escalation protocol);



    CPU and GPU used in perfect balance, both screaming for mercy and getting none.

    For any situation where you tweak your settings, you can make use of most of your cpu, though many games specifically bottleneck harder on one single core - hence the interest in intels single core performance, etc.

    720p benchmarks are meant to "synthetically" push the cpu in a game and represent this situation. Usually it is a good representation of cpu power but sometimes the engine just isnt optimized for high fps in 720p and the result is skewed.
    For cs:go 720p is even a real world scenario, though at extremely high fps it starts to become a ram benchmark in addition to cpu.
     
    Last edited: Jul 13, 2019 at 1:26 PM
  4. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    I cross-posted this to a few forums i frequently read and amd_robert commented on this;

    "There's nothing to optimize against for the cluster or chiplet design. The scheduler is already optimized for the CCXes, and the chiplets introduce no additional complexity from the perspective of a game or scheduler. The topology appears monolithic d/t its implementation. "

    As this is my mind does not explain the issues LTT experienced and mitigated by manually assigning cores (but rather could be the cause? "The topology appears monolithic"), I asked for a followup and will add any reply.
     
  5. RamonGTP

    RamonGTP [H]ardness Supreme

    Messages:
    7,574
    Joined:
    Nov 9, 2005
    The topology “appearing” monolithic is the issue since it is in fact NOT monolithic.
     
    Algrim likes this.
  6. n=1

    n=1 2[H]4U

    Messages:
    2,387
    Joined:
    Sep 2, 2014
    Just buy a damn 9900K or wait for the 9900KS (assuming not vaporware) or wait for Comet Lake if you have the patience and be done with it. You're for looking for reasons to NOT buy AMD and I strongly suspect even if the community convinced you, you'll still find something to nitpick over/be unhappy about and return it anyway. So just do yourself (and us) a huge favor and stick to Intel plz kthxbai.
     
  7. Keljian

    Keljian Gawd

    Messages:
    578
    Joined:
    Nov 7, 2006
    Oh man.

    There is nothing wrong with an intel 9900. There is nothing wrong with wanting a 3xxx series Ryzen. Get whatever floats your boat.
     
    nEo717 likes this.
  8. EmualDave2k12

    EmualDave2k12 n00b

    Messages:
    15
    Joined:
    Jan 19, 2016
    Reading all those reviews of the new RYZEN CPUs, it seams AMD overclocked it to the extreme and call it a day with a die shrink ..... that's why you can't overclock it any further.
    Simply by looking at the volt the CPU takes to reach the advertised speed.

    Its nicely and will played by AMD this time around.

    At least AMD did it, they delivered 7nm CPUs with no compromise on the speed.


    In the other hand, Intel is struggling and it seams that INTEL will not be able to deliver similar performance (speed) with any die shrink ....

    For me I will jump to AMD Ryzen 3 when they release a passive cooled x670

    I would expect Ryzen 3 to be more matured and consume less power = less heat

    Intel needs a miracle to return (maybe 2 to 3 generation) (this is my expectation)
     
  9. ChadD

    ChadD 2[H]4U

    Messages:
    3,942
    Joined:
    Feb 8, 2016
    If you want passive cooling just buy one of the X470 boards with a confirmed PCIe 4 upgraded bios. Many of the B450 boards have also recieved PCIe4 bumps.
     
  10. Trimlock

    Trimlock [H]ardForum Junkie

    Messages:
    15,103
    Joined:
    Sep 23, 2005
    While it could be the issue, I don't think allowing the OS to see the individual CCX's will solve the problem either. The latency will be there in some ways regardless if software can directly use a CCX or more or if the hardware has to do the selection. At least this way, the development is simplified and no additional drivers need to be implemented to use a chiplet design.
     
  11. Hakaba

    Hakaba Gawd

    Messages:
    616
    Joined:
    Jul 22, 2013
    You are going to end up with buyers remorse if you let someone talk you into an upgrade. Either grab an X450+3600 and ride it out for a few years, or wait till late 2020/2021. That is when AM4 is supposedly departing and the new socket should be here.

    Also noting intel is already talking their 10 cores coming to market in the near future.
     
  12. ManofGod

    ManofGod [H]ardForum Junkie

    Messages:
    10,353
    Joined:
    Oct 4, 2007
    If you do not want to spend the money then, do not spend it. (I mean that in the sense of if you do not feel you need an upgrade, then don't) Whether you go the Intel or AMD route, I am not sure you would be happy, considering, at least for now, the 6700k when overclocked, is still a pretty good chip. What monitor and Videocard do you have? Are you like me and more enjoy the build process itself?
     
  13. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    IF this is the issue, then it actually could solve any (potential) issues in many cases. As long as say a game only needs 4-6 physical threads in sync (very few need more), then assign to same chiplet, done. The rest of the cores can do asynchronous workloads,

    If 7 or more need to be in sync yes, can't be helped.
     
  14. Trimlock

    Trimlock [H]ardForum Junkie

    Messages:
    15,103
    Joined:
    Sep 23, 2005
    That introduces a lot of complications and additional coding required for specific hardware. It doesn't fully fix the issue and you'll still incur latencies within in the software. It'll be better, but only in the cases that the developer choses to code this way. Windows scheduler needed some tweaking when Zen first hit the market but it has needed very little work since then and I think this is most of the reason why.

    I think AMD bet on the side of the developer friendly case and its paying off for them. I wouldn't hope to have a system that offers developers more options with coding more specifically to AMD's chiplet design.

    edit: actually, if it only needed one chiplet, or maxed out on one chiplet you probably wouldn't need any additional coding. But this is still realy limiting AMD and they wanted easy integration across the board. Software matters, maybe sometime in the future?
     
  15. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    My purchasing considerations are more of a side thought, (i mean i wrote only a few lines about that), but I just wanted to see what opinions and knowledge people had about the possible issues - especially the chiplet latency issue. The rest were just oddities - I don't care whatsoever about power efficiency personally, I do not care about temperatures as long as performance is there (though they seem to run very hot, and probably affects the boost/OC).

    With that said, I have a 2080ti and PG27UQ and a more or less unlimited budget (within reason), but I do not throw money away.
    So far the intel 7000 series have been a waste vs 6700k, 8000 series as well, 9000 series interesting as its double the core count AND a minor per core upgrade. Ryzen 1-2000 simply too low performance.
    Rynzen 3000 apart from the 3900x, nope, nope , nope . The 3900x on the other hand seems at least as interesting as the 9900k, but the potential chiplet issues must be cleared up.

    Again, not really looking for purchasing advice, just technical discussion/info. I am leaning towards waiting, but not sure yet.
     
    Last edited: Jul 14, 2019 at 10:32 AM
    ManofGod likes this.
  16. OnceSetThisCannotChange

    OnceSetThisCannotChange Limp Gawd

    Messages:
    135
    Joined:
    Sep 15, 2017
    Or wait for 3950x in Autumn, four more cores and 100 mhz higher boost clocks.
     
    Master_shake_ and ManofGod like this.
  17. RamonGTP

    RamonGTP [H]ardness Supreme

    Messages:
    7,574
    Joined:
    Nov 9, 2005
    Yes the latency would be there IF you’re traversing Infinity Fabric but if the OS is aware of the CCXs it may be able to avoid doing that. If it sees the CPU as monolithic and say you’re running a game that only uses 4 cores, there’s nothing to keep it from processing 2 threads on one CCX and 2 on the other, causing a performance hit. If it does know about the CCX design, it can schedule all 4 threads on the same chiplet.
     
  18. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    Exactly. And the LTT video above seems to suggest that is is an issue, AMD claims it does not need fixing and is not an issue. I want to find out which is true.
     
  19. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    As many are probably too lazy to check the video, the difference when assigning cores manually to the same cluster or not was not some minor nitpicking, it was 51 vs 92 fps for 99th percentile min fps and 151 vs 161 average.
     
  20. ChadD

    ChadD 2[H]4U

    Messages:
    3,942
    Joined:
    Feb 8, 2016
    Ok first... what are you actually DOING that requires more then 8 cores ? It doesn't sound like anything. Is your current setup letting you down ? Doesn't sound like it.

    Worrying about latency between chiplets is silly. Yes their is latency anytime data is transmitted anywhere in any way. Through traces on silicon... through a interconnect like Inf fabric, or skylakes x mesh... latency is a reality of physics.

    The history of multi core chips basically starts with companies gluing 2 cores together that where basically independent. Then they found ways to interconnect them in a more meaninful way and implemented level 3 cache systems so the cores could work on the same workloads. Then they increased cores again and found that one of the biggest issues with early designs was differening latencies... Core 1 would have low latency to core 2 right beside it but worse latency with core 3 with longer traces. So they implemented ring bus and core complexing designs. This worked well until we got to 4+ cores when it made sense to have multiple complexes. Intels 8 core chips and a fully loadced 8 core amd chiplet are still split into 2 complexes. The latency differences are frankly minor. The catch is inf fabric (and the intel version used in xeon golds ect) requires more power. A high (relatively) powered interconnect like Inf Fabric or Intels X mesh can actually be faster then low power silicon etched connects over short traces. Part of the reason Intel is able to clock higher has to do with Inf fabric which draws more power and creates more heat. Intel is going to run into the same issues with clock rate when they move to chiplet designs. (which they are)

    You are focusing way way to much on the interconnect technology at play here. Core clustering is a reality on both AMD and Intel parts. L1 and L2 is not shared over complexes by either companies designs. L3 is used very much the same way by both intel and amd. The difference is Intels current monolithic chips have more packed into a large single bit of silicon. As such they have much less room for actual cache memory. They also have to deal with far lower yields. AMDs design for all the talk of Inf Fabric vs on chip traces will lead to real world gains even when using cores on 2 separate chiplets the core complexes have access to much more cache.

    The results are easy to find... real world performance AMD has shown that their current version of Inf Fabric in Zen 2 is fast enough to connect a chiplet to an on package scheduler and memory controller chip and show not only no real world loss. (as we would be comparing that vs their own 2000 chips) In fact the opposite is true they are showing pretty massive gains, which at the end of the day despite all the talk of X or Y new cool thing is probalby mostly to do with the massive amount of cache the chips now have.

    There are ZERO potential issues to clear up. Chiplet design is sound and anyone that can read a simple bar graph can see that on multiple review sites. The reality is... AMD will not be releasing non chiplet parts ever again. Intel is also very likely to introduce their own chiplet design sooner rather then later. Intels XE GPU is basically confirmed to be a chiplet design. Intel has been talking up their Embedded Die Interconnect Bridge ‘EMIB’ for a few months now, in relation to Agilix and Xe server bits. It will be their version of Inf fabric for their next generation parts. (assuming they don't try and squeeze one more revision series out) Anyone that has a Intel-G part with radeon graphics is already using a part with EMIB, Intel uses it to connect the GPU and CPU. In the non consumer space they also use it in their Agilix.

    So your options really are to hold pat... grab the last and perhaps best example of the single silicon chips in the 9900k. Step up to more cores with a AMD 3900 or 3950 chip. Or wait for Sunny cove. (which although not confirmed based on what intel has said will either be a chiplet like AMDs... a 3D stacked chiplet design... or it will be a complete bust and 2020 is going to suck for Intel) At a recent Intel arch day conf Intels chief engineer Dr Murthy Renduchintala said, "We have humble pie to eat right now, and we're eating it." He went on to basically say they have thrown out a good bit of their 10nm work... and have redesigned for sunnycove. Part of what they did confirm is a massive bump in L2 and L3 cache vs their older chips. (that and talk of non x86 bits included in the design heavily point to a chiplet design) It sounds like they may actually get their 3D chiplet stacking tech into the consumer space. Now that is going to be a real crap shoot as far as how well that will work. The end of 2019 early 2020 might be interesting if Intel doesn't pull a knee jerk reaction junk CPU out of the hat to try and fight Ryzen 2. If they let their engineers get things right... as Dr. Renduchintala says, Ryzen2 vs the first actually new Intel design in a long time might be interesting. It will probably be Flat Chiplet vs 3D chiplet... and Infinity Fabric vs EMIB... and if they can't get it out and shipping by spring they will be dealing with zen2+ not zen2.
     
    Last edited: Jul 14, 2019 at 10:32 PM
    TurboGLH, Keljian and blkt like this.
  21. ChadD

    ChadD 2[H]4U

    Messages:
    3,942
    Joined:
    Feb 8, 2016
    You can set it up to fail is all that proves. The real test is does letting it go full auto and assign work loads itself over all cores... VS forcing it to only use one complex look that drastic ? The answer is no. Yes you can pick 4 cores ... that are one out of each cluster on purpose and make it run like ass. If a 12 core Intel server part was forced to use 4 cores one from each of its CCXs... it to would run like ass.

    They haven't exposed some weakness in AMDs interconnect... but a weakness of multi core chips >.< CPU onboard schedulers are NOT stupid. And would not take a bunch of work and split into 4 bits and feed it to the 4 worst choices on the package. AMD gives you some software to force that behaviour... cool I guess.
     
    blkt likes this.
  22. tangoseal

    tangoseal [H]ardness Supreme

    Messages:
    7,305
    Joined:
    Dec 18, 2010
    Wow that was a HUGE wall of emotions. Just get whatever the heck makes you happy. Dont like it sell it.

    Latency problems are fully resolved on Zen 2. There's no excuse to invoke that conversation any longer.
     
  23. Keljian

    Keljian Gawd

    Messages:
    578
    Joined:
    Nov 7, 2006
    While memory latency is a bigger factor on Zen 2 - realistically it doesn't matter.. get ram that is 3200C16 at least and you'll be within about 1-3% of faster ram, and it'll do the job reasonably well.
     
  24. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    My wording wasn't 100% precise so I understand that it could be interpreteted this way - but it was auto assigned cores that according to LTT gave poor results and manually assigned affinity that provided the good result.

    "The real test is does letting it go full auto and assign work loads itself over all cores... VS forcing it to only use one complex look that drastic ?" The answer is yes in the LTT test case. (and you STILL evidently didn't watch the ~30 sec in the video on this but seem to comment a whole lot on it ;) ).

    What amd themselves answered about this, really could be the cause of the problems: "The topology appears monolithic d/t its implementation" - as in, they slapped on a huge L3 cache, present the architecture as monolithic, hope it is good enough (the latencies are really good internally in clusters - better than intel from what i've seen. Across chiplets they are much much higher, around 3x the latency of in-cluster, and nearly double intels).
    It could also be that LTT did something wrong, had something misconfigured or that motherboard has issues (seems to be many issues going around with various motherboards) or physically broken but overall they gave ryzen3000 a glowing review with excellent results.

    I also asked LTT to make sure their test setup was correct and:
    "All of our testing was done with the most up to date patches and BIOS revision available at the time, including the CCX-aware optimizations in 1903."
    They also say that "I mean, maybe it is our setup that's the problem, but to date nobody from AMD has approached me to talk about our results. "

    You can read LTT's full answer here: https://linustechtips.com/main/topic/1079529-i-had-given-up-on-amd…-until-today-ryzen-9-3900x-ryzen-7-3700x-review/?page=3

    I do not have much more to add about it until AMD answers again, or LTT possibly does a followup on it.
    But i checked a couple of other sites that publish 99th percentile minimum fps, and they did not experience the same, so who knows. (either just a LTT setup issue, or others ignored outlier results and published median).
     
    Last edited: Jul 15, 2019 at 3:42 AM
  25. Trimlock

    Trimlock [H]ardForum Junkie

    Messages:
    15,103
    Joined:
    Sep 23, 2005
    The only way for this to happen is if the scheduler was super smart (and thus more overhead) or the developer made it work that way with a profile.

    Imagine how games work with profiles in GPU drivers, without them the games are unable to use certain features. This would work the same way.

    I'm personally ok with how AMD does it, I'd rather it be intuitive to the developer and create ease of development.
     
  26. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    In the future, when everyone probably leaves the monolithic design, there could also be introduced a simple, completely optional, negotiation protocol with the scheduler during process startup, something like.

    1. I'm a a game and if game mode is on, i want priority.
    2. ok
    3. i want xx cores yy smt dedicated
    4. ok.
    5. process presents simple metadata how it wants main threads grouped and kept together; (prio0-9 reserved for win-dos); prio10; 1,2,3 same cluster, prio20; 4,5,6 same cluster but not necessarily in sync with with 1,2,3. prio30; 10 alone but on a physical core.
    6. ok.

    That should completely mitigate any possible issues.
     
    Trimlock likes this.
  27. Trimlock

    Trimlock [H]ardForum Junkie

    Messages:
    15,103
    Joined:
    Sep 23, 2005
    ive wanted something simple like this for some time!
     
  28. RamonGTP

    RamonGTP [H]ardness Supreme

    Messages:
    7,574
    Joined:
    Nov 9, 2005
    I think you’re overstating how much smarter and how much overhead the scheduler would need. Prior to the P4 days it didn’t know how to deal with hyper threading and not long after it knew the difference between physical and virtual cores. Didn’t seem very hard to do and didn’t seem to add any appreciable overhead so I’m not entirely sure what you’re basing your opinion on.
     
  29. Trimlock

    Trimlock [H]ardForum Junkie

    Messages:
    15,103
    Joined:
    Sep 23, 2005
    HT isn't SMT, but in the end what you are talking about is existing in the same core utilizing the same resources. Not switching chiplets, different resources and dynamically changing resource allocation.

    In the early HT days, it wasn't bad at all. Only certain programs had issues using the secondary thread, the common fix was to block off HT for those programs.
     
    Algrim likes this.
  30. RamonGTP

    RamonGTP [H]ardness Supreme

    Messages:
    7,574
    Joined:
    Nov 9, 2005
    I know it’s not the same thing but the idea of making the schedule aware Doesn’t seem as complex as you’re making it out to be. Like at all.
     
  31. ChadD

    ChadD 2[H]4U

    Messages:
    3,942
    Joined:
    Feb 8, 2016
    Well the bottom line is this on the turning cores off and testing. They went up to 148 FPS from 142 by forcing the 3900x to use only one chiplet while streaming. That is 4% difference. Now I get that is probably the only background thing someone will be running while gaming but... I would be willing to bet the difference would drop even more if there was yet more going on. Even without OBS running the difference is according to Linus (and his use of the 'word' closlier makes it hard for me to watch his video lol) is 6%. I'm sorry but that isn't really anything to write home about and is very much in line with how much performance you gain turning off SMT/HT on AMD or Intel chips IF the game your running is poorly threaded or just flat out isn't taxing the cores for real.

    For Linuss test.... I find it interesting that in order to show a 4-6% performance loss with all cores vs forced assignment, they had to use 1080p. My guess would be that at 1440 that number would shrink to a couple % at most, and at 4k we would probably start seeing a reversing where disabling half the CPU would cost you some frames.

    Games are a hard thing to schedule for... prediction doesn't work as well ect. IMO I would rather have my CPU lighting up all the cores so that if something heavy in the game happens, like a change of location or introduction of a taxing AI pops up I don't get studder when all of a sudden the 6 or 8 cores I'm forcing get nailed for 100% usage. Will I give up 4-6% potentially when the CPU isn't crying... perhaps. Still seems preferable to me. Reminds me a lot of the previous AMD generation where in benchmarks the 2700x looked a lot worse for gaming. Yet most people would report that the game play experience was superior. Benchmarks aren't everything actual game play will very. ;)

    Oh and Linus and his crew didn't just give AMDs ryzen2 a good review... they have basically said there is zero reason to buy an Intel part right now.
     
    Last edited: Jul 15, 2019 at 3:20 PM
    blkt likes this.
  32. Algrim

    Algrim [H]ard|Gawd

    Messages:
    1,429
    Joined:
    Jun 1, 2016
    When the resolution is up you're more GPU than CPU bound.

    The reason why schedulers back then were so confused is because the OS was programmed to assume it was assigning jobs to a second core, not a pseudocore that changed capabilities dependent on what job(s) was/were being performed. Note that operating systems still have a problem correctly handling some HT/SMT operations. Otherwise disabling HT/SMT would never ever increase performance yet here we are.
     
    blkt and ChadD like this.
  33. RamonGTP

    RamonGTP [H]ardness Supreme

    Messages:
    7,574
    Joined:
    Nov 9, 2005
    Right I get it. So advance it further to recognize a chiplet design.
     
    n=1 likes this.
  34. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    53,810
    Joined:
    Feb 9, 2002
    I'll take a stab at this.

    No, it won't. The AGESA code may provide some slight performance benefits here and there, but I wouldn't bet on it changing how and what we think of Ryzen 3000 series CPU's today. Windows 10 build 1903 already makes scheduler adjustments which now make Windows aware of the topology of Ryzen CPU's. That is, their CCD/CCX complex design and in order to reduce core to core and CCX to CCX latencies, keeps threads confined within a single CCX complex whenever possible. Even this hasn't had a profound effect on Ryzen and I don't expect that will change. Even the CPPC2 change doesn't have a dramatic impact by AMD's own admission. This was in our reviewers guides.

    Well, in simple terms that's pretty much what happened. On an IPC level, the two are nearly equal when factoring in the most recent builds and most of the mitigations on the Intel side. Intel still commands a single-threaded performance advantage, but this is mostly due to clock speed. When the clocks are equal, they are pretty equal and trade blows with the edge going to Ryzen in most the tests I ran.

    ohDKCIO.jpg

    This is in fact, simply untrue. Everyone who reviewed Ryzen 3000 series CPU's actually tested overclocked performance whether they intended to or not. Ryzen automatically overclocks itself. Games tend to benefit more from single-threaded performance, and as a result, you get more performance out of PB2 and PBO then you would from manual overclocking. However, I provided results for PB2 and a manual all-core overclock. Many other sites did as well.

    It's not a fact, because you are incorrect.

    This is a fair point. Two things: 1.) This is conjecture. The real reason for the inconsistencies isn't actually known outside of AMD. I saw pretty consistent result in benchmarks, and when your at settings and resolutions that make your more GPU limited, this doesn't really matter. 2.) I wouldn't expect miracles here. I don't think this will close the gap between Intel and AMD. AMD's had and still has game mode on these CPU's which eliminates the latency issue as much as possible using the current design. This does have benefits on some of the processors with greater CCX counts but it really doesn't matter a whole lot unless your at 1080P and lower setttings.

    On this we agree, and I said as much above. The whole CCD/CCX complex layout will always create latency issues on some level. All we can do is minimize the impact. Threadripper is the worst case scenario for AMD and gaming and I've never seen any difference at 4K putting this thing in game mode. The only CPU that this might even be worth doing is the Threadripper 2990XW. It takes the CCX complex latency to its absolute worst case scenario.

    OK.

    A couple things on this. Most of the time when you do these types of articles, you don't have time to explore every scenario you can think of. We had these CPU's about a week ahead of the embargo date. Secondly, it really makes sense to show people a "worst case scenario" when it comes to all things hardware. Blender is a very good example of just that. If you get into the "real world" examples of specific applications, people will hound you endlessly about how it performs with x application. This isn't a good metric either as it isn't comparable to other site data in any way. I used Cinebench R20 for power testing as its what AMD actually used. The idea being I would try and see how valid AMD's provided data was. I actually shared AMD's metrics via its slide deck in the article I wrote. At the wall, my Ryzen test system consumed more power than the Intel, but it also provided more performance. In some tests like Cinebench, a great deal more.

    I'll agree that the difference in power consumption that you see at the wall is not massively different between the two setups. AMD reports it as so, but they look at is "performance per watt." One thing you have to understand is that it isn't that Ryzen isn't efficient at 7nm. It's that AMD took the gains from the process node shrink and reinvested them in performance. The die shrink gave them a greater transistor budget. It gave them the ability to provide more CPU using the same TDP's as the older CPU's.

    As someone who read AMD's reviewer's guide and had access to it the entire time, I can tell you this is patently false. I can't speak for other reviews, but I did include temperature information. I also covered this in our update to the article. Essentially, mine stayed at around 62-65c during most testing. It would go up to 78c during an all core overclock. Actual operating temperatures on the CPU really aren't that bad.

    The main reason why Ryzen doesn't overclock better comes down to the fact that the architecture simply requires far more power to clock high and thus generates more heat than can be dissipated via heat sinks and fans, AIO's or even custom loops. We know this because we've seen that it can break 5.0GHz on liquid nitrogen. What it takes to get there is voltage. With voltage comes heat. So in that sense, you aren't wrong. I will also concede that I didn't include an Intel reference for temperature, just power. It was an oversight on my part. However, my Intel sample was basically at 85-90c all the time. The only time it wasn't was at stock settings. I don't recall the actual temperatures of the stock values.

    I think your splitting hairs on the Ryzen 9 3900X. The fact is, the thing is more than capable of gaming. The higher the resolution, the less its deficit vs. Intel is seen. In virtually every other metric, Ryzen 9 3900X is better than Intel's Core i9 9900K at everything. This is why reviewers are so taken with it. It's also why they are selling like mad.

    Source Me:
    https://www.thefpsreview.com/2019/07/07/amd-ryzen-9-3900x-cpu-review/
    https://www.thefpsreview.com/2019/07/15/amd-ryzen-9-3900x-cpu-review-new-bios-performance-tested/
     
    Mav451, blkt, RamonGTP and 2 others like this.
  35. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    53,810
    Joined:
    Feb 9, 2002
    I'd be interested to know what your tweaks sacrifice. I play Destiny 2 a lot and I don't get 120FPS. I'm an image quality or bust kind of guy so I'm curious how your settings look.
     
    N4CR likes this.
  36. ChadD

    ChadD 2[H]4U

    Messages:
    3,942
    Joined:
    Feb 8, 2016
    Good post Dan and great review.

    I think was easy for reviewers to get excited about Ryzen 2 and keep the hype going for AMD. After years of not much to write about followed by a few years of this one wins here and there this one wins there and there.... and if you do X or Y you want blue if you do Z you want Red. AMD released a product that is actually new, and not only is it new, for the most part you can now say if you do X Y or Z Red is not only the best bang for your buck it's the best bang.

    Low single digit % wins in gaming at low resolution and or iq settings are not wins. Not when your product costs more offers less and losses everything else.
     
    blkt likes this.
  37. Morkai

    Morkai Limp Gawd

    Messages:
    348
    Joined:
    May 10, 2011
    I read your review, and I think it is well written. (as feedback, i think you should change the 1-12 navigation buttons to the page titles and/or add a drop-down nav). Also, sorry for lazy formatting here;
    I mean sure, it is probably technically true, but when the "industry standard" is that intel's consumer cpu's can stay at max boost clocks forever with a half-decent cooler (might need to lift power limits but I don't think even that is generally needed anymore), and ryzen3000 can not..
    ...it's far fetched to call not even reaching advertised boost clocks overclocking. Not all core, not even single core. to quote yourself; "Indeed, our review sample did not achieve the advertised 4.6GHz boost clocks during our review, although they were generally close." "PBO+Offset has no guarantee of granting you the extra clock speed and in my case it never did. However, I saw PBO boost into the 4.5GHz range fairly often in single threaded tasks."

    You are technically correct on some level, i guess, but it is arguing semantics. I dare say there is a case to be made that you could even RMA that sample if you had theoretically paid money for it? AMD's tech spec page says maximum boost 4.6Ghz, does not give conditions about "exotic cooling required", or saying "up to" so i think it is fair to assume it should at least briefly reach that with the stock cooler for single core?

    So again, what you say is probably technically true but to be practical and less theoretical you could as well have said: "Everyone who reviewed Ryzen 3000 series CPU's actually tested underclocked performance whether they intended to or not as the stated boost clocks were not even reached"




    "It's not a fact, because you are incorrect." - unsure what this refers to as it was quoted out of context. The above about LTT having ryzen3000 5% lower + potential intel overclocking that it actually referred to? (that was one of the best case scenarios; your own figures show a much larger gap in game performance - most things I quoted were actually best case, to account for early issues), or the following statement that "Many people only do low-intensive tasks or work on their computers, and the only demanding task they do is play games.?" (which is clearly an undisputable statement? I could've say "most", which is probably even true, but I didn't).



    I did point out above that it only affected LTT and not others who published 99th percentile min fps, so i hope it is a non issue. But if addition to the slightly worse performance, random games here and there will have 50% lower 99th percentile min-fps... thats bad.
    If amd had given some standard answer like "we are looking into it", or "please retest with xx yy" that would sound good. But when the answer was essentially "this is not an issue", it looks off.
    CPU performance in 4k, as shown above is critical when aiming for higher fps, but irrelevant in capped low fps. my cpu is constantly a bottleneck for singlecore performance, and this has been the case in a year of tweaking game settings for high fps with the pg27uq. Perhaps people like me are who prioritize higher fps (motion resolution over stills/enjoying scenery) is a minority but can not be dismissed across the board.



    I know that many reviewers have a standard choice for efficiency testing. I do not believe anyone ran a test suite, cherrypicked blender as pro-amd, then ran efficiency testing on it.
    In retrospect, it does look like it favors AMD though. But we both agree "I'll agree that the difference in power consumption that you see at the wall is not massively different between the two setups." I do not personally care for this, I just pointed out that this point seems over hyped. Also as mentioned, it seems that the cpu itself might indeed be efficient and the x570 powerhungry enough to eat that advantage - ryzen 3000 on older motherboards might be the efficiency choice.

    Just saying that this is exactly what i mean - they run hot, hard to cool, and is probably the limiting factor;
    "The main reason why Ryzen doesn't overclock better comes down to the fact that the architecture simply requires far more power to clock high and thus generates more heat than can be dissipated via heat sinks and fans"

    From that I've read the more powerful VRM's on the x570 are complete overkill because of this, and performance on older boards is identical. When one core can't boost higher due to temps, it wont show on the whole package temp.I mean, I haven't tested it, but it looks likely.

    I think the 3900x seems like a great product, and it is amazing that AMD caught up. (Again, the higher the resolution requires more cpu power not less (obviously) when reaching for higher fps at 4k). So if the LTT best case stays true, 5% less then 9900k, that's great, and it is highly likely I'll likely buy one - but the chiplet latency issue can't just be swept under the carpet.
    I am gaming biased, so maybe I put too much weight on that part but when you remove the hyped gaming performance "it doesnt matter!!", the hyped power efficiency that seems small or non-existent, the price change intel has claimed is incoming, (9900k is already a fair bit cheaper than 3900x here) whats left? Mainly compute stuff I think are statistically in very very low percentage of uptime and market share on consumer parts.

    I think its a great product line but over hyped.
     
  38. NWRMidnight

    NWRMidnight Limp Gawd

    Messages:
    274
    Joined:
    Oct 23, 2010
    Morkai, I think bios updates will correct the boost clocks, as we have already had the release day bios releases that have helped. So lets wait and see before we assume it's the cpu.

    Heck, even with my C7H and 2700x, clock speeds have changed with each bios, sometimes for the better, and sometimes for the worse. However, ram speed also effects boost clocks on the 2700x. If I leave my ram at 2133 (defualt) i get single core boost of 4.4 to 4.5 (up to 3 cores) , with spikes hitting 4.6. with the memory at 2933 (4X8 for 32GB) I get 4.266 to 4.3 ( up to 3 cores) with spikes of 4.35. (water cooling, no manual over clocking)
     
    Last edited: Jul 15, 2019 at 8:49 PM
  39. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    53,810
    Joined:
    Feb 9, 2002
    Boost clocks are governed by an algorithm. Sometimes under that algorithm it can achieve the clocks and sometimes it can't. Furthermore, I wrote an update on this issue. Myself, and other reviewers did that because we received BIOS updates for our review boards that actually had different AGESA code which did improve boost clocks. Spoiler Alert: It didn't change much.

    I don't know that this is what everyone experienced. Again, I got close to the advertised boost clocks the first time and did the second time around. Others felt the need to update their articles and show the additional data set. So that information is out there.

    I was referring to the statement you made where you said that no one tested overclocked performance, which is untrue. All the reviews I've seen, including the one I wrote had overclocked values. I had a 4.3GHz all core overclock in the data set. Sure, you could try manually clocking a single core up to the boost clocks, but really there isn't a need for this. So the figures where we saw 4.4GHz and later 4.6GHz boost clocks in the update pretty well have this covered on both ends of the spectrum. The data is there. When you look at the Intel data its presented the same way. Stock speeds which include boost clocks and all core overclocks. That's exactly what I (and others) did.

    I don't know what his issues were. I can't speak to that. All I can say is that the reviewers all had one of several different motherboard options with a couple of different possible BIOS versions, and I think two different AGESA code versions.

    No, and I certainly didn't choose Blender for that. It is part of the standard testing I have chosen to use in our reviews. Actually, its a carry over from HardOCP's reviews. I used what we did there as a base and added to it while changing a few things. But picking it because it seems pro-AMD as something to highlight its efficiency was never part of my thought process. I doubt it was something anyone else did specifically. I actually chose Cinebench for power testing because that's what AMD did, and I wanted to see if their data was bullshit or not. More than that, I think you are right in that X570 pulls a ton of power and eats up what little savings you might have on Ryzen 3000 series CPU's.

    Yes, the VRM's are overkill. Some of them can generate far more power than the Ryzen CPU's could ever possibly use. The design isn't specifically about overkill though. It's about creating a VRM that's robust, capable of dealing with whatever demands the customer might have and maintaining a certain amount of efficiency and operating in certain temperature ranges while doing it. As for the temps, you are partially correct. Ryzen Master, and most applications may not show specific core temps and instead only show the entire package. CoreTemp does show the individual core and package temps on Intel CPU's. I saw a case where my Core i9 9900K wasn't behaving right because some of its cores were hitting temperatures that were way too high and throttling. I contacted the motherboard manufacturer and they sent me a replacement motherboard. Last time I looked, Core Temp only ran on Intel CPU's, and I do not know if anything can show the same data for AMD CPU's.

    I think you are over thinking things and getting caught up way too much in the numbers. I've been pushing 4K for years and more than that before I had a 4K display. I've used a wide range of processors and while I agree, more is better, your still primarily GPU bound. Tests have been done in the past showing virtually no difference between many CPU's using a high end graphics card at higher resolutions. This is basically well known and generally accepted. Case in point, a friend of mine has the same display as I do and he plays many of the same games I do. He gets the same performance in Destiny 2 I do, if not better using a pair of 1080 Ti's and a Core i7 6700K. Another friend of mine has a very similar display, plays Destiny 2 and gets slightly worse performance as I do using a Ryzen 7 2700X and a reference RTX 2080 Ti. His CPU is better than my Threadripper at gaming, but it doesn't really seem to make any difference. Admittedly, I haven't done a bunch of exhaustive testing on these three configurations, but that's kind of my point. This is what you'd see when you sit down and play on them. You don't exactly get a feel for 99th percentile minimum FPS or any thing like that. Was/is the game play smooth? Yes or no? While I think many of the things we do can improve performance, both in what we feel and what we test, I think some of it is straight up academic. You don't really notice it in the real world.

    I've participated in blind studies between AMD and NVIDIA hardware and despite the latter being faster, at lower resolutions where the extra power of the NVIDIA card can't be realized, it's a moot point. Both are often so smooth you can't tell which is which. If I sit here and look at bar graphs and numbers all day, I'll go with the numbers. Actually playing games on these things? It's much harder to tell the difference. I've got a slow ass Threadripper, 9900K and a 3900X right here and I can't really tell the difference between them as long as I'm running the same video card in all three. And that's the crux of the argument that the gaming difference between AMD and Intel doesn't really matter. Unless your benchmarking and highlighting the numbers on a graph, the difference is one that's harder to spot than you migh think.[/QUOTE]
     
    otg, N4CR and IdiotInCharge like this.
  40. Dan_D

    Dan_D [H]ard as it Gets

    Messages:
    53,810
    Joined:
    Feb 9, 2002
    The boost clocks are something that has been addressed in AGESA Combo Pi 1.0.0.3 patch AB. I don't know about all the other manufacturers, but MSI has not released a UEFI BIOS update to the public with this newer AGESA code in it at this time. It's coming though. I'd expect to see it very soon.
     
    Mav451 likes this.