AMD Ryzen 9 3000 is a 16-core Socket AM4 Beast

Discussion in 'HardForum Tech News' started by sknight, May 10, 2019.

  1. N4CR

    N4CR 2[H]4U

    Messages:
    3,864
    Joined:
    Oct 17, 2011
    Yes but you make it out like it's a massive problem when it's not for most users here - the latency is as low or lower than intel where it counts in under 8 threads. This is the same discussion as the 16 core Zen thread but reversed.
    It has more threads than most software can make use of. Let alone beyond 8 threads for some software cases and Amdahls law etc. But here, in this thread, somehow latency matters more than anything for beyond 8 threads all of a sudden?

    I did look at the chart. And it clearly showed AMD has the lowest latency for a majority of desktop workloads under 8 threads, even beating the 7700k, dark blue and grey at the bottom of the latency pile. I would share your thoughts on latency regarding Zen 2 I would expect a slightly higher latency, with vastly lower latency for Chiplet>Chiplet than in existing Epyc/TR arrangements (e.g. the other half of the chart, closer to current intra-CCX latency which is top of the chart for latency. Even with a doubling they're still around a 6950X and much faster still than the ring bus 7900X latency, which no one bitches about or notices. I would also expect they have a trick to minimise this issue. IO controller is off die, don't forget that moving all that stuff off the chiplet made more room for cache - so expect more there, 32Mb L3 per chiplet if SiSoft leak isn't fudged. So it will pick up some steam in other areas than just clockspeed.
    Screeny as much as I am not a fan of them, is from pcper.

    Edit to add, Zen+ latency improved over Zen and memory speeds also can impact this, so improved MC and efficiency bump might negate most of the latency to be a wash...
     
    Legendary Gamer likes this.
  2. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    11,705
    Joined:
    Jun 13, 2003
    It matters or it doesn't; as games get more thread-aware, which is happening but slowly, it can tank IPC. Benchmarks on Threadripper at >16 cores show this vividly, and it's not just gaming.

    The challenge is that when games 'break apart' processes to take advantage of more hardware thread resources, they still need to maintain concurrency for much of that work; thus, if the OS puts a thread on a core with significantly worse latency, the whole thing can slow down. With Threadripper the AMD provided solution was to literally turn cores off in the UEFI.

    It shows the AMD parts either on par or worse; as core counts rise, vs. modern Intel parts, AMD inter-core latency gets significantly worse. Again, we're both hoping (and more or less expecting) AMD to address this issue, in part at least by:


    But it will be cache that helps. And probably a lot of it. Memory latency is going to get worse; not as bad as the orphen dies on Threadripper hopefully, but it's also going to affect every core.


    Biggest issue is that while total performance is going to undoubtedly go up, AMD has made architecture decisions that will make catching up with Intel's raw per-core performance even more difficult, and that's measuring against the aging and repeatedly 'refreshed' Skylake architecture. Intel has another that's been sitting around for three or four years that likely saw some updates, and that's what AMD's Zen2 / Ryzen 3 is really going to have to compete with.
     
    Thatguybil likes this.
  3. Flexion

    Flexion [H]ard|Gawd

    Messages:
    1,601
    Joined:
    Jul 20, 2004
    Seems like we may have to pay more to not have failing southbridge fans? XD
     
    Master_shake_ likes this.
  4. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    Yes. Latency is not a problem. That is the reason why AMD released latency improved AGESAs, a new chipset with improved memory support to reduce latency, some 2000 series chips with reduced L2 latency, a special BIOS mode for dual-die TR (that disables one die to eliminate die-die cross latency), it is also the reason why reviews test Zen with the higher possible OC memory to reduce latencies, and why users are in forums asking about how to obtain the fastest stable RAM for reducing latency in their builds. :rolleyes:
     
  5. Rockenrooster

    Rockenrooster Limp Gawd

    Messages:
    389
    Joined:
    Apr 11, 2017
    Windows has problems (scheduler???) with more than 16 cores 32 threads...
    Look at linux benches vs Windows and then come back...(not gaming benches)
     
    N4CR and Lakados like this.
  6. Rockenrooster

    Rockenrooster Limp Gawd

    Messages:
    389
    Joined:
    Apr 11, 2017
    I wonder if there will be some form of L4 cache on the IO die......................
     
    Keljian likes this.
  7. Keljian

    Keljian Gawd

    Messages:
    646
    Joined:
    Nov 7, 2006

    Now that would solve a lot of issues
     
    N4CR likes this.
  8. Snowdog

    Snowdog [H]ardForum Junkie

    Messages:
    9,354
    Joined:
    Apr 22, 2006
    I think it's a virtual certainty, that there will be substantial cache on the I/O die. I'd be shocked if there wasn't.
     
  9. Gideon

    Gideon 2[H]4U

    Messages:
    2,307
    Joined:
    Apr 13, 2006
    It's a enthusiast forum so yeah were always trying to tweak the machine to get the best otherwise you would just buy a dell or something. It's not a issue and it makes a small difference in benchmarks if you tweak the memory speed and latencies. It's not a problem except for a few unique cases in the Threadripper and everyone knows about that if you read reviews covering the largest Threadripper. Also Id rather have a feature to disable a ccx that I didnt need if it hurt a specific task I needed to do. All you get from Intel is free exploits these days and reduced performance. Chiplet design was always going to have pros and cons but at least they can continue to innovate on it and move to a smaller process easier. If Intel keeps with that monolithic design they will fall further and further behind as the node shrinks will destroy their yields.
     
    N4CR likes this.
  10. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    11,705
    Joined:
    Jun 13, 2003
    I can look at Linux benchmarks all day and not exceed a hard zero for say Adobe Premiere.

    Yes, Microsoft (and software vendors) need to address the issue, and yes, AMD is still responsible for releasing a product without full software support.


    Beyond that, even when the software support is there, yes latency is still going to be a factor. That's what we're really highlighting using Threadripper, though it should be reiterated that the issues seen with Threadripper represent an absolute worst case and are significantly worse than what should be expected from multi-chiplet Ryzen 3 releases.
     
  11. Rockenrooster

    Rockenrooster Limp Gawd

    Messages:
    389
    Joined:
    Apr 11, 2017
    From WCCF tech:

    graph_7.png

    I this is to be believed....... then they did something right to get some more IPC... (+25% commpared to 16 Core Threadripper)
    16 core Intel @4.8 = AMD 16 core @ 4.2......Holy crap!
    But then its WCCF Tech.............
    I guess we'll see......
     
    Last edited: May 22, 2019
    Master_shake_ likes this.
  12. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    No
     
  13. Rockenrooster

    Rockenrooster Limp Gawd

    Messages:
    389
    Joined:
    Apr 11, 2017
    I guess we'll see......
     
  14. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    Another source claims score of ~4250. This is "~12,5% IPC gain" over Zen+. And this is Cinebench, which favors Zen muarchs.

    This was discussed before. There is no space in the IO die for a proper L4. Moreover, software doesn't detect any L4.
     
  15. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    11,705
    Joined:
    Jun 13, 2003
    ...more IPC in Cinebench, which really means more HPC float throughput.

    Not to belittle that achievement, but it seriously doesn't mean much when you throw in stuff that is latency and concurrency dependent. If it pans out, it will be nice for video editing though!
     
  16. NWRMidnight

    NWRMidnight Limp Gawd

    Messages:
    350
    Joined:
    Oct 23, 2010

    Is this going to turn into a GPU type argument that the Nvidia fans always use when AMD does well, where anything that favors AMD is invalid because it didn't favor Nvidia like the majority of all games do, except in this case it is a CPU not a GPU? The difference in this case is it is Intel instead of Nvidia and the majority of software still favors Intel architecture. The sad part is, you are ignoring the fact that the i9 7960x is overclocked to 4.8Ghz (14% higher clock speed ), as well as having 2 extra cores.. so your "it favors AMD" argument is invalid no matter how you spin it if these numbers are accurate.

    I for one hope these are legit and accurate results. It will be fun times.
     
    Last edited: May 22, 2019
  17. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    11,705
    Joined:
    Jun 13, 2003
    ...they're both 16-core CPUs...

    They may be for this specific spin of Cinebench, but the application of these scaling results to any other application directly would be wholly incorrect.

    It's entirely possible for Ryzen 3 to be slower than Ryzen 2 with the same number of cores given the changes to the architecture. One step forward, two steps back, and all that.
     
  18. NWRMidnight

    NWRMidnight Limp Gawd

    Messages:
    350
    Joined:
    Oct 23, 2010
    You are correct, my mistake, not sure where I got the 2 extra cores... what I get for posting before I have my morning coffee.. LOL!
     
    Darth Kyrie and IdiotInCharge like this.
  19. Jim Kim

    Jim Kim 2[H]4U

    Messages:
    3,512
    Joined:
    May 24, 2012
  20. byusinger84

    byusinger84 Gawd

    Messages:
    777
    Joined:
    Feb 1, 2008
    Even still, Cinebench is a good indicator of some baseline performance. I knew my performance was bad on my Ryzen 2700X (and before that my 1700X) because of Cinebench. I couldn't really tell in games, per say, except benchmarks were always slower than other comparable systems. Turns out it was my memory. Swapped to Ryzen compatible memory and everything worked perfectly.

    I'm super excited about these rumors. I really hope they end up being true. We're close enough now and I've seen enough smoke to believe it to be. Cost will be an interesting factor as well. Anything in the $200 range that performs as well or better than current gen CPUs are lower TDP will be winners for sure especially for those who haven't switched to AMD.

    I'm personally looking forward to the 12c/24t or 16c/32t CPUs.
     
    Last edited: May 23, 2019
    blkt likes this.
  21. Jim Kim

    Jim Kim 2[H]4U

    Messages:
    3,512
    Joined:
    May 24, 2012
    Dreaming of sugarplums and a Ryzen 7 3700X.
     
    LightsOut41 and Legendary Gamer like this.
  22. mnewxcv

    mnewxcv [H]ardness Supreme

    Messages:
    6,471
    Joined:
    Mar 4, 2007
    Really getting curious on pricing. If a tr4 refresh is some months after zen 2 release, will zen 2 be in the same price bracket as tr4?
     
  23. Nobu

    Nobu 2[H]4U

    Messages:
    3,234
    Joined:
    Jun 7, 2007
    Maybe current gen TR price for matching core counts, more or less depending on how much TR is left in inventory and the relative performance. Expect Zen2 TR to be more expensive across the board.
     
  24. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    What Nvidia fans have to do with what is being stated here about Cinebench? What is being remarked is that Cinebench doesn't represent the average performance of Zen chips, neither represent those tasks where AMD is much worse. Cinebench is an "outlier" for AMD Zen. Why Cinebench is a best case for Zen doesn't have anything to do with optimizing for "Intel architecture". It has to do with Cinebench being a rendering benchmark (that is a throughput load) and Zen being optimized for throughput rather than for latency, apart from Cinebench having anomalous SMT yields (which further favors an architecture as Zen)

    And the 7960X has 16 cores, not 18.
     
  25. NWRMidnight

    NWRMidnight Limp Gawd

    Messages:
    350
    Joined:
    Oct 23, 2010
    You completely missed my point, as it is hard to demonstrate true performance, when the majority of software is coded and optimized for Intel, not AMD. The NVidia comment was made in hopes you would not take a similar path in this comparison, but it didn't work as your response indicates, as it is just an excuse to discount the results, as expected.

    It is true that you can determine true performance from just one piece of software, but a person still has to give credit where credit is due rather than make statements trying to invalidate the achievment and/or the results, which is exactly what your statement is trying to do. Basically all you are saying is AMD is optimized for this particular application, intel is not, so the results don't mean anything because it shows Intel's weakness and not AMD. Its the exact same argument Nvidia fans use when AMD does well in a particular game.

    As for my core count mistake, may i suggest you read the rest of the comments as it was already covered.
     
    Last edited: May 23, 2019
  26. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    Not true.

    No. I am not saying that "AMD is optimized for this particular application". My remark was other. And you ignored my point again: my point that CB15 doesn't in any way represent the general behavior of applications, but it is an outlier.
     
  27. NWRMidnight

    NWRMidnight Limp Gawd

    Messages:
    350
    Joined:
    Oct 23, 2010
    You are talking in circles, contradictory of your own words. Didn't you just state above that C15 is a rendering benchmark (that is a throughput load) and that Zen is optimized for through put, not latency? Doesn't that mean the zen is optimized for that particular application since that is what C15 is really testing?

    I never said C15 could represent it's behavior in other applications, specially since each application is different and the results will change with each application. In other words, an application can only show the behavior for that particular application or applications in that particular category. C15 can only represent the performance in the rendering category. But it seems, you are trying to invalidate those results, as if rendering performance means nothing since it isn't able to show the behavior in other applications/categories outside of rendering.

    If C15 is just a "outlier" why did Intel go to great lengths with their 28 core "chiller" fiasco last year to try to look competitive with AMD, using C15?
     
    Last edited: May 23, 2019
  28. Snowdog

    Snowdog [H]ardForum Junkie

    Messages:
    9,354
    Joined:
    Apr 22, 2006
    Even WCCF tech is saying to take this one with "a grain of salt".

    If true, this is a very big performance jump.

    But one thing from this latest WCCF "leak" stands out as a red flag. They have the boost clock on the 12 core at 5Ghz, and only 4.3 on the 16 core. That makes no sense at all if you understand how boost clock works.

    Since boost clock usually only affect 1 or 2 cores then starts dropping. It should be about the same on both 12 and 16 core parts, and generally AMD sets it higher on higher core count chips not the other way around.

    It looks more like someone got sloppy making up their "leak".
     
    wolfofone and Jim Kim like this.
  29. Snowdog

    Snowdog [H]ardForum Junkie

    Messages:
    9,354
    Joined:
    Apr 22, 2006
    Based on what? Why wouldn't there be cache on the IO die?

    Without cache you are reduced to doing a cross bar from the memory controller to each chiplet(locking out the rest), for each memory access. This would be quite inefficient.

    With a nice fat cache, the memory controller can work on keeping the cache filled, while chiplets could have simultaneous access to cache portion set up for them.
    Also look at the size of the IO die, and consider that it does relatively little. I would bet most of the die is cache.
    This will be extremely important for Epyc with 8 chiplets, but this kind of design should also by on Ryzen 3000 with two chiplets.

    I'd be shocked if there is no cache in IO die.
     
  30. Nobu

    Nobu 2[H]4U

    Messages:
    3,234
    Joined:
    Jun 7, 2007
    relevant article on wikichip
     
  31. Snowdog

    Snowdog [H]ardForum Junkie

    Messages:
    9,354
    Joined:
    Apr 22, 2006
    While that answer is slightly better than "no", it's an Infinity Fabric article that says nothing about the new I/O controller chip in Ryzen 3000.

    The only thing I could find on wikichip that mentioned the either of the new I/O dies was this:
    https://en.wikichip.org/wiki/amd/microarchitectures/zen_2
    The truth is that no one knows what is in the I/O die, but given the large size vs work to be done, it seems reasonable that it would include a significant cache.
     
    N4CR likes this.
  32. Nobu

    Nobu 2[H]4U

    Messages:
    3,234
    Joined:
    Jun 7, 2007
    Wasn't really meant as a "no", just for informational purposes. I understand that, being a wiki, it is made up of bits of information pieced together by individuals with varying levels of understanding of the subject, sometimes without good sources.
     
  33. Jim Kim

    Jim Kim 2[H]4U

    Messages:
    3,512
    Joined:
    May 24, 2012
    ^This
    I seriously doubt AMD has spent all this time and money just to push out a turd.
     
  34. jeffj7

    jeffj7 [H]Lite

    Messages:
    100
    Joined:
    Jun 2, 2012
    considering the price difference between the intel and its leaked rumored price. well it would have to be really terrible not to be a big win. cant wait to find out for sure :)
     
    Darth Kyrie and Jim Kim like this.
  35. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    Space. There is no enough space in the IO die.
     
  36. juanrga

    juanrga Pro-Intel / Anti-AMD Just FYI

    Messages:
    2,550
    Joined:
    Feb 22, 2017
    Zen muarch is optimized for throughput workloads, which isn't the same than saying that "AMD is optimized for this particular application". I already explained why CB15 is an outlier, because it has special characteristics such as abnormally large SMT yields. The Zen architecture has not been optimized for executing CB15.

    You continue ignoring the point, and this is the third time you do. The problem here isn't that CB15 doesn't represent non-rendering applications. Blender doesn't represent 7-zip and 7-zip doesn't represent SPECint, and SPECint doesn't represent GROMACs... The problem is that CB15 is an outlier (it doesn't represent rendering, because Blender, Corona,... behave differently), and being an outlier CB15 must be taken out the sample of representative applications.

    It doesn't matter what marketing teams do. CB15 is an outlier as proved in a former post.
     
  37. schmide

    schmide Limp Gawd

    Messages:
    221
    Joined:
    Jul 22, 2008
    Why do you keep subjugating zen as throughput optimized?

    If you take a fixed set of data and measure how long it takes, it is a response metric.

    If you take a fixed time period and measure how much data is processed, it is a throughput workload.

    Since CB15 takes a fixed set of data and times how long it takes it is thusly the former.

    By the above logic and your subjugation, zen is optimized for response not throughput.

    No processor is optimized for a certain frame of reference.

    Irony that you pull out your GROMACs whistle when complaining about an outlier.
     
  38. KazeoHin

    KazeoHin [H]ardness Supreme

    Messages:
    7,862
    Joined:
    Sep 7, 2011
    Remember: when Intel dominates a benchmark, it's because it's a normal task. When AMD dominates: it's because of a freak accident.





    We're probably going to see a lot of freak accidents soon...
     
  39. NWRMidnight

    NWRMidnight Limp Gawd

    Messages:
    350
    Joined:
    Oct 23, 2010
    CB15 doesn't represent non-rendering applications? I swear I saw a statement that said basically the same thing. Now where did I see that.. hmm, oh wait, silly me! I said it in the very response you replied too (did you fully read what i said).. I think you just confirmed the point I was trying to make. You even took it so far as give examples, and basically using your logic you invalidated every benchmark/application used to judge performance because no single benchmark/application is capable of demonstrating relative performance in every situation for every workload category be it rendering, compression algorithm, gaming, etc. So basically with your logic, 7-Zip benchmarks are invalid because it doesn't represent rendering performance. Do you see how silly your argument is now?

    BTW, how is a rendering benchmark not a representation of rendering? I get that a benchmark is going to behave different than an actual rendering application, that is a given. Just as gaming benchmarks behave different than actual game play. But they are still tools to give us indicators of how a piece of hardware will perform doing a particular work load and a way to judge performance between different manufactures/architectures etc.
     
    Last edited: May 23, 2019
    Darth Kyrie and schmide like this.
  40. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    11,705
    Joined:
    Jun 13, 2003
    If we take average performance over a range of benchmarks that say represent >95% of applicable workloads, and there's this one benchmark that really stands out one way or another, we can call it an outlier. CB15 more or less is that.

    If we could take a CB15 score and apply that scaling across the board, that would be awesome- but we really, really can't, particularly with AMD changing the layout with Ryzen 3. Ryzen to Ryzen 2 would be more reliable and yet we'd still be a bit skeptical.

    To state my concern, I see the possibility that Ryzen 3 might be faster than expected in some workloads and slower than expected in others- as in, we very likely won't see a linear shift in performance from Ryzen 2. These quite nice CB15 resultes point to pure float throughput being up per clock, however, AMD's architectural changes point to lower IPC for anything that requires memory access or thread coherency across dies, simply because they've split processing between two dies and put the memory controller on a third.

    And to be clear, just like the references used in this thread, we're speculating. For me, it isn't anti-AMD- I'd love to see Skylake or better performance across sixteen cores in something accessible to consumers! I just see some very real speedbumps potentially impeding Ryzen 3 from getting there.
     
    juanrga and KazeoHin like this.
Tags: