AMD Ryzen 16 Core “Whitehaven” Enthusiast CPUs Leaked – 3.6GHz Clock Speed, Boatloads of Cache & Qua

Well, you and others objected to my pre-launch claim that, due to higher performance per core, the 10C Skylake "would be faster" than the 16C TR on workloads with 24 threads. I am talking about the 1920X, for two reasons (i) because we don't have workloads with exactly 24 threads running on the 1950X, and (ii) because the equation I used for my claim applies to both the 1950X and the 1920X, because it is a basic equation of computing

Well then your equation is wrong because its giving too much value to SMT. I'm pretty sure that 16c / 16t TR will be faster than 1920x in all workloads, unless AMD somehow (magically) increased SMT performance so much that its actually giving over 35% perf. increase per core.

That is also the reason why 6c / 6t Coffee lake will be faster than 4c / 8t Kaby. I'm gonna go as far and say that 6c / 6t CFL @ stock 4.3Ghz all core turbo is going to be faster than 5Ghz 4c / 8t Kaby when utilizing all available threads.

But why don't you post your equation so we can all see it and maybe understand why you come to these conclusions based on your equations.


TreadRipper with 3200MHz RAM
When 10C SKL wins to 12C TR, it usually does by a large amount that when it loses. I have not computed the average, but I am sure that it will be close to the average reported by HFR review, despite the above benches are all encoding/rendering, whereas HFR average is for a broader range of applications.
latency-pingtimes-1950x2400.png
latency-pingtimes-1950x3200.png


normal-memspeed.png


That is with unknown timings... and I know for a fact that timings (subtimings in particular) play a HUGE factor when optimizing Ryzen latency penalties. It plays such a big role that 3Ghz Ryzen with fast memory and optimized subtimings is faster than 4Ghz ryzen with "standard" memory profiles. And the same applies to TR.

rotr3kvafh.jpg


3Ghz Ryzen with 3466C14 + proper subtimings is beating out 4Ghz 3200C14 "standard profile".

But I'll concede that it doesn't affect with workloads which doesn't have that much thread dependencies. But then again, you've been spouting for a while now that latency will be a problem with EPYC and TR... and from tests I've read, they seem to be doing just fine with server / workstation loads.
 
Last edited:
Amd.... "We're avoiding the supply issues we encountered with Ryzen 7 by launching Threadripper with a much higher volume".

Yeah whatever.. Amazon estimated shipping date for people who preordered is Sept 11 2017. A month after release. What was the effing point to preordering?
 
Amd.... "We're avoiding the supply issues we encountered with Ryzen 7 by launching Threadripper with a much higher volume".

Yeah whatever.. Amazon estimated shipping date for people who preordered is Sept 11 2017. A month after release. What was the effing point to preordering?

Amazon wont stop preorders so your hosed once they sell past a certian point and your not part of that first wave. Always go with Newegg as they will cut off preorders. I got my Ryzen chip no problem from Amazon but they ran out of motherboards and gave me a two month window. Ordered from Newegg and got it within a week even tho they had no stock when I ordered. Avoid Amazon for preorders, only order if they show stock on hand.
 
latency-pingtimes-1950x2400.png
latency-pingtimes-1950x3200.png



normal-memspeed.png


[...]

But I'll concede that it doesn't affect with workloads which doesn't have that much thread dependencies. But then again, you've been spouting for a while now that latency will be a problem with EPYC and TR... and from tests I've read, they seem to be doing just fine with server / workstation loads.

Thanks for the first figure, which shows the huge latency between cores on different dies. Of course, as I have stated a dozen of times, this latency penalty is not affecting throughput loads such as the collection of CineBench, Handbrake, Corona Render, V-Ray,... used in many reviews. Remember

latency != throughput

The second figure just confirms my point that faster memory doesn't change the situation significantly on TR. Ignore the synthetic memory bandwidth measurement, because obviously this is going to be very sensitive to RAM speed, and take the average of the remaining 22 benchmarks. If I did not maje any mistake the averages are

2400 RAM: 1.01
3200 RAM: 1.04

Therefore the faster RAM settings provided about 3% higher performance on average. People that criticized the HFR review was wrong. And people that did make accusations about cherry picking reviews was even more wrong.
 
Last edited:
I'm hearing Amazon has stock of Threadripper now. Have to check. Posting from my phone.
Just checked. MSI board only8 left. 1-2 days shipping delay on the processors. Amazon's falling behind the curve.
 
Last edited:
I'm hearing Amazon has stock of Threadripper now. Have to check. Posting from my phone.
Just checked. MSI board only8 left. 1-2 days shipping delay on the processors. Amazon's falling behind the curve.

I get my 1950 tomortow (mon) From Crapizon. Still haven't shipped my board. I got a mobo also from Newegg which wil be here wed. MSI. Gbyte is super delayed for some reason at Amafail.
 
Well, you and others objected to my pre-launch claim that, due to higher performance per core, the 10C Skylake "would be faster" than the 16C TR on workloads with 24 threads. I am talking about the 1920X, for two reasons (i) because we don't have workloads with exactly 24 threads running on the 1950X, and (ii) because the equation I used for my claim applies to both the 1950X and the 1920X, because it is a basic equation of computing,



TreadRipper with 3200MHz RAM

HandBrake.png

Premiere.png

Blender1.png

Blender2.png

Corona.png

POVray.png


When 10C SKL wins to 12C TR, it usually does by a large amount that when it loses. I have not computed the average, but I am sure that it will be close to the average reported by HFR review, despite the above benches are all encoding/rendering, whereas HFR average is for a broader range of applications.

A lot comes down to turbo range, Intel process has the clockspeed advantage which contributes a fair amount to a core vs core advantage. The good thing is that AMD have caught up a lot on core vs core and while not yet perfect it is compelling for a large portion of the consumer market.

AMD stirred the pot and Intel clearly is reacting to it, I mean the 7980XE was a pipe dream 6 months ago.
 
Last edited:
A lot comes down to turbo range, Intel process has the clockspeed advantage which contributes a fair amount to a core vs core advantage. The good thing is that AMD have caught up a lot on core vs core and while not yet perfect it is compelling for a large portion of the consumer market.

10--20% behind on IPC, about 1GHz behind, and years-light of distance on AVX workloads (something as 2X or 3X slower) doesn't look as "caught up".

On stock settings, the TR 1950X is about 20% faster than i9-7900X on workloads that scale up to 16C. And TR is slower on all workloads that cannot use those 60% extra cores.

And thanks to higher OC capacities, the i9 can close the gap

srednia_app.png



AMD stirred the pot and Intel clearly is reacting to it, I mean the 7980XE was a pipe dream 6 months ago.

Considering that TR didn't exist on the AMD roadmaps only one year ago. I wonder who really stirred the pot.
 
Last edited:
10--20% behind on IPC, about 1GHz behind, and years-light of distance on AVX workloads (something as 2X or 3X slower) doesn't look as "caught up".

On stock settings, the TR 1950X is about 20% faster than i9-7900X on workloads that scale up to 16C. And TR is slower on all workloads that cannot use those 60% extra cores.
And thanks to higher OC capacities, the i9 can close the gap
Considering that TR didn't exist on the AMD roadmaps only one year ago. I wonder who really stirred the pot.

Wow never seen someone complain about Threadripper that much without any validation behind it. What is next you are going to tell us it is slower in gaming ? What about the latencies ?
Luckily were spared the Intel will come back with 10nm or Optane.

The product is still ahead of Intel 16C32T because it does not exist for consumers also it beats Intel hands down on the price.
Maybe you could link us some data where there is some exotic benchmark where Intel wins all the time.

You should ask the people at Intel who stirred the pot all of their high priced server cpu which costs over thousands of dollars are now used for HEDT at a fraction of what they are worth in a different market.
 
10--20% behind on IPC, about 1GHz behind, and years-light of distance on AVX workloads (something as 2X or 3X slower) doesn't look as "caught up".

On stock settings, the TR 1950X is about 20% faster than i9-7900X on workloads that scale up to 16C. And TR is slower on all workloads that cannot use those 60% extra cores.

And thanks to higher OC capacities, the i9 can close the gap

srednia_app.png





Considering that TR didn't exist on the AMD roadmaps only one year ago. I wonder who really stirred the pot.

As before AVX is a niche market, one intel had because they bought all the enterprise players by ensuring AMD would never compete in that segment. AVX like CUDA is manipulative of a single player climate that doesn't treat other processing units equally. Despite AMD supporting AVX 256 in my opinion and based on intels very own proviso in their AVX section says that non intel gets baseline support and to me Baseline seems like AVX only support nothing more in so much as CUDA did nothing on Radeon cards.

I have stated this before, I like many don't expect that situation to change over night but do expect AMD will nibble bit by bit away from HPC/Enterprise until developers in that segment choose not to use propriatory standards.

As to the rest, yes AMD's strength is parallelism and in general computing which Ryzen and Threadripper are designed for the degree of parallelism is impressive, far more than I was expecting. So in scaling AMD is very much keeping intel honest.

On the clock speed, unfortunately it is part of the consequences of using a LP node, but nodes can be changed easily and I am sure AMD will change over to IBM beyond pinnacle ridge. I won't lie the clocks are a bit to low and gives far to much advantage to the intel priced competition in that area.

I have people that said that TR prototypes were well known during the Ryzen ES stage, guised as a Naples, so I don't think it was a surprise to AMD, I remember how you vociferously disputed its existence about a month before it was confirmed.

Anyways I will leave you to being continually salty about AMD to much salt is toxic and I have had my dosage.
 
Wow never seen someone complain about Threadripper that much without any validation behind it. What is next you are going to tell us it is slower in gaming ? What about the latencies ?
Luckily were spared the Intel will come back with 10nm or Optane.

The product is still ahead of Intel 16C32T because it does not exist for consumers also it beats Intel hands down on the price.
Maybe you could link us some data where there is some exotic benchmark where Intel wins all the time.

You should ask the people at Intel who stirred the pot all of their high priced server cpu which costs over thousands of dollars are now used for HEDT at a fraction of what they are worth in a different market.

If you wondered what all the salt in the ocean would be like, I present you exhibit A.
 
As before AVX is a niche market

AVX512 is a standard in HPC, and now broafdy used in servers.

I have stated this before, I like many don't expect that situation to change over night but do expect AMD will nibble bit by bit away from HPC/Enterprise until developers in that segment choose not to use propriatory standards.

As x86-64, which is a closed standard that AMD doesn't license except to Intel via the crossing-license agreement? Like FSA, which was a propietary standard that AMD tried, but no one wanted?

As to the rest, yes AMD's strength is parallelism and in general computing which Ryzen and Threadripper are designed for the degree of parallelism is impressive, far more than I was expecting. So in scaling AMD is very much keeping intel honest.

Designing a CPU optimized for latency is harder than designing one optimized for throughput. AMD is repeating the same strategy than with Bulldozer: "BULLDOZER: AN APPROACH TO MULTITHREADED COMPUTE PERFORMANCE"

We have the same strategy now with Zen: moar-cores at similar price points, more distributed microarchitecture, memory subsystem more optimized for throughput than latency, overclocking to reduce the latency penalty,...

On the clock speed, unfortunately it is part of the consequences of using a LP node, but nodes can be changed easily and I am sure AMD will change over to IBM beyond pinnacle ridge.

I have heard that excuse before.
 
Wow never seen someone complain about Threadripper that much without any validation behind it. What is next you are going to tell us it is slower in gaming ? What about the latencies ?

#248

And anxiously awaiting for the TR 1900X review, because this will have only four cores per die.

The product is still ahead of Intel 16C32T because it does not exist for consumers also it beats Intel hands down on the price.
Maybe you could link us some data where there is some exotic benchmark where Intel wins all the time.

Dozens of ordinary benches have been posted in this thread showing that 10C SKL (on stock settings) wins to 12C more often than not, and is just behind the 16C TR on throughput-optimized workloads (such as rendering and encoding). Averages at both stock settings at OC were also posted. i9 OC is on pair with 1950X OC.
 
#248

And anxiously awaiting for the TR 1900X review, because this will have only four cores per die.



Dozens of ordinary benches have been posted in this thread showing that 10C SKL (on stock settings) wins to 12C more often than not, and is just behind the 16C TR on throughput-optimized workloads (such as rendering and encoding). Averages at both stock settings at OC were also posted. i9 OC is on pair with 1950X OC.
You forgot to talk about PRICE again and PLATFORM advantages that are GREATLY in AMDs FAVOR.

You keep forgetting to comment on those.
 
AVX512 is a standard in HPC, and now broafdy used in servers.



As x86-64, which is a closed standard that AMD doesn't license except to Intel via the crossing-license agreement? Like FSA, which was a propietary standard that AMD tried, but no one wanted?



Designing a CPU optimized for latency is harder than designing one optimized for throughput. AMD is repeating the same strategy than with Bulldozer: "BULLDOZER: AN APPROACH TO MULTITHREADED COMPUTE PERFORMANCE"

We have the same strategy now with Zen: moar-cores at similar price points, more distributed microarchitecture, memory subsystem more optimized for throughput than latency, overclocking to reduce the latency penalty,...



I have heard that excuse before.

Since the market AMD designed its product for is one of multithreaded performance, they have done well, haven't seen many people into large content production complaining at all about Ryzen performance in those tasks.

So for workloads requiring large rendering to be done as fast as possible AMD did very well and still offers competent gaming performance at a low cost, yeah that's so bad for us man.

You mean to say a company with largely more resources can dedicate them to trying to build the mecca of CPU's but has largely been stuck in Neutral for years, opposed to a company with very little RnD designing the best effective CPU it can on skint budget. Just seeing your reactions over the internet clearly AMD gave a few nosebleeds.

Stay salty brother, stay salty.
 
#248
And anxiously awaiting for the TR 1900X review, because this will have only four cores per die.
.

I'm not waiting for anything ;) what will be funny is the 16C32T Intel cpu ;) .

What I am anxious for is when software is written that revolves around CCX communication for larger workloads. In the end on _any_ (yes also Intel) higher core count cpu (16+) would benefit from having small clusters rather then large clusters for communication.
 
I'm not waiting for anything ;) what will be funny is the 16C32T Intel cpu ;) .

What I am anxious for is when software is written that revolves around CCX communication for larger workloads. In the end on _any_ (yes also Intel) higher core count cpu (16+) would benefit from having small clusters rather then large clusters for communication.

In typical AMD fashion there is some future low lying fruits if you consider the loads where it excels at. CCX communication is something that will get ironed out over time but what I have seen from a physical Ryzen system in front of me is that it does everything well if not necessarily the best, but the kicker is that it didn't really need to be the utter best.
 
You forgot to talk about PRICE again and PLATFORM advantages that are GREATLY in AMDs FAVOR.

You keep forgetting to comment on those.

I was discussing exclusively performance on x86 workloads on that post. Pricing is similar, and I didn't discuss platform characteristics of both, just as I didn't disscuss AVX512, the content/gaming BIOS mode, or efficiency.
 
Since the market AMD designed its product for is one of multithreaded performance, they have done well, haven't seen many people into large content production complaining at all about Ryzen performance in those tasks.

So for workloads requiring large rendering to be done as fast as possible AMD did very well and still offers competent gaming performance at a low cost, yeah that's so bad for us man.

Multithreaded performance is given by the product of number of cores used and performance per core. Also for code with lots of explicit parallelism, a SIMD approach as AVX512 is better because reduces the thread scheduling overhead in the OS and reduces the fetch/decode overheads in the core pipeline. This is the same reason why ARM has developed the SVE specification with support up to 2048bit.

Pricing is the about same and TR play games worse. I gave above application averages, now take a look to game averages

srednia_gry.png
 
Multithreaded performance is given by the product of number of cores used and performance per core. Also for code with lots of explicit parallelism, a SIMD approach as AVX512 is better because reduces the thread scheduling overhead in the OS and reduces the fetch/decode overheads in the core pipeline. This is the same reason why ARM has developed the SVE specification with support up to 2048bit.

Pricing is the about same and TR play games worse. I gave above application averages, now take a look to game averages

srednia_gry.png

oh look

images


Did they run out of English sites, I mean I can't even figure out what the bench is, at lease we know a non oc 1800X and 1950X with similar clocks performs more or less the same showing gaming as a non scaling situation. I also love how a 6950X is atop of a 7700K in gaming making me question this as pure cherry picked BS that supports your agenda.

Keep picken
 
Back
Top