AMD Ryzen Threadripper 2990WX & 2950X CPU Review @ [H]

chithanh · Aug 15, 2018

PhoenixWings said:
Windows 10 Pro 64bit supports up to 256 cores per CPU. I hope that we can put this to rest now.

That being said, the scheduler might be absolute garbage in Windows 10 Pro and only improves as you move up to Enterprise (I'm speculating here).

I don't expect that using Windows Enterprise or Server will help. Windows may scale up to 256 cores on server workloads, but possibly not on HEDT/workstation workloads.

juanrga said:
Linus doesn't seem convinced the problem is on Microsoft side

https://www.realworldtech.com/forum/?threadid=179265&curpostid=179281
https://www.realworldtech.com/forum/?threadid=179265&curpostid=179333

Linus is at least surprised how bad Windows performs in some of the benchmarks. Also not all benchmarks where the 2990WX came out ahead are perfectly tuned for parallelism.
http://openbenchmarking.org/result/1808130-RA-CPUUSAGED10

juanrga said:
It seems to me that 2990WX is performing better on phoronix review because the suite is using many microbenches and toy-like workloads that fit into cache and avoid the latency/bandwidth penalties on the compute dies.

Certainly there are some benchmarks that fit your description, but by far not all.

In particular, 7-zip compression and Linux kernel compile don't fit into cache. CFD depends heavily on memory bandwidth, and from looking at some of the Windows results one might think that TR 2990WX has hit a wall here (source: https://techreport.com/review/33977/amd-ryzen-threadripper-2990wx-cpu-reviewed/7):

When actually that could rather be a peculiarity of Windows and/or the benchmark (source: https://www.phoronix.com/scan.php?page=article&item=amd-linux-2990wx&num=4).

IdiotInCharge · Aug 15, 2018

N4CR said:
Only thing they had going for them was compiler fucking and sse2 video applications, bit like how they're trying with avx512 today, just that's hardly used at all, so they can't pull the same trick. Whoops..

But we use the crap out of SSE2 today, for everything.

These instruction additions are meant to address x86's (and mostly x87's) inherent deficiencies and are doing a damn fine job of it. AVX512 continues down that road, and for workstation tasks, would be something highly desired. Hard to imagine building a compute-heavy application today without considering its use, especially given the latency and bandwidth benefits of having it done locally versus pushing it across the PCIe bus to GPU(s).

[Edit] And I'll add: I'd love to see the 28-core Intel compared to the 32-core AMD with an AVX512-optimized workload. Hopefully compiled separately for each, with optimizations for each, such that the difference between the two can be reasonably sussed.

IdiotInCharge · Aug 15, 2018

chithanh said:
Also not all benchmarks where the 2990WX came out ahead are perfectly tuned for parallelism.

Parallelism isn't the only thing to consider here, though- it's how resources are distributed, specifically with respect to memory bandwidth, on the 29x0WX. Having cores that don't have local memory access at all is definitely going to throw a wrench into things. Just being well-threaded might actually be a detriment in some cases if the process isn't also aware of and considers how memory is also distributed.

FrgMstr · Aug 15, 2018

capt_cope said:
But they didn't say NOT to try more ram though

If you've got some spare dimms it might be worth the time to test it out.

Thanks for the insight.

IdiotInCharge · Aug 15, 2018

capt_cope said:
But they didn't say NOT to try more ram though

If you've got some spare dimms it might be worth the time to test it out.

A thought on this- the issue could be alleviated by more RAM, but more likely, performance issues with highly-threaded workloads are going to be constrained by having half of those threads on cores that are not directly connected to a memory controller.

But hell try it if it can be tried.

chithanh · Aug 15, 2018

Seems that one part of the reason why the 2990WX does poorly in games is the NVidia driver. NVidia GeForce driver seems to have scaling problems to 64 threads.

https://www.golem.de/news/32-kern-cpu-threadripper-2990wx-laeuft-mit-radeons-besser-1808-136016.html (in German)

IdiotInCharge said:
Having cores that don't have local memory access at all is definitely going to throw a wrench into things.

Do you have an example of a workload where this is reasonably the case? From what I have seen so far, either the operating system or the benchmark application itself appeared to not scale properly.

DejaWiz · Aug 15, 2018

Beast Mode: ON

That 2990WX is fully unchained! I'm really, really looking forward to the coming couple/few years as professional applications and game engines are optimized to take advantage of that number of cores/threads.

My only minor nitpick: I wish AMD had enabled dual socket SMP for TR2. 64C/128T running at 3.0+ GHz for a fraction of the price of the same core/thread EPYC setup? YES PLEASE!

IdiotInCharge · Aug 15, 2018

chithanh said:
Do you have an example of a workload where this is reasonably the case? From what I have seen so far, either the operating system or the benchmark application itself appeared to not scale properly.

Uh, I'm getting at the 'not scale properly' problem; if an application attempts to use cores with remote memory access the same way it uses cores with local memory access, there are going to be performance issues. Whether that's the Windows scheduler not holding the applications hand or the application just not accounting for the difference is certainly a point of concern, but the overriding issue is that a processor that has cores that are not directly connected to main memory is being used in a 'consumer' environment.

cyberguyz · Aug 15, 2018

DejaWiz said:
Beast Mode: ON

That 2990WX is fully unchained! I'm really, really looking forward to the coming couple/few years as professional applications and game engines are optimized to take advantage of that number of cores/threads.

My only minor nitpick: I wish AMD had enabled dual socket SMP for TR2. 64C/128T running at 3.0+ GHz for a fraction of the price of the same core/thread EPYC setup? YES PLEASE!

AMD does not want to give away the market that buys EPYCs to one that can by TRs for a lot less. Thus I very much doubt you will ever see multi-socket TRs. AMD is making some massive headway in the server market with EPYC which is pretty much trouncing Intel. They won't jeopardize the huge server market for a fringe group that wants to run multi-socket TRs.

That said there is really nothing stopping you from running a less enterprise-class server with a 64-core TR. It would handle server duties just fine as long as you are not expecting to need ECC memory or heavy memory access server apps.

chithanh · Aug 15, 2018

Threadripper officially supports ECC.

schmide · Aug 15, 2018

IdiotInCharge said:
Would it be fair to just swap 'Infinity Fabric (IF)' for 'uncore'? I get that the IF is just one part of uncore, albeit probably the most prominent relative to power draw, but what we're really talking about is how non-compute power scales considerably when the number of cores scales up, right?

While related, context matters. When routing a few connections on a less complex chip, you could blend the IF and uncore. Topographically it's hard to lump all this into one comparable unit. The longer the lines the more power they use. Why? Combinations. If you're using a point to point model where each connection is distinct and routeable:

nCr (specifically the number of pathways between n objects is nC2)

2 = 1
3 = 3
4 = 6
5 = 10
6 = 15
7 = 21
8 = 28

Pathways can be directional and shared, this cuts down on the complexity of things but now they must be arbitrated. Intel's ring and mesh are a good representation of certain tradeoffs for complexity and distance.

Threadripper1 is 2 chips linked together. You have one pathway. Threadripper2 is 4 chips with 6 pathways. Thus 6x.

This can apply to almost any topology.

Now to the crux of all this. Is "uncore" getting bigger? Relative to others things when they get smaller. Yes.

Is it fair to say anands numbers are wrong or off when they explained in great detail and context what they were representing? No.

AMD is not going to win an uncore war with a ring or mesh bus. They are just different beasts. It's a tradeoff.

tangoseal · Aug 15, 2018

I'd buy an epyc just to run plex of they weren't so damned expensive.

STEM · Aug 15, 2018

IdiotInCharge said:
But we use the crap out of SSE2 today, for everything.

These instruction additions are meant to address x86's (and mostly x87's) inherent deficiencies and are doing a damn fine job of it. AVX512 continues down that road, and for workstation tasks, would be something highly desired. Hard to imagine building a compute-heavy application today without considering its use, especially given the latency and bandwidth benefits of having it done locally versus pushing it across the PCIe bus to GPU(s).

[Edit] And I'll add: I'd love to see the 28-core Intel compared to the 32-core AMD with an AVX512-optimized workload. Hopefully compiled separately for each, with optimizations for each, such that the difference between the two can be reasonably sussed.

Please name three useful applications that use AVX 512 instructions.

AMD could have implemented AVX 512, however, it takes up a lot of silicon real estate. Most tasks that AVX 512 instructions can be used for can be offloaded and handled way better and faster by a GPU. Intel does not make GPUs (their integrated graphics crap doesn't count), so of course, Intel has a vested interest in offering similar functionality in their CPUs, albeit way slower and way less efficient. AMD, on the other hand, makes the Radeon Instinct line for example. So take a wild guess why for AMD it makes no sense to implement AVX 512. If you're still in doubt, remove any AVX multiplier limit in your BIOS (in case you have a Skylake-X CPU), fire up Prime 95, and watch your CPU temperature and power consumption go through the roof.

If management at Intel has any common sense, they will let that 28 core Xeon be. That presentation at Computex was reactionary, and it was meant to prevent AMD from showing their own 32 core Cinebench scores, which are lower than Intel's overcooked Xeon chip. Nothing more. Intel can either hold out and hang around the mid range until they can manufacture at 10nm, or they can follow AMD's lead and implement MCM in their designs. What do you think would be more cost effective and profitable for Intel?

IdiotInCharge · Aug 15, 2018

PhoenixWings said:
Please name three useful applications that use AVX 512 instructions.

Ouch!

PhoenixWings said:
AMD could have implemented AVX 512

And not only should they have, they will.

PhoenixWings said:
Intel does not make GPUs

They don't make discrete GPUs

PhoenixWings said:
Intel does not make GPUs (their integrated graphics crap doesn't count)

It absolutely does. Not only can you game on them- I do, and I have a 1080Ti in my main rig!- but they're fully-featured and the drivers are working very well.

PhoenixWings said:
so of course, Intel has a vested interest in offering similar functionality in their CPUs, albeit way slower and way less efficient.

You're entirely skipping over what happened to their second try at discrete GPUs- they become the very effective Xeon Phi accelerators, that also make heavy use of AVX. And unlike GPUs, both Intel Xeon CPUs and Xeon Phi accelerators support branching code running natively on each compute core.

PhoenixWings said:
So take a wild guess why for AMD it makes no sense to implement AVX 512.

Honestly and with all intent of fair evaluation, they likely just had not gotten around to it. I get why they likely focused on refining other parts of Zen 1, especially given just how rough the memory controller was on release.

PhoenixWings said:
If you're still in doubt, remove any AVX multiplier limit in your BIOS (in case you have a Skylake-X CPU), fire up Prime 95, and watch your CPU temperature and power consumption go through the roof.

I have, many times before!

And it tells me that AVX512 is rather easy to put to use, and that it's capable of doing some real work.

Now, all of just to say that it's something that would have been nice for AMD to have included, and it's something that they lack in the workstation space. Again, if I were building apps that needed to do compute on the CPU, I'd want the latest AVX support in my code. If I were buying a workstation, I'd have to balance AVX512 support with a few more cores (at lower clockspeeds and lower IPC).

For myself, I'd probably still choose TR2, but that's because I have no personal need for these parts. Work is another matter entirely.

FrgMstr · Aug 15, 2018

Xyxox · Aug 15, 2018

Looks like the 2990WX is the proc of choice for running Blender on Linux.

FrgMstr · Aug 15, 2018

Xyxox said:
Looks like the 2990WX is the proc of choice for running Blender on Linux.

Sort of the same way it is on Win 10....

pendragon1 · Aug 15, 2018

Xyxox said:
Looks like the 2990WX is the proc of choice for running Blender on Linux.

you saved your first post for a year and a half for that?

pendragon1 · Aug 15, 2018

Xyxox said:
Well, yeah. I haven't been to this site again in a year and a half. Guess I'm not welcome here.

nobody said that.

Zarathustra[H] · Aug 15, 2018

Xyxox said:
Well, yeah. I haven't been to this site again in a year and a half. Guess I'm not welcome here.

Everyone is welcome here, but the hardforums do require a little bit more thick skin than some other communities.

We aren't subtle with our opinions

Xyxox · Aug 15, 2018

pendragon1 said:
nobody said that.

Good to know.

Zarathustra[H] said:
Everyone is welcome here, but the hardforums do require a little bit more thick skin than some other communities.

We aren't subtle with our opinions

It was just me taking a comment wrong.

FrgMstr · Aug 15, 2018

Xyxox said:
I find Blender just works better overall on Linux. YMMV.

So what you wanted to say was Linux is your choice for Blender? OK. In what exact ways is it better? Please share some data so the rest of us can benefit from your experience. Thanks.

schmide · Aug 15, 2018

pendragon1 said:
you saved your first post for a year and a half for that?

I went almost 10 years. Sorry to take this off topic again.

pendragon1 · Aug 15, 2018

schmide said:
I went almost 10 years. Sorry to take this off topic again.

just looked, pretty easy with only 65

that looked like a worthwhile post though, saved it to call out bs.

back ot, wish I had need/money for one of these, maybe just to play with.

FrgMstr · Aug 15, 2018

prne10 said:
Sorry if I am missing it, but was there any followup on the MSI MEG X399 Creation motherboard issue (not sure if you're allowed to share what the issue was or not)? I want to pull the trigger on this thing, but that somewhat cryptic note about it breaking has me concerned.

I had corrupted the UEFI on the board. Not sure how I did, but I did. I used the UEFI Flashback feature with a USB stick and I was back up and running. Doing 2950X OCing now.

fwiler1 · Aug 15, 2018

Dan_D said:
I haven't used Hyper-V in years. I was planning on giving that a shot. I know VMWare fairly well so that's where my comfort zone lies.

Really like hyper-v now, just due to the simplicity and features, but with better networking and less downtime. Nothing against vmware as it served us well, except their pricing strategy. We switched from vmware to hyper-v a couple of years ago at our business and no regrets. I wouldn't have done that 4 years ago.
On another note, I would love to see some type of multi application benchmark. I've never seen someone produce what I would call a real life scenario where you could benefit from all these cores.
For home use, I like running multiple vm's while still using Windows 10 for core. I don't think I'm out of the norm for this type of configuration either. Some encoding going on in background while working with Photoshop, etc.

FrgMstr · Aug 15, 2018

DuronBurgerMan said:
[H]eresy! What demonic entity has possessed Kyle, that he would say such a thing?

I hear ya. 580 watts package power pulling over 1KW at the wall....for the few moments it ran. Weird shit going on there on the ASUS board. I will get PBO working soon hopefully.

dook43 · Aug 15, 2018

My I7 920 4.0ghz with a 5970 overclocked system needed every bit of 1000w almost 10 years ago... not sure what all the fuss is about systems using that much power.

FrgMstr · Aug 15, 2018

dook43 said:
My I7 920 4.0ghz with a 5970 overclocked system needed every bit of 1000w almost 10 years ago... not sure what all the fuss is about systems using that much power.

Well, first off we are talking about the processor....alone.

FrgMstr · Aug 15, 2018

Xyxox said:
I've pretty much only used Blender on Linux until very recently. Since I'm most familiar with Linux as a platform, I'm much more comfortable with it as a platform. All of the benchmarks I've seen with Blender using the 2990wx show Linux to be the best performing OS with that processor. Given the fact that I'm most comfortable with various Linux distros as compared to Windows and the fact that performance benchmarks better on the 2990wx with Linux, I'll just stick with Linux when I build a 2990wx system. As always, and as specifically indicated in my previous post, YMMV.

https://www.phoronix.com/scan.php?page=article&item=2990wx-linux-windows&num=1

There you go. Give us some scope. Phoronix, which does GREAT Linux content, showed a 14% decrease in render time in blender. (I will type it out here to add value to the thread with data.) I would suggest that Windows is having some big scheduler issues with this CPU and we will likely see that gap greatly close. I hope they show us the same tests with the 2950X.

IdiotInCharge · Aug 15, 2018

I realize that time is money, but all that drama over 14% on a newly-released part benchmarked on a consumer OS?

Wild.

FrgMstr · Aug 15, 2018

IdiotInCharge said:
I realize that time is money, but all that drama over 14% on a newly-released part benchmarked on a consumer OS?

Wild.

Well when you put it in scope, that is 3/4 of a workday a week if you are in the art or animation department rendering scenes all day.

FrgMstr · Aug 15, 2018

IdiotInCharge said:
Sure, but are you replacing your production machine with just-released hardware?

Businesses constantly replay machines every year. Not like every machine is on the same timeline. And with Intel not keeping sockets around long, these companies plan on buying fully new systems because of the ecosystem that Intel has created. You can buy 4 2990X boxes for the upgrade/replacement cost for a new Xeon workstation.

MrDeaf · Aug 16, 2018

After reading and watching people trying to overclock this, 2990WX chip, as well as the theoretical bits on power draw and heat output, it feels like the 2990WX is exceeding what the current "extreme" parts have to offer.

I watched buildzoid's MSI mobo breakdown
Theoretically, the 2990WX can exceed 2x EPS 8pin power connectors... which do 480W each...

So, over 960W of power draw on the CPU alone... that's insane
The MSI board has 1160W? of power delivery, and that's potentially not enough either?... that's insane
And when you have 960W+ of heat from the CPU to dissipate, even LN2 will run into thermal issues... that's insane

So... like... On water, I'm guessing you would need a peltier and water chiller or two to keep the 2990WX from thermal throttling when overclocked

TLDR: Mind = Blown

IdiotInCharge · Aug 16, 2018

capt_cope said:
AMD sales/marketing

AMD engineering and support needs to get with the program too

[if Dell/HP sell it, they have to support it- which means that AMD has to be behind them as well...]

Neapolitan6th · Aug 16, 2018

MrDeaf said:
After reading and watching people trying to overclock this, 2990WX chip, as well as the theoretical bits on power draw and heat output, it feels like the 2990WX is exceeding what the current "extreme" parts have to offer.

I watched buildzoid's MSI mobo breakdown
Theoretically, the 2990WX can exceed 2x EPS 8pin power connectors... which do 480W each...

So, over 960W of power draw on the CPU alone... that's insane
The MSI board has 1160W? of power delivery, and that's potentially not enough either?... that's insane
And when you have 960W+ of heat from the CPU to dissipate, even LN2 will run into thermal issues... that's insane

So... like... On water, I'm guessing you would need a peltier and water chiller or two to keep the 2990WX from thermal throttling when overclocked

TLDR: Mind = Blown

Well that's 960 watts via the spec. They can go past that, they just might melt if you aren't careful.

Though yes crazy amounts of power if pushed, in all the best ways of course.

pgaster · Aug 16, 2018

MrDeaf said:
After reading and watching people trying to overclock this, 2990WX chip, as well as the theoretical bits on power draw and heat output, it feels like the 2990WX is exceeding what the current "extreme" parts have to offer.

I watched buildzoid's MSI mobo breakdown
Theoretically, the 2990WX can exceed 2x EPS 8pin power connectors... which do 480W each...

So, over 960W of power draw on the CPU alone... that's insane
The MSI board has 1160W? of power delivery, and that's potentially not enough either?... that's insane
And when you have 960W+ of heat from the CPU to dissipate, even LN2 will run into thermal issues... that's insane

So... like... On water, I'm guessing you would need a peltier and water chiller or two to keep the 2990WX from thermal throttling when overclocked

TLDR: Mind = Blown

I am no expert, but I think Buildzoid is wrong on that 480W number for an 8 pin connector.

I have seen multiple places that say 336W:
https://forums.servethehome.com/ind...r-supply-with-dual-8-pin-eps-connectors.8371/
http://www.jonnyguru.com/forums/showthread.php?p=97924

4 of the pins are ground, (black), and the other 4 carry power, (yellow)
Multiple people are saying 7 amps is the safe level for each pin.
4 pins x 7 amps x 12V = 336W so the math makes sense

IdiotInCharge · Aug 16, 2018

pgaster said:
Multiple people are saying 7 amps is the safe level for each pin.

This is going to be relative; I would probably agree (if I knew the spec) that 7A is probably a good 'safe' level of current, but every step along the way can be overbuilt.

To that end, with a motherboard and a PSU that are both at least marketed for overclocking, 'safe' might be a good bit higher.

Of course, the only way to know is the [H] way- hook it up and see how many Amps it takes to pop or ignite something

schmide · Aug 16, 2018

cyberguyz said:
I think in an art or production department rendering scenes all day I'm not gonna cheap out on my rendering equipment and go either (multi) Epycs or Xeons. Time is money

So brand loyalty renders faster than actual price / performance?

FrgMstr · Aug 16, 2018

cyberguyz said:
I think in an art or production department rendering scenes all day I'm not gonna cheap out on my rendering equipment and go either (multi) Epycs or Xeons. Time is money

Well, I have actually talked to some guys that do this for a living lately, and you are talking about a HUGE delta in price. Using your logic sounds fine until you actually considering having to pay for it and how much those platforms are costing your per minute.

AMD Ryzen Threadripper 2990WX & 2950X CPU Review @ [H]

Gawd

NVIDIA SHILL

NVIDIA SHILL

Just Plain Mean

NVIDIA SHILL

Gawd

Fully [H]

NVIDIA SHILL

Gawd

Gawd

Limp Gawd

[H]F Junkie

Gawd

NVIDIA SHILL

Just Plain Mean

n00b

Just Plain Mean

Extremely [H]

Extremely [H]

Extremely [H]

n00b

Just Plain Mean

Limp Gawd

Extremely [H]

Just Plain Mean

Weaksauce

Just Plain Mean

2[H]4U

Just Plain Mean

Just Plain Mean

NVIDIA SHILL

Just Plain Mean

Just Plain Mean

Limp Gawd

NVIDIA SHILL

[H]ard|Gawd

[H]ard|Gawd

NVIDIA SHILL

Limp Gawd

Just Plain Mean