AMD Ryzen Threadripper 2990WX & 2950X CPU Review @ [H]

Riccochet · Aug 14, 2018

Dan_D said:
Well that's the sort of thing I'm working through. While I have a background dealing with VMWare, I haven't been using it for playing games.

GPU passthrough in ESXi for gaming is not very good, and most likely you'll run in to issues. It's fine if you are passing through a Quadro or GRID GPU for compute workloads, but graphics are a whole other challenge. The VM will need to be set to hide the hypervisor, so it thinks it's running on bare metal hardware. Even then, certain feature sets of the GPU will not function. It'll be laggy at best for anything more demanding than desktop applications.

For the CPU you can try setting affinity or resource reservation.

If you want some help feel free to hit me up. I live in VMWare daily and have experience doing what you're trying to accomplish.

IdiotInCharge · Aug 14, 2018

Riccochet said:
GPU passthrough in ESXi for gaming is not very good, and most likely you'll run in to issues. It's fine if you are passing through a Quadro or GRID GPU for compute workloads, but graphics are a whole other challenge. The VM will need to be set to hide the hypervisor, so it thinks it's running on bare metal hardware. Even then, certain feature sets of the GPU will not function. It'll be laggy at best for anything more demanding than desktop applications.

For the CPU you can try setting affinity or resource reservation.

If you want some help feel free to hit me up. I live in VMWare daily and have experience doing what you're trying to accomplish.

Have you and Dan_D seen LTT's build?

Factum · Aug 14, 2018

IdiotInCharge said:
Have you and Dan_D seen LTT's build?

Do they have perfomance numbers in there somewhere?

Riccochet · Aug 14, 2018

IdiotInCharge said:
Have you and Dan_D seen LTT's build?

using unRAID. I was talking about ESXi.

Quite honestly ESXi is the least favorable choice to do GPU passthrough. Hyper-V is what I would recommend.

AlphaQup · Aug 14, 2018

Great review as always [H]! Definitely an interesting read, and comes at a great time. Our Additive guys are looking for a couple more simulation machines and this 32-core beast might just be it. Man, all those simulation programs we run love cores

IdiotInCharge · Aug 14, 2018

Riccochet said:
using unRAID. I was talking about ESXi.

Quite honestly ESXi is the least favorable choice to do GPU passthrough. Hyper-V is what I would recommend.

I'm aware that they didn't use ESXi, but hey, they did it and did it years ago

.

Also, I guess I should look into Hyper-V for GPU passthrough- I use it for VM development on my desktop and deploy to my server for testing, but I haven't tried throwing a gaming OS in a Hyper-V VM yet.

Zarathustra[H] · Aug 14, 2018

Grimham said:
Yes, Please! I know it's not the sort of thing [H] usually does, but I'd love to see some virtualization numbers.

Me too, but my experience is that while VMWare and ESXi were the big ones years ago, in the last few years I've seen more and more organizations and individual users migrating to KVM. EVERYONE seems to be dumping VMWare in the last couple of years.

BAsed on this, I'd argue that KVM based results would probably be more relevant than ESXi (or vsphere hypervisor or whatever the hell silly name they are calling it these days).

Zarathustra[H] · Aug 14, 2018

Riccochet said:
using unRAID. I was talking about ESXi.

Quite honestly ESXi is the least favorable choice to do GPU passthrough. Hyper-V is what I would recommend.

The problem with Hyper-V is that your host needs to be Windows. That kind of defeats the entire purpose for a GPU passthrough gaming system.

The reason you do this is because you want to run some sort of *nix system, but still be able to game in a VM.

If I were doing it today, my first try would probably be KVM on an Ubuntu based distribution.

Riccochet · Aug 14, 2018

Zarathustra[H] said:
The problem with Hyper-V is that your host needs to be Windows. That kind of defeats the entire purpose for a GPU passthrough gaming system.

The reason you do this is because you want to run some sort of *nix system, but still be able to game in a VM.

If I were doing it today, my first try would probably be KVM on an Ubuntu based distribution.

Hyper-V Server is not Windows front end based. And Core is CLI only. But the way it handles hardware passing to Windows based VM's is much better than ESXi.

schmide · Aug 14, 2018

juanrga said:
Take their numbers with suspicion, they are reporting as "IF" power that corresponds to other planes as IO or memory controllers.

Said that, moving data is the big problem today in computer microarchitecture. In the past most of the computation power was taken by execution units; today moving data to and from execution units takes most of the power budget. There are studies that predicted that this was going to happen

View attachment 96046

And the problem will get worse with future nodes 5nm, 3nm,... Engineers will have to rethink future CPUs to avoid the wire problem.

Dude you seem to draw the wrong conclusion for some purpose? FUD maybe?

The odd unsourced unfinished graph shows things getting better not worse.

I have no doubt there will be inefficiencies in a MCM especially as more weight is placed on the subsystem, but this is never going to be a fundamental flaw.

ShogoXT · Aug 14, 2018

May I make a suggestion for the video encoding test? By the devs admission Handbrake isn't much good for utilizing multicores using x264 and x265.

Lots of people still encode using x264 and x265, but other programs run multiple instances of them to fully saturate the CPU, and combine the pieces. I know at least Ripbot264 can do this and even network computers for this purpose. It also has most of the features Handbrake has. I know of x264 and x265 benchmark programs as well, but I'm checking now about their features.

Thanks for your hard work Kyle.

FrgMstr · Aug 14, 2018

ShogoXT said:
May I make a suggestion for the video encoding test? By the devs admission Handbrake isn't much good for utilizing multicores using x264 and x265.

Lots of people still encode using x264 and x265, but other programs run multiple instances of them to fully saturate the CPU, and combine the pieces. I know at least Ripbot264 can do this and even network computers for this purpose. It also has most of the features Handbrake has. I know of x264 and x265 benchmark programs as well, but I'm checking now about their features.

Thanks for your hard work Kyle.

I think it is just as important, if not more important to show what does not work, as much as what does.

Dan_D · Aug 14, 2018

IdiotInCharge said:
I'm aware that they didn't use ESXi, but hey, they did it and did it years ago .

Also, I guess I should look into Hyper-V for GPU passthrough- I use it for VM development on my desktop and deploy to my server for testing, but I haven't tried throwing a gaming OS in a Hyper-V VM yet.

I haven't used Hyper-V in years. I was planning on giving that a shot. I know VMWare fairly well so that's where my comfort zone lies.

IdiotInCharge · Aug 14, 2018

Dan_D said:
I haven't used Hyper-V in years. I was planning on giving that a shot. I know VMWare fairly well so that's where my comfort zone lies.

Hey, they're all outside of my comfort zone

.

I've been using it because it's seamless between my desktop and my Server 2016 homelab server, and I like that it lets me run Windows on metal with the most hardware pass-through.

But this is homelabbing, not production work!

tangoseal · Aug 14, 2018

ShogoXT said:
May I make a suggestion for the video encoding test? By the devs admission Handbrake isn't much good for utilizing multicores using x264 and x265.

Lots of people still encode using x264 and x265, but other programs run multiple instances of them to fully saturate the CPU, and combine the pieces. I know at least Ripbot264 can do this and even network computers for this purpose. It also has most of the features Handbrake has. I know of x264 and x265 benchmark programs as well, but I'm checking now about their features.

Thanks for your hard work Kyle.

Staxrip can load all those cores easy (I think) but not many people know about Staxrip.

Meeho · Aug 14, 2018

ChronoDetector said:
The benchmark numbers for the 2990WX are all over the place which is rather disappointing, though the Linux benchmarks are rather impressive, hopefully AMD and Microsoft will work together to optimize Threadripper 2 because right now the 2990WX is an immature product that needs to be patched and optimized for it to work to its full potential.

Wouldn't that be Windows, specifically Windows' scheduler, that is immature?

x3sphere · Aug 14, 2018

Riccochet said:
GPU passthrough in ESXi for gaming is not very good, and most likely you'll run in to issues. It's fine if you are passing through a Quadro or GRID GPU for compute workloads, but graphics are a whole other challenge. The VM will need to be set to hide the hypervisor, so it thinks it's running on bare metal hardware. Even then, certain feature sets of the GPU will not function. It'll be laggy at best for anything more demanding than desktop applications.

For the CPU you can try setting affinity or resource reservation.

If you want some help feel free to hit me up. I live in VMWare daily and have experience doing what you're trying to accomplish.

I never used ESXi, KVM+QEMU on Linux works excellent for GPU passthrough though. I do all my gaming in a Windows VM, and there is about a 3% performance hit compared to native. I also have a macOS VM with another GPU passed through running on the same PC for certain applications I need to run. So three OS running simultaneously on my 1950X.

IdiotInCharge · Aug 14, 2018

Meeho said:
Wouldn't that be Windows, specifically Windows' scheduler, that is immature?

Not if it's optimized for monolithic cores, but still needs work for AMD's newer 'distributed' architecture. AMD made the change, AMD bears the burden

.

Meeho · Aug 14, 2018

IdiotInCharge said:
Not if it's optimized for monolithic cores, but still needs work for AMD's newer 'distributed' architecture. AMD made the change, AMD bears the burden .

Poor Microsoft is not to blame, but free Linux can get it right. Sure, must be the technology's fault.

Zarathustra[H] · Aug 14, 2018

x3sphere said:
I never used ESXi, KVM+QEMU on Linux works excellent for GPU passthrough though. I do all my gaming in a Windows VM, and there is about a 3% performance hit compared to native. I also have a macOS VM with another GPU passed through running on the same PC for certain applications I need to run. So three OS running simultaneously on my 1950X.

Thanks for that.

I actually just posted a thread in the virtualization subforum asking about this very topic!

IdiotInCharge · Aug 14, 2018

Meeho said:
Poor Microsoft is not to blame, but free Linux can get it right. Sure, must be the technology's fault.

Because the focus here is exactly the same?

This isn't MS vs. Linux. This is hardware and software integration.

It's also not what's been reviewed here.

Meeho · Aug 14, 2018

IdiotInCharge said:
Because the focus here is exactly the same?

This isn't MS vs. Linux. This is hardware and software integration.

It's also not what's been reviewed here.

There have been talks about needing to improve Windows' scheduler for the past decade or so. This is about Microsoft's failure to improve things that matter instead of focusing on useless things (to the consumers).

And if MS would get its act together, the rest of the software industry that is stil lagging would be more likely to improve, providing us with better programs, especially now that CPUs are becoming wider, not faster per core.

chithanh · Aug 14, 2018

Microsoft has a documented history of poor Windows multi-core/thread scaling, and would only address problems when prominent or many users are affected.

Chromium developers encountered such bugs when they upgraded their build systems to 24 cores:
https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and-i-cant-move-my-mouse/
https://randomascii.wordpress.com/2018/02/11/zombie-processes-are-eating-your-memory/
https://randomascii.wordpress.com/2018/02/25/compiler-bug-linker-bug-windows-kernel-bug/

I would not expect any quick fix from Microsoft until major websites start pointing out how bad Windows runs compared to Linux on the 2990WX.

FrgMstr · Aug 14, 2018

capt_cope said:
Have you tried Premiere Pro CC / AME CC with more ram?

No. That said, I did run my results by AMD and they made no suggestion as to using more RAM to fix the issues.

FrgMstr · Aug 14, 2018

Meeho said:
There have been talks about needing to improve Windows' scheduler for the past decade or so. This is about Microsoft's failure to improve things that matter instead of focusing on useless things (to the consumers).

Considering we see great scaling up to 16C/32T on Win10 right now, I think it is sort of ingenuous to say this is a failure on MS's part. There has never been any reason to have 32C/64T on its desktop OS for the consumer before.

IKV1476 · Aug 14, 2018

Great review as always.
I must say this 2990WX and 2950X are both outside of my needs and usage type BUT that 2950X is dam sexy.
So tempted to jump into this platform.

mvmiller12 · Aug 14, 2018

The issue really boils down to people missing the good old days before MS starting "artificially segmenting their OS's".

Whether or not that was necessary is not a can of worms I'm going to open, but there has always been a general sentiment that there should really only be one version of Windows that does everything. I remember when a simple registry hack could "convert" Windows NT 4 Workstation to Windows NT 4 Server and I believe people still believe that the various Windows builds are still that closely related.

They really aren't. Ever since they fully modularized in Windows 8, the different editions of Windows have been VERY different, and people need to come to grips with this. It is not reasonable to have expected MS to roll down full support for 32/64 processors for regular Windows Home and Pro versions. Didn't they make a HEDT version of Windows 10 Pro?

STEM · Aug 14, 2018

Windows 10 Pro 64bit supports up to 256 cores per CPU. I hope that we can put this to rest now.

That being said, the scheduler might be absolute garbage in Windows 10 Pro and only improves as you move up to Enterprise (I'm speculating here). If anyone has some concrete information about this, please share. As far as Windows 10 Home is concerned, I would never consider it for anything.

The Linux kernel has a far superior scheduler and Red Hat made sure of that around 2001~2002. Red Hat wanted to make sure that the Linux kernel would scale properly, so they funded the development of a new scheduler. Maybe some of you remember the compatibility issues it caused back then, or how certain distros would stick to older kernels for new releases in order to avoid compatibility issues.

I can't find the 2990WX anywhere in stock. I believe that AMD is limiting supply to increase demand, good for them. They should have released the 2950X now as well, instead of making people wait until the 31st this month. Oh well, I guess they want to sell some more 1950X CPUs. People will buy the 1950X regardless because while it's only about 6% slower than the 2950X, it can be had for $600 to $700.

Just my two cents...

IdiotInCharge · Aug 14, 2018

PhoenixWings said:
I believe that AMD is limiting supply to increase demand, good for them.

...really?

N4CR · Aug 14, 2018

IdiotInCharge said:
I corrected my post above about ECC- very nice to see support from AMD!

As for Intel not supporting it; I'm not trying to be 'fair' to AMD here, I am (or was) trying to point out a niche that AMD could fill that Intel does not. And it looks like AMD has!

[a use case I've considered recently: using Threadripper as a large NAS- all of those PCIe lanes could be used for controllers for tiered storage, that is fast NVMe caches including perhaps Optane, with high-capacity SATA SSDs and even higher capacity NAS/Enterprise spinning disks, while simultaneously supporting local application server VMs, and the major point here is that high-usage storage arrays essentially demand ECC RAM to keep the bitrot out]

While I will agree that AMD does have 'workstation-class' processors, they don't appear to market Epyc as such, and Epyc tops out at a pretty low burst speed- only 3.2GHz. One of Threadrippers attractions is that top 4.4GHz burst speed on a few models, and >4.0GHz for the whole lineup.

Understood, sorry for having a rag.
I think the 7nm epyc is going to be far better suited to WS with higher clocks, however it does have more IMC to drive which eats TDP, as Juan pointed out the cost of moving data. A 65nm interposer with the IF links in future may help this by reducing die space and spreading heat generation out.
A TR NAS as you describe would be something quite impressive to see.. And you make a great point, bit rot has been something I've worried about for years but not had any major run-ins, yet.. touch sillicon!

N4CR · Aug 14, 2018

PhoenixWings said:
Windows 10 Pro 64bit supports up to 256 cores per CPU. I hope that we can put this to rest now.

That being said, the scheduler might be absolute garbage

If my memory serves me right, the Athlon 64 x2/Opteron 16X also had issues exactly the same, I remember having to run a patch for this, they were some of the first and most popular dual cores and had similar BS from MS.

STEM · Aug 14, 2018

IdiotInCharge said:
...really?

That statement wasn't meant in a negative way. Look at the 1950X supply, for example, everyone still has plenty of stock. Never mind the 1900X, which no one wants. I think that this time around they are being more careful, especially with the 2990WX supply. If they limit the supply, they won't have to cut prices in the future to move the merchandise. Other companies do it as well...*cough* NVIDIA *cough*

CAD4466HK · Aug 14, 2018

N4CR said:
If my memory serves me right, the Athlon 64 x2/Opteron 16X also had issues exactly the same, I remember having to run a patch for this, they were some of the first and most popular dual cores and had similar BS from MS.

AMD Dual Core Optimizer for XP.

STEM · Aug 14, 2018

N4CR said:
If my memory serves me right, the Athlon 64 x2/Opteron 16X also had issues exactly the same, I remember having to run a patch for this, they were some of the first and most popular dual cores and had similar BS from MS.

Your memory serves you very well. I got the Athlon 64 X2 3800+ as soon as it came out. I loved that CPU. Later on, I also ordered an Opteron from "The Tank Guys" (if anyone remembers them) and tried to get to 3.0GHz.

N4CR · Aug 15, 2018

CAD4466HK said:
AMD Dual Core Optimizer for XP.

Boom! Thank you, cannabis can't be too bad for your memory then

PhoenixWings said:
Your memory serves you very well. I got the Athlon 64 X2 3800+ as soon as it came out. I loved that CPU. Later on, I also ordered an Opteron from "The Tank Guys" (if anyone remembers them) and tried to get to 3.0GHz.

History repeats they say... Just another example of it!
Those were my favourite cpus of all time, absolute beasts and made Intel look like a hot, underperforming POS. Only thing they had going for them was compiler fucking and sse2 video applications, bit like how they're trying with avx512 today, just that's hardly used at all, so they can't pull the same trick. Whoops..
Had the air cooled WR for oppy 165s, 2.84GHz or something at the time, hand picked stepping and it took two weeks to tweak a dfi nf3ud, was running the same scythe ninja I have today. That thing is bulletproof.

CAD4466HK · Aug 15, 2018

N4CR said:
Boom! Thank you, cannabis can't be too bad for your memory then

Not in the slightest, I managed to remember it.

juanrga · Aug 15, 2018

schmide said:
Dude you seem to draw the wrong conclusion for some purpose? FUD maybe?

Cut the usual crap. I am saying that numbers given by Anandtech aren't correct. IF is not so power hungry like they claim. They are reporting as IF power the power is being used by other planes in the SoC.

I am mentioning a mistake in their review and saying that IF is better than what their numbers show.

schmide said:
The odd unsourced unfinished graph shows things getting better not worse.

You don't understand the graph. It represents a well-known problem.

schmide said:
I have no doubt there will be inefficiencies in a MCM especially as more weight is placed on the subsystem, but this is never going to be a fundamental flaw.

Who ever mentioned MCM? Did you even read my post?

juanrga · Aug 15, 2018

Meeho said:
Poor Microsoft is not to blame, but free Linux can get it right. Sure, must be the technology's fault.

Linus doesn't seem convinced the problem is on Microsoft side

https://www.realworldtech.com/forum/?threadid=179265&curpostid=179281
https://www.realworldtech.com/forum/?threadid=179265&curpostid=179333

It seems to me that 2990WX is performing better on phoronix review because the suite is using many microbenches and toy-like workloads that fit into cache and avoid the latency/bandwidth penalties on the compute dies.

schmide · Aug 15, 2018

juanrga said:
Cut the usual crap. I am saying that numbers given by Anandtech aren't correct. IF is not so power hungry like they claim. They are reporting as IF power the power is being used by other planes in the SoC.

Cut the usual crap? I'm not the one posting obscure unreferenced graphs. Seriously look at what you posted. (repeat)

juanrga said:
I am mentioning a mistake in their review and saying that IF is better than what their numbers show.

They know what they're doing. They provide guidance on what they did, how they interpret the data, and to what inferences should be drawn from it.

Edit: I'm pulling back some of my harshness.

juanrga said:
You don't understand the graph. It represents a well-known problem.

Now this could be true. You provided little guidance, zero references, and a bunch of conclusions without a logical path. You should post to the audience, not what you believe is well known. If it causes confusion, that's on you.

So. The blue line is some time in the past when this obscure graph was made. The red line is now which was the future in the past. The deltas go down really close to the processor and really far from the processor while staying the same for near and on chip routing.

This is not going to change. It's a routing issue. The greater the number of connections, the more pathways and more power needed to drive these pathways. They may get more efficient like networking did.

juanrga said:
Who ever mentioned MCM? Did you even read my post?

I did. This seems to be the area you're focused on. Between edges of the core to the memory system. The infinity fabric!!!

The implication drawn is the way AMD set up their chip, a multi-chip module and how it relates to this issue of power.

IdiotInCharge · Aug 15, 2018

schmide said:
I did. This seems to be the area you're focused on. Between edges of the core to the memory system. The infinity fabric!!!

Would it be fair to just swap 'Infinity Fabric (IF)' for 'uncore'? I get that the IF is just one part of uncore, albeit probably the most prominent relative to power draw, but what we're really talking about is how non-compute power scales considerably when the number of cores scales up, right?

AMD Ryzen Threadripper 2990WX & 2950X CPU Review @ [H]

Fully [H]

NVIDIA SHILL

2[H]4U

Fully [H]

Gawd

NVIDIA SHILL

Extremely [H]

Extremely [H]

Fully [H]

Limp Gawd

n00b

Just Plain Mean

Extremely [H]

NVIDIA SHILL

[H]F Junkie

Supreme [H]ardness

2[H]4U

NVIDIA SHILL

Supreme [H]ardness

Extremely [H]

NVIDIA SHILL

Supreme [H]ardness

Gawd

Just Plain Mean

Just Plain Mean

Lurker

[H]ard|Gawd

Gawd

NVIDIA SHILL

Supreme [H]ardness

Supreme [H]ardness

Gawd

2[H]4U

Gawd

Supreme [H]ardness

2[H]4U

2[H]4U

2[H]4U

Limp Gawd

NVIDIA SHILL