To put that into perspective, that's what my ageing dual Westmere-EP system scores under Indigo bench!

In the videos there is mention that the CorePrio utility may also be beneficial to Intel CPU's as well.
 
In the videos there is mention that the CorePrio utility may also be beneficial to Intel CPU's as well.

Probably should have mentioned, I'm not running Windows.

Uploading a quick test video, hopefully it snot too crap. Sorry for background noise, housemates playing COD and geek squad tech support lol.

I have found a second issue, if you dont restart indigo between start/stop on coreprio your results will not be consistent. Took me a few videos to figure out what was going on and you can see it in this video as I restart indigo once to get it to work properly.

EDIT - For what its worth I wouldnt put it past the issue being a BIOS issue with the MSI motherboard i'm using (X399 Gaming Pro Carbon AC) as I cant disable SMT. Doing so results in no POST.

Crap video

View attachment 132934


Is it at all possible for you to install Ubuntu and run the same benchmark?
 
Uploading a quick test video, hopefully it snot too crap. Sorry for background noise, housemates playing COD and geek squad tech support lol.

I have found a second issue, if you dont restart indigo between start/stop on coreprio your results will not be consistent. Took me a few videos to figure out what was going on and you can see it in this video as I restart indigo once to get it to work properly.

EDIT - For what its worth I wouldnt put it past the issue being a BIOS issue with the MSI motherboard i'm using (X399 Gaming Pro Carbon AC) as I cant disable SMT. Doing so results in no POST.

Crap video

View attachment 132934


I'm not in a place I can watch your video, but I have the same Mobo/cpu. Very interested.
 
I'm not in a place I can watch your video, but I have the same Mobo/cpu. Very interested.

Can you set SMT to off and still POST?


Probably should have mentioned, I'm not running Windows.



Is it at all possible for you to install Ubuntu and run the same benchmark?

At the moment cant do it unfortunately, but if I manage some time i might be able to do it.
 
At the moment cant do it unfortunately, but if I manage some time i might be able to do it.

It'd prove beyond all doubt as to whether your issue is the exact same issue or not. I don't think the benchmark is IO intensive so HDD speed shouldn't even matter...

[EDIT] I love your IRC client, I haven't seen that in years!
 
I used the handbrake CLI and transcoded big_buck_bunny.mov from the blender website to the amazon 2160p HVEC preset in ~15 mins on epyc (8 channels) to get a baseline. It was an appaling ~8.6ish frames per second.

Could you indicate the exact preset you used? Using one of the few 2160p60 HEVC 4K Surround presets on my ancient workstation (12c/24t) on the same movie places me in 15 minute territory as well.
 
It'd prove beyond all doubt as to whether your issue is the exact same issue or not. I don't think the benchmark is IO intensive so HDD speed shouldn't even matter...

[EDIT] I love your IRC client, I haven't seen that in years!

Had mint kicking around on a USB thumb drive, is that good enough? Scored 1.9x vs 1.0/1.7x in windows. Pretty sure that windows just has a problem

 
Not looking favorable. I'm hoping the xx00's and xx20's are not caught in this. I guess I will see when I get back to my TR system sometime next month.
 
Could you indicate the exact preset you used? Using one of the few 2160p60 HEVC 4K Surround presets on my ancient workstation (12c/24t) on the same movie places me in 15 minute territory as well.

Where is the file on the Blender site? As I used a 4k BigBuckBunny file I had here that I downloaded a while ago that I thought was the same thing and encoding to H265 2160p60 took ~8mins on my aging dual X5675 PC running Ubuntu Mate 16.04.

Now I'm doubting whether I used the right preset?

One thing worth noting, using htop I had near perfect 100% utilization across all 12C/24T running Handbrake from the GUI.

[EDIT] Just tried the Fire TV 2160p60 4K HEVC Surround preset, got ~the same time. Need to know the exact file/preset used.
 
Last edited:
I'll check when I can. What BIOS version are you using? One came out in like...November or so.

1.80 08/09/2018 AGESA SP3r2-1.1.0.1

I havent updated in a while since this has been working good with my setup, just would be interested in killing SMT to test.
 
That's not a 50% performance drop, that test was highlighting the difference in kernels considering Spectre/Meltdown.
Please disregard what the Phoronix benchmark originally intended to highlight, that is inconsequential to the argument here. The performance delta between the still-somewhat-but-less-NUMA 7980XE and the very-much-so-NUMA 2990WX is the interesting part here.
 
Please disregard what the Phoronix benchmark originally intended to highlight, that is inconsequential to the argument here. The performance delta between the still-somewhat-but-less-NUMA 7980XE and the very-much-so-NUMA 2990WX is the interesting part here.

I'm not going to disregard the Phronix benchmarks at all, as they highlight the exact same performance issues experienced with the exact same Threadripper processors under Windows.
 
I'm not going to disregard the Phronix benchmarks at all
Um, who told you to disregard the benchmark?

The benchmark result is what is interesting. It shows a weakness of Threadripper that cannot be addressed by switching to Linux.
The question that Phoronix originally tried to answer with the benchmark is not interesting.
 
Um, who told you to disregard the benchmark?

The benchmark result is what is interesting. It shows a weakness of Threadripper that cannot be addressed by switching to Linux.
The question that Phoronix originally tried to answer with the benchmark is not interesting.
That was probably the result of a fast read of your comment. It's ok.
 
The other thing I question regarding the video is where the presenter states that we're specifically talking about single socket systems? Yes, it looks like one socket, but as far as I'm aware that's literally two AMD CPU's 'joined at the hip', you can even see this by looking at the bottom of the CPU as a package. So, technically speaking, while the distances between data paths is naturally substantially shorter between the two processors considering NUMA, there is still two processor packages present and therefore two individual sockets that simply look like one large socket?

So, Epyc, and Threadripper have 4 dies, not 2. (Only 2 active in 12/16 core Threadripper, though) The fact that the pins on the bottom of the processor package have bilateral symmetry has nothing to do with it, that's just how they designed the socket.

Would you agree that Threadripper and Epyc are essentially individual dies in the one package with dual sockets placed exceptionally close together to, in effect, appear as one socket?

So, logically, yes, but due to the physical layout the actual memory access penalty of going from socket to socket is a lot higher than just die to die in the same socket. This is explained in the video, and he is talking only about this specific scenario, not a multi-socket one which could be different.
 
And this is one of many reasons I believe that Microsoft is going to make a Windows that runs Linux kernel, a Windows X. Besides the whole Embrace Extend and Extinguish, it'll probably be cheaper for them to use code that actually works and that isn't written by idiots.
So what OS have you made? You really think windows could have been made by idiots? Roflmao.
 
Um, who told you to disregard the benchmark?

The benchmark result is what is interesting. It shows a weakness of Threadripper that cannot be addressed by switching to Linux.
The question that Phoronix originally tried to answer with the benchmark is not interesting.

That makes absolutely no sense whatsoever. Not only does Phoronix prove that the Threadripper issue is totally limited to the NT kernel, the video in the OP of this very thread also highlights the exact same point - That the issues related to Threadripper are totally isolated to the NT kernel and is not evident under Linux.

You told me to disregard the point of the Phoronix benchmark, I don't intend to do that as it highlights the exact same issue evidenced in the OP.

So, Epyc, and Threadripper have 4 dies, not 2. (Only 2 active in 12/16 core Threadripper, though) The fact that the pins on the bottom of the processor package have bilateral symmetry has nothing to do with it, that's just how they designed the socket.

It really doesn't look that way, that looks like two sockets joined at the hip to me to appear as one socket. I'll see if I can find some information relating to the exact pinouts. ;)
 
It really doesn't look that way, that looks like two sockets joined at the hip to me to appear as one socket. I'll see if I can find some information relating to the exact pinouts. ;)

It's definitely 4 dies.
Check it out:
https://www.google.com/search?q=epyc+delidded&tbm=isch
https://www.google.com/search?q=threadripper+delidded&tbm=isch

The video actually talks about this, too, when he is at the whiteboard he shows the 4 dies, and how each of the 4 dies are connected to 2 memory channels on Epyc, while only 2 of the 4 dies on Threadripper have connections directly to memory. (even in 32 core Threadripper with 4 active dies)
 
Maybe I missed it in a glossing over in this discussion, but I wonder how it plays out in virtualization. I mean, if you've got 32 cores you could run 3 8-core VMs and keep 8 for the host. Yes, I know you lose some in the overhead and you can't put the power all towards a single task. But which would be faster - Having 32 cores work all on A-B-C-D or having 8 on A, 8 on B, 8 on C, and 8 on D, assuming you've got parallel tasks like many jobs to run that don't have to be done in a specific order of completion.

Or without virtualization simply running the application 4 different times on the same computer and locking specific cores to specific application instances?
 
Maybe I missed it in a glossing over in this discussion, but I wonder how it plays out in virtualization. I mean, if you've got 32 cores you could run 3 8-core VMs and keep 8 for the host. Yes, I know you lose some in the overhead and you can't put the power all towards a single task. But which would be faster - Having 32 cores work all on A-B-C-D or having 8 on A, 8 on B, 8 on C, and 8 on D, assuming you've got parallel tasks like many jobs to run that don't have to be done in a specific order of completion.

Or without virtualization simply running the application 4 different times on the same computer and locking specific cores to specific application instances?

That would depend on the workload. If your application (windows thread scheduler) is going insane producing threads then splitting the load up manually doesnt change anything, but yeah splitting can be beneficial for some instances.
 
Back
Top