AMD Ryzen 7 Performance: Windows 7 vs. Windows 10

just putting in my bit
I do know own a Ryzen so I can't test this (I wish i did)


It might be due to core parking. Due to the massive core number in Ryzen SMT easier hurts on more software.
Windows 7 with coreparking sometimes park every other core and thereby "disables" SMT software wise. This gives an improvement on low threaded software ( threads equal or less than physical core)
But windows 10 coreparking is disabled so you run into the penalty of threads conflicts under SMT

If this is the case. I'm working on a fix is here:
https://hardforum.com/threads/amd-ryzen-game-performance-fix.1926435/page-2

If you would like to test it BrianB. Please let me know and I'll give you access to the Beta and experimental builds that also has fixes some of the CCX switching issue


SMT is finicky but a rule of thumb is:
Threads <= physical core SMT HURTS performance
Threads > physical core SMT HELPS performance
That is just the nature of SMT.
 
As demonstrated by reviews SMT-off only increases Ryzen average gaming performance by 1% and 3%. As demonstrated by reviews 4+0 is on average only 5% faster 2+2. As demonstrated by reviews Ryzen is 2% faster on W10 than W7. Meanwhile AMD has given official statement that (i) W10 works fine, (ii) there is no scheduler issue, and (iii) there is no SMT issue.

SMT always has an issue in its it nature. It does not mean the SMT is bad. But it does mean some situations SMT will hurt performance.
Also you numbers are wrong. Some increase was above 15% boost form disabling SMT. Which is in line with doing the same on Intel CPU's under the right circumstances.
 
SMT always has an issue in its it nature. It does not mean the SMT is bad. But it does mean some situations SMT will hurt performance.
Also you numbers are wrong. Some increase was above 15% boost form disabling SMT. Which is in line with doing the same on Intel CPU's under the right circumstances.

My numbers are from TechSpot extensive testing of this issue. They tested 16 titles. Also AMD posted an official statement admitting that, in general, there is no SMT problem with RyZen and games:

Finally, we have investigated reports of instances where SMT is producing reduced performance in a handful of games. Based on our characterization of game workloads, it is our expectation that gaming applications should generally see a neutral/positive benefit from SMT. We see this neutral/positive behavior in a wide range of titles, including: Arma® 3, Battlefield™ 1, Mafia™ III, Watch Dogs™ 2, Sid Meier’s Civilization® VI, For Honor™, Hitman™, Mirror’s Edge™ Catalyst and The Division™. Independent 3rd-party analyses have corroborated these findings.
 
My numbers are from TechSpot extensive testing of this issue. They tested 16 titles. Also AMD posted an official statement admitting that, in general, there is no SMT problem with RyZen and games:

My game/ryzen numbers are from various site and you would have seen that if you had cared to followed the providing link (you gotta love when people want to counter ague but don't read your evidence provided) , and hands-on SMT testing over several dozens system ranging over I3 i5 i7 and Xeons
Also please remember "you can't prove a negative" aka just because you find some games it is not present, does not meant it can't be present in other games.

Also AMD saying there is no problem with SMT in general is correct and it is probably in comparison to Intel SMT implementation. They are basically saying our SMT is not broken compared to Intel's. and they are right
The problem is that SMT has a natural spot where it hurts whatever it be AMD or Intel. its a simple drawback of the very nature of SMT.


it amaze me that after that many years people still talk about SMT and don't really get it the basic design of it.
But let me illustrate how it goes (Please ignored the CCX part of this as I am simply using a template for Ryzen to show it. its only the top half that is important

This is a WORSTCASE scenario for running 4 threads on a 4core system with SMT
https://s11.postimg.org/jaa1jj6s3/SMT4_Tworstcase.png
Notice how the 4 threads only get access to 2 physical cores aka a reduced set of physical core to run on.
This is due to Microsft Thread scheduler being round-robing based (with a heck of a lot of modifications and exceptions)
This is something that happens now and then, and is not avg or typical load distribution, but it DOES happen.


This is the optimal situations for 4 thread on a 4 core CPU with SMT ( ignored the forbidden symbols. for a moment)
https://s11.postimg.org/qozdbwsnn/SMT4_TAdjusted.png
Notice know all 4 threads have a physical core each and thereby run faster

As i wrote above a good rule of thumb is
Threads <= physical core SMT HURTS performance (Because the optimal is running on all physical cores. so SMT provides nothing extre)
Threads > physical core SMT HELPS performance (Because here we need something more than just the physical cores. and getting the leftover from paring 2 threads to one core give us more resources)
That is just the nature of SMT.


This is a general SMT issue in a low multithread situation on multicore CPU's. and is not a sign of bad SMT implementation. it just antual drawback no matter who the CPU designer is.
it also does not mean SMT is bad in general. but it DOES have situations where it hurts performance.

It is very simple to test if you had a SMT based CPU.
Let me know if you want to know how
 
Last edited:
I forgot juanrga
if you want a bit more insigt in why SMT hurts Ryzen in games more than intel you might want to read what i wrote here
https://hardforum.com/threads/amd-r...n-doom-deus-ex.1927200/page-2#post-1042891248

Basically the massive core of Ryzen make SMT hurt in games. Again if you look at my little rule of thumb you would see why.
As we increase the physical cores we need more heavy CPU software threads to not make SMT hurts us. and games typically don't scale numbers of threads well .aka BF1 seem to only have 5 CPU heavy threads from what I've seen
 
Last edited:
Also please remember "you can't prove a negative" aka just because you find some games it is not present, does not meant it can't be present in other games.

But that wasn't my point. ThechSpot and other reviews found that some few games run faster, some few run slower and many others run the same. The point was that average improvement is in the low single digit percent.
 
But that wasn't my point. ThechSpot and other reviews found that some few games run faster, some few run slower and many others run the same. The point was that average improvement is in the low single digit percent.

My apologies then. I misunderstood your post. You are in fact totally correct on this.

My point is that we can fix the situations where SMT hurts and get the benefits of SMT when it helps.
By more intelligently distributing the threads among the logical cores to avoid those SMT conflicts.
We might even get a boost on software that show no difference because they might just be having a equal amount of benefits of SMT as they have a penalty of it. and removing the penalty leaves us with just the benefits. At least in theory as I haven't had time to test my intelligent SMT enabled/disabler yet. only the dumb

The same thing could be done to handle CCX switching penalty.
 
My apologies then. I misunderstood your post. You are in fact totally correct on this.

My point is that we can fix the situations where SMT hurts and get the benefits of SMT when it helps.
By more intelligently distributing the threads among the logical cores to avoid those SMT conflicts.
We might even get a boost on software that show no difference because they might just be having a equal amount of benefits of SMT as they have a penalty of it. and removing the penalty leaves us with just the benefits. At least in theory as I haven't had time to test my intelligent SMT enabled/disabler yet. only the dumb

The same thing could be done to handle CCX switching penalty.

Well yeah you can do that if you have access to the code, but how do you propose doing that without the code? That is not easy to do, and you need to do it on an application specific target. There is no cure all for the CCX issues.
 
Well yeah you can do that if you have access to the code, but how do you propose doing that without the code? That is not easy to do, and you need to do it on an application specific target. There is no cure all for the CCX issues.
You don need access to the code. I already have a working program for the smart aprouch in alpha/beta testing as we speak.
The dumb approach to disabling SMT (if you know it goings to hurt) has been working for years in my utility.
All you need is affinity control


If you are interested to be part of the beta testing
https://hardforum.com/threads/amd-ryzen-game-performance-fix.1926435/page-2

You can check out more of the utility here
www.techcenter.dk
 
You don need access to the code. I already have a working program for the smart aprouch in alpha/beta testing as we speak.
The dumb approach to disabling SMT (if you know it goings to hurt) has been working for years in my utility.
All you need is affinity control


If you are interested to be part of the beta testing
https://hardforum.com/threads/amd-ryzen-game-performance-fix.1926435/page-2

You can check out more of the utility here
www.techcenter.dk

SMT yeah I can see that happening, cause you are tracing specific threads and associating them to a specific physical core, where it goes doesn't matter (cause ya know where it has to go). CCX issue is an entirely different beast. Cause you don't know what the program requirements are, you don't know what the graphics needs are at that specific time. So messing around with the threads at that point you need to know what is high priority and what is time dependent to get proper results. Otherwise you end up with a bigger mess then what you started off with.

As long as there are no issues with time dependent or co dependency of task, shifting threads around is no problem at all. Games don't work that way though, the reason why Ryzen is showing up with the CCX issues, is because of this to begin with.
 
Last edited:
SMT yeah I can see that happening, cause you are tracing specific threads and associating them to a specific physical core, where it goes doesn't matter (cause ya know where it has to go). CCX issue is an entirely different beast. Cause you don't know what the program requirements are, you don't know what the graphics needs are at that specific time. So messing around with the threads at that point you need to know what is high priority and what is time dependent to get proper results. Otherwise you end up with a bigger mess then what you started off with.

As long as there are no issues with time dependent or co dependency of task, shifting threads around is no problem at all. Games don't work that way though, the reason why Ryzen is showing up with the CCX issues, is because of this to begin with.

Exactly we are forcing the thread to go where the are optimally executed. we can do it with CCX switchingg as well provide the software/game to begin with does not hav enough threads to utilize more than one CCX.
that is kinda the little pain spot right her ewith SMT and CCX issues. is that it hasto be multithreaade but not multithread enough

Here is an example of BF1. from what I"ve seen (Sadly i dont have any modern games to teste with so i rely on other people getting me the info) BF1 has 5 CPU heavy threads

So lets tosse it on a Ryzen CPU in theory.
With windows ususal modfied round robing dsitribution the mostly likely results is that the 5 cpu threads is going to be split between the 2 CCX

1st threads has a 8/16 aka 50% chance of landing on one or the other
2nd threads has a 7/15 chance of landing on the same CCX or 8/15 risc of landing on the other CCX
so most often it will be on the other CCX. so lets go with that just for this example
3rd threads has a 7/14 aka 50% to land on the first CCX and 7/14 aka 50% chance of landing on the 2nd CCX
4th threads has 6/13 of landing on the ccx with the most threads and 7/13 chance of landing on the ccx with the fewest threads
again lets go with the biggest chance
5th now has again a 50/5 chance to land on each of the CCX. 6/12 vs 6/12

So we end up with a distribution of 2 threads on one CCX and 3 threads on the other.


By using affinity locking again we force those 5 threads onto the same CCX and now they are all nice together on the same 4physsical cores/8logical cores.

Now i have not had a chance to test this as i dont own a ryzen system. which is why I've bee trying to get forum uses help to test this
 
Cool, ok but that is the same thing as turning off 4 cores lol. I think I misunderstood what you were stating before, it seemed like you were going to utilize all 8 cores and be able to get around the CCX problem.

Yeah now if not enough cores are being utilized and you switch treads to the same CCX module that is not a problem at all! Cool stuff!
 
Cool, ok but that is the same thing as turning off 4 cores lol. I think I misunderstood what you were stating before, it seemed like you were going to utilize all 8 cores and be able to get around the CCX problem.

Yeah now if not enough cores are being utilized and you switch treads to the same CCX module that is not a problem at all! Cool stuff!

You are correct and no I cant make magic and have all 8 cores working with no CCX issues on the same process .. sadly
But i can do analyse of the CPU load of a process and determine if it should get the full 8/16 cores or just lock it to one CCX of 4/8 cores.
The background task/process would still utilize the other 4physsical cores as they are frea to roem. only the game/main process would be "guided" away from the CCX issue


Now i just ran through my games I actually has and sadly Path of Exiled is the only really contender for SMT issues on my I7 3770
It has 2 heavy CPU threds and some minor ones
Full disclosure I had to reduce it down to 800x600 and lowest settings to get out of GPU bottlenecks

This is the threads load of the game (remember that 12.5% is a full core being used)
Po_Ethreads.png

GPU load at the time was around 48-49% according to GPU-z


I found a silent place to stand and used fraps (1sec intervall) and just eye monitored this over a few minutes ( it felt like forever) for the lowest and highest FPS value.
By using all logical cores i had a 978-1009 FPS
By adjusting affinity to avoid SMT i got to 1032-1076 FPS

Now the increase i abyssal small, but it does show that the lowest FPS value is higher then the previous highest. So we got some gains here.
games with a bit more threads and a bit more even distribution would have been better candidates but I had nothing good available

But hey 6% performance boost. :D
Cinebench show around 14% but not using it as an example because people freak out that i lowered the thread count to simulate games. and blah blah blah
 
Last edited:
yes, clearly. then again ive just tested CARS which should be faster on win 8.1 and up but it is slower under ryzen:
 
yes, clearly. then again ive just tested CARS which should be faster on win 8.1 and up but it is slower under ryzen:
Just curious and sorry if you mention it somewhere, what is the power management mode on the WIn10 set to and do you test it in both balanced and high performance?
Cheers
 
Just curious and sorry if you mention it somewhere, what is the power management mode on the WIn10 set to and do you test it in both balanced and high performance?
Cheers
i have set it to high performance only
 
Thanks.
On the plus side the differentiation is not great, unless you notice it was a lot larger gap when driving round the circuit or weather conditions/etc.

Cheers
 
Back
Top