AMD Ryzen and the Windows 10 Scheduler - No Silver Bullet

FrgMstr

Just Plain Mean
Staff member
Joined
May 18, 1997
Messages
55,664
So we have seen all sorts of theories batted around here for days now about why the AMD Ryzen CPUs show poor gaming performance at 1080p resolutions and below. One of the reasons that has been rumored heavily has been put to the fault of the Windows Scheduler. PCPer has a full look, with tons of data, that basically puts this rumor to rest. In all our testing here in the last week, we have not found anything that has uncovered some huge fault either. Watch the video below, but certain hit the page to read up on this as well.

Editor's Note: The testing you see here was a response to many days of comments and questions to our team on how and why AMD Ryzen processors are seeing performance gaps in 1080p gaming (and other scenarios) in comparison to Intel Core processors. Several outlets have posted that the culprit is the Windows 10 scheduler and its inability to properly allocate work across the logical vs. physical cores of the Zen architecture. As it turns out, we can prove that isn't the case at all. -Ryan Shrout
 
Sorry, but that's not convincing at all. Where's the 8.1/7 testing?
 
Allyn Malventano did do another test requested in the comments, and said that threads can actually spill over to the other ccx node with SMT ON, 16 cores enabled, 4 workers. You can see it thread all of the 4 workers across the ccx node on the 8 physical cores, instead of trying to keep the work within relative nodes for some workloads.
W48QVmP.jpg
 
Last edited:
i only have one question. are they sure what cores are physical and what cores are logical (looking at the task manager)?
 
My question would be, is the performance of Ryzen poor, or is it perfectly acceptable but not as good as intel? Maybe someone who games on a run of the mill 60 Hz monitor wouldn´t notice a difference? Maybe if someone has a high refresh rate panel he or she would see a difference?
 
My question would be, is the performance of Ryzen poor, or is it perfectly acceptable but not as good as intel? Maybe someone who games on a run of the mill 60 Hz monitor wouldn´t notice a difference? Maybe if someone has a high refresh rate panel he or she would see a difference?

The gaming numbers are so close that no one would notice the difference period. I'm sorry but anyone that claims they can see 5-8% difference one way or the other when running a game at 150fps is lying.

As always the GPU is what matters. AMD vs Intel for gaming is within single digits at all times at all resolutions. (even if it turns out AMDs multi core beast does 8% better in a well optimized game later that still won't be an issue one way or the other).

If you purpose built a game test to favor Intels single core performance you can show AMD is a bit slower. At real actual resolutions and settings the chips are largely =. Ryzen does however bring real improvements in things that are CPU intensive like rendering and encoding.

I am still shaking my head that some people think gaming is a massive weak spot in these chips. Most people built to a budget. I can't imagine anyone would argue that an core chip with a step down GPU would outperform a Ryzen with a step up. Granted some people will opt for cheaper Intel 4 core chips... but if your comparing Apples to Apples. You can build a Ryzen 1700 and slot a 1080 where as for the same $ if you go with Intel your going to have to settle for a 1070 or 1060. At that point the Ryzen system is > for games.
 
So it looks like when you cross a cluster of 4 processors for a task, there's some sort of cache thrash going on here. (The 144 ns latency is likely the result of cache cleanups as it crosses the fabric...which you can think of as a highway going from one large city to another) This was a similar problem when vista was running scheduling which they fixed in windows 7. Threads getting thrown across processors each swap-in caused large penalties. However this is the result of trying to access a similar block of memory across cores. (Such situations include mutexes, context locks, and thread locks)

Am I interpreting this correctly?

It's interesting they are worried about Naples. If it's a server and the data is largely static, then cache thrash from concurrent access should be a minimum. (No concurrent db access row locks in read only mode)
 
Windows 7 to windows 10 comparisons prove that when Microsoft patchs / optimizes Win10. It will be enough to be a silver bullet alone .
https://hardforum.com/threads/amd-ryzen-7-performance-windows-7-vs-windows-10.1926898/

Incredibly flawed logic there. Neither you nor I, not many others know why Ryzen is doing slightly better on Windows 7 vs Windows 10. It might be the scheduler, it might not. Windows 7 is not Windows 10. We will not know until if and when Microsoft updates the scheduler if it needs it at all.

The only clear fact is that Ryzen on Windows 7 does better than Ryzen on Windows 10; nothing more, nothing less.
 
Incredibly flawed logic there. Neither you nor I, not many others know why Ryzen is doing slightly better on Windows 7 vs Windows 10. It might be the scheduler, it might not. Windows 7 is not Windows 10. We will not know until if and when Microsoft updates the scheduler if it needs it at all.

The only clear fact is that Ryzen on Windows 7 does better than Ryzen on Windows 10; nothing more, nothing less.

I find BrianB's Videos much more usefull/interesting that the PCper video (growing a beard waiting for them to get to the point).

It is very clear if you watch the win7 comparison videos, that Windows 10 is shifting between cores MUCH more than Windows 7 in the games where it has a significant advantage on Windows 7.

Windows 10 seems much more aggressive trying to "spread the load", which likely results in many extra cache misses as it moves thread between cores, where Windows 7 seems to keep many threads locked on cores.
 
I find BrianB's Videos much more usefull/interesting that the PCper video (growing a beard waiting for them to get to the point).

It is very clear if you watch the win7 comparison videos, that Windows 10 is shifting between cores MUCH more than Windows 7 in the games where it has a significant advantage on Windows 7.

Windows 10 seems much more aggressive trying to "spread the load", which likely results in many extra cache misses as it moves thread between cores, where Windows 7 seems to keep many threads locked on cores.


The PCPer analysis was pretty good as far as showing windows is 'correctly' loading cores and splitting load - assuming that loading each core first is the behavior you want. The penalty between ccxs is high enough that some apps may run better on a single ccx or if certain threads are in the same ccx. Thread/core affinity is possibly an issue that PCPer's article didn't seem to address and sounds like it's worth looking into.

I don't think we can say scheduler improvements won't make a difference, I think the PCPer analysis does show that at a high level things are behaving as you probably would expect.

I don't have a Ryzen, but can you disable the second ccx so that everything runs inside a single ccx and benchmark these games?
 
I find BrianB's Videos much more usefull/interesting that the PCper video (growing a beard waiting for them to get to the point).

It is very clear if you watch the win7 comparison videos, that Windows 10 is shifting between cores MUCH more than Windows 7 in the games where it has a significant advantage on Windows 7.

Windows 10 seems much more aggressive trying to "spread the load", which likely results in many extra cache misses as it moves thread between cores, where Windows 7 seems to keep many threads locked on cores.

That's all fine and dandy, and yet that doesn't guarantee the scheduler is to blame in part or entirely. Only AMD and Microsft will/would know.

Also, interesting comment on the pcper article:

"Windows driver dev here - I can shed some light on why Windows does this and why it isn't as stupid as people think.

When a thread becomes ready for the CPU (a wait has been satisfied, it has become the highest priority thread, etc), the scheduler will look for an available logical CPU to place it on. That is - a core either idle or running a thread of lower priority than the one being scheduled.

If possible, the scheduler will try to place the thread on the logical CPU it was last running on (in the hopes the cache will still be hot), however sometimes at the very instant that it tries to do that, it finds another thread of equal or higher priority running on that CPU already.

Rather than waste time waiting for the ideal CPU to become available, the scheduler will go ahead and migrate the thread to a different CPU to allow it to execute immediately. This improves ready-to-running scheduling latency. The cache hotness cost is easily offset by the performance win of having the thread running even on a cache-cold CPU."

So this seems to be expected behaviour on all CPU architectures, not just Ryzen. Which means this also happens on Intel, but of course since Intel chips and Ryzen are fundamentally different, they react differetly.

So here's the rub, AMD must have known about this prior to release. There's no possible way they did not know of these issues when it took reviewers mere hours or days to encounter them. This really has been quite the clusterfuck of a release. I guess it's still better than Bulldozer though! :ROFLMAO:
 
Wait, so... The Windows 10 Scheduler doesn't appear to be broken at all - in fact, it's working just fine. But the Scheduler should still be "fixed" so that it knows the actual functionality of the CPU better - and with that - better performance. right?
 
Wait, so... The Windows 10 Scheduler doesn't appear to be broken at all - in fact, it's working just fine. But the Scheduler should still be "fixed" so that it knows the actual functionality of the CPU better - and with that - better performance. right?

We will just have to wait and see. For all anyone knows, even if a "fix" or update is applied that helps; there might only be a fraction of improvement.

All that is known right now is that there doesn't seem to be an easy answer to any of this.
 
Wait, so... The Windows 10 Scheduler doesn't appear to be broken at all - in fact, it's working just fine. But the Scheduler should still be "fixed" so that it knows the actual functionality of the CPU better - and with that - better performance. right?

Something like that. It just has to schedule a thread in the same CCX cluster.

Below is a modified graphic (Modified by me with text annotations at bottom) with the original die shot from guru 3d


 
Last edited by a moderator:
I am still shaking my head that some people think gaming is a massive weak spot in these chips. Most people built to a budget.

Agreed. If you're spending $340 on the CPU, and your primary purpose is 1080p gaming, then you're doing it wrong.

Let's see how the $200 Ryzen 5 six-cores compare to the $200 Intel Core i5 quad-cores in terms of 1080p gaming. Edit: Actually, let's see how the $150-200 Ryzen quad-cores fare vs. the $200 Intel Core i5 quad-cores in terms of gaming. My money's on the Ryzen chips costing 25% less and delivering 95% of the performance.
 
Last edited:
  • Like
Reactions: ChadD
like this
Again i need ryzen owners to test my Ryzen booster tool. that avoid this CCX and SMT issue

What Ryzen booster tool? Also, how do I test with it please? I have a 1700 here in one machine and a 1700x in my other computer.
 
What Ryzen booster tool? Also, how do I test with it please? I have a 1700 here in one machine and a 1700x in my other computer.

i can send you a link if you need too.
Bassically its a Window CPU scheduler assistant that tries to to circument part of the "dumb" CPU scheduling that is beeing done.
instead it is aware of issues like SMT and the CCX.

its kinda like the "no HT conflicts " features that i have in my Project Mercury.
in PM it just avoids SMT issues in games byt settings the affinity of the foreground application ( the game) to by pass SMT. and boom no more SMT penalties for programs/*games that does not utilize enough threads

but analyse the workload on a process and ditributing the threads in a better way we can avoid some of the fallpits in the modenr day desing CPU


Funny enough i proud this up months ago. and sevealre HardOCP readerss denied this was an issues (without even testing it) and now we have this video beeing posted telling exactly the same thing as i told months ago.


I would need some benchmarks results in either some specific games ( let me find the list) or in CPU intensive programs where you can set the thread count (Cinebench, 7-zip)
 
My question would be, is the performance of Ryzen poor, or is it perfectly acceptable but not as good as intel? Maybe someone who games on a run of the mill 60 Hz monitor wouldn´t notice a difference? Maybe if someone has a high refresh rate panel he or she would see a difference?

The gaming numbers are so close that no one would notice the difference period. I'm sorry but anyone that claims they can see 5-8% difference one way or the other when running a game at 150fps is lying.

As always the GPU is what matters. AMD vs Intel for gaming is within single digits at all times at all resolutions. (even if it turns out AMDs multi core beast does 8% better in a well optimized game later that still won't be an issue one way or the other).

If you purpose built a game test to favor Intels single core performance you can show AMD is a bit slower. At real actual resolutions and settings the chips are largely =. Ryzen does however bring real improvements in things that are CPU intensive like rendering and encoding.

I am still shaking my head that some people think gaming is a massive weak spot in these chips. Most people built to a budget. I can't imagine anyone would argue that an core chip with a step down GPU would outperform a Ryzen with a step up. Granted some people will opt for cheaper Intel 4 core chips... but if your comparing Apples to Apples. You can build a Ryzen 1700 and slot a 1080 where as for the same $ if you go with Intel your going to have to settle for a 1070 or 1060. At that point the Ryzen system is > for games.

The gaming numbers are close in most games but several games, like new titles such as Hitman and not so new like GTAV, experience double digit decreases in performance compared to Intel's offerings. That being said yeah in most games it is close the question remains if that will be the case in most upcoming games. If you want a fool proof chip for the next few years and all you do is game go i7 7700k. I will say this about higher refresh rates if it jumps around a lot you notice it. I try to keep all games at 144hz with v-sync on.
 
I will say this about higher refresh rates if it jumps around a lot you notice it. I try to keep all games at 144hz with v-sync on.

hahah, this has got to be a joke. The whole post is a bit ridiculous but this is a bit too much
 
hahah, this has got to be a joke. The whole post is a bit ridiculous but this is a bit too much
If you are laughing due to latency triple buffering and a low latency panel can counteract the amount of latency introduced by v-sync. If I had a variable refresh rate monitor i would not use it, but even G-Sync and Free-sync introduce some latency.
 
So here's the rub, AMD must have known about this prior to release. There's no possible way they did not know of these issues when it took reviewers mere hours or days to encounter them. This really has been quite the clusterfuck of a release. I guess it's still better than Bulldozer though! :ROFLMAO:

Comments like this make me think that people have forgotten how many teething issues Intel had with their last major architecture change(Nehalem).

Hyperthreading worked great in production tasks like rendering video but in games it not only lowered framerates but caused lag and stutters because the logical cores were slower and games expected threads to be processed at the same speed.

Moving to an internal memory controller caused all sorts of issues. Some CPUs(like mine) wouldn't run higher speed memory at it's rated speed even using the XMP profile, I've had 3 RAM kits and none of them have been able to run at their rated speed(1866, 2000, 1600).

Motherboards were somewhat scarce at launch and many of the early ones had some issues, memory compatibility being the biggest issue but far from the only one. Most early motherboards got a ton of BIOS updates for about a year and some were even getting new hardware revisions months after launch when they decided they couldn't be fixed with BIOS updates.

My point isn't that AMD should get a free pass but that some teething issues are to be expected with a major architecture change like this is. I'm overdue for a new PC and I'd love an affordable 6-8 core but I'm waiting to see how this all shakes out before making a decision about which direction I want to go.
 
If you are laughing due to latency triple buffering and a low latency panel can counteract the amount of latency introduced by v-sync. If I had a variable refresh rate monitor i would not use it, but even G-Sync and Free-sync introduce some latency.

Any type of sync would introduce latency no matter what. Triple buffering doesn't help at all. What's the point of having high refresh rates if you're just going to shit on them with vsync?
 
Comments like this make me think that people have forgotten how many teething issues Intel had with their last major architecture change(Nehalem).

Hyperthreading worked great in production tasks like rendering video but in games it not only lowered framerates but caused lag and stutters because the logical cores were slower and games expected threads to be processed at the same speed.

Moving to an internal memory controller caused all sorts of issues. Some CPUs(like mine) wouldn't run higher speed memory at it's rated speed even using the XMP profile, I've had 3 RAM kits and none of them have been able to run at their rated speed(1866, 2000, 1600).

Motherboards were somewhat scarce at launch and many of the early ones had some issues, memory compatibility being the biggest issue but far from the only one. Most early motherboards got a ton of BIOS updates for about a year and some were even getting new hardware revisions months after launch when they decided they couldn't be fixed with BIOS updates.

My point isn't that AMD should get a free pass but that some teething issues are to be expected with a major architecture change like this is. I'm overdue for a new PC and I'd love an affordable 6-8 core but I'm waiting to see how this all shakes out before making a decision about which direction I want to go.
That was back in 2008, surely you can't be suggesting AMD learned nothing from something their competitor did 10 years ago? The issue here I think is pre-orders the fact knowing all these issues exist AMD opened up pre-orders and locked all reviews to a NDA until release day ensuring no pre-order cust would know about issues ahead of time.
 
Any type of sync would introduce latency no matter what. Triple buffering doesn't help at all. What's the point of having high refresh rates if you're just going to shit on them with vsync?
yes triple buffering helps http://www.anandtech.com/show/2794/2
Also I don't use vsync on everything if a game is running near or well above 200fps you can't even see screen tearing and any fps jumps are nearly unnoticeable. Fact is VSYNC with tripple buffering as long as the game meets all the time or exceeds the refresh rate of the monitor will not introduce enough latency to be noticeable and ensures a smooth experience.
 
That was back in 2008, surely you can't be suggesting AMD learned nothing from something their competitor did 10 years ago? The issue here I think is pre-orders the fact knowing all these issues exist AMD opened up pre-orders and locked all reviews to a NDA until release day ensuring no pre-order cust would know about issues ahead of time.

I'm suggesting that any major architecture changes are going to cause some issues and that they WON'T all be identified and fixed by launch, there's a reason why new technology is called bleeding edge tech.
 
yes triple buffering helps http://www.anandtech.com/show/2794/2
Also I don't use vsync on everything if a game is running near or well above 200fps you can't even see screen tearing and any fps jumps are nearly unnoticeable. Fact is VSYNC with tripple buffering as long as the game meets all the time or exceeds the refresh rate of the monitor will not introduce enough latency to be noticeable and ensures a smooth experience.

I was about to question your motion detection prowess and the types of games you play but I really am not in your body and I simply don't experience things the way you do.
 
I'm suggesting that any major architecture changes are going to cause some issues and that they WON'T all be identified and fixed by launch, there's a reason why new technology is called bleeding edge tech.
There is literally no way this was not identified before launch reviewrs discovered it minutes into testing and took days to pinpoint several things that might be causing it.
 
I was about to question your motion detection prowess and the types of games you play but I really am not in your body and I simply don't experience things the way you do.
I'd prefer to keep other people out of my body so thank you for the consideration
 
If anything, the article brings good news for a 4-core Ryzen:

But there are some other important differences standing out here. Pings within the same physical core come out to 26 ns, and pings to adjacent physical cores are in the 42 ns range (lower than Intel, which is good), but that is not the whole story. Ryzen subdivides by what is called a "Core Complex", or CCX for short. Each CCX contains four physical Zen cores and they communicate through what AMD calls Infinity Fabric. That piece of information should click with the above chart, as it appears hopping across CCX's costs another 100 ns of latency, bringing the total to 142 ns for those cases.
 
If anything, the article brings good news for a 4-core Ryzen:

But there are some other important differences standing out here. Pings within the same physical core come out to 26 ns, and pings to adjacent physical cores are in the 42 ns range (lower than Intel, which is good), but that is not the whole story. Ryzen subdivides by what is called a "Core Complex", or CCX for short. Each CCX contains four physical Zen cores and they communicate through what AMD calls Infinity Fabric. That piece of information should click with the above chart, as it appears hopping across CCX's costs another 100 ns of latency, bringing the total to 142 ns for those cases.
This is why we you should hope all 4 core Ryzen chips are 4 cores on one CCX and not 2 and 2
 
There is literally no way this was not identified before launch reviewrs discovered it minutes into testing and took days to pinpoint several things that might be causing it.

I said "identified and fixed", many of Nehalem's issues were identified immediately too. Is there anything else you'd like to twist around?
 
I said "identified and fixed", many of Nehalem's issues were identified immediately too. Is there anything else you'd like to twist around?
This is what you said
I'm suggesting that any major architecture changes are going to cause some issues and that they WON'T all be identified and fixed by launch, there's a reason why new technology is called bleeding edge tech.

Underlined and italics all mine. I was jumping on the idea it was not identified and i probably would not launch a major CPU my entire business is relying on to be a success with these issues, but that is just me.
 
Underlined and italics all mine. I was jumping on the idea it was not identified and i probably would not launch a major CPU my entire business is relying on to be a success with these issues, but that is just me.

That seems like an unreasonable expectation to me which is why I pointed out that even Intel wasn't able to pull that off with their last major change in architecture.
 
Back
Top