Ryzen and the Windows Scheduler - PCPer

Itanic? Both companies have had their share of flops, Zen is not one of them.


Itanic 1 was crap, Itanic 2 was much better, but software was a problem and no company wanted to rewrite code for itanic's ISA, when AMD already had an x64 part that did better out of the box (MS also supported AMD's architecture over Intel's Itanic). If software was made for Itanic 2 they would have gotten better performance than AMD's parts, but it was not easy to do, lots of work had to be done to get the most of Itanic 2.
 
That's just as much as HP's fuck up as it was Intel's, it was joint project after all; but good try.

Oh dont forget Netburst, I suppose that was some other companies fault as well? Presshot or wait I meant Prescott. Please it's not hard to find failures at both companies.
 
Oh dont forget Netburst, I suppose that was some other companies fault as well? Presshot or wait I meant Prescott. Please it's not hard to find failures at both companies.

True enough, the difference is AMD is consistent at it.
 
Itanic 1 was crap, Itanic 2 was much better, but software was a problem and no company wanted to rewrite code for itanic's ISA, when AMD already had an x64 part that did better out of the box (MS also supported AMD's architecture over Intel's Itanic). If software was made for Itanic 2 they would have gotten better performance than AMD's parts, but it was not easy to do, lots of work had to be done to get the most of Itanic 2.

Itanic was dead once AMD made X64. But to be honest Itanic was already dying, was far more work then it was worth to make programs for and it died. To be honest it was better for us it died anyway, Otherwise it would have allowed Intel to lock up the market and force all competition out of it, the true goal of Itanic.
 
Itanic was dead once AMD made X64. But to be honest Itanic was already dying, was far more work then it was worth to make programs for and it died. To be honest it was better for us it died anyway, Otherwise it would have allowed Intel to lock up the market and force all competition out of it, the true goal of Itanic.
So Itanic was the Titanic. LOL.
 
There is a problem its on AMD's end and how THEIR infinity fabric works. Just imagine how their cut down 8 core parts are going to perform? Are they going to be worse that Intel's 2 core performance in games? Possibly.
With less cores per CCX you would need less bandwidth across the Fabric, so in essence it would be like the 8 core configuration with double the speed on the fabric for a 4 core configuration. There are other factors such as each CCX has a L3 (which can be shared but is probably mostly works with it's own CCX) that speeds up the data needed. So there are other factors in the design that help it achieve better results.

Higher speed memory speeds up the fabric speed which will help. If AMD allows greater than 3200mhz without needing to up the BCLK will also work.

Plus the so called issue with games is no issues or minimal issue due to it is in most cases the frame rate is way faster then what the monitor could display or is limited by the GPU. The whole premise dealing with games as in being poor or let just say not capable is asinine. So once performance deviations are noted the next question should be how does that affect the outcome or in this case game experience - reality is it doesn't.
 
Seems like Zendozer to me....

No, it doesn't.

I present you the real Zendozer by gruffi

Zendozer.jpg
 
With less cores per CCX you would need less bandwidth across the Fabric.

It just goes in the opposite direction. The moar cores per CCX the more higher is the probability that all the running threads are in a single CCX and its associated L3 slice.
 
It just goes in the opposite direction. The moar cores per CCX the more higher is the probability that all the running threads are in a single CCX and its associated L3 slice.
Actually he is closer to the mark than you here. His is based on factual inference yours is just probability and instance which is not factual; at all. Actually given your inane probability comment then it still works in favor of the 4 core as being more consistent across more games with the 8 core having a greater chance at hitting worse configuration of threads.
 
With less cores per CCX you would need less bandwidth across the Fabric, so in essence it would be like the 8 core configuration with double the speed on the fabric for a 4 core configuration. There are other factors such as each CCX has a L3 (which can be shared but is probably mostly works with it's own CCX) that speeds up the data needed. So there are other factors in the design that help it achieve better results.

Higher speed memory speeds up the fabric speed which will help. If AMD allows greater than 3200mhz without needing to up the BCLK will also work.

Plus the so called issue with games is no issues or minimal issue due to it is in most cases the frame rate is way faster then what the monitor could display or is limited by the GPU. The whole premise dealing with games as in being poor or let just say not capable is asinine. So once performance deviations are noted the next question should be how does that affect the outcome or in this case game experience - reality is it doesn't.


I don't know how much less that is why I said possibly lol.

Also the reason why I say this is because of the AMD video with using faster RAM...... There is absolutely no need to show that video (or even make it) unless they are still going to have a performance defect.
 
Actually he is closer to the mark than you here. His is based on factual inference yours is just probability and instance which is not factual; at all. Actually given your inane probability comment then it still works in favor of the 4 core as being more consistent across more games with the 8 core having a greater chance at hitting worse configuration of threads.


Actually they are both correct and its application specific on top of this too so its hard to know exactly what the bandwidth needs will be, it will also change based on the core configuration (disabled parts), I love it how the people that say "soothsayers" yet they presume to know everything.
 
Actually they are both correct and its application specific on top of this too so its hard to know exactly what the bandwidth needs will be, it will also change based on the core configuration (disabled parts), I love it how the people that say "soothsayers" yet they presume to know everything.
too funny, that last part from you. Anyway, my point is at this juncture 2+2 is likely to be far more constrained in results as noko alluded. Juanrga point is more of a random guess depending on circumstance of CCX usage. Now I am sure when they are able to restrict threads to a 4+0 configuration then Yes 2+2 may take a hit in comparison. However when mobos finally get higher ram support then this CCX issue will likely be less of an issue.
 
too funny, that last part from you. Anyway, my point is at this juncture 2+2 is likely to be far more constrained in results as noko alluded. Juanrga point is more of a random guess depending on circumstance of CCX usage. Now I am sure when they are able to restrict heads to a 4+0 configuration then Yes 2+2 may take a hit in comparison. However when mobos finally get higher ram support then this CCX issue will likely be less of an issue.


Its not that much actually, ran through the numbers there is only a max of 10% bandwidth difference.....

That is the difference justreason, I know the math, just don't know how much it will be affected because of the other reasons as I mentioned.
 
Holy shit why you people arguing so much over this. Flawed or not Ryzen is a great CPU for the price. If the performance difference is that important to then go ahead buy Intel.
 
Holy shit why you people arguing so much over this. Flawed or not Ryzen is a great CPU for the price. If the performance difference is that important to then go ahead buy Intel.
Buy both :D
 
Actually they are both correct and its application specific on top of this too so its hard to know exactly what the bandwidth needs will be, it will also change based on the core configuration (disabled parts), I love it how the people that say "soothsayers" yet they presume to know everything.
Well with a four core you are much more likely have thread communication/dependencies between the two CCX's (maybe not so good) due to the more limited number of cores. I guess this just has to be tested out. Anyways the Ryzen 4 cores are going against the I3's with basically double the cores. Which makes me think how will the RyZen APU's be configured? 1 CCX or 2 CCXs? Looks like 1 CCX and the other would be the GPU, there you would all the eggs in the same basket and maybe a faster 4 core then what is coming out.
 
Well with a four core you are much more likely have thread communication/dependencies between the two CCX's (maybe not so good) due to the more limited number of cores. I guess this just has to be tested out. Anyways the Ryzen 4 cores are going against the I3's with basically double the cores. Which makes me think how will the RyZen APU's be configured? 1 CCX or 2 CCXs? Looks like 1 CCX and the other would be the GPU, there you would all the eggs in the same basket and maybe a faster 4 core then what is coming out.


Yeah ideal 1 ccx active is the best case, but doesn't seem that will happen. APU's are going to be 1 CCX as they are only 4 core chips which will be good for AMD and OEM's, no need to worry about the CCX problems. But those aren't going to be out till end of year.
 
Well with a four core you are much more likely have thread communication/dependencies between the two CCX's (maybe not so good) due to the more limited number of cores. I guess this just has to be tested out. Anyways the Ryzen 4 cores are going against the I3's with basically double the cores. Which makes me think how will the RyZen APU's be configured? 1 CCX or 2 CCXs? Looks like 1 CCX and the other would be the GPU, there you would all the eggs in the same basket and maybe a faster 4 core then what is coming out.

Yes, the Raven Ridge APUs use a different die with one CCX replaced by the iGPU.

The APUs will have the four cores in the same CCX and will lack the interconnect latencies of the (2+2) R5 CPUs, but the APUs use a different CCX with only 4MB L3.
 
The latency the CCX adds is pretty overblown, yeah it would be nice if it was lower but it's not a huge deal.
Thought that myself when I saw one of the latency graphs. Think it was against a 6800k. The 6800k had constant 80ns( or was it micro sec) across all cache levels. The Ryzen had 40 or 60 thru L2 and then jumped to 140 with L3. Granted that is a huge % jump but question is to what degree with that effect real world performance. Just didn't seem as catastrophic as some were alluding to. Also don't remember the tested ram speed and would love to see how that L3 cache speed changed with Ram speed.
 
Thought that myself when I saw one of the latency graphs. Think it was against a 6800k. The 6800k had constant 80ns( or was it micro sec) across all cache levels. The Ryzen had 40 or 60 thru L2 and then jumped to 140 with L3. Granted that is a huge % jump but question is to what degree with that effect real world performance. Just didn't seem as catastrophic as some were alluding to. Also don't remember the tested ram speed and would love to see how that L3 cache speed changed with Ram speed.

The ram increases performance for both AMD and Intel, linked a few recent review benchmark showing that, and one showed it for comparison between Ryzen and Intel.
The latency sensitivity depends upon the game engine/thread and data dependency structure, affecting some games more than others and easiest way to tell is look at 1800X/7600K/7700k/6900K.
There are some with the notable jump between 7600K->7700K (shows which games respond well to SMT) to 6900K while the 1800K does not follow same trend.
But then you need at least a GTX1080 for 1080p resolution to really pick up this, even for 7700K to see SMT benefit in some of the games (Hardware Unboxed did a vid showing that a GTX1070 bottlenecked testing certain games for SMT gains at 1080p resolution compared to the Pascal Titan).

Cheers
 
Thought that myself when I saw one of the latency graphs. Think it was against a 6800k. The 6800k had constant 80ns( or was it micro sec) across all cache levels. The Ryzen had 40 or 60 thru L2 and then jumped to 140 with L3. Granted that is a huge % jump but question is to what degree with that effect real world performance. Just didn't seem as catastrophic as some were alluding to. Also don't remember the tested ram speed and would love to see how that L3 cache speed changed with Ram speed.


How many cycles will that increase of latency cause? That will be the frame rate loss.....
 
Last edited:
What we need is NUCA (non-uniform cache architecture) Nodes!
Haven't read the last however many threads, please don't blast me. I'm going read them now. ;)
 
The ram increases performance for both AMD and Intel, linked a few recent review benchmark showing that, and one showed it for comparison between Ryzen and Intel.
The latency sensitivity depends upon the game engine/thread and data dependency structure, affecting some games more than others and easiest way to tell is look at 1800X/7600K/7700k/6900K.
There are some with the notable jump between 7600K->7700K (shows which games respond well to SMT) to 6900K while the 1800K does not follow same trend.
But then you need at least a GTX1080 for 1080p resolution to really pick up this, even for 7700K to see SMT benefit in some of the games (Hardware Unboxed did a vid showing that a GTX1070 bottlenecked testing certain games for SMT gains at 1080p resolution compared to the Pascal Titan).

Cheers
Come on, how do you get here from what I stated. I wasn't talking about frame rates or what not. I was talking about L3 cache changes and any REAL findings using different speed Ram and its effect on the infinity Fabric aka: CCX. Frame rates and what not matter little to me as the current performance from actual users seems to more than sufficient. So in conclusion have you seen any benches on the L3 cache using different ram speeds? That is what I am interested in.
 
Come on, how do you get here from what I stated. I wasn't talking about frame rates or what not. I was talking about L3 cache changes and any REAL findings using different speed Ram and its effect on the infinity Fabric aka: CCX. Frame rates and what not matter little to me as the current performance from actual users seems to more than sufficient. So in conclusion have you seen any benches on the L3 cache using different ram speeds? That is what I am interested in.
As I said the trend is comparable between Ryzen and Intel with the increase of RAM speed, if it also improved latency or the inter-CCX that gain would be notably higher for Ryzen but those doing these comparable tests showed it was more in line with 'game sensitivity' to RAM speed increase as there was a trend correlation between Ryzen and Intel.
Eurogamer is one that did this test and came to that conclusion.

Changing RAM speed is not enough to change inter-CCX latency; you get greater bandwidth but does not change the underlying protocols/controls/data transmission structure (this is how AMD improves on PCIe latency by using their own protocols/controls/data packaging over the actual PCIe physical connections).
Cheers
 
Last edited:
As I said the trend is comparable between Ryzen and Intel with the increase of RAM speed, if it also improved latency or the inter-CCX that gain would be notably higher for Ryzen but those doing these comparable tests showed it was more in line with 'game sensitivity' to RAM speed increase.
Eurogamer is one that did this test and came to that conclusion.

Changing RAM speed is not enough to change inter-CCX latency; you get greater bandwidth but does not change the underlying protocols/controls/data transmission structure (this is how AMD improves on PCIe latency by using their own protocols/controls/data packaging over the actual PCIe physical connections).
Cheers


Its not going to get though to him man. He is thinking the latency of the fabric is linked to the ram speed..... which its not two separate issues
 
True enough, the difference is AMD is consistent at it.

AMD is fascinating in this respect. Sometimes, they are an utter failure. 1. See: Bulldozer, K5, etc... Other times they are a decent competitor in the mid-range that at least keeps pricing down. 2. See: AMD's 486 lineup, K6-2/K6-3, Early Phenoms. And then, on rare occasion, they step up to the plate and deliver *solid* competition in the high end space. 3. See: Original Athlon, Athlon 64/X2. I mean, they basically made x86-64 and cheap dual cores a thing. I'm hoping Zen makes cheap 8 cores a thing in the same way.

Zen is either a case of #2 or #3, depending on how things shake up as the uarch matures and developers play with optimizations for it. It is definitely NOT a case of #1. And, quite frankly, anyone who thinks it is, is f*cking blind. Crapdozer lost to *existing* AMD Phenoms when it was released. Crapdozer won nothing against Intel, neither in content creation nor in gaming. It even lost in power draw. In everything, Crapdozer was inferior. Zen mixes things up. It wins in some things, it loses in some things, and it's very competitive in other things. It's an interesting alternative for certain types of users and enthusiasts.
 
Its not going to get though to him man. He is thinking the latency of the fabric is linked to the ram speed..... which its not two separate issues

Clock speed of the fabric is linked to that of the ram and time latency is just the inversion of frequency.

Cycle latency of the fabric itself wont change of course but from a core's perspective access will be faster
 
that is true but to over come the latency the frequency of the ram needs to be quite a bit higher a magnitude of 300% or so, which is unreasonable expectations.
 
Clock speed of the fabric is linked to that of the ram and time latency is just the inversion of frequency.

Cycle latency of the fabric itself wont change of course but from a core's perspective access will be faster
However Eurogamer and a few others have shown that Ryzen access being faster from a core perspective does not improve performance when they compare Intel to R7 Ryzen and 2133MHz to around 3000MHz and games impacted (relative gains comparable on both), or if it does it is very marginal.
Cheers
 
Last edited:
As I said the trend is comparable between Ryzen and Intel with the increase of RAM speed, if it also improved latency or the inter-CCX that gain would be notably higher for Ryzen but those doing these comparable tests showed it was more in line with 'game sensitivity' to RAM speed increase as there was a trend correlation between Ryzen and Intel.
Eurogamer is one that did this test and came to that conclusion.

Changing RAM speed is not enough to change inter-CCX latency; you get greater bandwidth but does not change the underlying protocols/controls/data transmission structure (this is how AMD improves on PCIe latency by using their own protocols/controls/data packaging over the actual PCIe physical connections).
Cheers
can't be that hard to understand? I am interested, in a purely scientific curiosity, in how ram speed affect L3 latency. Not at all in game benchmarks as they reflect total system not the individual L3 I am interested in. I have seen the latency graphs of caches but not between different ram speeds as to ascertain how much of an impact it has. Now do you get it?
 
can't be that hard to understand? I am interested, in a purely scientific curiosity, in how ram speed affect L3 latency. Not at all in game benchmarks as they reflect total system not the individual L3 I am interested in. I have seen the latency graphs of caches but not between different ram speeds as to ascertain how much of an impact it has. Now do you get it?


As I stated you are not understanding, It doesn't affect the latency, it affects the clock cycles, the latency will not change.....
 
can't be that hard to understand? I am interested, in a purely scientific curiosity, in how ram speed affect L3 latency. Not at all in game benchmarks as they reflect total system not the individual L3 I am interested in. I have seen the latency graphs of caches but not between different ram speeds as to ascertain how much of an impact it has. Now do you get it?
Well what can you not understand....
If both Intel and Ryzen have relative comparable gains going from 2133MHz to around 3000MHz in games then obviously RAM speed is NOT affecting L3 latency/inter-CCX.... From the perspective of this being an isolated benefit to Ryzen.
Science relies upon results and not speculation.
 
Its not going to get though to him man. He is thinking the latency of the fabric is linked to the ram speed..... which its not two separate issues
if the infinity fabric is linked to ram speed and it is the speed at which the L3 runs and based on the graphs that peg the CCX issue to the latency then Yes I would like to see if ram speed will make any discernable difference. I am not making any claims here just interested in testing using these variables.
 
Well what can you not understand....
If both Intel and Ryzen have relative comparable gains going from 2133MHz to around 3000MHz in games then obviously RAM speed is NOT affecting L3 latency/inter-CCX....
again that is an assumption. Where are the latecy tests of the L3 on different ram speeds.
 
if the infinity fabric is linked to ram speed and it is the speed at which the L3 runs and based on the graphs that peg the CCX issue to the latency then Yes I would like to see if ram speed will make any discernable difference. I am not making any claims here just interested in testing using these variables.


Latency and clock speeds have nothing to do with each other dude they are != they are not even associated with each other. You can't even test for it because the results will end up the same.

You need to look at it this way, there are two bottlenecks here. The ram speed and latency right?

Which one is affecting what? Both are affecting the frame rates yes but at different times. So by removing the frequency you will still have the L3 cache latency, that will not change. By removing the L3 cache latency (which is not possible) but using slower rate ram you will still have the bottleneck of the slower ram.
 
Back
Top