Ryzen with 3600MHz RAM Benchmark

For my 8350 core 0 is usually the main core but say about 30% of the time it can be another, again usually an odd core being windows treats the even cores as HT.


I am not sure i am understanding what you are trying to say. but i i do i can not confirm you findinds in all of my time measuring (logical) cores and palying with affinity.
You don't have a main core and then a HT core. You have two logical cores leading to a physical core. it doesn matter which of these two paths your thread takes it still ends up on the same physsical core.
The only issue is the load that physical cores get from the other logical core assigned to it.

The belief that there is somehow a real core and a HT core in the list of logical cores you are seeing in taskmanager is just plain out wrong.

If I misunderstood what you where trying to say I apologize.

-- edit --
i wrote HT core here replaced it with CMT extra cores or whatever proper name would be for it in AMD's last Gen CPU
 
Ok went through the article, here is some imp data:

- Latency of pings WITHIN the same physical core: Intel 14 ns, AMD 26 ns
- Latency of pings to ANOTHER physical core: Intel 76ns, AMD 42 ns (if within the same CCX)
- But here is the kicker: Since AMD uses 4 cores on each CCX, the latency between Core 4 (on CCX 1) and Core 5 (on CCX 2) takes a whole 142ns, while Intel maintains that at 76ns across all physical cores.

So what this essentially means is 2 important points:

Point 1: that if data is distributed across the first 4 cores (8 threads), AMD would be twice faster than Intel. However, if data starts moving beyond 4 cores, then Intel is twice as fast as AMD. Now, is this reflected in benchmarks, or would it have ANY real-world impact? can it be ever noticed in games I am not sure.


Point 2: What it also means is that the data fabric is mostly relevant when data from one CCX (4 cores, 8 threads) is moved across or between the second CCX (the other 4 cores and 8 threads). In other words, faster memory will actually ONLY benefit you if your app or games will use more than 4 cores / 8 threads. Windows scheduler seems to try and use the first 4 cores within the first CCX and the next 4 cores of the second CCX after that. Which is good. That said, faster memory latency will help your system in general, whether its Intel or AMD. In that sense, buy the fastest mem you can afford, it will just work better with Ryzen and beyond 4 cores utilization.

That said, 142ns is 0.000000142 of a second, so will this be actually felt while gaming vs Intel's 76ns (assuming the game requires more than 4 cores or 8 threads)? Hard to answer, because games are not yet optimized for AMD in the first place, whether that latency will matter or not.

AMD is doing better in various multi-core benchmarks than Intel and vice versa.

I personally think that the best test to measure all this inter-core latency between AMD and Intel would be to run a program or game that utilizes exactly 5 cores (10 threads). This will force the second CCX to be used on AMD, and it will give you an average of performance across the 5 cores on Intel vs AMD. Anyone knows how to perform such test?
 
Last edited:
Not sure if core affinity can be manually assigned in Windows for individual cores. I know it can be done in Linux and other OSes for various appliances and usages, but I have no idea if Windows goes as far as allowing that. But I don't know to what extent Ryzen itself manages it CCXs and core distribution. They might do it in a way to spread the load evenly (across various cores physically) in order to distribute heat properly, but this is just a very wild unproven assumption. I do know that the first core (core 0) gets hammered almost always as its the default to be used for most processes. But then, Ryzen could be different. There is a lot that is still unknown, or known to a few and still to be discovered.
As Svent said they're their own thread on a pure core, and perform the same (except, curiously, in StatusCore, but I don't think that's reflective of real-world situations), so unless you have them both loaded there shouldn't be a problem.
As for the heat theory, that's hard for us to really speculate about. Reason being, we don't know how the threads are numbered in each CCX module.
Is it...
1 | 2
3 | 4
or
1 | 3
2 | 4
As loading 1 & 2 could theoretically spread the heat out better since there'd be a bigger gap between the chips (all the cache lays in between). Whereas, in the other layout, running on 1 & 2 would put them smack dab next to each other, and now 1 & 3 are the ones further apart.

Whether you run on High Performance profile also makes a difference. I've been running on Balanced in order to leverage the full Core Boost speed, as I've not gotten into overclocking yet. In that instance, this one program I have can eat up almost 100% of two threads, but will (if allowed) also spread across 4 cores at much lower loads.
Now. If I tell it via Affinity to run on 0-1-2-3, it'll spread out. HOWEVER, if I set it to say 0-1-6-7, it'll remain on the first two threads only, and not unpark the latter despite being willing if the cores are unparked already. In High Perf, it just picks the first 4 heh


Check this site for info, read the conclusion if you wish not to read the entire thing, but it does seem that windows is, and am quoting "the CCX design of 8-core Ryzen CPUs appears to more closely emulate a 2-socket system". So if I understand that correctly, all cores in one CCX will be utilized before the 4 other cores from the second CCX will be.

But I am also reading in that post that even the inter-core communication within the same CCX is also done through the infinity fabric. Anyway I will read up more and update.
To me it sounded like multiple other things use the Fabric as well which operate at different speeds (based on that post by TheStilt I had linked you before), but I thought that at the very least the memory transfers occur over the Infinity Fabric? The Infinity Fabric is a bit elusive to me ATM. lol


So what this essentially means is 2 important points:

Point 1: that if data is distributed across the first 4 cores (8 threads), AMD would be twice faster than Intel. However, if data starts moving beyond 4 cores, then Intel is twice as fast as AMD. Now, is this reflected in benchmarks, or would it have ANY real-world impact? can it be ever noticed in games I am not sure.
Correct me if I'm wrong, but wouldn't that not be the entire picture? Like, doesn't the throughput also play a bit role into who is actually faster?
Your car might have a quick 0-60MPH time, but if your top speed is only 100MPH, then a slower starting but much higher top-end vehicle will pass you before long.
Though I'll admit, how much that applies to real world situations is an unknown to me *shrug*
 
To me it sounded like multiple other things use the Fabric as well which operate at different speeds (based on that post by TheStilt I had linked you before), but I thought that at the very least the memory transfers occur over the Infinity Fabric? The Infinity Fabric is a bit elusive to me ATM. lol

From what I understand so far... Infinity fabric is used within the SAME single CCX (in other words, connecting the 4 different physical cores together) as well as connecting the two CCXs together (in case of the Ryzen 7). That said, the difference of latency of pinging the cores within the SAME CCX for AMD (being 42ns) and pinging the cores from within 2 different CCX (being 142ns) is attributed, IMO, to the actual physical distance between the two CCXs. But when it comes to computing and memory, each ns matters, hence memory access is improved from say CL14 to CL16 and so on. It would take an AMD engineer to explain why AMD decided to accept that 100ns distance between one CCX and the second.

Yes throughput of course matters, I have to look up the throughput numbers of AMD and Intel and compare them... There is inter-core throughput and core to DRAM throughput that have to be known and compared. But then you have to consider if your app/game is actually even reaching the limits of the throughput...and if it does not, well then latency becomes the most important factor.
 
I am not sure i am understanding what you are trying to say. but i i do i can not confirm you findinds in all of my time measuring (logical) cores and palying with affinity.
You don't have a main core and then a HT core. You have two logical cores leading to a physical core. it doesn matter which of these two paths your thread takes it still ends up on the same physsical core.
The only issue is the load that physical cores get from the other logical core assigned to it.

The belief that there is somehow a real core and a HT core in the list of logical cores you are seeing in taskmanager is just plain out wrong.

If I misunderstood what you where trying to say I apologize.

-- edit --
i wrote HT core here replaced it with CMT extra cores or whatever proper name would be for it in AMD's last Gen CPU
First what I meant was windows treats the construction cores now as HT. It tends to use 0-2-4-6 before using 1-3-5-7, treating 1-3-5-7 as HT threads instead of actual cores ( now it TENDs to but could as easily use 1 before 0 rather than 0 before 1, which brings up another point I will make shortly). You have to take these terms loosely.

Now my other point is this word-Nazi attitudes when other comment on HT and cores. I am sure we understand what is meant, however it does open some questions. Win7 when gaming with the Ryzen chips, from what I have seen, only every other thread was populated and always was the first one of each. Now arguing whether the main core is always designated as 0 or 1 is semantics and counterproductive to the point and topic of conversation. It doesn't change that once a core has a thread and the second will be the HT/SMT.
 
First what I meant was windows treats the construction cores now as HT. It tends to use 0-2-4-6 before using 1-3-5-7, treating 1-3-5-7 as HT threads instead of actual cores ( now it TENDs to but could as easily use 1 before 0 rather than 0 before 1, which brings up another point I will make shortly). You have to take these terms loosely.

Now my other point is this word-Nazi attitudes when other comment on HT and cores. I am sure we understand what is meant, however it does open some questions. Win7 when gaming with the Ryzen chips, from what I have seen, only every other thread was populated and always was the first one of each. Now arguing whether the main core is always designated as 0 or 1 is semantics and counterproductive to the point and topic of conversation. It doesn't change that once a core has a thread and the second will be the HT/SMT.

Thank for explaining it we are on the same page in regards to HT and cores then. You are talking about the pairing and not that one is the physical and one is a logical one. gotcha.
However in regards to windows distribution it acroos the even or uneven logical core first to avoid SMT conflicts. i can't confirm that outside of coreparking in windows7
Coreparking in windos 7 does work on disablign every other logicalc ores which natural gives threads a more even distribution among the physical cores. however Win10 and win7 without coreparking does not seem to have any behavior to avoid SMT conflicts.
I've done a pretty amount of studying the thread distribution behavioer under windows7 and windows 10. when developing project mercury. and never once did such a behavior occur.
However i am not stating this as a 100% facts because it mostly a side observations of other stuff i've analysed.

It would be interesting if you had anything that could lay evidence to that the thread scheduler indeed does work with SMT in mind.
 
From what I understand so far... Infinity fabric is used within the SAME single CCX (in other words, connecting the 4 different physical cores together) as well as connecting the two CCXs together (in case of the Ryzen 7). That said, the difference of latency of pinging the cores within the SAME CCX for AMD (being 42ns) and pinging the cores from within 2 different CCX (being 142ns) is attributed, IMO, to the actual physical distance between the two CCXs. But when it comes to computing and memory, each ns matters, hence memory access is improved from say CL14 to CL16 and so on. It would take an AMD engineer to explain why AMD decided to accept that 100ns distance between one CCX and the second.

Yes throughput of course matters, I have to look up the throughput numbers of AMD and Intel and compare them... There is inter-core throughput and core to DRAM throughput that have to be known and compared. But then you have to consider if your app/game is actually even reaching the limits of the throughput...and if it does not, well then latency becomes the most important factor.
So I did some reading and appears like all Infinity Fabric is, is a remixed and broader reaching HyperTransport, under a new name. It'll be bringing in basically everything and all will incorporate into the Infinity Fabric in some way. Multi-chip solutions will communicate across it. GPUs (starting with Vega) will use it on their end (Cache Coherent Interconnect or whatever it was called) AND tap into it on the CPU side. As well as still being open source like it always has. I applaud the name change though, since now with Ryzen supporting "HT", we won't have "HT" and "HTT" lol (HyperTransport Technology)

Also, all current Ryzen have dual 4C modules, they're just configured (cut) to have X amount of cores. So your "in the case of Ryzen 7" applies to all. Ryzen 3, 5 and 7. I suspect the only time we're going to see a stand-along 4C/8T Zeppelin Module will be on the Raven Ridge APUs. Though I hope we'll see them in Zen+ revisions w/o any GPU, but we'll see. I'm hoping they've left those off the board right now to not waste resources, and see how well Ryzen does, but probably won't be the case. :p


when developing project mercury.
Like the one comment on your latest release, I too had Googled it quick and was met with the same fate of only NASA results. Your response was of what was your motivation to use Mercury. As such, would perhaps naming it Project Hermes not be just as viable and avoid confusion? Being the program is still potentially in it's infancy as far as being widely known about, it might be the best time to consider a change. Just a thought though.

Also, documentation does seem missing, but on all of your software. heh
As such, may I inquire into what "Phoenix" is (or does) that is on your file server?
 
Last edited:
Like the one comment on your latest release, I too had Googled it quick and was met with the same fate of only NASA results. Your response was of what was your motivation to use Mercury. As such, would perhaps naming it Project Hermes not be just as viable and avoid confusion? Being the program is still potentially in it's infancy as far as being widely known about, it might be the best time to consider a change. Just a thought though.

Also, documentation does seem missing, but on all of your software. heh
As such, may I inquire into what "Phoenix" is (or does) that is on your file server?

I've been using the older version of Project Mercury now for a few years with great results, on my son's FX 6300 with win7.
Granted I have this thread bookmarked so I can add to new rigs if needed.
 
So I did some reading and appears like all Infinity Fabric is, is a remixed and broader reaching HyperTransport, under a new name. It'll be bringing in basically everything and all will incorporate into the Infinity Fabric in some way. Multi-chip solutions will communicate across it. GPUs (starting with Vega) will use it on their end (Cache Coherent Interconnect or whatever it was called) AND tap into it on the CPU side. As well as still being open source like it always has. I applaud the name change though, since now with Ryzen supporting "HT", we won't have "HT" and "HTT" lol (HyperTransport Technology)

Also, all current Ryzen have dual 4C modules, they're just configured (cut) to have X amount of cores. So your "in the case of Ryzen 7" applies to all. Ryzen 3, 5 and 7. I suspect the only time we're going to see a stand-along 4C/8T Zeppelin Module will be on the Raven Ridge APUs. Though I hope we'll see them in Zen+ revisions w/o any GPU, but we'll see. I'm hoping they've left those off the board right now to not waste resources, and see how well Ryzen does, but probably won't be the case. :p



Like the one comment on your latest release, I too had Googled it quick and was met with the same fate of only NASA results. Your response was of what was your motivation to use Mercury. As such, would perhaps naming it Project Hermes not be just as viable and avoid confusion? Being the program is still potentially in it's infancy as far as being widely known about, it might be the best time to consider a change. Just a thought though.

Also, documentation does seem missing, but on all of your software. heh
As such, may I inquire into what "Phoenix" is (or does) that is on your file server?

Just to add.
It is probably better to think of the Infinity Fabric in both physical and logical terms with perspective of it being both an on-chip switching solution and also an evolved HyperTransport (this is an end-to-end communication over PCIe).
Vega as you mention will support unique way to direct connect with the CPU (not technically the same but closest comparison may be the NVMe approach that bypasses the older and 'slower' protocols).
The HBCC is a bit different to this and not sure if that is what you are thinking of as that is the coherent/unified memory controller for all devices, but we have only seen this in the most simplest of implementations to date and I am still not entirely convinced upon its scalability and flexibility for use in real world beyond unified memory between GPU and system memory.

Cheers
 
Last edited:
Also, all current Ryzen have dual 4C modules, they're just configured (cut) to have X amount of cores. So your "in the case of Ryzen 7" applies to all. Ryzen 3, 5 and 7. I suspect the only time we're going to see a stand-along 4C/8T Zeppelin Module will be on the Raven Ridge APUs. Though I hope we'll see them in Zen+ revisions w/o any GPU, but we'll see. I'm hoping they've left those off the board right now to not waste resources, and see how well Ryzen does, but probably won't be the case. :p

What I meant to say is both CCXs are fully enabled like in the Ryzen 7 family. Yep I read that somewhere as well, they just disable or "kill" the physical cores they dont want to use. In fact, they start with the same die, and they chose the best of them for the 1700x and 1800x family. Maybe also the 1700 :)

Say for whatever reason some "core" in one of the CCX is not as "expected", well they use that die in the lower Ryzen 3 or Ryzen 5 family. They wont "throw it away" :) Come to think about it, its BILLIONS of transistors wasted, but they figured its cheaper than making chips specifically for a certain frequency, core count, and so on.
 
Like the one comment on your latest release, I too had Googled it quick and was met with the same fate of only NASA results. Your response was of what was your motivation to use Mercury. As such, would perhaps naming it Project Hermes not be just as viable and avoid confusion? Being the program is still potentially in it's infancy as far as being widely known about, it might be the best time to consider a change. Just a thought though.

Also, documentation does seem missing, but on all of your software. heh
As such, may I inquire into what "Phoenix" is (or does) that is on your file server?

Yeah I'm not the best at documentation. I do like much more analysing stuff and working stuff out. Documentation is just so borring

I do regret the name sadly I was not aware of the NASA project when I named it.
Hermes might not be a bad name.

Project Phoenix is just a collection of stress tools with a simple GUI.
It has prime95 and linpack for CPU stress test and furmark for GPU
As well as some memory testing.
I tuned the load based on measuring wattage draw from the wall and it is pretty nasty.
 
The HBCC is a bit different to this and not sure if that is what you are thinking of as that is the coherent/unified memory controller for all devices, but we have only seen this in the most simplest of implementations to date and I am still not entirely convinced upon its scalability and flexibility for use in real world beyond unified memory between GPU and system memory.
Yea yea, that was what I was thinking of! Thanks. Though there IS a "Coherent Fabric" too, but I don't know quite what it is or which products it's specifically going to be part of (probably servers), as I didn't dig into it heh
I was a big fan of all the HSA stuff when it was getting rolled out, and I'm really bummed that it doesn't seem to be getting utilized much, or if it is then not getting talked about. However, at the same time, will not ignore the fact that if it isn't being utilized to the extent it could be, it's purely down to being due to AMD having been so behind that no one really wanted to bother. :\ So I'm really hoping now with Ryzen actually contending with Intel, that maybe it'll make folk rethink it all, as that seems pretty much what the High Bandwidth Cache Controller (HBCC) is trying to accomplish. I won't deny that I too am skeptical it'll play out the way AMD has touted it, but I really like what they've been trying to accomplish these past number of years, despite pretty much all of them getting ignored except for Mantle. Which, yes, largely Mantle was ignored by the industry... except, I think it served its purpose by bringing us Vulkan and the underlying ambitions of DX12. Hell, maybe we'll get lucky and actual 3D sound chips (cards) again if M$ decides to bring back DirectSound! :D


Say for whatever reason some "core" in one of the CCX is not as "expected", well they use that die in the lower Ryzen 3 or Ryzen 5 family. They wont "throw it away" :) Come to think about it, its BILLIONS of transistors wasted, but they figured its cheaper than making chips specifically for a certain frequency, core count, and so on.
It's a smart way to diversify your product stack, that's for sure. Though I'm still not quite sure why they wouldn't have decided to cut the entire second module in that 4C Ryzen, instead of the 2+2 design they went with, unless it was purely for the "16MB L3 Cache Quad Core!!" marketing standpoint. Thing is, I'd think that there'd be more to gain from nixing the entire module in terms of performance, which would result in just as good of sales and better consumer benefit, versus a "MOAR CACHE!" marketing bullet point.

Though I didn't consider this until just now... there could be very real functional reasons behind their choice, like if the don't keep the other module active then perhaps the chips would lose those 12 PCIe lanes, since each module has a block of PCIe circuitry positioned next to them. The world may never know. heh


Yeah I'm not the best at documentation. I do like much more analysing stuff and working stuff out. Documentation is just so borring

I do regret the name sadly I was not aware of the NASA project when I named it.
Hermes might not be a bad name.

Project Phoenix is just a collection of stress tools with a simple GUI.
It has prime95 and linpack for CPU stress test and furmark for GPU
As well as some memory testing.
I tuned the load based on measuring wattage draw from the wall and it is pretty nasty.
I don't blame you really lol But at the same time, even a page for each of your programs that, if nothing more than just detailing what the program is, it'd be a big help. For example, you could quite literally copy and paste exactly what you wrote there about Project Phoenix, and then any visitors to your page would at least know. :) It was short, but to the point, and for anyone who is interested in something like that, they'll not necessarily need documentation for that one.

When I had made the suggestion for Hermes originally I didn't know that you Proj Mercury had been around that long. At that point it makes it a much harder decision to make the name change.
However... lmao Looks like Hermes isn't even the best choice, either :cry: Not only is there this:
https://en.wikipedia.org/wiki/Hermes_(missile_program)
in addition to various other unrelated things that also shared the Project Mercury name.
Beings my Minecraft Server is Greek/Roman themed (alright, Percy Jackson, but I bought into it so... don't judge me! LOL) so I'll keep it in the back of my mind and maybe I'll come up with an alternative that's more viable. :smug:
 
Though I didn't consider this until now... there could be very real functional reasons behind their choice, like if the don't keep the other module active then perhaps the chips would lose those 12 PCIe lanes, since each module has a block of PCIe circuitry positioned next to them. The world may never know. heh

Well Intel got away with putting all 8 cores in one single die, think of it as one single CCX.. We know that AMD can disable a core one each CCX. So that means if they did like Intel, and put 8 cores under a single CCX, they can still disable 2, 3, or more cores as they wish. Cache wise, well they can increase the cache for the entire CCX (if it was one) as well. Look if Intel has done it and it works very well, I dont see why AMD would not do it. It would get rid of that 100ns delay going between one CCX and the other.

I think the reasons might be engineering related more than marketing, and I cannot answer further than that :)
 
Hey look! another Video addressing Ryzen memory speed and it's effect on gaming with different levels of Nvidia GPU!!! lets take a look.


I wonder were my overclocked 1070 would stack up...
 
So many nOObie characters popping up in this thread. And all so opinionated in one direction. Curious that.

At least that is what the forum is for , but why necro a month old thread ?

We haven't heard the latest about ram overclocking the Beta bios for Asus crosshair VI hero has people already posting at 4000mhz , plenty to come before the end of this year.
 
At least that is what the forum is for , but why necro a month old thread ?

We haven't heard the latest about ram overclocking the Beta bios for Asus crosshair VI hero has people already posting at 4000mhz , plenty to come before the end of this year.

Sounds awesome. The piss poor ram clocks is what mostly puts me off Ryzen.

Gotta link?

Cheers :)
 
Not likely- even if these results are due to increased bandwidth (and not just increased interconnect speed or decreased latency), core speeds will be lower on the higher core-count products, meaning that poorly threaded applications will still be limited and fall behind the fastest single-core product, currently the 7700k.

Supposedly that the "R9" stuff will run more cores at the same clocks as R7 stuff. Who knows.
 
https://community.amd.com/community/gaming/blog/2017/05/25/community-update-4-lets-talk-dram

I'm not to sure where I read that someone already been posting at 4000 because of the nature of these beta bios it could be a fluke. But were going to see more results as they will release new bios.

Interesting read, thanks.
AGESA 1.0.0.6 update said:
Added dividers for memory clocks up to DDR4-4000 without refclk adjustment. Please note that values greater than DDR4-2667 is overclocking. Your mileage may vary (as noted by our big overclocking wartning at the end of this blog)."

So they've added dividers for 4000mhz and if you got a lucky IMC you can now, theoretically, POST there! This is new right?
 
Yea, I saw AMD's update posted elsewhere. Can [H] run a retest? This is a pretty substantial bump.
I would not bother ;) to be honest the thing is so new that I would wait until the end of the year or maybe after the summer to see where most of the AM4 platform performs with these speeds.
Interesting read, thanks.
So they've added dividers for 4000mhz and if you got a lucky IMC you can now, theoretically, POST there! This is new right?

I usually check what Chew (xtremesystems.org) is doing or the mega thread over on overclockers.net . What I first thought were IMC related problems simply were bios limitations. The point is without some time to test these things it is hard to point where the bottleneck for AM4 memory is.
 
This is also the second video I've seen where a GTX1070 + High settings (vs the GTX1080 + Ultra we see most places) seems to oddly favor Ryzen over the 7700k in a lot of titles it usually doesn't. Weird.

Who cares about anything but maxed settings. Ever.

I bet your desk only costs 5K.
 
the infinity fabric is tied to ram speed but theres cut off around 3000-3200 where the gains become negligible above that.

A case in point: I recently bough G.Skill DDR4-4366mhz with CL19-19-19-19-38. I knew I had no prayer of getting anywhere near that speed. But I liked the tight timings and hoped I could go to 3466mhz. MY 1800X has very mediocre IMC so the highest I could boot is 3333mhz a no go at3466. 3300MHZ can not be stabilized, but at 3200mhz at 14-14-14-34 CR T1 , the same settings on my Flare X DDR4 3200 that I sold off. I am consistently getting 129.45 FPS on cinebench 15.038 as compared to 123. 7 FPS on the Flare X. No it is not infinity fabric , but less noise in the super binned 4266mhz dimms when running at the same speed and settings as the Flare X. This is something CHEW has spoken of reducing electronic noise in the ecosystem of your pc. It can raise performance as much as memory speed and timings.
 
Back
Top