x58 triple channel memory performance vs modern dual channel configuations

hyper-nova

n00b
Joined
Aug 3, 2016
Messages
10
Hi all,

I'm interested to know (either with real world benchmark data or calculations) what the memory performance is like between older x58 triple channel platforms and modern dual channel configurations such as the AMD 3600 platform. (I don't know the exact chipset name/number, I believe there are a few available.)

I have a PC which is nearly 10 years old now, which serves as my workstation. I also play some games on it during lockdown.

I don't know if there is much to be gained by upgrading my system. I am somewhat tempted by the new AMD 3600. So far I've never upgraded because

1: I don't need to
2: I bought this system a couple of years back for a decent price, and although it was last gen hardware at the time it was still top class performance for its time. I never saw any information which indicated that I would gain a lot through an upgrade.

Things have moved on somewhat, and there are a few things that I now do need, or would be nice to have. The most obvious is USB 3, either in the form of the 5GB/s of 10GB/s version. But there are other things. I'm still using PCI-E 2.0, for example, and this board has a load of firewire, etc, interfaces which I just don't use.

Can anyone direct me to information about memory performance, either in real world measurements or calculations.

For example, this system has CL10 DDR3 running at 1333MHz. I believe that this means the latency is 10 / (1333e6 / 2) * 1e6 ns = 15.0 ns.

That is CL latency (clocks) / (speed in Hz / 2) = latency in units of seconds

Then the bandwidth is 1333e6 * 8 = 10 GB/s.

That is DDR frequency (and it is double because data comes out both when the clock goes high as well as low) * 8 bytes, because data comes out in widths of 64 bits.

These numbers don't seem that great by modern standards. Are my calculations correct? If we have things like 10Gb/s (ok bits per second not bytes) LAN and 10GB/s USB, then I would have thought modern CPU/ram combinations would have much higher bandwidths.

In particular, is my bandwidth calculation correct? It seems sensible that latency has a limit in that the higher the frequency, the longer the CL latency must be for the data to arrive on the memory controller bus. However, does bandwidth really scale linearly with frequency?
 
Oh boy, the rabbit hole. First, what are you trying to figure out. Because just because it's called DDR3 1333, doesn't mean it's 1333... it just means it's "effectively" 1333MHz, which already takes into account the clock doubling. Fun fact, DDR3 1333 ram chips actually runs at 166.666MHz.... the IO bus runs at 666MHz, since it's DDR (double data rate) it's effectively 1333MHz. So it's pretty simple math... 1333/8 * 64 = MB/s... so DDR3 1333 = 10,666MB/s or 10.6GB/s. This is Bytes, not bits. In theory a single stick of DDR3 could support about 10x 10gb/s ethernet cards (if nothing else was going on). Triple channel means you can triple this value (31400MB/s or 31.4GB/s). Latency is the time per clock cycle * CAS.

Armed with this knowledge, here is my spreadsheet when I got bored and wanted to compare latency for different memory types (SINGLE CHANNEL).

Ram TypeEffective ClockRamClock Cycle TimeCASLatencyBandwidth (MB/s)GB/s
SDR
100​
SDR-100
10.00​
3​
30.00​
800​
0.8​
SDR
133​
SDR-133
7.52​
3​
22.56​
1064​
1.064​
DDR
200​
DDR-200
10.00​
2.5​
25.00​
1600​
1.6​
DDR
266​
DDR-266
7.52​
2.5​
18.80​
2128​
2.128​
DDR
333​
DDR-333
6.01​
2.5​
15.02​
2664​
2.664​
DDR
400​
DDR-400
5.00​
3​
15.00​
3200​
3.2​
DDR2
400​
DDR2-400
5.00​
3​
15.00​
3200​
3.2​
DDR2
400​
DDR2-400
5.00​
4​
20.00​
3200​
3.2​
DDR2
533​
DDR2-533
3.75​
3​
11.26​
4264​
4.264​
DDR2
533​
DDR2-533
3.75​
4​
15.01​
4264​
4.264​
DDR2
667​
DDR2-667
3.00​
4​
11.99​
5336​
5.336​
DDR2
667​
DDR2-667
3.00​
5​
14.99​
5336​
5.336​
DDR2
800​
DDR2-800
2.50​
4​
10.00​
6400​
6.4​
DDR2
800​
DDR2-800
2.50​
5​
12.50​
6400​
6.4​
DDR2
800​
DDR2-800
2.50​
6​
15.00​
6400​
6.4​
DDR2
1066​
DDR2-1066
1.88​
6​
11.26​
8528​
8.528​
DDR2
1066​
DDR2-1066
1.88​
7​
13.13​
8528​
8.528​
DDR3
800​
DDR3-800
2.50​
5​
12.50​
6400​
6.4​
DDR3
800​
DDR3-800
2.50​
6​
15.00​
6400​
6.4​
DDR3
1066​
DDR3-1066
1.88​
6​
11.26​
8528​
8.528​
DDR3
1066​
DDR3-1066
1.88​
7​
13.13​
8528​
8.528​
DDR3
1066​
DDR3-1066
1.88​
8​
15.01​
8528​
8.528​
DDR3
1333​
DDR3-1333
1.50​
7​
10.50​
10664​
10.664​
DDR3
1333​
DDR3-1333
1.50​
8​
12.00​
10664​
10.664​
DDR3
1333​
DDR3-1333
1.50​
9​
13.50​
10664​
10.664​
DDR3
1333​
DDR3-1333
1.50​
10​
15.00​
10664​
10.664​
DDR3
1600​
DDR3-1600
1.25​
8​
10.00​
12800​
12.8​
DDR3
1600​
DDR3-1600
1.25​
9​
11.25​
12800​
12.8​
DDR3
1600​
DDR3-1600
1.25​
10​
12.50​
12800​
12.8​
DDR3
1600​
DDR3-1600
1.25​
11​
13.75​
12800​
12.8​
DDR3
1866​
DDR3-1866
1.07​
10​
10.72​
14928​
14.928​
DDR3
1866​
DDR3-1866
1.07​
11​
11.79​
14928​
14.928​
DDR3
1866​
DDR3-1866
1.07​
12​
12.86​
14928​
14.928​
DDR3
1866​
DDR3-1866
1.07​
13​
13.93​
14928​
14.928​
DDR3
2133​
DDR3-2133
0.94​
11​
10.31​
17064​
17.064​
DDR3
2133​
DDR3-2133
0.94​
12​
11.25​
17064​
17.064​
DDR3
2133​
DDR3-2133
0.94​
13​
12.19​
17064​
17.064​
DDR3
2133​
DDR3-2133
0.94​
14​
13.13​
17064​
17.064​
DDR4
1600​
DDR4-1600
1.25​
10​
12.50​
12800​
12.8​
DDR4
1600​
DDR4-1600
1.25​
11​
13.75​
12800​
12.8​
DDR4
1600​
DDR4-1600
1.25​
12​
15.00​
12800​
12.8​
DDR4
1866​
DDR4-1866
1.07​
12​
12.86​
14928​
14.928​
DDR4
1866​
DDR4-1866
1.07​
13​
13.93​
14928​
14.928​
DDR4
1866​
DDR4-1866
1.07​
14​
15.01​
14928​
14.928​
DDR4
2133​
DDR4-2133
0.94​
14​
13.13​
17064​
17.064​
DDR4
2133​
DDR4-2133
0.94​
15​
14.06​
17064​
17.064​
DDR4
2133​
DDR4-2133
0.94​
16​
15.00​
17064​
17.064​
DDR4
2400​
DDR4-2400
0.83​
15​
12.50​
19200​
19.2​
DDR4
2400​
DDR4-2400
0.83​
16​
13.33​
19200​
19.2​
DDR4
2400​
DDR4-2400
0.83​
17​
14.17​
19200​
19.2​
DDR4
2400​
DDR4-2400
0.83​
18​
15.00​
19200​
19.2​
DDR4
2666​
DDR4-2666
0.75​
17​
12.75​
21328​
21.328​
DDR4
2666​
DDR4-2666
0.75​
18​
13.50​
21328​
21.328​
DDR4
2666​
DDR4-2666
0.75​
19​
14.25​
21328​
21.328​
DDR4
2666​
DDR4-2666
0.75​
20​
15.00​
21328​
21.328​
DDR4
2933​
DDR4-2933
0.68​
19​
12.96​
23464​
23.464​
DDR4
2933​
DDR4-2933
0.68​
20​
13.64​
23464​
23.464​
DDR4
2933​
DDR4-2933
0.68​
21​
14.32​
23464​
23.464​
DDR4
2933​
DDR4-2933
0.68​
22​
15.00​
23464​
23.464​
DDR4
3200​
DDR4-3200
0.63​
20​
12.50​
25600​
25.6​
DDR4
3200​
DDR4-3200
0.63​
22​
13.75​
25600​
25.6​
DDR4
3200​
DDR4-3200
0.63​
24​
15.00​
25600​
25.6​
 
Ok cool, yeah looks like I missed the bit about 3 channels = 3x bandwidth. Ok interesting. Two further questions come to mind.

Firstly regarding this table. Is there any way to know how latency and bandwidth work together to influence real world performance? I guess not, right? Or at least the situation is presumably complex. I know quite a lot about programming, right down to basic assember level, but really I have no clue how the hardware electronics works to this level of detail. Say I read a block of memory, and the block is large - presumably this is bandwidth limited, because presumably for one continuous block of memory only one request to RAM is made and the memory controller handles storing the data in cache, and it comes out as one continous stream. But having said that I don't know if this is the case, it is just a guess. I'm not a CPU archetect / engineer.

The further question then is really the following. For someone in my situation, it doesn't seem like it makes a lot of sense to upgrade. Even at stock speeds I'm getting bandwidths of ~ 30 GB/s, and modern dual channel systems are only about ~ 40 GB/s, maybe up to about 50 GB/s at the higher end, but this is a fairly expensive upgrade. Quad channel is something I could not afford. It certainly doesn't make sense on a performance/price basis. There's also the issue that high core count CPUs are likely to have lower single thread performance, and most of what I do is single thread. Then latency is only likely to increase. Sorry that's not really a question but a statement. I presume what I said is correct. I guess the caveat is that modern CPUs probably contain more complex instructions, but again I'm not a compiler engineer so I wouldn't be able to tell you if that translates into real world performance boosts for the kind of work I do.
 
RAM speed affects real world performance by improving the efficiency of the the CPU on cache heavy tasks so in games overclocking my RAM can improve CPU performance by up to ~30% which is more than overclocking the CPU itself but in other programs like Cinebench that fit neatly into the CPU cache not needing random data from RAM to continue processing, CPU performance improves less than 1% from the faster RAM.
Some programs care more about latency and others care more about bandwidth.

Here is one of the better overclocked tipple channel tests I could find for X58 but unknown RAM speed.
https://www.userbenchmark.com/UserRun/17594762

MC Read 23.5
MC Write 18.1
MC Mixed 23.2
AVG 21.6 GB/s
SC Read 14
SC Write 13.6
SC Mixed 16.5
AVG 14.7 GB/s
Latency 48.4ns

Vs my dual channel System RAM at 2400c15 SPD
https://www.userbenchmark.com/UserRun/22497480

MC Read 31.7
MC Write 33
MC Mixed 25.5
AVG 30.1 GB/s
SC Read 22
SC Write 35.1
SC Mixed 26.9
AVG 28 GB/s
Latency 54.4ns

RAM at 3600c14 tweaked
https://www.userbenchmark.com/UserRun/21737762
MC Read 47.5
MC Write 48.2
MC Mixed 38.5
AVG 44.7 GB/s
SC Read 28.1
SC Write 52.4
SC Mixed 39.7
AVG 40.1 GB/s
Latency 39.6ns
 
Last edited:
Ok cool, yeah looks like I missed the bit about 3 channels = 3x bandwidth. Ok interesting. Two further questions come to mind.

Firstly regarding this table. Is there any way to know how latency and bandwidth work together to influence real world performance? I guess not, right? Or at least the situation is presumably complex. I know quite a lot about programming, right down to basic assember level, but really I have no clue how the hardware electronics works to this level of detail. Say I read a block of memory, and the block is large - presumably this is bandwidth limited, because presumably for one continuous block of memory only one request to RAM is made and the memory controller handles storing the data in cache, and it comes out as one continous stream. But having said that I don't know if this is the case, it is just a guess. I'm not a CPU archetect / engineer.

The further question then is really the following. For someone in my situation, it doesn't seem like it makes a lot of sense to upgrade. Even at stock speeds I'm getting bandwidths of ~ 30 GB/s, and modern dual channel systems are only about ~ 40 GB/s, maybe up to about 50 GB/s at the higher end, but this is a fairly expensive upgrade. Quad channel is something I could not afford. It certainly doesn't make sense on a performance/price basis. There's also the issue that high core count CPUs are likely to have lower single thread performance, and most of what I do is single thread. Then latency is only likely to increase. Sorry that's not really a question but a statement. I presume what I said is correct. I guess the caveat is that modern CPUs probably contain more complex instructions, but again I'm not a compiler engineer so I wouldn't be able to tell you if that translates into real world performance boosts for the kind of work I do.
Very complex interactions, but to simplify, latency affects things that require random access, while bandwidth is how much data will come through if it's all contiguous. So if it's a very random workload lower speed memory with tighter timings can give better results than just buying the highest MHz. This is why transitions from ddr to ddr2 and then 2->3 didn't net huge benefits in all workloads (or even some regressions in workloads). After a few iterations when they can crank up the frequency and tighten the timings does the real benefit happen. Newer CPUs have a lot of cache now as well, so it masks memory speeds to an extent until cache misses. So even just comparing speeds of memories only does so much as the architecture and CPU designs have changed as well. If your memory constrained it may not do much to upgrade, really depends on your work load honestly. I have triple channel but only 1066... But I can tell even my ryzen 1600 with 16GB dual channel feels much faster than my dual xeon triple channel with 92GB... Not just due to memory as I'm more CPU limited with a mix of heavy threaded and not very threaded things. Keep in mind, servers were meant to handle large amounts of data, so triple channel 1333 with tight timings won't be far off of (or possibly better than) ddr4 with crappy timings.

Real world performance is hard to compare directly unless the platform supports both memory types, like when AMD CPUs supported both ddr2 and ddr3 depending on the MB you bought. Nowadays you can't really do direct comparisons because a new architecture and changes in CPU cache can easily modify the results.
 
Btw, you have the general idea about how it works. Random read vs. contiguous, depending on work load one may be more important (timing) vs the other (max bandwidth).

Sorry, another edit. Don't forget about other things like SATA3 vs SATA2 or pcie 2.0 vs 3.0 that all make differences in how th PC does with different things. My SSD in my server on SATA2 is a dog compared to my nvme on my daughter's Pentium... Regardless of having 12/24 with 96gb vs 2/4 with 8gb.
 
That table has some seriously high latencies for DDR4. Yea, I know they're the JEDEC standard specs, but most platforms can drive RAM much tighter and faster than JEDEC. With Zen2, you can get DDR4-3733 CL17 which would give nearly twice the bandwidth in dual channel at nearly half the latency than DDR3-1333 CL10 does in triple channel.
 
That table has some seriously high latencies for DDR4. Yea, I know they're the JEDEC standard specs, but most platforms can drive RAM much tighter and faster than JEDEC. With Zen2, you can get DDR4-3733 CL17 which would give nearly twice the bandwidth in dual channel at nearly half the latency than DDR3-1333 CL10 does in triple channel.
Lol, it was my homemade table that I made a while back before they got the timings tightened up (I put this together not to long after I built my son's 6600k which we bought the day it released to give you an idea of ram availability). It wasn't supposed to be all encompassing. If you want or think it would help I can add other timings and speeds and post it back here, wont take much effort since it calculates everything for me. It does ignore all the other timings which also have an affect.
 
Lol, it was my homemade table that I made a while back before they got the timings tightened up (I put this together not to long after I built my son's 6600k which we bought the day it released to give you an idea of ram availability). It wasn't supposed to be all encompassing. If you want or think it would help I can add other timings and speeds and post it back here, wont take much effort since it calculates everything for me. It does ignore all the other timings which also have an affect.

I didn't mean to come off as critical. Just wanted to add to what you already said to be sure the OP was aware that there were options that were higher bandwidth and lower latency.
 
A couple of updated speeds for you. Basically the fastest I could find for each speed (didn't do all kits, just some common ones, obviously price goes up with speed), was very surprised to see a 3800 kit with CL14... wonder if that was @ 3800 or just it could hold CL14 at regular speeds (like my ram kit can hold better timings at 3000 than at 3200). Also, found a sweet DDR3 kit... DDR3-3200 @ CAS9... not sure there is anything that can actually run it that high without some major overclocking, but hey, it exists and has crazy low latency. If you could find a DDR3 system that can run 3 channel, that would be more bandwidth than DDR-3800 and much better latency. Also found a DDR2 kit with CAS5, lol. Better latency than most DDR4 kits (not much for bandwidth).

Ram TypeEffective ClockRamClock Cycle TimeCASLatencyBandwidth (MB/s)GB/s
DDR2
1200​
DDR2-1200
1.67​
5​
8.33​
9600​
9.6​
DDR3
3200​
DDR3-3200
0.63​
9​
5.63​
25600​
25.6​
DDR4
2400​
DDR4-2400
0.83​
10​
8.33​
19200​
19.2​
DDR4
2666​
DDR4-2666
0.75​
13​
9.75​
21328​
21.328​
DDR4
3000​
DDR4-3000
0.67​
13​
8.67​
24000​
24​
DDR4
3200​
DDR4-3200
0.63​
14​
8.75​
25600​
25.6​
DDR4
3600​
DDR4-3600
0.56​
14​
7.78​
28800​
28.8​
DDR4
3800​
DDR4-3800
0.53​
14​
7.37​
30400​
30.4​

Edit: One last one, his original speed I found with CAS6...
DDR3
1333​
DDR3-1333
1.50​
6​
9.00​
10664​
10.664​

I didn't take it as critical, just explaining the lack of newer items in the list ;).
 
Pretty useful info actually - and yes good point about updated SATA standards. My MB of course does not have M.2.
 
Well if you learned a little bit then it was worth sharing, lol. Good luck with whatever you're trying to do.
 
Also, found a sweet DDR3 kit... DDR3-3200 @ CAS9... not sure there is anything that can actually run it that high without some major overclocking, but hey, it exists and has crazy low latency. If you could find a DDR3 system that can run 3 channel, that would be more bandwidth than DDR-3800 and much better latency.
Even if a CPU\MB could handle that speed the memory controllers tend to become a bandwidth bottlneck where bandwidth doesn't continue to scale with increased frequency.
Dual channel 4790K DDR3 is capable of higher bandwidth than the tri channel X58 DDR3 for this reason but bandwidth still hits a wall around DDR3 2400-2666 on 4790k.
 
Even if a CPU\MB could handle that speed the memory controllers tend to become a bandwidth bottlneck where bandwidth doesn't continue to scale with increased frequency.
Dual channel 4790K DDR3 is capable of higher bandwidth than the tri channel X58 DDR3 for this reason but bandwidth still hits a wall around DDR3 2400-2666 on 4790k.
I understand, I was just looking at the rediculously low latency and relatively high throughput. I understand there isn't much that could take advantage of it, but also shows if a recent platform had DDR3 support, it would/could probably be just as fast as most DDR4 installations.
 
Well if you learned a little bit then it was worth sharing, lol. Good luck with whatever you're trying to do.
Yes absolutly, thanks very must for the info, these things are not just useful to know, they are also interesting just to gain a better understanding of how things work. All the best
 
I have an x58 system too, upgraded with a BIOS mod to be compatible with server CPUs, so I got myself the x5690. There is nothing better.

I also bought the latest DDR3 RAM I could buy. Some Trident X stuff.
I can't even get it to run at its claimed speed because I'm limited by the motherboard and chipset.

So with perfect RAM and the ultimate CPU, I went out to play a videe game from 2018.

Ouch. It was my GPU that was under-performing. An R9 Fury X from 2015. It's "enthusiast class," so it's in the realm of the gods.

Yet it staggers.

So in other words, in the applications you're having issues with, check your bottlenecks. If your RAM is only 50% utilized and your CPU never goes above 50%, but your GPU is at a solid 100% usage, then you know what to upgrade.

Of course a more modern CPU and motherboard would be nice, but if you want bang-for-buck, just get a better GPU if that's what your statistics tell you you need.
 
I have an x58 system too, upgraded with a BIOS mod to be compatible with server CPUs, so I got myself the x5690. There is nothing better.

I also bought the latest DDR3 RAM I could buy. Some Trident X stuff.
I can't even get it to run at its claimed speed because I'm limited by the motherboard and chipset.

So with perfect RAM and the ultimate CPU, I went out to play a videe game from 2018.

Ouch. It was my GPU that was under-performing. An R9 Fury X from 2015. It's "enthusiast class," so it's in the realm of the gods.

Yet it staggers.

So in other words, in the applications you're having issues with, check your bottlenecks. If your RAM is only 50% utilized and your CPU never goes above 50%, but your GPU is at a solid 100% usage, then you know what to upgrade.

Of course a more modern CPU and motherboard would be nice, but if you want bang-for-buck, just get a better GPU if that's what your statistics tell you you need.
I'm running a radeon fury nano in my desktop (itx build) and a radeon fury x in my wifes desktop. We don't play to demanding of games so they work just fine still.
 
Back
Top