NVMe performance in 1st slot vs. 2nd slot, if both same PCIe Gen?

Toyzrme

n00b
Joined
Mar 25, 2021
Messages
5
I just built a Gigabyte Aorus Ultra X570 MoBo with 2x Samsung 980 Pro NVMe's (i.e. everything is PCIe v4.0) - getting just under 7,000 MB/s read, 5,000 MB/s write on both.

Question: has anyone ever tested the real-world performance of a CPU-connected NVMe vs. chipset-connected? The 2nd & 3rd M.2 slots are connected via the chipset, which is PCIe v4.0, so the serial throughput is the same in raw throughput testing.

But I'm curious about the latency, as I want to use PrimoCache to L2 cache my 36TB of RAID6 spinning drives.

I'm considering swapping my boot drive (OS & apps only, currently in M.2 slot 1) to the second M.2 slot, and putting the Cache drive in the first slot connected directly to the CPU to reduce latency.
Main uses for the system are video and photo editing. So I'm not launching lots of apps, but bouncing back and forth through footage and pics (hence the question on latency over throughput).

Thoughts pro/con/WTH are you doing? :)
 

TheSlySyl

[H]ard|Gawd
Joined
May 30, 2018
Messages
1,150
I don't think the difference between cpu and chipset is enough to matter, haven't noticed it with my drives but they aren't as high quality as yours are.
As a fellow Primocache user - the difference between nvme and RAMcache can be noticeable. Especially for OS.
 

doubletake

Gawd
Joined
Apr 27, 2013
Messages
705
Makes no sense to make that slot swap. Is there even a reason that you care about possible addtional latency that's going to be measured in nanoseconds for this setup? That drive, even at its slowest and attached to a PCH, is always going to be significantly faster than any operation coming off of a mechanical RAID array. The only way you risk having any noticeably increased delay when utilizing that chipset-linked drive is if you are also hammering every other device that is attached to the chipset, meaning all the SATA ports, all the PCI-e 1x slots, and any additional USB headers coming off of it.
 

Happy Hopping

Supreme [H]ardness
Joined
Jul 1, 2004
Messages
6,874
I also want to know about the speed difference btwn. a NVMe on a m.2 slot vs. NVMe on an add-on PCI e x4 adapter card
 

Dan_D

Extremely [H]
Joined
Feb 9, 2002
Messages
57,910
Fortunately, I can shed some light on this as I have tested this. The short answer, is there is no difference between NVMe drives run from the PCH vs. the CPU's PCIe controller. The reason for this is simple. You don't saturate the PCH's downlink to the CPU enough for this to impact performance negatively. In fact, PCIe 3.0 on an Intel PCH vs. going direct to a Ryzen 3000 series CPU makes no difference either. You would have to saturate the link between the CPU and the PCH for there to be a measurable difference. To do that, you'd need to benchmark all the devices attached to the PCH at the exact same time. You would need to setup a two or three drive M.2 NVMe RAID 0 array and then benchmark it to run into bandwidth limitations of this link. To answer the last question, there is no difference between running a drive off PCIe x4 adapter card vs. a motherboard PCB mounted M.2 slot.
 

Toyzrme

n00b
Joined
Mar 25, 2021
Messages
5
I also want to know about the speed difference btwn. a NVMe on a m.2 slot vs. NVMe on an add-on PCI e x4 adapter card
FYI where the NVMe and PCIe ports are connected depends on the CPU/PCH/MoBo. My MoBo has the first 2 x16 slots and 1st M.2 off the CPU, while the other 2x M.2's (at 4x & 2x) and last PCIex16 (at 4x) are off the PCH. So putting it in M.2 #2 (x4) on mine will go twice as fast as M.2 #3 (x2). If I want a 3rd NVMe, I'm doing what you're saying: NVMe in a PCIe-x4 AIC. Otherwise sounds like it doesn't really matter performance-wise, barring edge cases.
 

Toyzrme

n00b
Joined
Mar 25, 2021
Messages
5
Thanks to all - this is what I suspected, but prefer hard evidence or expert opinion over my suspicions.

So I'll leave the NVMe's as-is, with OS near the CPU, cache off the chipset.

doubletake & Dan_D - thanks for the insight - I understand block-level stuff, but wasn't sure how many added cycles happened between the CPU and PCH-connected devices. But it also means another option I was considering might not be so wise: adding older/unused SATA SSD's as additional cache, if they might eat into the PCH bandwidth. Or at least connect only to CPU-connected SATA ports (depending whether I'd be hammering both caches at the same time - what, you mean I have to actually THINK?? lol).
 

Toyzrme

n00b
Joined
Mar 25, 2021
Messages
5
I don't think the difference between cpu and chipset is enough to matter, haven't noticed it with my drives but they aren't as high quality as yours are.
As a fellow Primocache user - the difference between nvme and RAMcache can be noticeable. Especially for OS.
TheSlySyl Curious what your system is, typical use(s), and your Primocache settings?

My current thinking is use R/O RAM cache on OS NVMe, maybe 20GB of my 128GB. Then use most of the 2TB 980 Pro (2nd NVMe) as R/O L2 cache for my 32TB of spinning rust, but probably(?) no RAM cache.

Thoughts?

NM: Just noticed your sig. Serious brain cramp - but glad my hunches were good :)
 
Last edited:

Happy Hopping

Supreme [H]ardness
Joined
Jul 1, 2004
Messages
6,874
FYI where the NVMe and PCIe ports are connected depends on the CPU/PCH/MoBo. My MoBo has the first 2 x16 slots and 1st M.2 off the CPU, while the other 2x M.2's (at 4x & 2x) and last PCIex16 (at 4x) are off the PCH. So putting it in M.2 #2 (x4) on mine will go twice as fast as M.2 #3 (x2). If I want a 3rd NVMe, I'm doing what you're saying: NVMe in a PCIe-x4 AIC. Otherwise sounds like it doesn't really matter performance-wise, barring edge cases.
what brand and model of motherboard are you are using? as my one doesn't say what speed ea. m.2 slot is
 

TheSlySyl

[H]ard|Gawd
Joined
May 30, 2018
Messages
1,150
I host a plex and minecraft server off my system, I also stream and minor video/audio editing but I'm basically transcoding something nonstop.

I have 8GB RAMcache for OS/Plex. 2GB Ramcache for my Minecraft server and 8GB Ramcache for my game drive. (A little over 20GB total considering overhead, so 44ish usable gigabytes. My Minecraft mapping software sometimes eats 20 gigs at a time.)

I have other small SSD caches on my media drives, mostly to help with seek times. They don't seem to matter much TBH. Hit rates are terrible.

Most of my games are installed on a 12TB spinner with a 1TB 3.0 nvme drive cache and then the ramcache ontop of that. The other 40+ terabytes are mostly media and a scratch drive for transcoding.

I have to have an entire seperate drive (1TB QLC) for my xbox gamepass games. Because of the annoying fucking way they install, they don't play nice with Primocache and figuring that out took me way too long and way too many crashes.
 
Last edited:

Dan_D

Extremely [H]
Joined
Feb 9, 2002
Messages
57,910
Thanks to all - this is what I suspected, but prefer hard evidence or expert opinion over my suspicions.

So I'll leave the NVMe's as-is, with OS near the CPU, cache off the chipset.

doubletake & Dan_D - thanks for the insight - I understand block-level stuff, but wasn't sure how many added cycles happened between the CPU and PCH-connected devices. But it also means another option I was considering might not be so wise: adding older/unused SATA SSD's as additional cache, if they might eat into the PCH bandwidth. Or at least connect only to CPU-connected SATA ports (depending whether I'd be hammering both caches at the same time - what, you mean I have to actually THINK?? lol).
There are no CPU connected SATA ports. While this is an option for motherboard manufacturers, it comes at the cost of dedicated lanes used for M.2 devices. Literally no motherboard I've ever seen sold in the U.S. uses this option. The SATA ports always go through the PCH, so you would be utilizing bandwidth from the PCH to the CPU to do that. Caching in this manner is utterly worthless for NVMe devices which are far faster than SATA based SSD's. SATA based SSD caching, or any SSD caching will only benefit mechanical hard drive volumes and drive arrays.
 
Top