Do NVMe's SSD's today provide real-world performance advantages over SATA SSD's?

OpenSource Ghost

Limp Gawd
Joined
Feb 14, 2022
Messages
220
Back when NVMe SSD's just started coming out, there wasn't much real-world performance improvement from their use compared to SATA SSD's. Has that changed recently? DirectStorage 1.1 is supposed to make a difference, but nothing uses it yet. Is CPU still the bottleneck that prevents NVMe SSD's from reaching their full potential?
 
Last edited:
NVMe drives are SSDs. You mean NVMe vs. SATA.

Real-world, there's a narrow difference between NVMe and SATA SSDs for most apps. For games, web browsing, etc., it's generally dealing with a lot of smaller files, which means drive latency is a lot more important than raw throughput. Latency is the real reason SSDs are so much faster than HDDs in everyday use.

If your current system is on a SATA SSD I wouldn't necessarily recommend upgrading to a NVMe unit unless you have some other reason to buy a new unit, such as needing more capacity. Also, older systems (>~5 years or so maybe, I don't recall the cutoff offhand) can't boot from NVMe SSDs.
 
Also, older systems (>~5 years or so maybe, I don't recall the cutoff offhand) can't boot from NVMe SSDs.

It's more like 10 years (since EFI), although some would require an unofficial firmware update.
 
Last edited:
There will be once direct storage becomes a thing. PS5 really spoiled me with little to no loading times.
 
My main rig has an pci e 4.0 2tb 980pro. One of my other rigs have 3 sata 6.0 ssds in raid 0 and I can't tell the difference for day to day usage/gaming even though the nvme should technically be faster.
 
I would say the real issue with storage is the trend towards microfiles. You now have software that for some reason has to store its data in tens or hundreds of thousands of KB range files. You can have 7000MBps transfer speeds but that stuff is going to copy at 10KBps all day long.

As for day to day between SATA and MVMe? I have to say NVMe was a big disappointment. Once you get above 250MBps with flash latency...you are golden. I would look for endurance over top end speed.
 
Reviewer tend to not have many real world test outside copy pasting files and often does not include an good SATA reference if they have them, but:

Stuff like game loading:
https://www.techspot.com/review/2116-storage-speed-game-loading/

Diminishing return versus a good SATA ssd would be saying it mildly.

Say your stuff has a compression ratio of 4:1 and you want to fill an impressive 8 gig of ram and 4 gig of vram of just fully new stuff, you will read 3 gig from the harddrive an sata will read at 550mbs and will take less than 6s, for it to be a significant bottleneck you need near instead loading time (a la PS5).

I would say the real issue with storage is the trend towards microfiles. You now have software that for some reason has to store its data in tens or hundreds of thousands of KB range files.
Considering how the other way around game tend to be (often very few files with some big unique files of 30 gig with everything in a single well compressed format) for the specific subject, it could be quite a different subject.

My Cyberpunk install is pretty much only 31 .archives files of more than 2gig by files in average for example and many goes by with less than that
 
So what bottlenecks NVMe's? CPU?
CPU bottleneck is common and SATA is already quite fast for good compression ratio compressed data:

03k8kYxG8wHpBB9F38JTXZq-6.fit_lim.size_960x.png


You can listen to Forspoken presentation:


At 29:32 there is a break down of a scene loading and rapidly that actual reading-loading of file become a minority of what is going on and make very little load time being gain from passing to 4.8 from 2.8 gig by second.

GPU decompression will open the door to one of those step to decline and make reading speed increase interesting again, but for stuff with a ratio sometime going close to 6:1 in compression like textures a solid SSD good enough for a while and certainly PCI 3 type of NVME could be more than good enough for a very long time.

PCI 4x4 bandwidth is similar to a DDR3-1066 ram module after all
 
Last edited:
So what bottlenecks NVMe's? CPU?
Full disclosure: I'm not an expert, however how I understand it... Its the filesystem and software. SSDs shine at high queue depths, where you have 30+ load/store instructions penned into a schedule, but when the software isn't really asking for that and instead asking for a handfull of files, checking them, noting down logs for how everything went, then checking to see if everything is ready for them, then checking to see what it needs next: its just sluggish and slow, and queue depths can't get long enough to really make the NVM-E SSD excel.

Instead of a finish line where files are trying to get across as fast as possible, most software is more like the airport security check, where every file is checked in, made sure its correct, passport stamped, random checks, has to unpack its bags and remove shoes, etc.
 
Microfiles.
But games tend to be only a small numbers of extremely well optimized giant files,latest Uncharted games has 40 gig and a 24 gig .psarc files for the textures, a 4.4 gig file for the sounds, 2.3 for the english voices so on.
 
So what bottlenecks NVMe's? CPU?

There's just some basic overhead in the OS finding where a file is located withing the file system, feeding that info to the drive, the drive accessing the file's location, the OS allocating RAM to hold the file, and a bunch of other small things that add up. Throwing the fastest CPU at it will certainly help a little bit with some of that, but due to Direct Memory Access (DMA) the CPU has very little to do with the actual transfer of files from storage to RAM, or RAM to GPU.
 
But games tend to be only a small numbers of extremely well optimized giant files,latest Uncharted games has 40 gig and a 24 gig .psarc files for the textures, a 4.4 gig file for the sounds, 2.3 for the english voices so on.
I wasn't talking about games.

The question was what bottlenecks NVMe (or any storage) ...it's micro files.
 
I wasn't talking about games.

The question was what bottlenecks NVMe (or any storage) ...it's micro files.
Yes I was making a counter point of not being sure for that to be the case has NVMe does not seem much faster in scenario without any micro files and those application that use micro files will often have a cpu work by files ratio so big for it to not matter much (and for those files to fit in ram).

At least it does not address why when there are just a few giant files like video games does nvme does not outperform SATA ssd by much.
 
So what bottlenecks NVMe's? CPU?
Multiple things. Often it's CPU setup and processing time swamping out transfer time. Very frequently, the I/O is random, not sequential; worse, the random I/O is often issued sequentially, so queue depths are very small. NVMe drives aren't all that much better than SATA drives at handling a steady drip of random I/O requests.
But games tend to be only a small numbers of extremely well optimized giant files,latest Uncharted games has 40 gig and a 24 gig .psarc files for the textures, a 4.4 gig file for the sounds, 2.3 for the english voices so on.
That's no help unless the game reads a big chunk of the file and caches it. More typically, there will be some sort of index and the game will only read the texture piece or sound/speech fragment that it needs. As far as the SSD is concerned, that's still small random transfers.

To really see the NVMe benefit, you need to do massive (gigabyte and above) sequential transfers. The transfer size has to be large enough to take a large fraction of a second, or it will be too short to subjectively notice transfer speed differences.
 
That's no help unless the game reads a big chunk of the file and caches it.
Which one would assume optimized game engine do, some game like a COD still made to run on regular PS4 HDD I would imagine try to be in sequential read mode quite a bit
 
Which one would assume optimized game engine do, some game like a COD still made to run on regular PS4 HDD I would imagine try to be in sequential read mode quite a bit
Caches it where? You mentioned 24 and 40 gigabyte files, that's not going to work with a typical 16GB RAM desktop. Even caching an entire 4+ gig phrase file might be an issue, given everything else that a game engine might want/need to keep in memory. Caching stuff you won't need is a waste of time and space, and without clairvoyance, it's pretty hard to predict what will be needed in general; even if it were known, it's not always possible to arrange that data needed sequentially by some particular gameplay runtime are stored sequentially in the file. It's going to depend on the player path through the game. I'm sure there are occasions where it's likely that if you got chunk A, you'll need B and C as well. It's unlikely that those occasions comprise enough data to make sequential read speed differentials among SSD's apparent to the user.

I'll also note that console versions of a game can probably do more aggressive caching, since they can assume that nothing else is running and that the game can use essentially all of available memory. A PC game can't make the same assumption.
 
Caches it where?
In ram, say an expected to be used 14 gig of stuff that goes into ram and VRAM a lot of it can be optimised to be continuous sequential read.

AAA games can assume to have more ram+Vram available it will almost always be the case (and can look to be sure) than on a console and for the GPU VRAM it will not be that different.

I suspect will see once GPU decompression arrive and directstorage quite the speed difference without game having to cache more stuff or be much better than now at sequentially placing large chunk of stuff frequently used at the same time.
 
In ram, say an expected to be used 14 gig of stuff that goes into ram and VRAM a lot of it can be optimised to be continuous sequential read.
That's going to be a problem when there's 16 GB of ram in the computer, and you still need lots of room for game engine code, framebuffers, game logic data, etc etc. VRAM adds another 8 GB or so, but a lot of that is needed for GPU processing, and it's no help when you are dealing with sounds or voice that doesn't go out the video port. We're still pretty far from the point at which game engine designers can assume 32 or 64 GB of RAM.
AAA games can assume to have more ram+Vram available it will almost always be the case (and can look to be sure) than on a console and for the GPU VRAM it will not be that different.

I suspect will see once GPU decompression arrive and directstorage quite the speed difference without game having to cache more stuff or be much better than now at sequentially placing large chunk of stuff frequently used at the same time.
Direct-to-GPU will probably make some difference and work to the NVMe advantage, yes. There's still going to be a fair amount of randomness given that you can't predict the player's path through the game, so it's impossible to always store data sequentially in the file in the same order as the game will demand it. Which implies that you'll end up with random reads no matter how much you try to optimize it.
 
One good way to test if the drive is the bottleneck and by how much could be to install the program on a ram drive and see how faster it get, for many things the gain are quite limited.
 
As someone that works in editing and grading films: yes.

But the question you have to ask yourself is if the workloads you do would see tangible benefits or not. If you're "just a gamer" you don't "need" an NVME SSD, but there is literally no reason to NOT buy one. An entry level NVME SSD is around $100 for 1TB. And even if you aren't "maxing out" it's data rates or pushing how fast it's capable of IOPS or random read/writes, you can be "safe" and be able to assume that it at least won't be an annoying bottleneck inside of your computer. Which is the reason why most people have moved to an NVME drive for at least the drive they have their OS and programs installed on.

And then the 5% of the time (or whatever it is) that you have to copy large amounts of files either on or off the drive you'll be happy knowing it's going as fast as it's capable of going (meaning whatever the target drive was being the limitation).
 
For business, it's a massive improvement. Example is our ERP system on the order of 2.5 TB SQL database x 3. I used to have to run a physical server for each one, each server with a RAID 10 of SATA or SAS SSDs for each database. Additional RAID sets for the OS, log, temp, page file. Significant complexity was removed by consolidating hardware onto NVME storage. When you start moving around terabytes of data it saves days of wait time. Some jobs I'd start Friday night and pray that it finished by late Sunday night. If it didn't, I'd have to stop it before operations began and start all over again another weekend or what was most likely a long holiday weekend.

edit
For desktop use, I'd say there's almost no difference day to day for gaming and general use. My 850 Pro will still deliver similar latency and response times to a 970 evo plus and WD SN750. Another benefit for NVME is what the actual manufacturers are making in big SSDs. You now find 8TB second hand enterprise SSDs for under $500. 15 TB drives for $100 per TB. You just can't find that in a SATA SSD.
 
Last edited:
For gaming probably not. I can tell you one area that it does matter/is noticeable and that is Virtual Instruments. This is for music production and what you do is stream digital samples off the drive, and you can have thousands streaming at the same time. Heavy on the random access. Also, latency is a huge factor. Pro soundcards can be set to extremely low latency, sometimes less than 1ms meaning if you do the data has to get prepped, processed, and out the door in short order.

The solution to this is loading part of the samples in RAM, so you start streaming that while you wait for the disk to catch up. This has been done for over a decade, since back in the HDD days. But the slower the disk the more RAM you have to dedicate to samples, which means the less you can load on a given system. Also at a certain point the disk just maxes out and you drop samples no matter what. Well, some modern samplers will let you cut the RAM use down and up the number of samples streaming if you have faster disks. If you have a fast enough setup, like RAID-0 nVME disks, you can actually try completely disabling RAM preload. It just streams off the disk.
 
Back
Top