Solved (approved for RMA): Unbearably Slow PCIe Gen 4 NVME Speeds

Dopamin3

Gawd
Joined
Jul 3, 2009
Messages
548
This issue is solved: https://hardforum.com/threads/unbearably-slow-pcie-gen-4-nvme-speeds.2029780/post-1045710560
The issue is back after 3 months and I noticed it only happens on older data, fresh files on the drives seem unaffected. Ultimately going to try to RMA to MicroCenter.
Final update: got approved for RMA with MicroCenter on all three Inland drives, using WD SN850X as replacements. Hopefully no more issues now.

1691364466299.png


The rig in question is in my signature below, and I'm on Windows 10 Pro 22H2. The drive in question is an Inland Performance Plus 4TB NVME (supposed to be up to 7200MB/s read and 6800 MB/s write).

Screenshot is from HWInfo64. I was playing Uncharted: Legacy of Thieves and some of the cutscenes were laggy/jumping around. After Googling this it basically just appears to be slow storage, and low and behold that's my issue. I'm currently verifying the game files as well, and it's reading at a steady 22 MB/s. The maximum it ever reached while the PC was on for a few days was 942 MB/s. I primarily use this drive just for games. It has a good bit on it, 3.01 TB used with 636 GB remaining (total capacity 3.63 TB in Windows) but is it normal to see such terrible performance at ~82.5% capacity?

This is running off a chipset fed PCIe 4.0 NVME slot on the X570 Aorus Xtreme, I understand it's not going to be as fast as a CPU provided NVME slot, but c'mon really? I might as well use a hard drive...

edit: I'm on AMD Chipset driver version 4.03.03.431 and the newest on AMD website is 5.05.16.529 so I will update that and post if it changes anything.
 
Last edited:
It has a good bit on it, 3.01 TB used with 636 GB remaining (total capacity 3.63 TB in Windows) but is it normal to see such terrible performance at ~82.5% capacity?
some times. get it down to about 60% and see if it shoots back up in speed.
if its not that hopefully the newer drivers help.
 
Why would read speeds be affected by the amount of free space available?
 
Why would read speeds be affected by the amount of free space available?
SSD manufacturers have been using part of the drive as an SLC cache for many years now. When the drive fills up, that cache is converted to the slower NAND type, drastically slowing down the drive.

Freeing up space and performing the windows drive optimization (TRIM function) usually restores performance until the drive reaches that threshold again.
 
SSD manufacturers have been using part of the drive as an SLC cache for many years now. When the drive fills up, that cache is converted to the slower NAND type, drastically slowing down the drive.

Freeing up space and performing the windows drive optimization (TRIM function) usually restores performance until the drive reaches that threshold again.

Pseudo-SLC cache has no affect on reads, only writes. Also, no SSD is setting aside over half a TB, even when empty. That's also plenty of space to perform TRIM operations. There's no way the amount of available space is causing OP's issue.

DRAM cache (or lack thereof) will affect latency a touch (e.g., finding/placing a file), but not overall read/write speeds.
 
Updating the AMD chipset drivers did not help. When Steam is verifying the game files it still caps out at about 20 MB/s. I'm going to try disabling write caching as this reddit post discusses: https://www.reddit.com/r/buildapc/comments/t0q6d0/steam_write_speed_to_ssd_very_slow_1020_mbs_how/ Maybe verification of game files has this issue even though most people are talking about Steam writing. Maybe write caching enabled in the first place was causing the slow reads? I'm not sure how but I had to have slow reads because that's where the game is installed and the only explanation I can think of that was causing the very glitchy cutscenes.

Now here is the weird thing. I ran CrystalDiskMark on this drive, and it actually seems to be performing like it should on a chipset lane:
1691406872277.png


It will probably be another couple days before I can test again but will post another update once write caching is disabled and any further input is appreciated.
 
Pseudo-SLC cache has no affect on reads, only writes. Also, no SSD is setting aside over half a TB, even when empty. That's also plenty of space to perform TRIM operations. There's no way the amount of available space is causing OP's issue.

DRAM cache (or lack thereof) will affect latency a touch (e.g., finding/placing a file), but not overall read/write speeds.
My bad on the reads, but drive testing has shown your understanding to not be the case with regards to slc cache size.
Also, no SSD is setting aside over half a TB

As to the OPs 22MB/s reads in Steam, I'm not certain but that could be down to file type. As in, a highly compressed file that hits the CPU hard and doesn't go as fast as it's a single threaded function. There have been game updates that go at 60MB/s for my Micron 9300 Pro, and others that are accomplished at a much, much higher rate.
 
Last edited:
Yeah, steam does weird things because its downloading and decompressing at the same time. So never use steam as a benchmark for download or read/write speeds unless you're transferring the file from somewhere else locally.
 
I have no idea what the issue is here, but it's not just limited to Steam and it's affecting all three drives. CrystalDiskMark really doesn't show anything abnormal on the drives, you can see tests on all three here but everything real world (opening programs, moving files in file explorer etc...) is just slow.

So far I have:
  • Verified TRIM is enabled. It already was, but I manually ran TRIM for each drive in Windows "Defragment and Optimize Drives" by clicking Optimize on each of the three.
  • Turned off write caching on each drive in device manager.
  • Updated the driver in device manager under Storage controllers for each drive, changing it from "Standard NVM Express Controller driver" to "Phison NVME 1.2 Storport Miniport" using a Phison driver I found here.
  • Updated AMD Chipset drivers to the latest available, at the time 5.05.16.529.
  • Updated to latest BIOS for my board- F37d, made sure AHCI was enabled and CSM disabled (they already were this way on f36b beforehand which I was on for a while). Basically the only changes in my BIOS are enabling XMP, running UCLK:MCLK 1:1 and slightly undervolting vSOC to 1.125v (it autos to 1.2v which I feel is a little high)
  • "Clean boot" or whatever you call it on Windows 10, disabling all startup items and non Microsoft services.
Right now I'm backing up everything, screenshot attached of how slow it's going (note the destination can do about 300MB/s write sustained EASY, it's an iSCSi share over 2.5GbE). Then I'm going to throw in the towel and physically remove and re-install the drives, secure erase them all, reinstall Windows and see how it goes from there. SMART looks good on all the drives and they all have pretty low TBW. No data seems corrupted.

1691705583916.png


I'd be interested to see any more input on this is. I hope it's not the motherboard or the drives.
 
does internal drive to drive copying slow down too?
Yes, I tried copying a decent size single file (like 5gb or so) and it hovers around 20 MB/s . From one drive to any other, or even to itself (like from c:/ to another folder in c:/)
 
I'm not convinced there's a problem, but anyway, have you tried limiting it to PCIe3? Assuming your BIOS gives you the option...
 
This is running off a chipset fed PCIe 4.0 NVME slot on the X570 Aorus Xtreme, I understand it's not going to be as fast as a CPU provided NVME slot, but c'mon really? I might as well use a hard drive...
When I added a third Nvme drive to my raid 0 array on X570 it actually slowed access to the array by a considerable amount for small random file reads and writes. So much so that I broke the array and rebuilt it with only two drives. The third drive via chipset just bottle necked the crap outta the array even though the sequential speeds were over 10K MB/sec it just felt slower in everyday tasks.
 
Those Crystal Disk tests at 1GiB is not enough...up it and you'll most likely see the change.
 
I'm not convinced there's a problem, but anyway, have you tried limiting it to PCIe3? Assuming your BIOS gives you the option...
I have not attempted this yet but I will try before nuking everything. Pretty sure I saw that option while navigating around in the BIOS. Another day or two until everything is copied so won't be able to test for a little bit. Even if I halve my theoretical speeds, that's fine and would be a good solution.
Those Crystal Disk tests at 1GiB is not enough...up it and you'll most likely see the change.
Here is on one of the drives not in use right now (other two are copying to network), 1 pass on the largest size (64GiB) and it looks pretty decent to me. Doesn't add up why it's so slow real world.
1691756487120.png


learners permit I can't quote your post for some reason, but I'm not doing these in RAID 0. Each one is separate. I understand why that issue would happen in your scenario but these are three separate drives. I expected the chipset provided m.2 slow to be slower, but not 20MB/s slow.
 
Have you tried disabling any and all antivirus and antimalware? The intermittent slow down sounds to me like it could be software. You might also try booting into a USB linux environment and then test a simple file copy. If it's still at 20 mbps in Linux, at least you've eliminated the OS and any software standing in your way.

Also, you could start doing what you do that's limited to 20 mbps and watch Resource monitor > Disk while it happens. See if you can spot what's consuming resources.
 
I'm not convinced there's a problem, but anyway, have you tried limiting it to PCIe3? Assuming your BIOS gives you the option...
After trying PCIe 3 mode it limited CrystalDiskMark to around 3600MB/s, but the same issue persists in actually doing anything on the PC.
Have you tried disabling any and all antivirus and antimalware? The intermittent slow down sounds to me like it could be software. You might also try booting into a USB linux environment and then test a simple file copy. If it's still at 20 mbps in Linux, at least you've eliminated the OS and any software standing in your way.

Also, you could start doing what you do that's limited to 20 mbps and watch Resource monitor > Disk while it happens. See if you can spot what's consuming resources.
I only run Windows Defender, but yes the issue persists with it completely disabled (along with tamper protection and whatever else). Resource monitor doesn't really reveal much. Here I am attempting to copy a 90GB iso file from my lowest capacity drive to itself:

1691895165751.png



Is there potentially an issue here with the alignment of my partitions? I've never really messed with this but started reading about how bad alignments can cause weird performance.


1691895293672.png


Edit: even in Ubuntu live CD it seems like the performance is very slow on the "Timing buffered disk reads" after doing hdpartm -tT on each device

1691884146834.png
 
Last edited:
I am out of ideas but this is an interesting thread. Now it looks like hardware issue again. Is there another PC you could test those drives in? Different mobo, CPU and all that.
 
Well, this issue is now fixed. The last two steps I did:

#1: I modified the Windows registry and power plan according to this Reddit post. After rebooting, all three drives performed abysmally slow. Whether copying to the same drive, or across drives.
#2: I then formatted / secure erased both of my non boot drives from a linux boot environment with "nvme format -s1 /dev/devicename"

I did not touch the boot drive where Windows was installed. I copied everything back from my 2.5GbE iSCSI share, and it basically ran at 300MB/s the whole time (at times during many small files, it would dip a little, as to be expected). Now Uncharted cutscenes don't skip, and I'm getting fast read and writes in real world usage (>1Gbps on a random folder copy containing various file sizes). I tested each of the three drives, whether copying from one to another or to itself. The read and write speeds are not terribly slow.

I don't exactly understand why this fixed it, but it did. I'm still puzzled why CrystalDiskMark showed fast speeds but real world usage was terrible. This motherboard runs my boot drive off the CPU, the two others are from the chipset. The issue was happening across all drives, including the boot drive being limited to 20ish MB/s. Now doing various real world tests, it's performing as I would expect. All drives are filled back to the same capacity filled they were (C:/ CPU fed drive: 37.6%, D:/ 88%, E:/: 15%) and they all perform great now.

For a lack of technical explanation, maybe something "screwy" was happening on one of the two drives I wiped that would cause this to happen? I hope the drives themselves aren't having an issue. SMART data is good on all three drives, I get they aren't name brand being Inland but they have the Phison E18 controller, and Micron 96L TLC (I don't have the new revision with 176L TLC) with a 3000TBW, 6 year warranty. I'd be curious to see what opinions are on what the cause of the issue would've been in the first place. When I initially built the PC I didn't have these issues, it just happened probably within the last month. The Inland drives have been in my system since late May, 2022.
 
Glad you solved it! I'm not totally surprised the CrystalDisk benchmark was a "lie" though, synthetic benchmarks are never to be trusted too much haha.
 
In hind sight, it probably would have been good to try some other benchmarks. I've not experienced an issue where real world performance was so far removed from CrystalDisk Mark like this. How wiping secondary drives improved performance on your boot drive is weird. To me I would presume that there was some corruption or mechanical/flash issue on a secondary drive that both Windows and Ubuntu were struggling with. That caused some subsystem to struggle against it slowing storage down across the machine.

For instance, just this weekend, I was trying to recover data from a very slow HDD that was in an HMI machine that connected to PLCs. Stuck the drive in a USB to SATA caddy and every time I would try to copy data off it the mouse would disappear for 5+ minutes while Windows struggled to access the data. After the 5 minutes the disk would "wake up" and copy at upwards of 150 mbps. Chkdsk found major corruption after I was able to back up critical data. Something like that going on where other related process struggle that are thinly unrelated.
 
Welp, this issue is now back. CrystalDiskMark seems fine on any size test, but any actual usage results in 100% Read Activity (shown in HWInfo) at around 10 - 20 MB/s read regardless if on 3.0, 4.0 or auto. Quick test of Uncharted gave a laggy intro screen and super delayed dialogue/glitchy cutscenes at the part I left off on.

I wonder if some of this could be a physical connection issue? The top NVME looks a little suspicious with how much you can see under the motherboards included heatspreader.... Obviously can't see the other two. Honestly I can't remember if I used an m.2 screw on the standoff, I might've just screwed down the heatspreaders at the time. I swapped GPUs last week so took a pic, which I didn't notice it at the time (and yes disregard broken PCIe retention clip... I know, I broke it off a long time ago)

1700008592357.png


Probably won't have a chance to tear it back down until next weekend, at that time I'll make sure the drives are seated good and have m.2 standoff screws. Would this potentially have anything to do with it? Seems like the same issue from before is back and is affecting all three of my NVMEs.
 
Last edited:
Have the Nvme and chipset temps been checked for high temp. I know when I had 3 Nvme the chipset got way hotter on X570. Added a ghetto ass fan to keep it cooler as the chipset was at 88 and that puny fan wasn't cutting it.
 
Have the Nvme and chipset temps been checked for high temp. I know when I had 3 Nvme the chipset got way hotter on X570. Added a ghetto ass fan to keep it cooler as the chipset was at 88 and that puny fan wasn't cutting it.
I think they've been fine. The X570 Aorus Xtreme has a chipset heatsink with a heatpipe connecting to an overkill 16 phase VRM heatsink that runs very cool. I have good airflow in the case as well. I don't have numbers off the top of my head but in HWInfo64 I think I remember seeing the NVMEs in the ~50-58C range and chipset in the ~45-55C range.

1700054122357.png


So here's the weird thing too, the last time this happened all three showed poor "Timing buffered disk reads" with a hdparm -tT test in Linux, but now only two do it.

I'll update again once I have a chance to reseat all three NVMEs and ensure they have a proper M.2 standoff screw rather than just held in place by the motherboard heatspreaders. I went Googling around a lot about an issue like this (essentially fast/normal Timing cached reads and really slow Timing buffered disk reads in hdparm) and found one user who fixed this issue by reseating the NVME (their speeds were slower than expected, but still not as bad as mine.)

Last thing I can do is try swapping in a 3950X in place of the 5950X. If the 3950X doesn't do this I'll contact AMD for RMA on the 5950X. My gut says this isn't an issue with the NVMEs or the motherboard.... Hopefully it's just some of the NVMEs aren't seated well, and removing them and reseating properly will correct it. Or if the behavior doesn't happen on a 3950X I would blame the 5950X providing PCIe lanes. My IMC is pretty decent on this chip, but curve optimizer isn't even stable at -10 which I found odd for a 2022 purchased B2 stepping.
 
I have more than a dozen of the same model Inland 4TB Perf Plus drives, they're excellent, and performance still screams even while getting hammered 24/7 and some of them at very close to full capacity, so I don't think your symptoms are inherent to the make/model drive. They're gems since Microcenter's warranty is so generous relative to rest of industry. Regardless, when I encounter any drive weirdness:

Step 1: Event Viewer -> Windows Logs -> System. Check for any warnings related to the drive (NTFS errors, WHEA-17 errors which are symptomatic of PCIe signal integrity issues, etc)
Step 2: Monitor drive temp with CrystalDiskInfo or HardDiskSentinel while slow transfer is happening.
Step 3: Open Task Manager and see what disk utilization is showing, and then see if there are any unknown/weird processes that might be hammering the drive without your knowledge.
Step 4: Test the drive in a different known good PC, and/or test a different known good NVME drive in the same slot that your problematic drive currently occupies

NVMe secure erase in PartedMagic or a linux distro will also often fix a problematic/stuttering SSD, which you're already aware of. I don't *think* your issue is related to runaway drive temp, because even if it wasn't making proper contact with the motherboard's M.2 heatsink, throttling internally still wouldn't produce throughput that low.

Finally, I used to have that same motherboard, transfer speeds always flawless on all M.2 slots, however Ican't remember if there's a setting in BIOS that relates to DMI link speed (the PCIe speed at which the chipset and CPU communicate), like exists on Intel motherboards, but on Intel I usually make sure its set to PCIe4 rather than Auto. That's probably not your issue, and if you're running latest motherboard BIOS and chipset/AGESA then default settings probably fine.
 
Last edited:
I have more than a dozen of the same model Inland 4TB Perf Plus drives, they're excellent, and performance still screams even while getting hammered 24/7 and some of them at very close to full capacity, so I don't think your symptoms are inherent to the make/model drive. They're gems since Microcenter's warranty is so generous relative to rest of industry. Regardless, when I encounter any drive weirdness:

Step 1: Event Viewer -> Windows Logs -> System. Check for any warnings related to the drive (NTFS errors, WHEA-17 errors which are symptomatic of PCIe signal integrity issues, etc)
Step 2: Monitor drive temp with CrystalDiskInfo or HardDiskSentinel while slow transfer is happening.
Step 3: Open Task Manager and see what disk utilization is showing, and then see if there are any unknown/weird processes that might be hammering the drive without your knowledge.
Step 4: Test the drive in a different known good PC, and/or test a different known good NVME drive in the same slot that your problematic drive currently occupies

NVMe secure erase in PartedMagic or a linux distro will also often fix a problematic/stuttering SSD, which you're already aware of. I don't *think* your issue is related to runaway drive temp, because even if it wasn't making proper contact with the motherboard's M.2 heatsink, throttling internally still wouldn't produce throughput that low.

Finally, I used to have that same motherboard, transfer speeds always flawless on all M.2 slots, however Ican't remember if there's a setting in BIOS that relates to DMI link speed (the PCIe speed at which the chipset and CPU communicate), like exists on Intel motherboards, but on Intel I usually make sure its set to PCIe4 rather than Auto. That's probably not your issue, and if you're running latest motherboard BIOS and chipset/AGESA then default settings probably fine.
This is all good info/recommendations, thank you. That's reassuring that you've had good luck with 12 of these drives. Event Viewer shows no weird errors/warnings (a few random things about BITS but I expect that on any PC running Windows). Temps seem fine. No weird processes are hogging I/O in Windows and the same terrible speeds occur running hdparm -tT off of a live Linux boot flash drive. I had this issue before and secure erasing the two chipset provided drives which just store data (Windows lives on the CPU fed NVME) fixed the issue. The speed shot back up to normal on all three drives even though one was untouched. I want to avoid doing this again because ultimately I feel like this issue will crop up after a short time. I don't really have other systems I can easily test these drives in, so I'm going to reseat the NVMEs first and check behavior, if still slow then swap in a 3950X and test behavior. If it's still performing like this I will have to test them in another system which will only be PCIe 3.0 capable BUT at least this can verify they don't get stuck at 100% activity at around 20 MB/s making them practically unusable.

I don't have physical access to the machine but RDP'ed in (hooray Apache Guacamole). Just for fun I simultaneously ran CrystalDiskMark on all three drives at the same time. Temps seem fine (learners permit you asked about this too). I think it's funny you can tell the chipset provided NVMEs are bandwidth starved when both are in use simultaneously which I sort of expected.

simultaneous.PNG

1700092317024.png

1700092330910.png


Will make a final update in the next week or two after getting hands on time. I'm about exhausted trying to attack this just from a software/OS/BIOS point of view. If I've gone through all this hassle just because the drives weren't seated properly I'll have to laugh. If my 5950X (still in warranty) is messed up providing PCIe I'll get by with an RMA. If ultimately the behavior is exhibited on the 3950X with a reseat, I'll physically test them one by one in another machine and contact Gigabyte (board still in warranty). If it does come down to the NVME exhibiting the same behavior on an entire different system (I highly doubt it) this might be a weird issue to explain to Microcenter because benchmarks and SMART data all appear normal.
 
They weren't mounted on the motherboard with the M.2 standoff (just the motherboard heatspreader) but even after properly mounting them, no change. Testing the drives in another system- no change. I've found three examples with the Corsair MP600 Pro XT that have near identical issues and a negative review on the Inland drive. They are both Phison E18 drives with probably similar componentry so I think relevant. Here is some interesting things:

1) Two separate examples of Corsair MP600 Pro XT 2TB NVME
2) An example of Corsair MP600 Pro XT 4TB NVME
3) A review on Microcenter's site for this exact model I have:
1700486438297.png


Does anyone have input on something like this? This is one of the weirdest issues I've encountered. Is it maybe a bad batch of NAND, controller or something? I would think firmware update could potentially fix it but it didn't fix it for a Corsair user and Inland doesn't even provide firmware. I'm not going through the hassle of secure erasing again just to have this issue happen to me after a few months.
 
Interesting to me that the drives have the same problem in another system. Definitely sounds like the problem is following the drives wherever they are instead of it being other hardware at fault, though these things do get complicated to troubleshoot. Really feels like it's the drives, but for 3 of them all to have an identical problem is kinda wild to me. Bad batch was a thought I had as well, but that seems like incredibly bad luck.

If you have any other nvme drives, if they work fine in these systems that would also point to these particular drives being at fault... the fact that they work fine for a while after an erase almost sounds like there's some firmware bug that worsens over time (or is doing something weird re: TRIM or caching once the drive has had some capacity used on it for a while...). Either way, very strange behavior. I would expect drives with high amounts of capacity usage to potentially slow down, but 20mb/sec? That seems like abnormally bad even for those situations.
 
Partition alignment issue? 512 byte vs 4096 byte sector size issue? I have no idea what else could be if the problem follows the drive to another system.
 
So just another thing that I never noticed with this issue until now... It doesn't happen on "fresh" data (this is mentioned in the Reddit thread, example 2 I linked).

Copying from an old SATA SSD to one of the Inlands I'm getting the expected speed:
1700521908414.png

Copying that folder to the same Inland drive from itself:
1700521955233.png

Copying an existing folder that has existed on the Inland to itself since I secure erased it in August:
1700522053079.png


This is just so weird. Contacted Microcenter support yesterday, they are asking for my invoice number since they couldn't open the Google Drive PDF I linked.
 
So just another thing that I never noticed with this issue until now... It doesn't happen on "fresh" data (this is mentioned in the Reddit thread, example 2 I linked).

Copying from an old SATA SSD to one of the Inlands I'm getting the expected speed:
View attachment 615019
Copying that folder to the same Inland drive from itself:
View attachment 615021
Copying an existing folder that has existed on the Inland to itself since I secure erased it in August:
View attachment 615022

This is just so weird. Contacted Microcenter support yesterday, they are asking for my invoice number since they couldn't open the Google Drive PDF I linked.
What are your replacement options for drives of that capacity? Anything comparable or are going to have to purchase elsewhere?
 
What are your replacement options for drives of that capacity? Anything comparable or are going to have to purchase elsewhere?
I just ordered WD SN850X 4TB models and will use those, should come in Wednesday. I'd still like to get these drives replaced under warranty and use one to see if it exhibits the same behavior again.
 
Cloning my Windows drive took 32 hours to the SN850X (used Windows PE based EaseUS Todo Backup Home 2023.) I didn't clone the other two drives, but just copied over the folders/files to them which took a similar amount of time due to the insanely slow reads. The cloned drive just worked, didn't have to mess with MBR/bootrec or anything for my Windows 10 install.

Up and running now with no issues. Updated the firmware on all three to the latest "624361WD" in the WD Dashboard application and it only took a few seconds for each drive which. Got approved for an RMA on all three Inland drives with Microcenter support, so I'll use them again in the future and hope I just received a "bad batch." In all my searching I could only find three examples of people who had an issue like this and they were all on Corsair MP600 Pro XT drives, which also used the Phison E18 controller and potentially other similar componentry. Maybe Phison had a brief bad run of controllers?

This was quite the rollercoaster and in all my years I've never seen any SATA or NVME SSD behave like that.

1701034546071.png
 
Back
Top