The SSD Endurance Experiment: They're All Dead

HardOCP News

[H] News
Joined
Dec 31, 1969
Messages
0
The Tech Report's SSD endurance experiment has come to an end and, as you might expect from the headline, the crew finally killed all the drives.

Today, that story draws to a close with the final chapter in the SSD Endurance Experiment. The last two survivors met their doom on the road to 2.5PB, joining four fallen comrades who expired earlier. It's time to honor the dead and reflect on what we've learned from all the carnage.
 
My main gaming PC has an Intel 335, which happens to be one of the SSDs in this test - the one that retired itself gracefully when Intel said it would. I've had it for three years now and the SMART Media Wearout Indicator is still at 100%. I have not yet used 1% of the writes this SSD is rated for, and I even have the Windows 7 swap file on it.

(copied and pasted myself from the older thread about the test)
 
Intel doesn't have confidence in the drive at that point, so the 335 Series is designed to shift into read-only mode and then to brick itself when the power is cycled. Despite suffering just one reallocated sector, our sample dutifully followed the script. Data was accessible until a reboot prompted the drive to swallow its virtual cyanide pill.

This seems like a horrible plan to me. The fact that after a reboot the drive is a brick, pretty much eliminating easy data recovery for the average Joe. Granted, most of us will never encounter this scenario because of data writing, but I am sure people have hit this due to failures in the drive itself based on other parameters.
 
I'm close to tossing in the towel as well. Mine is giving me grief now after only a few months. Fuk it. I don't experience any performance gain over my Barracuda other than startup times. I'm going back to the old standard. SSD's so long...hardly knew ya.
 
Glad I sprung for the 840 Pro when I built this machine. Coming up on 2 years now and it's been rock solid.
 
I'm close to tossing in the towel as well. Mine is giving me grief now after only a few months. Fuk it. I don't experience any performance gain over my Barracuda other than startup times. I'm going back to the old standard. SSD's so long...hardly knew ya.

My 960GB M500 has been problem free. What's wrong with your Crucial ssd?
 
I'm close to tossing in the towel as well. Mine is giving me grief now after only a few months. Fuk it. I don't experience any performance gain over my Barracuda other than startup times. I'm going back to the old standard. SSD's so long...hardly knew ya.

Really?? You're making us old-timers look bad *waves fist at the kids and their fancy SSD's*. I can't imagine going back at this point. SSD's have been such a "seat of the pants" upgrade, I can't imagine going back to regualr hard drives.
 
Glad I sprung for the 840 Pro when I built this machine. Coming up on 2 years now and it's been rock solid.

Same here.
Only written 3TB in 2 years ao not a good metric.
Its been very quick and trouble free, suffering a lot of abusive reboots.

I had an OCz Vector before that which lasted a few months before it completely died without warning. I was giving OCz another chance and regretted it.
Prior to that I had an OCz Vertex 2 that started reallocating sectors after about a year.
I'm glad OCz snuffed it and wont touch AMDs new "OCz" drives until many many years have passed and they have a good rep.

I want to see the final result of an 850 Pro endurance test.
This one is in progress, 1PB so far with only 16% wear levelling used!
http://blog.innovaengineering.co.uk/
 
I think the test would be more interesting if they wrote the drive for 24 hours, let it sit for 24 hours (off, unplugged, no power), verified the drive, and repeat. That's still way more than a typical user. The way they do it now, if I understand, is they only spot check a bit of the disk each iteration, and the drive is never off / idle.
 
I think the test would be more interesting if they wrote the drive for 24 hours, let it sit for 24 hours (off, unplugged, no power), verified the drive, and repeat. That's still way more than a typical user. The way they do it now, if I understand, is they only spot check a bit of the disk each iteration, and the drive is never off / idle.

Who has the time for that? The Techreport torture test took eighteen months to complete 2.5PB of writes. You appear to be proposing to more than double the length of the test, if I understand correctly.

The test Nenu linked to has written 1PB in the first month, and it looks like they will spend most of the year just getting to the warrantied amount of wear, so it could possibly take years to complete if it soldiers on past warranty like the 840 Pro did.
 
Who has the time for that? The Techreport torture test took eighteen months to complete 2.5PB of writes. You appear to be proposing to more than double the length of the test, if I understand correctly.

Not necessarily double. If you leave the drive unpowered for some time, then the flash will not last as many cycles. In other words, the unpowered data retention time goes down as the flash wears out.

Although I think 24 hours is too short a time. I'd like to see a test where they do one week of writes followed by one week of unpowered time. Repeat. Once an SSD gets to the point where it cannot retain data when it is powered off for a week, then it is quite useless for most people (except for those who leave their computer on all the time, or use their computer every single day and never take a vacation or whatever).
 
Last edited:
I have this strange feeling that anyone here reading this forum is not going to experience any of the failure issues that ended up happening. We will have upgraded our main drives several times and the old SSDs will be sitting in some other less important machine (Like my parent's web surfing home computer) before any trouble might happen. :p
 
I think that they should do this test for every type of SSD which comes along. he tests don't have to be started at the same time. Just set it up with the same conditions with the same metrics and off you go until death. Then they could just update their comparisons with each death. This is great data.
 
I think the test would be more interesting if they wrote the drive for 24 hours, let it sit for 24 hours (off, unplugged, no power), verified the drive, and repeat. That's still way more than a typical user. The way they do it now, if I understand, is they only spot check a bit of the disk each iteration, and the drive is never off / idle.

There's a quote in one of the articles saying:

"Unpowered retention tests were performed after 300TB, 600TB, 1PB, 1.5PB, and 2PB of writes. The durations varied, but the drives were left unplugged for at least a week each time."
 
Interesting test , i`v got 2.2TB writes on my 840 EVO 250GB , thats after nearly a year , used for OS and games , no page file used (24GB RAM)

with the prices of SSD dropping all the time , i cant see how i would even get to kill this drive before upgrading to a bigger and faster one.

BTW with how unreliable some HDD were in the 90`s and early 2000`s , it was the same thing - make sure you upgrade to another drive in 3 years tops or risk bad sectors,dead controllers etc.
 
still using what is now a five year old Crucial C300 128GB SSD in my main machine (come on NVME m.2 drives!),

ran a SSD health utility the other day and apparently still 93% healthy... whatever that means. certainly experiencing no problems five years in.
 
I wish they had been able to test with an Intel drive that used their own chipset rather than SandForce. It would have made for an interesting comparison
 
I'm close to tossing in the towel as well. Mine is giving me grief now after only a few months. Fuk it. I don't experience any performance gain over my Barracuda other than startup times. I'm going back to the old standard. SSD's so long...hardly knew ya.
Wow, an [H] member actually posted that...

ICOM, I kind of feel your pa... fugdat, hand your card in, bro! ;)

Is there a database floating around tracking SSD drive health from users? It would be interesting to see real-world usage statistics from a very large group in a glance.
 
I've had a 100% failure rate at first with SSDs : my M225 64GB from 2009 died after 1 year, and it was my only SSD. Since then I've bought 6 others and they're all fine, even the OCZ Vertex 2 (original one with the good flash) that is now in my parents' computer.

I'm loving the combination of speed and space of my 1TB Crucial M550. All my PCs and servers run SSDs for the OS, only my work computer doesn't since it's provided by my company, I've slipped a SSD in it but M$ detected it and was calling me a pirate so I had to put back the slow drive in.

I think that they should do this test for every type of SSD which comes along. he tests don't have to be started at the same time. Just set it up with the same conditions with the same metrics and off you go until death. Then they could just update their comparisons with each death. This is great data.

By the time the test concludes that a drive is reliable, it is no longer on the market !
 
This is my 2.5 y/o Mushkin Chronos Deluxe 240. As best I can tell there are no issues yet. Intel Sandforce based:

dXNgcxo.png
 
I'm close to tossing in the towel as well. Mine is giving me grief now after only a few months. Fuk it. I don't experience any performance gain over my Barracuda other than startup times. I'm going back to the old standard. SSD's so long...hardly knew ya.

You either set something up wrong, or you've had some seriously bad luck with your drive. My computer boots insanely fast, everything loads super fast. My SSD makes my old 500GB Barracuda drive look like a Snailacuda. I have 2.61 TB written on my 500 GB Samsung EVO 840, and it's rock solid stable. I'm loving it. The only thing I use spinning platters for now is external backups.

I can mention two things that will make your SSD not be your friend. The first thing is having a bad alignment. If the drive is misaligned it will run slow. Alignment has to be set during installation. Samsung's Magician software does this automatically. If you have a Sandisk drive you have to use something like EZGig IV (freeware) and tell it to set the 4K alignment and clone the data from your old platter drive. If your cloning software is old and can't do that, then try something like EZGig IV or get newer software that supports it. If you're doing a clean install of 7 I believe it already aligns properly.

The second thing that will hurt your performance - and I mean seriously hurt it - is not having the drive set to AHCI mode. If you don't have your drive set to AHCI in the BIOS (OK, it's UEFI now but it's just "BIOS with a fancy name"), you will get very little speed benefit whatsoever, especially random reads. You can get as little as 1/10th of your potential speed on this benchmark if you don't have AHCI enabled. Make sure you install an AHCI driver in Windows before doing this, otherwise Windows will bluescreen on the way in.

I have my SSD set up in Windows XP - not nearly as fast as what Windows 7 would run it, but XP is my workhorse development install as I only use Win 7 for newer games right now. Here's my speed results as of my last test:

Samsung EVO 840 500GB:

Sequential Read: 424MB/s
Sequential Write: 511 MB/s
Random Read (IOPS): 80759
Random Write (IOPS): 32327

Seagate Barracuda 1TB:

Sequential Read: 116 MB/s
Sequential Write: 100 MB/s
Random Read (IOPS): 256
Random Write (IOPS): 438

If you're not getting that kind of performance or better, then you've got a bad drive or something set up incorrectly. Throwing in the towel on SSD's over a correctable problem is not [H]ardcore behavior.
 
zrIW2tg.png


Here's the info on my 840 Pro after roughly 2 years of use. Looks like I should be covered for a while. The host write total is surprisingly low on here compared to some, which is mostly because the majority of my data goes on another drive.
 
I'm glad OCz snuffed it and wont touch AMDs new "OCz" drives until many many years have passed and they have a good rep.

OCZ exists in name only now. The brand is wholly owned and manufactured by Toshiba. Assuming equivalencies is pointless.
 
SSD's are so much better at tracking drive use.
Having metrics to gauge expected drive failure is great.
The tech is still maturing. Hopefully not at the cost of too many lost memories...
 
The tech is still maturing. Hopefully not at the cost of too many lost memories...
Yeah, this is still early in the development of solid state storage. One problem with a long term test like TR did is that firmware updates for those tested models may have allowed more graceful failure. Even the Intel drive, which was supposed to go into read-only mode when writes were unreliable, simply became inaccessible after restart. No doubt that and some other firmware-related problems in other models have already been fixed.

And the sample size in that test was too small to make even vague predictions of how each type of flash performs in the long term. The single TLC 840 non-Pro outlasted 2 supposedly more durable drives, including one of the two identical Kingston models which wound up failing before the other by a factor of 3x (~700TB vs ~2.1PB written).
 
Last edited:
By the time the test concludes that a drive is reliable, it is no longer on the market !

It would be like a bad SSD early warning system.

After the Samsung TLC fiasco I'd also like to see some kind of long-term read testing on various drives.
 
The article makes me glad I have an 840 Pro.

I really dislike how some of the drives handle failure though. Going into read-only makes sense, but a drive bricking itself after the first reboot? How is that beneficial to anyone?
 
I really dislike how some of the drives handle failure though. Going into read-only makes sense, but a drive bricking itself after the first reboot? How is that beneficial to anyone?

I don't think the designers have much of a choice. The SSD needs to read its flash allocation table when it initializes so that it can map the LBAs to the flash pages. And the allocation table is stored in flash memory. If the flash memory is so worn out that it has become unreadable, then the SSD cannot initialize the allocation table, and it cannot go into "read-only mode".

The only improvement I can see the designers being able to make would be to add a mode where the initialization has failed but the SSD still responds to ATA commands, albeit with read or write errors whenever any actual I/O is attempted.
 
I've bought over 100 Crucial SSD drives. 1 has failed in 2 years which could have been the RAID card vomitting all over itself. These ranged from 256GB to 1TB which were in servers, desktops, and laptops running Virtual Machines or day to day tasks.


You get a bad one every so often, but I don't believe ALL crucial's are inherently bad, but I had 1% failure rate so far on 2 years of constant use amongst a myriad of devices.
 
Pretty interesting test. I have 3 current SSDs in my gaming rig still plugging along. I have an 840 Pro, Agility 3, and a Crucial MX500 all still plodding along for 2+ years now. I had a Vertex take a plunge about 8 months after part of its connector broke. It finally had had enough and no amount of tweaking was helping it anymore.
 
Been on a 256GB 840 Pro for a couple years now. No errors, read/write is pretty much still around 500 a second.
 
I want to see the final result of an 850 Pro endurance test.
This one is in progress, 1PB so far with only 16% wear levelling used!
http://blog.innovaengineering.co.uk/

We're now at 1.5PB written and still going strong with 24% wear levelling used. The SSD is still on track for 6PB - we will keep the test running until it fails, though that may be some time!

We're also going to introduce periodic 7-day power-offs in response to some of the comments in this thread, which we're grateful for - this will allow us to check that it can still retain data ok. The first of these is scheduled for the 2PB point, which should only be a couple of weeks away. I will post an update once the power-off period is complete.
 
We're now at 1.5PB written and still going strong with 24% wear levelling used. The SSD is still on track for 6PB - we will keep the test running until it fails, though that may be some time!

We're also going to introduce periodic 7-day power-offs in response to some of the comments in this thread, which we're grateful for - this will allow us to check that it can still retain data ok. The first of these is scheduled for the 2PB point, which should only be a couple of weeks away. I will post an update once the power-off period is complete.

That is good news seeing as I just picked up a Samsung 850 Pro 1TB to pair with my 840 Pro 512.
 
OK, 2PB written now and a few things to update on.

The disk is still running fine, and we've powered it off to test data retention - no problems, it successfully read back a huge file that we had stored on it.

But we've also seen the first few sectors on the disk be remapped - 7 so far. There has been no data loss, but these sectors did not erase properly, so the controller swapped them for some of its spares. We will keep a close eye on how this progresses and update here when there is more news.
 
Interesting; in TechReport's endurance test of the 840 Pro, "Reallocated sectors started appearing in volume after 600TB of writes. Through 2.4PB, the Pro racked up over 7000 reallocated sectors totaling 10.7GB of flash." (Source) You've seen the reallocated sectors just beginning at the point when the 840 Pro was almost dead.
 
You've seen the reallocated sectors just beginning at the point when the 840 Pro was almost dead.

That's a good point, though the size of the SSD is likely a factor here too; the 840 Pro that Tech Report looked at was 256GB, whereas the 850 Pro I'm testing is 1TB. That might suggest that the 840 actually did slightly better than the 850, though both did brilliantly.

The 850 Pro endurance test has reached 3PB written now, with just under 800 sectors reallocated. I'm powering it off for a week as a short retention test, and will then set the normal testing going again.
 
Back
Top