Backblaze Analyzes SSD Reliability

AlphaAtlas

[H]ard|Gawd
Staff member
Joined
Mar 3, 2018
Messages
1,713
BackBlaze regularly posts failure rates for their substantial collection of hard drives, and according to the results they published just a month ago, they have over 100,000 of them to test. But as we've recently noted, flash memory prices are dropping like a rock, hence solid state drives are quickly becoming a somewhat economical alternative to 7200 RPM spinners.

But just how reliable are these drives? According to a recent blog post, Backblaze thinks that SSDs are "generally" more reliable than HDDs under most workloads, though the factors that affect SSD reliability are different. As their name would suggest, SSDs have no moving parts, hence they're more tolerant to shock, vibration, and temperature changes, but that also means that users get no audible indicators when they do start failing. Flash memory can eventually wear out too, and it can wear out relatively quickly in QLC SSDs, but Backblaze says "SSDs can be expected to last as long or longer than HDDs in most general applications." Unfortunately, the backup company isn't backing up their claims with hard data yet, but other publications have torture tested SSDs before, and I expect it won't be long before Backblaze starts posting SSD failure rates as well. Thanks to AceGoober for the tip.

SSDs are a different breed of animal than a HDD and they have their strengths and weaknesses relative to other storage media. The good news is that their strengths -speed, durability, size, power consumption, etc. - are backed by pretty good overall reliability. SSD users are far more likely to replace their storage drive because they're ready to upgrade to a newer technology, higher capacity, or faster drive, than having to replace the drive due to a short lifespan. Under normal use we can expect an SSD to last years. If you replace your computer every three years, as most users do, then you probably needn't worry about whether your SSD will last as long as your computer. What's important is whether the SSD will be sufficiently reliable that you won't lose your data during its lifetime.
 
Last edited:
I've owned probably 25 SSDs and used countless others at work (IT), I would love to see these stats when backblaze finally releases the data.

The highest failure rates have been with first generation SSDs that ran SATA-II and more recently with Samsung 850 / 840 pro 1tb / 512gb models. Knock on wood - I have never had a Samsung EVO or Samsung Enterprise SSD die on me.
 
Shockingly, my original 30GB OCZ Vertex 1 still works and is still in use in a buddies' rig to this day. Can't say the same for my two Vertex 2s which both died fast.
 
BackBlaze regularly posts failure rates for their substantial collection of hard drives, and according to the results they published just a month ago, they have over 100,000 of them to test. But as we've recently noted, flash memory prices are dropping like a rock, hence solid state drives are quickly becoming a somewhat economical alternative to 7200 RPM spinners.

But just how reliable are these drives? According to a recent blog post, Backblaze thinks that SSDs are "generally" more reliable than HDDs under most workloads, though the factors that affect SSD reliability are different. As their name would suggest, SSDs have no moving parts, hence they're more tolerant to shock, vibration, and temperature changes, but that also means that users get no audible indicators when they do start failing. Flash memory can eventually wear out too, and it can wear out relatively quickly in QLC SSDs, but Backblaze says "SSDs can be expected to last as long or longer than HDDs in most general applications." Unfortunately, the backup company isn't backing up their claims with hard data yet, but other publications have torture tested SSDs before, and I expect it won't be long before Backblaze starts posting SSD failure rates as well. Thanks to AceGoober for the tip.

SSDs are a different breed of animal than a HDD and they have their strengths and weaknesses relative to other storage media. The good news is that their strengths -speed, durability, size, power consumption, etc. - are backed by pretty good overall reliability. SSD users are far more likely to replace their storage drive because they're ready to upgrade to a newer technology, higher capacity, or faster drive, than having to replace the drive due to a short lifespan. Under normal use we can expect an SSD to last years. If you replace your computer every three years, as most users do, then you probably needn't worry about whether your SSD will last as long as your computer. What's important is whether the SSD will be sufficiently reliable that you won't lose your data during its lifetime.

Well, considering the fact BackBlaze uses consumer mechanical drives for their backup, it's clear that using SSD is simply not cost effective for them, regardless of reliability. For backup density is just as important if not more than cost and you can get huge mechanical drives for a lot less than a big SSD which don't get big enough yet. Sure they are coming down in price very fast but not quite there yet. They can probably replace a bunch of their failed mech drives for the price of one big SSD. As for wear, it just depends on how they are used. We have a heave image processing machine and it can easily wear out a 256-512GB SSD withing a year or so. Granted that for our application it's totally worth it with SSD due to performance, nobody cares if we have to replace a drive as it has served its purpose but for BakcBlaze and their backups I'm not sure how important is performance versus reliability.
 
I've owned a good dozen SSDs and the worst brand was OCZ. 5 out of the 7 OCZ SSDs failed within a year. By a stroke of luck the other 2 are still going.

I'm currently running a Crucial M4 512GB with 13,257 PoH and 6,389 PCC and it is still going great and that is, according to Crucial, only 2% Lifetime Used. I'd be hard pressed to find any mechanical drive with those reliability numbers.
 
I've owned a good dozen SSDs and the worst brand was OCZ. 5 out of the 7 OCZ SSDs failed within a year. By a stroke of luck the other 2 are still going.

I have a 1st gen Vector that's still going strong.

I also have a Agility 4, but since it's a bottom barrel OCZ SSD, I usually try not to jinx it...
 
I've recently started monitoring all of our drives at work and find it fascinating how SSDs deteriorate no matter what and I have some enterprise grade HDDs that are years old with almost zero errors and claim to have 100% life left. Don't get me wrong, I put SSDs wherever I can but when I want solid data reliability over speed I'll toss in HDDs all day long.
 
My 160gb sata II intel ssd I paid 700 bucks for in 2009 is still going strong. It has been in more rigs than any other part except for one power supply.
 
I have a 1st gen Vector that's still going strong.

I also have a Agility 4, but since it's a bottom barrel OCZ SSD, I usually try not to jinx it...


You must have some extreme luck.

I've owned 6-8 (can't remember) OCZ drives including Warranty replacements. Every last one of them failed before the two year mark. Granted, this were designs from before Toshiba bought their assets in 2013. Maybe they've been better since?

In general - however - my conclusion is that SSD reliability is barely an issue anymore. Sure in the early days you had to worry about write endurance, but these days, as long as you buy good brands (essentially Samsung or Intel) even TLC drives are reaching the petabytes written level before they fail.

Write endurance just is not an issue on good SSD brands (which I limit to just Samsung and Intel) these days.
 
I've had 2 or 3 ocz ssds die out of about 10. I have OCZ Summit, OCZ Vertex, OCZ Agility and I think 1 or 2 other models.

Only 2 that died were OCZ Vertex....
 
I've had 2 or 3 ocz ssds die out of about 10. I have OCZ Summit, OCZ Vertex, OCZ Agility and I think 1 or 2 other models.

Only 2 that died were OCZ Vertex....

My first several SSD's were OCZ, and I naively kept buying them for a while.

- First Gen Agility 120GB
- OCZ Agility 2 120GB (RMA replacement for first Agility)
- OCZ Agility 3 120GB (second RMA replacement for Agility)
- OCZ Octane 60GB
- OCZ Onyx 32GB
- OCZ Vertex 3 256GB
- OCZ Vertex 3 256GB (RMA replacement for first one)
- OCZ Vertex 4 256GB (second RMA for Vertex 3)
- OCZ Vector 256GB (Third RMA replacement for Vertex 3)

The 120's and 256's were all for my desktop. The smaller ones were for router and server boot drives.

All of the above are dead. All of the ones that saw regular use died within 2 years.

I actually still had the final Vector sitting in a drawer, because I didn't have confidence enough in it to use it, until about 2 years ago when I stuck it in my stepsons desktop where it finally died a year later.

So, when it comes to OCZ SSD's I am 0/9.
Not sure why I kept using them after the first few failures. Maybe I was just convinced that it was bad luck, or that this was just part of the risk of using SSD's and worth it for the performance. Not sure.

Eventually I wised up though, and have only bought Intel and Samsung SSD's now for like 6 years or so. I've easily bought 30 of those in that time (I've lost count) both new and used since then, and not a single one has failed, even the heavy use cache drives on my server.

My conclusion is that OCZ was utter trash.

I will give them kudos on their RMA process though. It was painless. I guess practice DOES make perfect :p
 
I've recently started monitoring all of our drives at work and find it fascinating how SSDs deteriorate no matter what and I have some enterprise grade HDDs that are years old with almost zero errors and claim to have 100% life left. Don't get me wrong, I put SSDs wherever I can but when I want solid data reliability over speed I'll toss in HDDs all day long.

Reliable onsite storage: quality NAS + stack of enterprise HDDs in RAID5 / RAID6
 
I've had only 3 SSDs deaths. Two of those are OCZ Vertex 4, the other one is a Samsung 840 Evo, but this one died because of a short circuit along with the motherboard (apparently a mouse peed inside the computer).

The most worn-out drives are various models of 120GB ADATA branded drives used for office computers. The longest running drivers are a pair of 80GB Intel 320 (used in a server in RAID1 since 2011) and a 120GB OCZ Vertex II (since 2012 as boot drive in a file server).

Then there are a bunch of 840, 840 Evo, 850 Evo, some Crucial MX300 and BX100, Sandisk Ultra Plus, Ultra II, and also some Intel 710. The newer ones in use are a couple of Samsung 860, Intel 545s and HP EX920.

I may have missed one or two odd models.

EDIT: I remembered an array of 256GB 850 Pro and another one of 512GB 850 Pro that run 24/7.
 
I've owned a good dozen SSDs and the worst brand was OCZ. 5 out of the 7 OCZ SSDs failed within a year. By a stroke of luck the other 2 are still going.

I'm currently running a Crucial M4 512GB with 13,257 PoH and 6,389 PCC and it is still going great and that is, according to Crucial, only 2% Lifetime Used. I'd be hard pressed to find any mechanical drive with those reliability numbers.

My first SSD was the Vertex which I used 2 for 3 years in a workstation that was on 24/7 and never had a problem. So far none of the SSD's I have ever bought has needed to be replaced. Though admittedly I've almost exclusively stuck to Samsung's Pro and a few EVOs since the Vertex 2, recently bought a few Crucial M500's as cache drives for my nas so I will see how that goes.
 
Reliable onsite storage: quality NAS + stack of enterprise HDDs in RAID5 / RAID6

RAID 5/6 is not recommended anymmore, as rebuid times takes ages on modern high capacity drives and poses a risk of another drive dying during the rebuild. Most storage arrays now use RAID 10.
 
  • Like
Reactions: DocNo
like this
Amongst the 9 personal and 200+ work SSDs I've dealt with, I have yet to experience a single failure.

Dozens of spindle drives have shit themselves, however.
 
I on the other hand have an old OCZ First gen Agility still running in my gfs laptop. I think its over 6 years old now. No issues. It was run 5 years as my OS drive 24/7 until it got placed into my ladies laptop 2 years ago. She still uses it daily. Its not all trash. Sorry you had some bad luck there :(


My first several SSD's were OCZ, and I naively kept buying them for a while.

- First Gen Agility 120GB
- OCZ Agility 2 120GB (RMA replacement for first Agility)
- OCZ Agility 3 120GB (second RMA replacement for Agility)
- OCZ Octane 60GB
- OCZ Onyx 32GB
- OCZ Vertex 3 256GB
- OCZ Vertex 3 256GB (RMA replacement for first one)
- OCZ Vertex 4 256GB (second RMA for Vertex 3)
- OCZ Vector 256GB (Third RMA replacement for Vertex 3)

The 120's and 256's were all for my desktop. The smaller ones were for router and server boot drives.

All of the above are dead. All of the ones that saw regular use died within 2 years.

I actually still had the final Vector sitting in a drawer, because I didn't have confidence enough in it to use it, until about 2 years ago when I stuck it in my stepsons desktop where it finally died a year later.

So, when it comes to OCZ SSD's I am 0/9.
Not sure why I kept using them after the first few failures. Maybe I was just convinced that it was bad luck, or that this was just part of the risk of using SSD's and worth it for the performance. Not sure.

Eventually I wised up though, and have only bought Intel and Samsung SSD's now for like 6 years or so. I've easily bought 30 of those in that time (I've lost count) both new and used since then, and not a single one has failed, even the heavy use cache drives on my server.

My conclusion is that OCZ was utter trash.

I will give them kudos on their RMA process though. It was painless. I guess practice DOES make perfect :p
 
I'm still using my first gen 80gb Intel SSD.

Those were/are completely bulletproof. Some embedded systems I worked on used those (in hostile environments!) and even now with a bazillion units in the field, the failure rate due to that part is essentially zero.

Models since then have not fared as well in the same conditions.
 
I've only had a single SSD failure in all the drives I've bought, and that was a first gen OCZ Agility 60gb.

Several Crucial, and one Intel drive are all still running, or have been retired before failure. Mot have exceeded 7 years run-time.

ALL storage drives are way more reliable than they used to be, but SSDs are in a class of their own.
 
Those were/are completely bulletproof. Some embedded systems I worked on used those (in hostile environments!) and even now with a bazillion units in the field, the failure rate due to that part is essentially zero.

There was a firmware bug that caused data loss and the drive was recognized as having 8MB capacity. I never saw a drive with that problem though.
 
There was a firmware bug that caused data loss and the drive was recognized as having 8MB capacity. I never saw a drive with that problem though.

I've seen it three times, but they were all the Gen2 drives.

I will not touch an PNY drive. 23/20 have in one batch failed. Yeah, those numbers are right, because the RMA'd units failed as well! I have the only 2 non-failed units, and they have "DO NOT USE" written in silver sharpie on them just to make sure they don't go in production devices.

Our biggest problem was the Crucials that had a firmware bug that hit after so many hours of use.
 
I've bought a lot of SSD's for work, and the only SSD that's died on me was a Sandisk 960GB model.
Out of dozens of EV's and Crucial's, I only had 2 Sandisk, so that's not a good average.

I have ran into Bios problems on a few older crucial drives and the Samsung 840 EVO's, but after a bios update they worked like new again.
 
RAID 5/6 is not recommended anymmore, as rebuid times takes ages on modern high capacity drives and poses a risk of another drive dying during the rebuild. Most storage arrays now use RAID 10.

Yup - parity RAID is a looser. Not going to be as reliable as you need it to be and permanent write performance penalty. Just say no! RAID 10 FTW
 
I've seen it three times, but they were all the Gen2 drives.

I will not touch an PNY drive. 23/20 have in one batch failed. Yeah, those numbers are right, because the RMA'd units failed as well! I have the only 2 non-failed units, and they have "DO NOT USE" written in silver sharpie on them just to make sure they don't go in production devices.

Our biggest problem was the Crucials that had a firmware bug that hit after so many hours of use.

I bought 8, PNY Optima 240's 5yrs ago for my business. Only 1 is left alive. the 1 in my gaming rig, yeah I'm worried.
 
Only one SSD failure since 2009 in my personal systems, on an OCZ Core 60GB (truly awful). Last two (which are probably still in my sig) were Corsair 240GB Force GTs that I sold on at about 45,000 power on hours.

I think outside of certain heavy use scenarios or firmware faults most home users won't have reliability issues. That said the trend towards cheap products like cacheless SSDs is rather frustrating.
 
All i got so far is 2 500gb SSD blue from WD ... First version and new 3d version.
I entered late in the game, but won't go back.. no way.
Now if realiability was all like OCZ described here, wow! Thats terrible.
 
There was a firmware bug that caused data loss and the drive was recognized as having 8MB capacity. I never saw a drive with that problem though.

I saw it once; lost the data (unimportant in this case) but recovered the drive and continue to use it to this day. Have two of the drives still in service, actually.
 
Rolled out probably 150+ SSDs over the past 7 years or so. So far only lost two. A couple of Sandisk ones. But that's just luck of the draw IMO.

A lot of them were just the cheap A300 Kingstons. They got a lot of flack back then but they were a solid reliable SSD. And now probably far better quality than midrange SSDs today.

Only ever bought a couple of 64GB OCZ. They got phased out and forgotten long before they failed.
 
I'm still using two Samsung Spinpoint F1, one 750gb and one 1.5tb HDD's as a general storage drives and for games that I am not so picky about loading times. They are like 11 years old now, I bought them when I built my 2500K Sandy Bridge rig and now they sit in my Ryzen rig. No issues, no weird sounds, nothing. I am quite convinced these platters are eternal, knock on wood. In any case, a high quality HDD last a long, long time. I wonder if my Kingston SSD can do the same but it has been working flawlessly for couple of years now.
 
I'm still using two Samsung Spinpoint F1, one 750gb and one 1.5tb HDD's as a general storage drives and for games that I am not so picky about loading times. They are like 11 years old now, I bought them when I built my 2500K Sandy Bridge rig and now they sit in my Ryzen rig. No issues, no weird sounds, nothing. I am quite convinced these platters are eternal, knock on wood. In any case, a high quality HDD last a long, long time. I wonder if my Kingston SSD can do the same but it has been working flawlessly for couple of years now.

My 1TB Spinpoint F1 died a couple years ago... I had it since around 2007. Most reliable HDD I ever had. Still, HDDs have been very unreliable fire the for decade.
 
My 1TB Spinpoint F1 died a couple years ago... I had it since around 2007. Most reliable HDD I ever had. Still, HDDs have been very unreliable fire the for decade.

Hmm... It could be that mine are on their last legs too, but aren't showing it. On the other hand I hestitate to upgradre "just in case" because apparently modern HDD's are quite crappy quality.
 
my longest running daily used HDD is a WD 320 gb. use it for temp storage and bt. files on it from 2003.
my longest running daily used SSD is intel x25v for OS and OCZ Vertex for apps/games. got those near release day.
all in all, these all have worked wonderfully for general daily pc use and far exceeded the lifespan I expected of them.
 
My intel x25-V is still rocking on strong in my old I5-430M laptop.

upload_2019-2-23_14-47-7.png
 
You must have some extreme luck.

I've owned 6-8 (can't remember) OCZ drives including Warranty replacements. Every last one of them failed before the two year mark. Granted, this were designs from before Toshiba bought their assets in 2013. Maybe they've been better since?

In general - however - my conclusion is that SSD reliability is barely an issue anymore. Sure in the early days you had to worry about write endurance, but these days, as long as you buy good brands (essentially Samsung or Intel) even TLC drives are reaching the petabytes written level before they fail.

Write endurance just is not an issue on good SSD brands (which I limit to just Samsung and Intel) these days.
You must be ssding wrong :-D

We purchased a bunch of workstations for work back in 2011, all were fitted with OCZ Vertex3 SSDs. 8pcs out of which two had 120GB OS drives, the rest were 60GB Plus the two machines with the 120GB OS drive also had a pair of 480GB Vertex3s, that were abused with heavy IO workloads. But all still work today.
 
Here's my current collection:
OCZ Vertex2 120GB - 70,041 hrs (8 yrs) - 100% good health
Samsung SSD 830 256GB - 56,223 hrs (6.5 yrs) - 99% good health
Samsung SSD 850 Pro 512GB - 48,559 hrs (5.5 yrs) - 100% good health
Samsung SSD 850 Pro 1TB - 41,957 hrs (4.8 yrs) - 100% good health
Samsung SSD 850 Pro 1TB - 41,946 hrs (4.8 yrs) - 100% good health
Crucial CT250BX100SSD - Incorrect stats? - 15,576 power on count, 2435 power on hours - 100% (purchased 7/2/15)

I've not had any SSD's fail on me to date but probably just jinxed it.
 
You must be ssding wrong :-D

We purchased a bunch of workstations for work back in 2011, all were fitted with OCZ Vertex3 SSDs. 8pcs out of which two had 120GB OS drives, the rest were 60GB Plus the two machines with the 120GB OS drive also had a pair of 480GB Vertex3s, that were abused with heavy IO workloads. But all still work today.


Well, every last OCZ drive I've used has failed, and quickly.

Exactly 0 of the Intel (5x) and Samsung (12x) SSD's I've bought and used for years have had any problems what so ever.
 
I lost an OCZ Vertex II 60 GB, but it had been my boot drive for probably 5 years before I moved it to a laptop for a couple of years. I guess I was lucky. The Crucial MX300 in my wife's PC started acting strange right as the warranty was expiring. No data loss, just weird freezes. I replaced it with my Intel drive when I upgraded and put it in a drawer. I used it for a Linux server years later and got weird errors when configuring the machine. Tossed it for good.
 
Back
Top