Hard Drive Reliability Stats for Q1 2015

HardOCP News · Jun 13, 2015

Backblaze is at it again with its hard drive reliability stats for Q1 2015. Let the arguments over who makes the best hard drives win.

Thanks to cageymaru for the link.

Koolthulu · Jun 13, 2015

Nice to see my problems with Seagate aren't just me having a run of bad luck.

Zarathustra[H] · Jun 13, 2015

Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

Their stats are pretty much useless if the drives are used properly.

Yes, yes, I am familiar with Googles old data that says drives last longer when warm than when cool, but google never ran their drives as hot as backblaze does.

For instance. I have had 12 4TB WD Reds in my properly cooled server for the last year. According to Backblaze's stats, with 95% certainty, at least one of them should have failed by now.

(They all continue to run happily)

The Cobra · Jun 13, 2015

Wow, I used to avoid the Deskstar (formally knows as "IBM Deathstar" drives) drives like the plague back in the day. They had a very high failure rate. Good to know they have such low failures now.

nilepez · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

Their stats are pretty much useless if the drives are used properly.

Yes, yes, I am familiar with Googles old data that says drives last longer when warm than when cool, but google never ran their drives as hot as backblaze does.

For instance. I have had 12 4TB WD Reds in my properly cooled server for the last year. According to Backblaze's stats, with 95% certainty, at least one of them should have failed by now.

(They all continue to run happily)

There sample size for 4TB drives is still fairly small and yours is smaller still. I don't recall enough statistics to know what the 95% confidence interval is, but for the model in question, it's a pretty wide delta (.2%-50.2%). I could be wrong, but I think that means that the number could be way off. It's also notable that they all died around the same time (there was no deaths until march...most drives seemed to have a few deaths each month).

That said, I seem to recall reading elsewhere that the 4TB red drives had issues (probably when they first came out). That along with cost, led me to going with Seagate for the first time in at least 10 years. I've had no problems with them so far (not that that's statistically relevant).

Meeho · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

Their stats are pretty much useless if the drives are used properly.

Yes, yes, I am familiar with Googles old data that says drives last longer when warm than when cool, but google never ran their drives as hot as backblaze does.

For instance. I have had 12 4TB WD Reds in my properly cooled server for the last year. According to Backblaze's stats, with 95% certainty, at least one of them should have failed by now.

(They all continue to run happily)

Wow, 12 drives?! You don't say...

Meeho · Jun 13, 2015

Also: https://www.backblaze.com/blog/hard-drive-temperature-does-it-matter/

drowsyhaze · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

They use dampened cages these days - a big upgrade from their rubber band dampening. However, it's definitely not on the same level as enterprise of hardware.

Their stats are pretty much useless if the drives are used properly.

They aren't - BackBlaze's daily workload for these drives is similar to a consumer's torture test, as BackBlaze states.

Yes, yes, I am familiar with Googles old data that says drives last longer when warm than when cool, but google never ran their drives as hot as backblaze does.

No they don't... check the numbers.

For instance. I have had 12 4TB WD Reds in my properly cooled server for the last year. According to Backblaze's stats, with 95% certainty, at least one of them should have failed by now.

(They all continue to run happily)

Your anecdotal evidence is anecdotal and in very different circumstances. Aside from not accounting for their own lack of confidence in these numbers (that they outlined with statistics), you are forgetting that they are running hardware RAIDs with 45 disks, on top of another cabinet of the same, on top of another cabinet of the same... unlike your ZFS setup with 12 disks... Never mind the differences in workload.

These are statistics based on torture tests done to disks that are not meant to be used under these circumstances. While they don't paint a perfect correlation to consumer use, they do offer an understanding of how well consumer disks will perform under worst case use-scenarios, and there cabinets aren't that bad (better than most external docks and drive sleds, but certainly not enterprise).

naib · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

Their stats are pretty much useless if the drives are used properly.

Yes, yes, I am familiar with Googles old data that says drives last longer when warm than when cool, but google never ran their drives as hot as backblaze does.

For instance. I have had 12 4TB WD Reds in my properly cooled server for the last year. According to Backblaze's stats, with 95% certainty, at least one of them should have failed by now.

(They all continue to run happily)

Actually no, HALT/HASS is an established method to determine when and how something fails.

They might not have known they were doing it, but it doesn't fully dismiss what they did

evilsofa · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

I looked at - or, at least tried to as best as I could - the SMART data set they included. I opened the 2015-01-01.csv file from 2015_data.zip and sorted the raw values for 194 (the current temp in Celsius) for all the 41,000 or so drives. They read between 16C and 44C. At the top end, only one single drive actually read 44C, and only a few dozen drives read above 40C. 35,000 of the 41,000+ drives had temps of 30C or less.

My method is admittedly very limited and I did not even try to follow their recommendations for reading the data sets because I don't know how to do that stuff, but I'm curious to know what your standards are for "way over temp" when the manufacturers generally recommend in the range of 50C to 60C max, and how you've determined that your standards are better researched than Backblaze's article about how hard drive temperature affects failure rate.

wyqtor · Jun 13, 2015

In my experience, Toshiba are the very worst.

Cerulean · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says.

They pack drives WAY too tightly into their boxes, and run them way over temp.

Then wouldn't this mean that if the drives were operated in a less stressful that they would simply last even longer?

cageymaru · Jun 13, 2015

My 3TB Seagate 7200.14 Model number ST3000DM001 didn't even make it 3 years.

Nenu · Jun 13, 2015

It was their previous reports that made me buy a HGST 4TB backup drive.
Very glad, and that I didnt touch Seagate (again due to their previous reports)

Neither Seagate nor WD look good.
It would have been nice to see the WD black drives tested.

rat · Jun 13, 2015

Zarathustra[H];1041662863 said:
For instance. I have had 12 4TB WD Reds in my properly cooled server for the last year. According to Backblaze's stats, with 95% certainty, at least one of them should have failed by now.

You don't understand statistics, then.

A 25% failure rate doesn't mean that one in four will absolutely fail if you were to buy them yourself. It just means a higher overall chance versus a brand that has a 2% failure rate.

Case in point, I once bought four Seagate drives to go into a NAS and THREE of them failed prematurely. Far cry from the posted 32% failure rate.

husker4life · Jun 13, 2015

Seagate 3TB lasted 18 months, dont buy Seagate...the end.

Kor · Jun 13, 2015

Seagate continues to earn their reputation. Last time I moved I destroyed 5 dead drives i'd accumulated, all Seagate.

Sovereign · Jun 13, 2015

Zarathustra[H];1041662863 said:
Just disregard anything backblaze says..)

Yeah, we should trust N = 12 from Random Dude On The Internet With Strong Opinions (but no data) over what appears to be a solid, statistically valid sample of over forty thousand drives.

Riiiiiiiight.

timta2 · Jun 13, 2015

I'm not sure I can take anything they or their "stats" seriously after I read this:

http://www.tweaktown.com/articles/6...bility-myth-the-real-story-covered/index.html

ShamisOMally · Jun 13, 2015

A) Blackblaze does not run drives within spec. While this might not seem like a big deal, they buy commercial drives and not industrial drives and run them in industrial servers. A giant no-no for drive life, but cheaper than buying enterprise drives.

B) Their drives are not properly mounted, but considering they're just using commercial drives on the cheap this is not surprising. Drives if packed too tightly cause whats known as transference vibrations even across air in small distances due to air pressure. When you pack rows and rows of drives too densely this causes vibrations to grow out of control.

C) Stats while accurate for their business, are not accurate to give an indicator of drive reliability. Most if not all the high failure rates on Seagate drives for example are drives that are on average 2+ years older than the WD drives, and if you went and compared age vs reliability on the charts you see that WD absolutely tanks in that respect.

D) Commercial drives are not meant to run full tilt 24/7 reading/writing the whole time, they're just not. You're simply overworking machinery that was never meant to be worked that hard.

An example would be having a 4 cylinder tow truck vs an 8 cylinder tow truck. Both can do the job fine, but you'll wear out and break the 4 banger far before the 8 banger easily due to engine load and stress.

Best example of this is that all the industrial drives they DO have, they last an exceptionally long time. HGST Enterprise drives are meant for server arrays, and their reliability shows here. They are the 8 bangers of the HD world.

People have picked apart this before from Backblaze. End result is always the same, running commercial drives in industrial servers kills them ASAP.

cyclone3d · Jun 13, 2015

Kor said:
Seagate continues to earn their reputation. Last time I moved I destroyed 5 dead drives i'd accumulated, all Seagate.

Yeah.. I moved a couple powered off SANs about a month ago... 24 Seagate drives in total.

2 drives did not come back to life.. just completely dead when I powered them back up.

In the past we had 2x drives die at the same time while it was running as well as a drive dying randomly every few months.

And these are the enterprise drives... being run at very low loads... in a pretty cold server room.

Same thing goes for SANs with Seagate drives at other locations.

I cannot recommend them at all.

Phelptwan · Jun 13, 2015

Just had my first out of 14 WD Red's start giving me errors (running 24/7 for a bit over a year now). WD did great with the replacement and RMA though and I didn't have any data loss, so worked out for me.

choppedliver · Jun 14, 2015

Maybe I need to let go of grudges... but hgst used to be Ibm deathstar. I lost about 6 way back then. I'm still having ptsd issues lol

Dekoth-E- · Jun 14, 2015

That's pretty amusing to me as I have half a dozen seagate drives that are all 8+ years old now and still running perfectly. To be fair they are all 320gb drives and it seems their problems all came up in the 1tb+ drives.

Betaboy1983 · Jun 14, 2015

Dekoth-E- said:
That's pretty amusing to me as I have half a dozen seagate drives that are all 8+ years old now and still running perfectly. To be fair they are all 320gb drives and it seems their problems all came up in the 1tb+ drives.

You couldn't kill a less than 1TB Seagate with anything less than molten lava.

Love em.

nutzo · Jun 14, 2015

I've had great luck with both Hitachi and WD enterprise level drives in my servers.
Almost every server drive that has died over the past few years has been a Seagate. I wouldn't buy them myself, but that's what Dell usually ships in there server. Makes me want to skip the Dell drives and put in Hitachi or WD drives.

Meeho · Jun 14, 2015

timta2 said:
I'm not sure I can take anything they or their "stats" seriously after I read this:

http://www.tweaktown.com/articles/6...bility-myth-the-real-story-covered/index.html

That analysis is as flawed as it claims the Backblaze report to be.

ShamisOMally said:
A) Blackblaze does not run drives within spec. While this might not seem like a big deal, they buy commercial drives and not industrial drives and run them in industrial servers. A giant no-no for drive life, but cheaper than buying enterprise drives.

And a great indicator of the build quality.

B) Their drives are not properly mounted, but considering they're just using commercial drives on the cheap this is not surprising. Drives if packed too tightly cause whats known as transference vibrations even across air in small distances due to air pressure. When you pack rows and rows of drives too densely this causes vibrations to grow out of control.

They've improved their racks with time + people mount and use their drives in various ways and cases + it still makes for valuable data as to the robustness of the drives.

C) Stats while accurate for their business, are not accurate to give an indicator of drive reliability. Most if not all the high failure rates on Seagate drives for example are drives that are on average 2+ years older than the WD drives, and if you went and compared age vs reliability on the charts you see that WD absolutely tanks in that respect.

Not true. Seagate drives that are younger or as old as other drives still have much higher failure rates AND there is data from 2013 till 2015 for the same drives that show failure rates being more or less consistent regardless of age.

D) Commercial drives are not meant to run full tilt 24/7 reading/writing the whole time, they're just not. You're simply overworking machinery that was never meant to be worked that hard.

If a drive isn't meant to be used 24/7 it is crap. It is not until recently that companies started declaring "not meant for much use" for their drives. Seagate also pulled MTBF rating in favor of some other crap. It shows. They are bad for any use.

An example would be having a 4 cylinder tow truck vs an 8 cylinder tow truck. Both can do the job fine, but you'll wear out and break the 4 banger far before the 8 banger easily due to engine load and stress.

Best example of this is that all the industrial drives they DO have, they last an exceptionally long time. HGST Enterprise drives are meant for server arrays, and their reliability shows here. They are the 8 bangers of the HD world.

There are consumer drives from other manufacturers that aren't showing the horrible failure rates, making your point moot. Also, why would I buy a special snowflake 4 cylinder with same MPG and price and take extra special care of it when I could buy a robust 8 cylinder and be sure to have a much higher reliability without any drawback?

People have picked apart this before from Backblaze. End result is always the same, running commercial drives in industrial servers kills them ASAP.

People can pick it apart all they want, it does not make them right nor Backblazes data useless. Even if it didn't have merit on its own, which it does, there is a HUGE number of customer's reports regarding Seagate drives' lower quality and reliability that goes hand in hand with Backblaze's observed trends. From various forum reports to retailer stores ratings. That is more than enough for me.

rat · Jun 14, 2015

choppedliver said:
Maybe I need to let go of grudges... but hgst used to be Ibm deathstar. I lost about 6 way back then. I'm still having ptsd issues lol

Yeah. I still feel the same way. One of my first submissions to this site was a Deathstar drive I took a sledgehammer to because I lost a few years' worth of backups on it WHILE I was in the process of mirroring the drive.

Now they're the most reliable.

Nenu · Jun 14, 2015

I too was wanting to avoid Hitachi drives after the Deathstar days and this has proven sensible for the earlier Hitachi drives.
I, my friends and customers had many IBM Deskstars fail.

I got over it when I was given some fairly young (then) Hitachi 72101 1GB drives that now have 28429 (3.2 years), 33795 (3.8 years) and 47964 hours (5.4 years) on the clock.
Only one of these (3.8 yrs use) developed a problem, it gets a bit warm so I use it for offline music backup.
Lo and behold, this one is called a Hitachi Deskstar lol!
The others are great, a bit newer and dont have the dreaded moniker!

In that time I personally have had 2 Samsungs and 1 Seagate fail, all 1TB if I recall correctly.

Haha, I just checked a drawer with old drives in and found an original year 2000 Deathstar.

choppedliver · Jun 14, 2015

If I remember correctly the deathstars flirted with some "new" technology back then with Magnetoresistive heads which were supposed to be all the new rage.

One thing was certain, the deathstars would give you no warning, just all of the sudden, a sickening sound unlike that I've ever heard in another brand. It was a sickening squeak/honk oscillating sound ( hard to communicated in text ). Ugh. I need a beer now just thinking about it. I lost some priceless stuff, like pictures of a bachelor party ahaha. .

Nenu · Jun 14, 2015

It was the new glass platters that expanded more as they warmed up.
This caused the format/data position to go out of alignment with the head position.

If you formatted when hot, they would do a death click when cold and vice versa.
And because it tried to correct the errors it would fuck itself up.

Red Squirrel · Jun 14, 2015

Running in spec or not, I think these stats are great as it shows which drives will handle an environment like this better. Not everyone wants to pay a premium for "enterprise" drives/enclosures if they can get away with using something like what Backblaze does with consumer drives and the drives being happy with it.

In this day and age, *ALL* hard drives should be designed for 24/7 operation. The tech industry seems to like depriving home users of quality and then charging an arm and a leg for the same product that has specs that the consumer one should have in first place. How about actually designing a product that is quality to begin with.

No matter what though, when running big data operations raid is a MUST, with backups. In their case they are a backup company so don't imagine they do backups of backups, but their entire system is probably quite redundant.

Meeho · Jun 14, 2015

Red Squirrel said:
No matter what though, when running big data operations raid is a MUST, with backups. In their case they are a backup company so don't imagine they do backups of backups, but their entire system is probably quite redundant.

If they are a proper backup company, they probably have at least three copies of data at any given time.

choppedliver · Jun 14, 2015

Nenu said:
It was the new glass platters that expanded more as they warmed up.
This caused the format/data position to go out of alignment with the head position.

If you formatted when hot, they would do a death click when cold and vice versa.
And because it tried to correct the errors it would fuck itself up.

Ok. that makes sense. I had some die literally within hours of installing. Piece of shit. Ugh. Having data loss flashbacks.

dgingeri · Jun 14, 2015

With my experience as a systems admin at a storage company, I can see parallels. The Seagate 1TB drives we used in our older products are failing right and left these days. Sure, they weren't meant to keep running for 5-7 years, but the failure rate of the Seagates is FAR higher than the Hitachi and WD drives I deal with. We had roughly three times as many Seagates as WD or Hitachi drives two years ago, with all of them between 3 and 5 years old, and in that time, I have replaced in excess of 144 Seagate drives, about a dozen WD RE3 drives, and one, yes ONE, Hitachi drive. Our Hitachis are in excess of 7 years old, and I have had to replace ONE.

My main backup system for my lab VMs and file storage (We have a test software archive in excess of 6TB) has 84 Hitachi 1TB drives, is 7 years old, and it has yet to have one failed drive. While our main test VM host has 80 Seagate 1TB drives, and I replace one about ever other week on that thing. I've set it up with 3 hot spares per tray because I don't trust those drives.

(The Seagate, along with a few Hitachi 750GB drives, drives are for our fibre channel storage, and no non-certified drive can be used in it, so I can't just put Hitachi or WD drives into those trays. The WDs and Hitachis are in our direct attached storage, using 3Ware SAS RAID controllers and Supermicro systems and disk trays. Our customers lease our systems and we get them back for internal use when the customers upgrade. Many of the older systems have been repurposed as general storage, while some were reconfigured for use in our infrastructure. For example, my infrastructure VM cluster uses a FC storage tray with Hitachi 750GB drives and an expansion tray with Seagate 2TB drives.)

Of course, I can't tell you my company, as that could raise liability issues, thanks to the lawyers.

On top of all that, we had major issues with 2.5" Toshiba/Fujitsu 146GB 15k drives in one of our lines, where nearly every single drive had major failures within the first three years. We even had a few occasions where a drive failed in such a way that it damaged the backpane and caused failures in other drives. We replaced them with Seagate Saavio drives that have been totally reliable in their place.

We had an issue with Seagate's 3TB Constellation FDE line, where our initial test unit with 48 drives had 6 failures within the first week, 3 of which failed on initial spin up. We replaced them with HGST drives and haven't had a single failure so far.

Our current products now use almost exclusively HGST and WD drives for the 3.5" size and Seagate Saavio for 2.5" drives.

In short, in five and a half years of being in the systems admin in this test lab, I can say with absolute certainty that Seagate's Barracuda and Constellation lines are NOT reliable past the three year mark, while the WD RE3 and RE4 lines are of good reliability and Hitachi's Ultrastar line is the best in reliability. Toshiba drives are of generally horrible reliability. The Seagate Saavio line is also very reliable, in total contrast to their other lines.

Of course, these are enterprise lines of drives, not the regular end user type of drives. End user drives may see different results, but I kind of doubt it. I believe it shows the general attitude of each manufacturer, and you're likely to see similar results.

bubsie · Jun 14, 2015

This is an interesting chart.

Wierdo · Jun 14, 2015

It's a shame HGST got bought by WD. I'd take WD over a Seagate but I hope the HGST drives don't get affected by the quality standards of the parent company.

tikiman2012 · Jun 14, 2015

cageymaru said:
My 3TB Seagate 7200.14 Model number ST3000DM001 didn't even make it 3 years.

Mine didn't last 1 day before it started grinding.

Ashbringer · Jun 14, 2015

Tek Syndicate guys have already been saying HGST have the best reliable drives. They have some terrible things to say about Seagate. Blackblaze just confirms what everyone knows, and that is don't buy Seagate. I probably won't touch Western Digital either.

evilsofa · Jun 14, 2015

Wierdo said:
It's a shame HGST got bought by WD. I'd take WD over a Seagate but I hope the HGST drives don't get affected by the quality standards of the parent company.

On the other hand, I hope that WD drives get affected by the quality standards of HGST - hopefully, the higher-ups at WD notice what's going right over there at the Thailand plant and bring it to their WD plants.

Hard Drive Reliability Stats for Q1 2015

[H] News

Gawd

Extremely [H]

2[H]4U

[H]F Junkie

Supreme [H]ardness

Supreme [H]ardness

n00b

[H]ard|Gawd

[H]F Junkie

Limp Gawd

[H]F Junkie

Fully [H]

[H]ardened

Supreme [H]ardness

Limp Gawd

2[H]4U

2[H]4U

[H]ard|Gawd

[H]ard|Gawd

[H]F Junkie

Supreme [H]ardness

Limp Gawd

Supreme [H]ardness

[H]ard|Gawd

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

[H]ardened

Limp Gawd

[H]ardened

[H]F Junkie

Supreme [H]ardness

Limp Gawd

2[H]4U

Limp Gawd

[H]ard|Gawd

[H]ard|Gawd

Supreme [H]ardness

[H]F Junkie