Raid 5 nearly useless for 1.5TB drives?

wardour

n00b
Joined
Sep 30, 2008
Messages
30
I recently setup a raid5 running with 3 drives (planning on scaling up to 8). I have been following the discussions on raid5/6 becoming extinct in the next few years. Matter of point until a few months ago, no one had ever raided 1.5tb drives (because they didn't exist).

Anyone out there had drives fail in a 1.5tb drive array and recovered from it? Anyone out there with some mammoth 1.5tb hd hardware arrays? Raid 6 might be the answer for now, which cards are popular for raid 6? 12port card, (bang for the buck).

What I stand to loose is time. Its digitized media. (all of my important stuff is backed up in 3 places). I just don't want my media server to die, and to be offline for weeks over a drive failure. But raid 5 is looking pretty shaky. And a hot spare doesn't fix this issue.

Is this just fear mongering in your experience? Or are people standing to enter a world of hurt doing raid5 on 1tb+ drives?
 
Why would Raid 5/6 be extinct in a couple of years?

I understand that greater storage + same RPM and transfer speed ---> slower rebuild if a drive fails in an array, but what would replace Raid 5/6?
 
RAID 5 isn't really going anywhere, even with 1 TB + drives.

They just take an insane time to rebuild, so don't loose power, lol

Its hard to say that RAID 5 will go away when it is the most common RAID level used.
 
Could you explain this one Ockie?

I'm not Ockie, but Intels own page pretty much sums it up..

http://www.intel.com/design/chipsets/matrixstorage_sb.htm

The whole thing is you don't need RAID5 for all your data. Some data you don't care if you lose. Sometimes you want a small area of space that is stupidly fast. Sometimes you would prefer if your data was mirrored across multiple drives. With a proper Matrix controller you could configure your "pool of drives" in the way you need it to be. Put it this way...Matrix storage is the "virtualization" of the HDD world.
 
I'm not Ockie, but Intels own page pretty much sums it up..

http://www.intel.com/design/chipsets/matrixstorage_sb.htm

The whole thing is you don't need RAID5 for all your data. Some data you don't care if you lose. Sometimes you want a small area of space that is stupidly fast. Sometimes you would prefer if your data was mirrored across multiple drives. With a proper Matrix controller you could configure your "pool of drives" in the way you need it to be. Put it this way...Matrix storage is the "virtualization" of the HDD world.

desktop users arent going to make raid 5 disappear, think the business world...

raid 6 is nice but about %30 slower then raid 5 due to the extra parity info to be written, allowing for 2 drive failure instead of just one as in raid 5.

all intel is saying is you could use 3 drives basically and have a raid 5, a raid 1 and a raid 0 only using 3 drives as opposed to seperate drives for each raid array that is what matrix raid is, this has nothing to do with or even comments towards raid 5/6 becoming extinct.
 
Could you explain this one Ockie?

+1

Most other people, I'd go do some research, but from Ockie, I'd like to hear his opinion to because I know I'm going to learn half a dozen new and fascinating things.

I'm not Ockie, but Intels own page pretty much sums it up..

http://www.intel.com/design/chipsets/matrixstorage_sb.htm

The whole thing is you don't need RAID5 for all your data. Some data you don't care if you lose. Sometimes you want a small area of space that is stupidly fast. Sometimes you would prefer if your data was mirrored across multiple drives. With a proper Matrix controller you could configure your "pool of drives" in the way you need it to be. Put it this way...Matrix storage is the "virtualization" of the HDD world.


I think Trepidati0n said it pretty good. The matrix is basically the ability to pool your system drives and replicate files across multiple arrays of drives, giving you great data redundancy, storage capacity, and limiting your hardware corruptions. With this type of matrix concept, a user can choose which files are deemed more important and can have it replicated over 2 drives or a 100 drives, the files which may not be so important can be kept on one drive or replicated to two drives. If one drive fails or 10 drives fail, you will still maintain data integrity depending on how you are configuring your storage. Completely hardware independent and array independent, you can mix and match, add arrays into the mix or just single drives, the idea behind it is controlled distribution. You no longer have to worry about port count, hardware capacity, or array corruptions, a matrix would be more dynamic and wouldn't see any performance losses with drive failures... in fact, you would see performance increases as data can be harvested from multiple drives simultaneously.

WHS and some SAN networks are starting to only touch this concept on the surface. You get the freedom and cost of JBOD with the security of a high end array.
 
WHS and some SAN networks are starting to only touch this concept on the surface. You get the freedom and cost of JBOD with the security of a high end array.

I lost a drive a couple days ago on my WHS box and my important stuff pictures, videos I had replicated and so I didnt lose any of that when the drive failed...I just lost partial downloads on some of my torrents because I dont have my DL's replicated. Way better than a RAID5 I had Zero rebulid time I just pulled the bad drive out and I didn't notice any decrease in performance. All the while my important data is still completely intact.
 
What I am talking about is that with a 1.5tb drive, and the average rate for a read / write error, during a rebuild of a drive failure you have a 50% chance of a failed rebuild.

This is only going to get worse.

Namely is this FUD:
http://blogs.zdnet.com/storage/?p=162
 
wow. That article is just...dumb.
If you cant read a sector the controller should just continue with the next sector, the entire disk isnt gone. You would lose A file. Oh, and his calculation requires 7 disks, and 12TB of data. The chances of hitting an URE increases, but not the actual chances of a URE.
 
I do indeed, want to believe the article is dumb. I was just curious what you all were seeing in practice. I am going to have 8 1.5tb drives in a raid 5 (when the ups man comes). I wouldn't be entirely surprised to see a raid rebuild halt on a URE, but I also wouldn't be surprised to see it keep going.

So question, will a URE cause a rebuild to fail?
 
It largely looks like this is FUD. Raid products have countermeasures for this built in. This guy is assuming a very stupid controller. Slashdot has a thread going on the same article.
 
Wow, I spent the last 2 hours reading articles about raid 5 and URE, etc. Very interesting stuff. I dont know if a URE would cause a rebuild to fail or not, if it would cause a fail, it seems mathematically that having raid 5 with that much data is a bad idea. But, theres one thing that I read ( here I believe ) that keeps standing out in my mind. Raid =/= backup.
 
That article only points to SATA drives in RAID 5, anyone know what URE is with SAS drives? I find it hard to believe this guy things RAID 5 will dissapear because of 1 HDD interface
 
I believe the quotes for the enterprise class drives was X10 the URE vs reg Sata drives, so you would have 10 times less chance of a URE.........Ill backtrack to try to find you a definitive answer.
 
Bottom line RAID5 isnt going anywhere....even if you look at the Intel Matrix article it uses RAID technology its just "virtualized" if you will. Even it employs RAID technology just over a pool of disk.
 
IDK, whats the point of having raid 5 if you cant reassemble the array after a drive failure?
 
IDK, whats the point of having raid 5 if you cant reassemble the array after a drive failure?

You can rebuild it, we have the technology!!!!!

Lol, sorry, honestly i have rebuild a lot of SATA arrays with a failed drive, the only one that bombed was cuased by a power outtage at the customer site which corrupted the RAID10 array
 
You can rebuild them now, but once we reach a capacity to where its a given there will be an error over the course of copying, we will have issues. I dont play lotto with my money and wont do it with my data either.
 
It largely looks like this is FUD. Raid products have countermeasures for this built in. This guy is assuming a very stupid controller. Slashdot has a thread going on the same article.

Like the ones built into todays motherboards? ;)
 
IDK, whats the point of having raid 5 if you cant reassemble the array after a drive failure?

The thing is you can. Even if you couldn't though the machine stays up serving files with a single failed drive.

If you loose a drive you make a backup of the data on it that might have changed since last backup then start a rebuild. Even if it fails you had a chance to backup the data before and unless you have a drive fail and then the array fail it shouldn't have done anything to your uptime.
 
You can, now. But what theyre saying is given a drives chance to fail on an error, that with such large drives the chance to fail will be too much to reliably rebuild an array.

Lets say a drive failed on a read/write error every 1.2 terrabytes. We have a raid 5 array of 1.5 Tb drives. A drive fails. You replace the broken drive. Now, the chances to rebuild that array are very slim, considering you have 1.5 Tb drives, and an error occurs roughly every 1.2 Tb. See what I am saying? Statistically, you dont have a chance to rebuild that array without an error. This is assuming no type of error correction, etc on the raid cards part.

I dont fully know the ins and outs of error correction as far as what high end raid cards are able to overcome, etc.

I think from what reading I did is basically what theyre saying is once 1.5 Tb drives become the norm, all the onboard raid from the mobo and from cheap cards wont cut it anymore, and we will have to go to raid 6.
 
Lets say a drive failed on a read/write error every 1.2 terrabytes. We have a raid 5 array of 1.5 Tb drives. A drive fails. You replace the broken drive. Now, the chances to rebuild that array are very slim, considering you have 1.5 Tb drives, and an error occurs roughly every 1.2 Tb. See what I am saying? Statistically, you dont have a chance to rebuild that array without an error. This is assuming no type of error correction, etc on the raid cards part.

I haven't seen this URE statistic before -- when did the manufacturers stop using Mean Time Between Failures figures?

I always thought it was a simple case of "look for the biggest MTBF number" when after (more) reliable drives.
 
Well, apparently, from what I understand, this is all my interpretation, the drive manufacturers basically knew that with the capacity of HD growing so fast that you would more than likely retire a drive before a failure happened. But I guess that theyre basically saying that for the time being we wont see any huge leaps and bounds in the way drives are now ( not counting things like laser written or holographic drives) . So the time before failure is still there, but since were dealing with such large drives the failure of a read/write becomes more important.

Its never been a problem before because the chance to have an error was always far greater then the total capacity of a drive, but since our drives are starting to exceed the errors, we will possibly have issues. IE, you would have an error per drive on a 1.5 Tb versus one error every 4 total rewrites on an 80 gig. Now, I know my math isnt 100% on that but its just to put it into perpsective.
 
IDK, whats the point of having raid 5 if you cant reassemble the array after a drive failure?

You can just the same as a regular array...matrices just make it easier because you can run RAID0, RAID10, and RAID5 all on the same group of drives simulatneously
 
It has nothing to do with matrices, stripe size, the ability for the array to continue to work in an injured state. It is saying, that with such large drives, you wont be able to recover from raid 5.
 
yea the chances go up but its still 1 in 10^14 or should i say 1 in 100 Trillion bits....trillion come on ill take my chances....ive read what this guy has to say about a lot storage related things and some of his comments are just outlandish.
This guy said last year that in 2008 Holographic Storage was gonna take over the backup market.....I dont know of any major vendor that is selling a holographic solution...the only thing I can find is from NetAPP and its their snaplock product and it is barely mentioned...dell, hp, emc, ibm none of them sell holographic stuff its still all tapes, and SANs. So not saying this guy isnt smart but his ideas and thoughts are little out there.
 
yea the chances go up but its still 1 in 10^14 or should i say 1 in 100 Trillion bits....trillion come on ill take my chances....ive read what this guy has to say about a lot storage related things and some of his comments are just outlandish.

100 trillion may seem like a lot, but it's not. It's a RAID5 array of 9 1.5TB disks. 100 trillion bits is 11.37TB formatted capacity.

If a RAID controller will fail on a URE, then there is a 0% chance of a successful rebuild on a 9 disk 1.5TB RAID array.
 
If a RAID controller will fail on a URE, then there is a 0% chance of a successful rebuild on a 9 disk 1.5TB RAID array.

Well im not sure what kind of gorilla math you are using but I know that other guys on this forum have 22drive 1tb RAID5 arrays and have done successful rebuilds.
 
Well im not sure what kind of gorilla math you are using but I know that other guys on this forum have 22drive 1tb RAID5 arrays and have done successful rebuilds.

What does my math have to do with anything? Are you claiming that 100 trillion bits is not 11.37TB formatted?

It's simple, their controllers do not fail on UREs. I stated 'If a RAID controller will fail on a URE.'
 
any article saying raid 5 and 6 is dead is only for home users anyways,. massive corporations have been using, i am sure for many years now, MUCH larger then 12TB raid arrays using many smaller drives, considering SCSI drives are at what 320G now? or 400......

Not to mention the power backup systems and such they have and the hardware they use....

Most home users don't even know what raid is so how can something die that is unheard of in %99 of home users homes.
 
If you cant read a sector the controller should just continue with the next sector
Should---but does it?
, the entire disk isnt gone. You would lose A file.
And if that file is important? What if you have a movie file that's important to you (your wedding, birth of kid, what have you) and it gets lost because you're using raid 5?
Oh, and his calculation requires 7 disks, and 12TB of data. The chances of hitting an URE increases, but not the actual chances of a URE.
If you plan to have 8 disks in a RAID array for the foreseeable future, it's entirely reasonable to assume 2TB disks by 2009. And since the chance of URE are related to bytes read, not to size of disk, a disk that's twice as big has twice as good a chance to hit one.
 
Well im not sure what kind of gorilla math you are using but I know that other guys on this forum have 22drive 1tb RAID5 arrays and have done successful rebuilds.

I believe the only people running arrays that large are using raid 6, not 5. And if they're not, I believe they're treading really dangerous ground.
 
100 trillion may seem like a lot, but it's not. It's a RAID5 array of 9 1.5TB disks. 100 trillion bits is 11.37TB formatted capacity.

If a RAID controller will fail on a URE, then there is a 0% chance of a successful rebuild on a 9 disk 1.5TB RAID array.

So if the chance of winning a million dollars in mcdonalds monopoly is 1 in 100 trillion, you have a 100% chance of winning if you buy 100 trillion big macs?

Interesting.
 
Should---but does it?

And if that file is important? What if you have a movie file that's important to you (your wedding, birth of kid, what have you) and it gets lost because you're using raid 5?
You know answer as well as I do, they're called backups. And RAID is NOT a backup.
 
So if the chance of winning a million dollars in mcdonalds monopoly is 1 in 100 trillion, you have a 100% chance of winning if you buy 100 trillion big macs?

Interesting.

Your first analogy that you ninja edited was actually a correct analogy. You would win the lottery if you bought 100 trillion tickets, one with each of the 100 trillion possible combinations. The practicality of it is flawed since it would cost more than the winnings.

The Big Mac analogy is also correct. If there are 100 trillion tickets (1:100 trillion odds), and you buy them all, then yes you are guaranteed to win.

In a basic odds scenario such as a raffle, every purchase increases your chance to win. Two tickets would make it 1:50 trillion, four 1:25 trillion, and so on.
 
Your first analogy that you ninja edited was actually a correct analogy. You would win the lottery if you bought 100 trillion tickets, one with each of the 100 trillion possible combinations. The practicality of it is flawed since it would cost more than the winnings.

The Big Mac analogy is also correct. If there are 100 trillion tickets (1:100 trillion odds), and you buy them all, then yes you are guaranteed to win.

In a basic odds scenario such as a raffle, every purchase increases your chance to win. Two tickets would make it 1:50 trillion, four 1:25 trillion, and so on.

However, this is not how this works. The manufacturer doesn't hide a secret prize of a bad sector in every 100 trillion sectors. The statistic doesn't guarantee that for any given 100 trillion sectors one of them is bad.

Similarly, if you bought a lottery ticket or a big mac every day for 100 trillion days you would certainly NOT be guaranteed a winner.

If the lottery had only 100 trillion possible combinations and you bought a ticket for every one, then sure, you would win. This is not how this statistic works, however. You are simply buying lottery tickets that each have a set statistical chance of winning (1 in 100 trillion) and then buying 100 trillion of them. You are not guaranteed to win.

Let me put it more simply.

If a coin has a 1 in 2 chance of coming up heads every time you toss it, you are guaranteed for it to land heads if you toss it twice?

Interesting.
 
However, this is not how this works. The manufacturer doesn't hide a secret prize of a bad sector in every 100 trillion sectors. The statistic doesn't guarantee that for any given 100 trillion sectors one of them is bad.

Similarly, if you bought a lottery ticket or a big mac every day for 100 trillion days you would certainly NOT be guaranteed a winner.

If the lottery had only 100 trillion possible combinations and you bought a ticket for every one, then sure, you would win. This is not how this statistic works, however. You are simply buying lottery tickets that each have a set statistical chance of winning (1 in 100 trillion) and then buying 100 trillion of them. You are not guaranteed to win.

Let me put it more simply.

If a coin has a 1 in 2 chance of coming up heads every time you toss it, you are guaranteed for it to land heads if you toss it twice?

Interesting.

Ah you meant over time. How you wrote it in this post makes perfect sense and I get your point now for the lottery. The lottery numbers change every time.

You're right, it's not a situation of absolutes. I will gladly take back 0% as an absolute chance. However, the odds are extremely bad of a successful rebuild and will only get worse. The proper term for me to have used would be that it is close to or approaches a 0% chance of rebuilding.

I think that with odds that bad, it's undeniable that we're in need of another solution for storing large amounts of data.
 
Back
Top