View Full Version : RAID 3 vs. RAID 5
general
04-20-2006, 03:18 PM
Can someone put in plane english the difference between RAID levels 3 and 5. I understand that both have striping and it seems to me that both have parity on the last drive. I just can't get this. Can someone explain? Thanks.
djshelto
04-20-2006, 03:25 PM
as I understand it, Raid 3 has 2 disks with data and 1 with parity.
in Raid 5, all disks have data and parity. for example, if you have a 5 disk array, 4/5 of each disk contains data, and 1/5 contains parity information for data *not stored on that disk*
while you lose 1 disk to parity in each config, in Raid 5, there is no single disk that has all that parity info.
ModBoyzz
04-20-2006, 03:32 PM
Found some decent info on Raid on this site (http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks)
Hope this helps.
Eva_Unit_0
04-20-2006, 03:57 PM
as I understand it, Raid 3 has 2 disks with data and 1 with parity.
in Raid 5, all disks have data and parity. for example, if you have a 5 disk array, 4/5 of each disk contains data, and 1/5 contains parity information for data *not stored on that disk*
while you lose 1 disk to parity in each config, in Raid 5, there is no single disk that has all that parity info.
yeah the reason why Raid 5 is so popular (and why Raid 3 is pretty much unused) is because in Raid 5 ANY one disk can die and no data is lost from the array...there is no "bias" with any particular disk being more important than any other disk.
unhappy_mage
04-20-2006, 04:26 PM
Wrong. All of you. :o :p
Raid 3 is just like raid 5, except that the parity in raid 3 is all on one disk, where it's rotated across the disks in raid 5. So here's a diagram to illustrate this, with five disks in 3 versus 5:
A1 B1 C1 D1 P1
A2 B2 C2 D2 P2
A3 B3 C3 D3 P3
Raid 3
A1 B1 C1 D1 P1
P2 A2 B2 C2 D2
A3 P3 B3 C3 D3
Raid 5
The "height" of each block depends on the stripe width of the array in raid 5. In raid 3, it's a byte-level diagram.
Lastly, note that you can't really tell a raid 3 array from a raid 5 array on the byte level - if you're given say 8k blocks out of the middle of a disk from these two arrays, they'll be mathematically similar - the sum (XOR) of the blocks is zero. This property is difficult to maintain, so when XFX says:
the simultaneous benefits of both RAID 0-class performance and RAID 5-class data protection.
don't buy it for a second. They have to do XOR on write like everyone else.
http://www.hardfolding.com/ftag1.php/mem/150072.png (http://www.hardfolding.com?go=38&tm=33&id=150072)http://www.hardfolding.com/utag.php/mem/1392.png (http://www.hardfolding.com?go=36&id=1392)
general
04-20-2006, 04:48 PM
Mage,
As usual, you seem to be the storage guru around here. I looked at the diagram you had in your post and I still don't see much difference. Could you not reconstruct the parity or any of the other discs in RAID-3 if you lost one from data on the other discs?
unhappy_mage
04-20-2006, 05:17 PM
Yep. The difference mostly comes in with the performance consequences of byte-level striping; if you make even a 4-byte request from the array (most things are upped to multiples of 4k or 8k by the filesystem, but bear with me here) every disk in the raid 3 array has to seek to accomplish that request. Compare this to the raid 5 array; with its stripe size of 32k or so, that single request goes to only one disk. Only one disk seeks to make this request.
This doesn't make much difference with a single request queue depth, but smart apps make lots of requests at once and wait on all the requests. Thus, if you make 20 requests at once, the raid 3 array has to seek every disk 20 times, sequentially. The raid 5 array has to seek one disk per request; depending on how the blocks are laid out all 20 I/Os could be done in as long as it takes to make 4 seeks. This is pretty unlikely, but even in real life small parallel requests can happen faster on a raid 5 array than on a raid 3.
Raid 4 is block-based and doesn't rotate parity. Win? Not really. If you're doing small requests, you need to first read the block where the new data is going, read the parity disk, modify the disk where the write goes (duh) and the parity. On raid 5, the parity disk and write disk for two writes at once could be on four different disks. So four disks seek, four disks read, four disks write. Compare this to raid 4 on the same disks: the parity disk is guaranteed to be the same for both writes. So two disks (P+A) seek, two disks read, two disks write, *then* two disks (P+B) seek, read, write. That takes more time.
I spent a lot of time talking about seeking. That's because it's orders of magnitude more important than sequential transfer rate. Let's compare: an average seek time is around 10 ms, for a desktop drive. Call it 50 MB/s; at this rate a 32k transfer takes 0.6 ms. So a majority of the time is taken moving into the right place to make the transfer, not reading bits off the disk.
http://www.hardfolding.com/ftag1.php/mem/150072.png (http://www.hardfolding.com?go=38&tm=33&id=150072)http://www.hardfolding.com/utag.php/mem/1392.png (http://www.hardfolding.com?go=36&id=1392)
general
04-20-2006, 05:34 PM
OK, I'm getting more confused but maybe that is good. RAID-5 only requires the request from one disc? Isn't the information stored across all of the discs in the array? If not, how does the information get balanced across all of the drives so the parity can exist? Am I making any sense?
ashmedai
04-20-2006, 09:45 PM
If the data is stored in 32k chunks, anything less than that size won't be large enough to need to take up a chunk on another disk. For small requests like this, it's better to have one drive do the work instead of making the whole array move in lock-step. And these tend to be numerous...for a large file you only have to find it the one time, but what about virtual memory or internet cache where there are a LOT of small bits? And let's not even get into servers.
Also there's the small issue of what happens if there's a failure...the work involved is higher under RAID 3, and if it's your parity drive that bit the dust forcing you to recalculate ALL of it instead of just 1/nth of it? Have fun with that downtime.
unhappy_mage
04-20-2006, 11:31 PM
If the data is stored in 32k chunks, anything less than that size won't be large enough to need to take up a chunk on another disk. For small requests like this, it's better to have one drive do the work instead of making the whole array move in lock-step. And these tend to be numerous...for a large file you only have to find it the one time, but what about virtual memory or internet cache where there are a LOT of small bits? And let's not even get into servers.
This is exactly what I was implying.
Also there's the small issue of what happens if there's a failure...the work involved is higher under RAID 3, and if it's your parity drive that bit the dust forcing you to recalculate ALL of it instead of just 1/nth of it? Have fun with that downtime.
When a drive fails in raid 3, you recover onto a disk by calculating:
Replacement disk = d1 + d2 + ... + dn
where + is the XOR operation. Compare this to the recovery algorithm for raid 5, in which you calculate:
Replacement disk = d1 + d2 + ... + dn
Hmm. The way parity works is with XOR. D1 + D2 + D3 + D4 => parity; when you lose a disk from the array (let's say D3) you calculate D1 + D2 + D4 + parity => D3. HPA wrote a very interesting paper (http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf) on raid-6; much of the initial math applies to raid 5 as well.
http://www.hardfolding.com/ftag1.php/mem/150072.png (http://www.hardfolding.com?go=38&tm=33&id=150072)http://www.hardfolding.com/utag.php/mem/1392.png (http://www.hardfolding.com?go=36&id=1392)
drizzt81
04-21-2006, 07:38 AM
Here's RAID3 (http://www.storagereview.com/guide2000/ref/hdd/perf/raid/levels/singleLevel3.html) and RAID5
Random Read Performance: Good, but not great, due to byte-level striping.
Random Write Performance: Poor, due to byte-level striping, parity calculation overhead, and the bottleneck of the dedicated parity drive.
Sequential Read Performance: Very good.
Sequential Write Performance: Fair to good.
Random Read Performance: Very good to excellent; generally better for larger stripe sizes. Can be better than RAID 0 since the data is distributed over one additional drive, and the parity information is not required during normal reads.
Random Write Performance: Only fair, due to parity overhead; this is improved over RAID 3 and RAID 4 due to eliminating the dedicated parity drive, but the overhead is still substantial.
Sequential Read Performance: Good to very good; generally better for smaller stripe sizes.
Sequential Write Performance: Fair to good.
so you can see that everything UM said is pretty much true. Have a nice day :)
oh and:
Recommended Uses: Applications working with large files that require high transfer performance with redundancy, especially serving or editing large files: multimedia, publishing and so on. RAID 3 is often used for the same sorts of applications that would typically see the use of RAID 0, where the lack of fault tolerance of RAID 0 makes it unacceptable.
vBulletin® v3.8.2, Copyright ©2000-2009, Jelsoft Enterprises Ltd.