• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

RAID6 advantages?

Innocence

2[H]4U
Joined
Mar 9, 2001
Messages
2,604
Building out a main storage server for a SAN that will be serving 4 VM host servers with about 90 virtual machines (with room to grow).

The client wants about 20TB, and they're forcing Dell hardware (fine, whatever).

I'm not leading this, my buddy just brought me in to assist - however his recommendations seem a bit off. For a core storage server in a High Availability environment, he's recommending a 12 disk RAID6 array with 2 hot spares. In my experience, I'd rather do basically anthing but RAID 5 or 6 - just my preference for up-time, and I'm used to running 3 to 6 disk local storage for most builds.

I'd much rather go Mirrored ZFS, or hardware RAID 10, for speed and availability. He's insisting that RAID6 is faster and "just as safe". I can't find any evidence to back that up.

Can anyone let me know (other than cost) why RAID6 might be preferable?
 
For the same number of disks, raid6 will be safer than raid10, since you can lose any 2 disks. On the other hand, if you have a 12-disk pool, and one fails, the odds of losing the other one is 1/11, which seems pretty low. IMO of course. Raid10 will certainly outperform raid6 in the zfs world (there it is raidz2) for random workloads...
 
The Problem

On pure sequential streaming transfers a Raid-6 from 12 disks has a performance of 10 times a single disk. But whenever you have concurrent reads and writes like with a storage for VMs you must look at I/O performance - only at I/O performance.

In such a case every head of your disks must be positioned to read/write a single datablock what means that your overall I/O performance is equal to a single disk.

This is ok with a SSD only pool where your I/O performance is up to 100x of a spindle based raid. With spindles, you usually build a storage from 2way or 3ways mirrors when using large disks. A 3way mirror improves reads as well.

My opinion: Raid6 (ZFS Z2) or Raid-60 (ZFS with multiple Raid-Z2) is very good for a filer or a backup machine but not a good solution for a VM storage with 90 virtual machines - even with a mostly read workload on a ZFS storage with a massive read cache. Without ZFS you do not have even this superiour read cache.

I would go SSD only (expensive) or 3way mirror with 4 TB disks withh as much RAM as possible and a ZIL log device like a ZeusRAM or Intel S3700. I would also split the load to several filers or pools. 90 VMs is too much. A single VM problem or a single semi dead disk can lead to a "whole offline"
 
Last edited:
What's wrong with raid 6? ZFS is probably the better route to go, but I've had many raid 6 systems going for years. If a problem comes up you pop a disk, the raid resyncs and you forget about it for a few more years.
 
With raid5/6 you can have good random read performance.

But with the writes, it's going have the i/o performance of a single disk, or worse. as it has to read the chunks from every disk, compute parity, then write it out to the disks that have changes.

If all your vm's do is idle, or light reads (10iops?), that would still be 900 read iops. Will your raid6 handle that? a normal sata disk is 80iops, and we are giving the vm's 1/8 of that for read performance.

So, your looking at 800 to 1500 read iops max for that 12 disk array. with writes being much much lower, lets hope those vm's never write.
 
You also have to consider the unrecoverable error rate of the disks. For large disks, only having a single parity drive (in the case of mirrors) opens you up to near certain error. There have been some pretty good articles posted on the forum here lately about it. In those cases, having an extra parity drive (or two) is the only way to prevent error. We'll probably see more 3-way mirrors as disk size increases. RAIDz3 will also be more popular with larger disk sizes.

Also, some arguments have been made against hot spares. Instead of making a hot spare, use that extra hot spare as an extra parity disk. In this case, instead of RAID6/z2, go to 3 parity disks. You still have the same number of disks in the array, but the "hot spare" is already online and doesn't need to rebuild. Then, just replace the failed disk like you normally would.

This works for ZFS, but I have no idea how it works in the traditional RAID arena.
 
I think the main issue is not so much how safe/safer RAID6 is compared to RAID10 or other RAID levels, but rather if the whole concept is well suited for the intended use case.

You got a 4 node cluster with 90 VMs (and more to come in the future), and a single storage server with 20TB (maybe more for future growth) to host them all. So i doubt there is high availability possible, because this storage box will be your single point of failure for all the VMs. This machine dies or maintenance is needed, not a single VMs keeps running.

Therefore I second _Geas suggestion to strongly consider splitting the load to more storage boxes. This gives reliability and a performance increase whatever type of storage you choose. Even if these 90 VMs have a very small storage footprint (regarding I/O) during normal operation, there are times where your storage gets hit pretty hard (backup, happy patching day,...) A RAID6 with 2 hot spares built from 12 disks (spares included or not in these 12?) gives you only 8 spindles in worst case (12 disks - 2 hot spare - 2 parity = 8 disk capacity and I/O depending on RAID level). 20TB / 8 drives = 2,5 TB => 3 TB per disk. Even with 10 usable disks you need 2TB drives. All of them only 7200rpm, so no good performance expectable here. This amount of VMs will devour the disks for breakfast.;)

If your buddy is bound to Dell hardware and he/his customer does not need or want to go the file server like/ZFS road, they might want to consider the DELL Equallogic boxes. At my last job we used 3 of them for a 6 host cluster with about 60 VMs and a capacity of ~50TB (from 1*16 SAS, 1*16 SATA and 1*48 SATA) with pretty decent performance. AFAIK the newer models can be equipped with a mix of disks and SSDs to boost performance if needed. Management is pretty easy and prices were good too. You can choose between different controllers (1G or 10G iSCSI) and different disk types (SATA, SAS 7,2k 10k or 15k and SSD). Adding more boxes is a piece of cake.
 
Last edited:
Great info here, I recently brought my new file server online and transfered my 8 disk raid 5 array to it, and for the longest time I've been debating on doing raid 6 or changing to raid 10. I did do a raid 10 for fun/testing using 4 3TB drives when I initially brought the server online. This has me thinking I should stick with raid 10 for VMs and maybe do raid 5/6 with data that wont be getting as much "traffic". Good place to put backups of the VMs too.

One thing about raid 5 with 1TB disks is rebuilds take almost a day. I imagine with bigger disks that is even worse. With raid 6, how is it to rebuild from 1 disk failure?
 
One thing about raid 5 with 1TB disks is rebuilds take almost a day.

Isn't there a way to increase rebuild rate? I mean with after some tweaking (mostly telling mdadam to rebuild as fast as possible) mdadam raid6 I have rebuild times of 8 to 10 hours for 10 x 2TB 7200 RPM SATA drives.
 
Last edited:
1TB drives taking a day would mean disks are avging ~11.5MB.

During rebuilds I usually see disks at 60-70MB/sec. 1TB drives would be 4-5 hours, 4TB 4x that, so 16-20 hours.
 
Therefore I second _Geas suggestion to strongly consider splitting the load to more storage boxes. This gives reliability and a performance increase whatever type of storage you choose. Even if these 90 VMs have a very small storage footprint (regarding I/O) during normal operation, there are times where your storage gets hit pretty hard (backup, happy patching day,...) A RAID6 with 2 hot spares built from 12 disks (spares included or not in these 12?) gives you only 8 spindles in worst case (12 disks - 2 hot spare - 2 parity = 8 disk capacity and I/O depending on RAID level). 20TB / 8 drives = 2,5 TB => 3 TB per disk. Even with 10 usable disks you need 2TB drives. All of them only 7200rpm, so no good performance expectable here. This amount of VMs will devour the disks for breakfast.;)

The "parity disks" do get used for reads and writes (spares do not). In raid 5 and 6 the parity is staggered across disks (6 just has two parity stripes). No single disk has all the parity info. So a 12 disk raid 6 with 2 of them as hot spares has 10 disks with read and write speeds peaking at 10x the single disk speed. There is of course overhead from the parity on writes, but that's going to be controller and situation dependent.
Note: https://en.wikipedia.org/wiki/Raid_6#RAID_6
Hot spares are simply disks that exist in the raid but are not used until one of the active disks fails.
IMO raid 6 is superior to 10. I have never come across a case in which I wanted to use 10 over other raid methods.
 
" I have never come across a case in which I wanted to use 10 over other raid methods. "

Not knowing anything about you or the cases you run into, it's hard to comment, but your opinion seems to be in the minority...
 
Raid 10 is probably one of the best if you can afford 4 disks.
The read/writes of raid 0 with redundancy, what is not to like?
Raid 5 reads blow, 6 is better but not by much.

https://www.icc-usa.com/raid-calculator

"RAID 10 is a striped (RAID 0) array whose segments are mirrored (RAID 1). RAID 10 is a popular configuration for environments where high performance and security are required. In terms of performance it is similar to RAID 0+1. However, it has superior fault tolerance and rebuild performance."
 
I'd much rather go Mirrored ZFS, or hardware RAID 10, for speed and availability. He's insisting that RAID6 is faster and "just as safe". I can't find any evidence to back that up.
Here is evidence that RAID5 and RAID6 is not as safe as ZFS:
http://en.wikipedia.org/wiki/ZFS#Data_integrity

Read also the small text at the top "Hard disks and error handling" and "silent data corruption"
 
All RAID configurations have their upsides and downsides. There are some that are almost obsolete at this point (ala RAID 5). But even R5 has some place. It's not a very big but you could justify it if you absolutely had to.

The main advantages RAID 6 holds over RAID 10 is the parity. It's actually something that people overlook and just look at the basic premise between the number of disks required for a failure.

Raid 10 in it's default config of four disks requires the "stars to align" in order to achieve the same disk tolerance as Raid 6. Raid 6 on the other hand can have any combination of a two disk failure (barring nested raid) and be fine. That's the main difference aside of the performance hit from R6.
 
All RAID configurations have their upsides and downsides. There are some that are almost obsolete at this point (ala RAID 5). But even R5 has some place. It's not a very big but you could justify it if you absolutely had to.

The main advantages RAID 6 holds over RAID 10 is the parity. It's actually something that people overlook and just look at the basic premise between the number of disks required for a failure.

Raid 10 in it's default config of four disks requires the "stars to align" in order to achieve the same disk tolerance as Raid 6. Raid 6 on the other hand can have any combination of a two disk failure (barring nested raid) and be fine. That's the main difference aside of the performance hit from R6.

In a SAS setup this isn't a big deal. I've been using SAS raid 10 on my server since 2008 with well over 18 million hits on my database a month with intense I/O. I've had only one drive failed and rebuilt itself on the fly.

What's important is matching the raid setup with the type of activity it will be seeing. For intensive db systems, raid 10 is a no brainer.
 
All RAID configurations have their upsides and downsides. There are some that are almost obsolete at this point (ala RAID 5). But even R5 has some place. It's not a very big but you could justify it if you absolutely had to.

The main advantages RAID 6 holds over RAID 10 is the parity. It's actually something that people overlook and just look at the basic premise between the number of disks required for a failure.

Raid 10 in it's default config of four disks requires the "stars to align" in order to achieve the same disk tolerance as Raid 6. Raid 6 on the other hand can have any combination of a two disk failure (barring nested raid) and be fine. That's the main difference aside of the performance hit from R6.

Hmm, I would say for 4 disks, Raid5 is the best choice.
 
In a SAS setup this isn't a big deal. I've been using SAS raid 10 on my server since 2008 with well over 18 million hits on my database a month with intense I/O. I've had only one drive failed and rebuilt itself on the fly.
That's kind of apples and oranges. Yes, more robust/enterprise grade hardware is more reliable in enterprise settings. I don't disagree with that. However, I was pointing out the differences between R6 vs R10. Of course if you are running VM's, DB's or anything that needs the most performance R10 is preferred. However, if I was going for maximum space and uptime I'd go R6 over R10 any day.

What's important is matching the raid setup with the type of activity it will be seeing.

I agree completely.
 
Hmm, I would say for 4 disks, Raid5 is the best choice.

At this point R5 is hardly useful. It's more of a fall back if you prefer the extra write speed over R6, or maybe space constraints.Other than those two things I hardly find it worth it. Especially since in some set ups write holes exist because of the lack of dual parity.
 
just observation only, no distraction to points in various posts.

For example,

1. For RAID-6, you can consider 2 sets of 6x2TB RAID-6, 2 x 8TB usable. total 12 disks.

2. For RAID-10, you can consider 3 sets of 4x2TB RAID-10, 3 x 4TB usable, total 12 disks

3. For RAID-10 and RAID-6, you can consider 1 set of RAID-10 4x2TB usable 4TB, 1 set of RAID-6 8x2TB usable 12TB, total 12 disks.

this is with respect to total 12-disk scenario. They do not offer all 20TB usable. there tends to be compromise in practical scenario.

However, currently year 2014 this may not be relevant to your scenario because 2TB 15k SAS drives not sure whether it is available or not. If you have to rely on 7.2K rpm 2TB SAS drive or SATA drive, then the I/O implication for 90-VM needs vendor's investigation as well. you can contact vendor's Representative for verification.
 
Last edited:
At this point R5 is hardly useful. It's more of a fall back if you prefer the extra write speed over R6, or maybe space constraints.Other than those two things I hardly find it worth it. Especially since in some set ups write holes exist because of the lack of dual parity.

While it is true that R6 is safer, "losing" half the space is a big hit, and a major cost.
 
While it is true that R6 is safer, "losing" half the space is a big hit, and a major cost.
It's not exactly losing half the space when you account for actual usable space. 3TB x 4 is 5.5TB of usable space in RAID 6. That same setup in a RAID 5 is 8.2TB.

What are the major costs if you lose 8.2TB of data when your RAID 5 array dies? I'd rather have 5.5TB of my data safer than having a complete loss of 8.2TB.
 
While it is true that R6 is safer, "losing" half the space is a big hit, and a major cost.

It's only half the space if you're using 4 drives. The more drives you use, the less the space penalty is as a percentage of the system.
 
It's not exactly losing half the space when you account for actual usable space. 3TB x 4 is 5.5TB of usable space in RAID 6. That same setup in a RAID 5 is 8.2TB.

What are the major costs if you lose 8.2TB of data when your RAID 5 array dies? I'd rather have 5.5TB of my data safer than having a complete loss of 8.2TB.

You still have 50% more data for the same price

Thats a false dilemma, you are neither guaranteed to loose data with r5, or guaranteed not to loose data on r6. I personally would run backup either way, and I have never been in the situation of two drive failures within a few days of each other.
 
It's only half the space if you're using 4 drives. The more drives you use, the less the space penalty is as a percentage of the system.

Yeah, but my original claim was that R5 was the best choice for 4 disks. For for instance 8 i would rather choose r6.
 
If your paranoid about raid10 having two disk failures. Just use raid60. 4 disk raid6's all raid0 together. You loose half your write iops compared to raid10, and have more raid card cpu usage, but retain safety.
 
Back
Top