Raid0 vs Raid10 vs Raid6

Damodred · Aug 8, 2014

Hello,

I'm looking for advice for the best setup. Recently bought an Adapted 5805 and eight 4TB drives and now I got a bit of dilemma about how to use them with best result. From the beginning I planned for a simple Raid6 but now I don't know.

Rebuild time for an array of this size is like what, ten days? It feels like its a bit long. Then I thought of Raid0, with only six drives I would get the same storage capacity and when a harddrive fails I would just restore it from my backup. Lets say that I would actually fill all of the ~23TB then it would take me less than three days to restore everything, calculating that it would take three hours to do 1TB over a Gbit network.

Then, again, I started to think of Raid10 with eight drives which would give me just above 15TB of storage. When a drive fail here I would simply switch it and let the controller sync the lost data from the mirrored drive, a maximum of 4TB and around 16 hours.

What do you think? imho it feels like the use of Raid5 and Raid6 has lost its purpose now when the disks are at this size. Raid0 would mean the data not being available for the time of the transfer, however its "only" three days compared to one week - on the other hand, with Raid10 its always available and resync time kept to a minimum. However it would be nice to have those extra terabytes.

Regards

/dev/null · Aug 8, 2014

What kind of data are you storing?

Scratch data that is ok to lose + max performance? = Raid0
Don't care about total space but want better iops & not having to calculate parity? = Raid 10
Want max size but will pay the parity price & less random IOPS ? = Raid6

Damodred · Aug 8, 2014

Mostly media of different sorts. Its quite important that I don't lose it, but thats why I have backups.
I kind of figured out my alternatives but have no idea of what to go for... I dont like making decisions like these

Blue Fox · Aug 8, 2014

Any decent/modern RAID controller won't take 10 days to rebuild an array of that size. Should be less than a day.

Damodred · Aug 8, 2014

Really? I've seen some real horror stories about up to a month...Guess I'll give it a try first

But just clarify, raid5 is a big no-no, right?

Blue Fox · Aug 8, 2014

Not too versed on that specific Adaptec card, but the Arecas I own don't take too long. No reason to use RAID5 these days when disks are so cheap.

drescherjm · Aug 8, 2014

Mostly media of different sorts.

I would not be using raid at all for this type of data. SnapRaid + pooling (if you need everything in single path) will be better with that.

drescherjm · Aug 8, 2014

Blue Fox said:
Any decent/modern RAID controller won't take 10 days to rebuild an array of that size. Should be less than a day.

I agree. Maybe the rebuild rate is turned way down or something is accessing the raid constantly stopping the rebuild.

Damodred · Aug 9, 2014

I have to do some googling on SnapRaid...
Thanks for your advice!

houkouonchi · Aug 9, 2014

Go raid6. Adaptec controllers are known to be slow. I know online capacity expansion and raid level changing is extremely extremely slow but I think they are half-decent for rebuild times. On a 3 generations old areca controller I can rebuild a 30x3TB array in ~24 hours if there is not a lot of activity going on it.

Raid6 will be the most reliable (safe from array failures, read errors during rebuild,e tc) and give you the most space.

Raid10 is only better for lots of small random write I/O (specifically write I/O and small). Raid6 will give the same random read performance as raid10 so unless you are doing databases or some weird workload that does a lot of random small write I/O then raid6 will likely be the best performing (with redundancy) as well. Most controllers will give worse sequential read speeds on raid10 vs raid6 as only half of the disks are read (Areca is an exception to this) and of course half the disk in sequential write speeds. If the controller is not a bottleneck raid6 will have better sequential write speeds than raid10.

Don't run raid 5. Seriously... don't. Only time raid5 makes sense is with only 3 or 4 disks.

Almost never a good reason for running raid10. I guess go raid0 if all you care about is performance and always have a backup ready.

/dev/null · Aug 9, 2014

I like raid 10 as you get good performance across many scenarios and no parity stripe issue.

That is my default raid level when I'm no space restricted.

Meeho · Aug 9, 2014

RAID10 should be both safer and faster than RAID6, its downside being worse storage space efficiency.

Blue Fox · Aug 9, 2014

Meeho said:
RAID10 should be both safer and faster than RAID6, its downside being worse storage space efficiency.

Should? It isn't. Losing 2 disks can destroy you entire array. With RAID6, you are assured that no data loss occurs until the 3rd failure. Speed has already been addressed.

Meeho · Aug 10, 2014

That is the worst case scenario, true, but it can also survive much more disk failures and rebuild times are greatly reduced..

Blue Fox · Aug 10, 2014

Worst case? There's a more than 50% chance that the array gets nuked on the second drive failure. It's the most likely scenario.

Meeho · Aug 10, 2014

How did you calculate that? The second drive has to be part of the same mirror for the array to fail. You could lose as much as half the drives and still be operational.

Maybe you're thinking of RAID01?

drescherjm · Aug 10, 2014

Again for this type of need I believe SnapRAID is a very good choice. Much better than raid5/6 or raid10.

1. It stores data on individual disks making it more difficult to lose an entire array.
2. It tracks changes so that you can see what files have changed and even possibly undo a changed file if you want
3. It has up to 6 levels of parity. Imagine having to have any 6 drives die before you are vulnerable to data loss.
4. The parity sync is not done in real time so you can have your parity drives off / not even plugged in most of the time.
5. The sync only operates on the blocks that have changed so it is efficient
6. You can do partial scrubs (check for bit rot)
7. Blocks are checksummed on the data and parity disk so unlike raid5/6 or raid10 the bad block is 100% certain. Normal raid5/6 and raid10 have no checksum on the blocks.
8. Each disk has a native filesystem on it so it can be removed and read on any other system that supports that filesystem.
9. Removing and adding disks is pretty easy and since only the parity gets recomputed it does not put your data at risk (while reshuffling data blocks) during the expansion / contraction that raid5/6 or raid10 would do when you would add or remove disks.
10. Drives need not be the same size, performance level or NAS edition (TLER is not needed at all). All you need is the parity disks must be at least as large as the largest data filesystem.

I sync my SnapRAID array on my linux based HTPC one time per week. Usually the sync is under 1 hour for 200 or so GB changed on 7 data disks and 2 parity disks.

omniscence · Aug 10, 2014

Which filesystem do you use for your snapraid pool? I plan to move my media pool to SnapRAID during the next days as the 11 disk RAIDZ3 pool is almost full, the data becomes too large to handle with a single pool that cannot be extended, it consumes too much power, and the cooling is too noisy during warm days.

I planned to use single-disk ZFS pools for data and XFS for parity because ZFS is easy to handle and I like to keep the snapshot capability.

drescherjm · Aug 10, 2014

The data disks are a few single disk ext4, btrfs and zfs single disk pools. These are the filesystems that I had before the move to snapraid and I did not change that. The parity disks are zfs single disk pools on external USB3 drives.

omniscence · Aug 10, 2014

I try to avoid ZFS (or COW fs in general) for parity, because the in-place modifications of the parity file will significantly fragment it over time. It also does not support fallocate yet. Maybe I will try it with ZFS, the file can easily be moved to a new fs if there is an empty disk available.

drescherjm · Aug 10, 2014

I actually do worry that may be a bad choice for the parity disk for 1 reason. The parity is stored in a single file. If 1 bit of that file was corrupt it may be difficult getting zfs to read past the corruption. I may change this in the future. Although the reason for making the parity disks zfs is more for the lvm type features. Currently my parity drives are larger than my data drives so I make use of that by having extra datasets on the parity disks. I put data on these that I need but use infrequently.

omniscence · Aug 14, 2014

I just tested this specific case.

After a complete 'snapraid sync' I purposely corrupted the underlying parity block device.
I used 'zpool scrub' to detect the error, 'zpool clear' to clear the error block counters.
After this 'zpool status' still reported the parity file as containing permanent errors as expected.
I then verified the readability of the parity file with ddrescue, which reported some read errors but continued to read successfully after the bad section.
A 'snapraid check' found the unreadable blocks, 'snapraid fix' corrected them.
At the end 'zpool status' no longer reported permanent errors.

ZFS and SnapRAID seem to work very well together.

drescherjm · Aug 14, 2014

Thanks for the report. I try to test these type of failure scenarios but never got around to doing this.

Raid0 vs Raid10 vs Raid6

Limp Gawd

[H]F Junkie

Limp Gawd

[H]F Junkie

Limp Gawd

[H]F Junkie

[H]F Junkie

[H]F Junkie

Limp Gawd

RIP

[H]F Junkie

Supreme [H]ardness

[H]F Junkie

Supreme [H]ardness

[H]F Junkie

Supreme [H]ardness

[H]F Junkie

[H]ard|Gawd

[H]F Junkie

[H]ard|Gawd

[H]F Junkie

[H]ard|Gawd

[H]F Junkie