Problem with RAID5 setup?

SBB

Limp Gawd
Joined
May 30, 2009
Messages
132
Hi guys

I just built this system

http://hardforum.com/showpost.php?p=1035595810&postcount=750

I'm happy with it, but when I showed it to my friend, he mentioned that if a drive failed, that when I replaced it and tried to rebuild the array, the array might fail due to an "unrecoverable read error"

What are you thoughts on this. I want to know asap so I can change the array before I start filling it

Thanks for the help in advance!
 
RAID is no substitute for a backup - so regardless of using RAID, make sure you have a backup if you're concerned about the safety of your data.

If you have no money for that, the only good alternative would be to run ZFS using snapshots. That would give you a "backup" as it allows you to go back in time and recover corrupted/deleted/modified files. It is also much more resilient to bit failures (BER).

That's what your friend is telling you about. If one drive fails, and you are in the process of replacing it, there is no redundant information anymore. In essence, a RAID5 with one disk missing "degrades" to RAID0. Any error on the remaining drives during the rebuild may cause havoc. Such bit-errors are commonly referred to as BER or Bit-Error Rate. This has become more important as disks got larger while BER stayed about the same, thus causing more bit errors in actual usage.

Choosing disks with 4K sectors such as WD EARS might help with BER. But really the best investment would be to look at ZFS. It would mean you retire the PCI-X controller, or only use it as dumb HBA, and use as much onboard SATA ports as available. If you choose ZFS, it is only logical you also choose software-RAID.
 
The data on the drives is not critical, so it wouldn't be the end of the world if I lost it (all my important work is backed up off site).

But basically what I need to know is, if one drive fails, and I replace, will rebuilding the raid kill it?!
 
But basically what I need to know is, if one drive fails, and I replace, will rebuilding the raid kill it?!

Of course replacing a drive would not kill the array. Well unless there is a bug in the raid software or a second drive dies when you are replacing the first.
 
What's this TLER thing I keep reading about too, how does this affect things? :(
 
Sub.mesa, you seem quite knowledgable about ZFS however,. it seems no matter what the problem you reccomend it. Personally I think the time to learn it and the drawbacks arent worth it except for few people. Are you employed by Sun ;)

I would consider doing a raid 6 array versus a 5 if your that concerned.

Lets look at this scenario.
A drive fails, you replace it and the following could happen

1) Nothing, the array rebuilds and you keep going on your merry way.
2) As the array rebuilds another drive dies too ( possible, especially if all the drives are about the same age)
3) As the array rebuilds it has an error it cannot recover from. ( possible, and as drive capacities increase, more likely to happen)


With raid 5, if you are rebuilding and another drive dies/errors then you lost it. With raid 6 2 drives would have to die/error as your rebuilding


Now then, since you said this is a server, I would just make sure that really important files are backed up correctly and not stress the rest too much.

Edit : What Sub said is right. If you are low on funds, then ZFS would be a decent way to do it, but theres a lot of caveats and learning to implement a ZFS pool.
 
I cant do raid6 as my controller doesn't support it :( Seems like I might be better off dumping the raid and going for windows home server storage pool instead. At least then the damage will be limited. What happens if a drive dies in windows home server?
 
Sub.mesa, you seem quite knowledgable about ZFS however,. it seems no matter what the problem you reccomend it.
Thanks and i can see your point. However, i didn't really recommend ZFS to him; just pointed out its the only real alternative to having a backup; meaning either he opts for no backup, or he invests in twice the capacity and run legacy backup, or run ZFS. While both RAID and filesystems do not function as backup, ZFS is different in that it can go back in time using snapshots. This is not exclusive to ZFS; more filesystems use this. But the way it can be used in ZFS and the fact it doesn't cost you additional storage space (only for differencing data; shared data isn't duplicated).

But i consider ZFS something for a serious home NAS, and in particular to people who are scared about corruption on their filesystem. Since the OP was concerned about BER, it is worth mentioning this is kind of ZFS's strong point as there is virtually no substitute. So i do not recommend ZFS to anyone; only to people who have a particular interest in storage and know how to manage such a system. ZFS isn't the easy windows point-and-click type of software. So by definition it's not for everyone.

Personally I think the time to learn it and the drawbacks arent worth it except for few people. Are you employed by Sun ;)
Time to learn it; well essentially you say never do anything beyond Windows; its not worth the trouble. So Linux wouldn't be worth it also. I do not agree in such a statement. I think ZFS and other open source projects can benefit alot of people. But as often occurs, user friendly operation is really the lowest priority in open source development. This may happen when ZFS v13 gets integrated into FreeNAS; then its sort of point and click.

I never worked for Sun, and even if i did i would kind of waste my time if i was trying to get more users to use it. Besides i only really have experience with ZFS on the FreeBSD platform. I did extensively test it, though, long before it was stable and was crashing on KMEM_MAP_TOO_SMALL panics every 5 minutes. :D

ZFS is just the coolest thing that happened to storage; except for the SSD of course.

3) As the array rebuilds it has an error it cannot recover from. ( possible, and as drive capacities increase, more likely to happen)
Quite likely that multiple disks in a RAID5 will have errors. ZFS will recover that without problem as the bit errors do not all affect the same file; thus all files can be repaired with zero data loss. However, a hardware RAID6 will be broken if during a rebuild multiple disks have bad sectors (BER).

Most hardware RAID will disconnect/kick out a disk if it starts showing weakness such as a bad sector. However, there has been discussion about controllers who fix the bad sectors by writing interpolated data to the bad sector. For controllers that do this, TLER is highly recommended. I don't know any controllers that do, however. My Areca Hardware controller doesn't seem to do so.

Now then, since you said this is a server, I would just make sure that really important files are backed up correctly and not stress the rest too much.
Well that's the point; he is running without backup and fully relies on RAID5 or RAID6 to protect his data. Personally, i would feel safer with two RAID0 arrays where one is a full backup. Though many people who 'see' that as less secure; somehow a RAID5 enjoys much respect and trust; while i think that trust can be misplaced.
 
Sub well said.

I actually do believe in other OS's as I have and use Freenas and Linux on my own pc's. Heck with 9 boxes and counting I am always looking to try something new just for the sake of learning.

And to reiterate and keep this on subject, you have a few options. Do as I and many others do.....run 2 boxes with one backing up the other. Personally I have WHS as a main server because I am quite familiar with Windows OS and should a drive fail, given how WHS pools them, I can always pull the other drives, plug them into an external and recover data. And to back up that Windows box I have a Linux PC with a software raid setup. The problem with this is it gets pricey especially considering hard drives. You can use older equipment for most of it, but you simply cant skimp on the storage.

You could use ZFS. I actually read about this a while back and it has some outstanding features. Unfortunately I dont have the inclination to learn it. Maybe if it was implemented in Freenas as hinted I would jump all over it.


Or......simply decide whats irreplaceable and only back that up.

lots of choices, and scopes of budget to choose from.
 
At least then the damage will be limited. What happens if a drive dies in windows home server?

Well if the data on the dead drive has already been duplicated, you don't lose the data on that dead drive. Just slap in a new drive and you'll be set.
 
I've actually decided to switch to unRAID, it provides the best compromise for me.

It gives me redundancy against a first drive failing, but also means that I wont lose the whole array's data if other drives are lost, just what's on the drive. Obviously its not as fast, and hogs a whole system, but it's giving me 50mb read/write in the build stage which seems more than good enough!

Also means I can expand the array without having to worry about getting the same drive/controllers etc.

Thanks for all the help everyone :)
 
Last edited:
Back
Top