RAID5 Recovery

zacdl

[H]ard|Gawd
Joined
Feb 12, 2007
Messages
2,012
Seems like I'm having all sorts of server problems lately ;)

Anyways, pretty basic question...

Knock on wood, I've never had a RAID system fail on me. So, I have yet to find out first hand how big of a pain (or how easy) a restore from a RAID5 setup is.

That's basically my question:

Does it rebuild the data with the server off (I would imagine so...)?
Require any special software? Or do you just pop a new (identical) hard drive back in its place?
How long does it take? Generally speaking, is it fairly quick?
 
I've never had the joy of doing it and I'm sure I'm going to get corrected here but:

What do you mean by "with the server off"? If it's not powered on nothing will happen.
Windows or Linux both have the abilities to rebuild the array. If 1 of 3 drives fail, the server will continue to run, just in a degraded state. Replace the broken disk and to rebuild the array is a specific command in Linux or in the computer management/disk management you can rebuild the array in there.
If it's hardware raid I think the controller might need a reboot to rebuild the array
It can take a couple hours or less depending on how big the drives are (if they are 20 gig drives that won't take long but if they are 750 gig drives that is going to take awhile.)

I hope this helps a little, I'm just diving into the world of RAID myself.
 
I'm used to HP Proliant servers...which have superior SmartArray RAID controllers. On the "hot swap drive" models...you can yank the failed drive while the server is running, slap in the replacement drive..and it begins to rebuild right there..on the fly. No reboot required. You've done all the work required from you. Just took a few seconds.

Server may run a "tad" slower during the rebuild
How long does it take to rebuild? Depends on a lot of things.....a few are:
How much data there was
Which RAID controller you have..they have their own processors and RAM..so they perform differently
How busy your server is doing other things
You can select different "rebuild speed" options...rebuild slowly so that the server isn't bogged down, rebuild quickly if you need it, etc.
 
What do you mean by "with the server off"? If it's not powered on nothing will happen.
I mean before the OS is loaded.
Because, it looks as if (I haven't had a chance to take a look very close) it is all using a card.
Disk Management only sees one large drive in three partitions- which is telling me that the card probably has some sort of BIOS.

I'm used to HP Proliant servers...which have superior SmartArray RAID controllers.
It is an HP Proliant. I didn't order nor build nor configure this server, but see my above comment regarding the card. Dunno if it is shipped like that or what...
Do you happen to know?
 
A hardware RAID card will rebuild an array in the background after the OS loads. Boot it all the way and the RAID card will take care of the rebuild.

When a drive first goes bad and the card decides to stop using it it will usually "idle" the drive - mark it as not in use or whatever - and you can hot swap it with the OS running if you know which drive it is. Sometimes you don't (e.g. if the people who built the box didn't label things properly), you have to take down the server and swap the most likely drive(s) and watch the BIOS messages to see if you removed the bad one. If you didn't you quickly hit the off switch and try a different drive.
 
I was able to rebuild a RAID-1 array on a Dell server recently by just replacing the bad drive and entering the RAID card's setup. I may have had to make the drive available to the array or something like that and then the card took off and rebuilt the mirror. It took about an hour or so on a partially filled 160GB SATA drive.

I let the mirror repair itself and then booted up and it's like nothing ever happened. OpenManage then saw the array as "healthy" instead of "degraded".

Did the same thing on my own home server last year. Replaced a dead 120GB ATA drive and the tools in the el cheapo Highpoint RAID card setup rebuilt the array just fine.

I also repaired a few Compaq RAID-5 arrays back in the 9GB & 18GB SCSI drive days. Just pulled the alarmed drive with the server up and installed the new replacement. Saw the message that it was starting the rebuild and I was out the door. Never had a problem with the auto rebuilding.

BTW, Do not pull out a drive while the array is rebuilding. Some of today's controllers may be able to handle that, but do that could easily trash the whole array. Once it starts rebuilding, leave it alone. And of course, have the last backup available just in case.
 
Seems like I'm having all sorts of server problems lately ;)

Anyways, pretty basic question...

Knock on wood, I've never had a RAID system fail on me. So, I have yet to find out first hand how big of a pain (or how easy) a restore from a RAID5 setup is.

That's basically my question:

Does it rebuild the data with the server off (I would imagine so...)?
Require any special software? Or do you just pop a new (identical) hard drive back in its place?
How long does it take? Generally speaking, is it fairly quick?

We use Proliant servers. As one has stated earlier, nothing happens if server is off. In a RAID 5 configuration (especially a hotswappable), you should be able to replace a hard drive on the fly without having to down the server or anything. If there's not hotswappable, then you'll have to replace the drive while the machine is down. If you have more than one hard drive failure, you're SOL.

I mean before the OS is loaded.
Because, it looks as if (I haven't had a chance to take a look very close) it is all using a card.
Disk Management only sees one large drive in three partitions- which is telling me that the card probably has some sort of BIOS.


It is an HP Proliant. I didn't order nor build nor configure this server, but see my above comment regarding the card. Dunno if it is shipped like that or what...
Do you happen to know?

Most Proliant have RAID configuration. It all depends on how many hard drives you have and how many arrays.
 
How do you check these HP RAIDs? As I said- Disk Management only sees one physical drive, so something else is managing it.

My problem (once more) is I didn't setup this system. And rebooting the server to "just see" what comes up really isn't too much of an option.

So I'm not even entirely sure it IS a RAID5. I'm just assuming it is.
 
I'm used to HP Proliant servers...which have superior SmartArray RAID controllers. On the "hot swap drive" models...you can yank the failed drive while the server is running, slap in the replacement drive..and it begins to rebuild right there..on the fly. No reboot required. You've done all the work required from you. Just took a few seconds.

Server may run a "tad" slower during the rebuild
How long does it take to rebuild? Depends on a lot of things.....a few are:
How much data there was
Which RAID controller you have..they have their own processors and RAM..so they perform differently
How busy your server is doing other things
You can select different "rebuild speed" options...rebuild slowly so that the server isn't bogged down, rebuild quickly if you need it, etc.

Confirmed HP Proliant's are awesome. Speaking from a remote support perspective you couldn't ask for more. You have a site that doesn't have any IT staff located there. You ship drive and tell the personal to yank and insert.

Dell's Poweredge aren't too shabby. Not as great as the HP's, but still fairly decent. The hot spare automatically replaces the failed disk. When you take the bad disk out and insert the new disk, you have to manually convert the new disk as the hot spare.
 
on our dell servers, it's all handled automagically by the raid controller. But, you can only troubleshoot the beeping and track rebuilds if you have installed the management software. There isn't anything native in windows that will do that part for you.
 
Just checked the server- looks like HP has a software utility. So not entirely sure how it handles recovery- I just hope I never find out ;)

I guess it would be a good idea to order a couple more physical drives as spares.

Thanks for all your comments guys!
 
Just checked the server- looks like HP has a software utility. !

HP's ACU...yes. The OS will not see the individual drives, it will only see what you've made for partitions from the RAID. Say you have 3x 36 gig drives setup in a single RAID 5....when you install Winders...it will see only a single roughly 70 gig drive.
 
I'm no longer in an HP environment here, but I'm pretty sure the new Array controllers have 'HotSpare' status like the PERCs in the Dells I use. If you slap the extra drive in a spare cage slot and designate it as a hotspare, as soon as one drive fails the controller automatically mounts the hotspare drive and rebuilds the array on the fly, and you never have to touch it (except to remove the bad drive). Depends on how long you want to run in degraded mode. On the PERCs there's a noticeable difference in performance when running degraded.
 
I'm no longer in an HP environment here, but I'm pretty sure the new Array controllers have 'HotSpare' status like the PERCs in the Dells I use.

The Proliants have had options for "hot spare" for quite a while..even back in the Compaq days. Just depends on which controller you have in your server.
 
Just for paranoia's sake, I've always abided by my own personal Raid-5 rule: "Raid-5 isn't redundant WITHOUT a hot spare" And, I've had a case in a 3 drive system where a drive failed, the array rebuilt, and within hours of finishing the rebuild, Drive#2 dies... It took me forever to get that through my salesweasels' heads too! It's a small price to pay for paranoia insurance.
 
Just checked the server- looks like HP has a software utility. So not entirely sure how it handles recovery- I just hope I never find out ;)

I guess it would be a good idea to order a couple more physical drives as spares.

Thanks for all your comments guys!

If you have it on a RAID 1 (I think it's 1, Mirroring) or Raid 5 --- as long as it's a hard drive failure, Recovery is just a matter of replacing the drive itself. You don't do anything. Hopefully you won't have a different hard drive go out on your while you're rebuilding -- that'd be disasterous.
 
It is RAID5.

I'm thinking of calling HP sometime and giving them the server's SN and telling them to ship me an extra hard drive.
 
Back
Top