RAID 5: HDD FAIL

OpenBook

n00b
Joined
Dec 18, 2009
Messages
18
I have a DELL 2900 win2k3 fileserver. Yesterday I noticed that one of the drive caddies was lit up yellow and the LCD on the front said "HDD FAIL". So, naturally, I panicked. I ripped it out and replaced it.

After installing the new drive I was able to relax a bit and take a look at the questionable drive. It's a Seagate 7200.10 320GB with a five year warranty. I figured I would send it in RMA. Seagate wants you to run a Seatools diag before sending anything back.

No problem, I run the Seatools long test:
PASS
PASS
:confused:
Now I've got a drive that I know Seagate won't replace but I don't trust it.

Question is:
Does anyone have any experience with these DELL Perc5i RAID controllers? I can't find any documentation about what conditions cause a HDD FAIL flag. Why would it tell me that a good drive has failed?

I have the drive in question plugged in to my desktop here and it appears to be functioning.
 
Perhaps... Does that data get cleared when you unplug the drive? It passes SMART checks now in my desktop.
 
>Run Wddiag on it or hdtune pro long health test... I had bad drives passing their manufacturers tests but failing other ones...
 
No, if it was failing SMART that would fail Seatools. What can happen (and has been a big issue with WD drives) is that if the drive's natural error handling takes, say 7 seconds to ECC-recover a sector or decide to remap it but the timeout on the RAID controller is, say 5 seconds then the controller will drop the drive after 5 seconds even though the drive gets itself in order 2 seconds later. You can read about WD TLER to get more details about this specific issue.
 
Our Dells occasionally have HDD faults show up for seemingly no reason. We simply reseat the drive and all is good again. That's become our first step anytime it happens, actually.
 
Thank you all very much this is interesting. The HD tune error scan passed, all green. The health screen is meaningless to me, some numbers are above the threshold and two lines are highlighted yellow:
Reallocated sector count and Spin Retry count
What does the yellow indicate?
It also says "Health Status: OK"

Before the drive went down the following error was repeated continuously for 5 hours:

"The Patrol Read corrected a media error"

After that I was getting

"A foreign configuration has been detected." Until I replaced the drive. Now all is quiet in the event log.

WTF is a Patrol Read?
 
Last edited:
Ok, so I put the drive back in the fileserver. Now it is showing that the disk status is "foreign" meaning the drive contains data not relevant to this array. (Might be a pant load of BS)

Question is: How do I clear it and/or add it back to the array; as a hot spare for example.

The options in the DELL OpenManage are: Blink and unBlink.

I have read about people performing a clear and an import command but I can't find that in the management software or the command line interface.

Is there something I need to do to the drive to prepare it prior to joining it?

I think the answer is in here but I am not a member:
http://www.experts-exchange.com/Storage/Hard_Drives/Q_24100162.html
 
I know a friend who had a similar problem like you.
He had a intel raid 5 with 4 seagate 500 gig drive.
He used the raid drive as his OS drive.
At first everything seem fine but then 1 of the drive would fall out of the raid and it would rebuild and thus makes everything slow.
This started to happen very frequently and almost on a weekly basis.
All his drive pass the seagate tool test and they work fine on their own.
Infact he gaved them to me and I been using them in a Nvidia raid 5 for almost a year without any failures except the ones I made on my own.
only difference is I didn't use them as a OS drive and there is no page file or anything similar on it.
 
Now I've got a drive that I know Seagate won't replace but I don't trust it.

How do you know Seagate won't replace it?
Has anybody actually had an RMA'ed drive sent back to you because Seagate claimed it Passed the SeaTools tests, even though the drive was obviously having issues which triggered you to RMA it?

If Seagate is really so strict about that, I'm glad I've never owned a Seagate drive.

As long as I've had some sort of evidence that the HDD was having issues (and most likely at fault for said issues), I've never had an issue with Western Digital or Hitachi accepting problem drives for RMA, even if the drives passed their internal drive test tools and were within SMART thresholds. The internal drive test tools and SMART usually have thresholds set to the point that if either fail, you've likely already experienced data loss as the HDD is already beyond the point of no return, which usually means it is stating the obvious.
 
Few months ago, a seagate 1TB failed on my array (3rd one this year), only testing I did was.. put it into another pc, see if it generally worked (it did, for a bit then smart error).. ANY drive that fails out of a raid for any reason, I replace. Cross-shipping seagate is, $20 per drive iirc.

RMA the drive anyway. The -worst- case that will happen.. Seagate will charge you for the drive (and for a 320gb, that won't be much).
 
If this is a seagate drive that you purchased through dell in a dell caddy, it likely does not have the 5 year seagate warranty on it and has either a 1 year dell warranty or whatever warranty you got on the original server.
 
That is interesting that you mention that mwroobel. I did call DELL and they did say that it was outside of the DELL 3yr warranty I had. When I contacted Seagate they told me that the drive will be in warranty until 2012 (Another two years from this posting)

Lesson here is: don't listen to DELL, they want to sell you something. DELL won't even help you get the full manufacturer's warranty on the hardware they sell you without buying in to some crazy DELL support plan.
 
The Seagate Dell drives we have do not allow for RMA through Seagate direct, whenever I have put the serial numbers in it just redirects you to speak to Dell.

Odd that the serial on that one allows for the RMA process direct through Seagate.
 
At any rate I am still having a problem and it must be an easy fix. In the DELL server management the drive is showing up as "Foreign" and I need to get it added to the array. How do I tell the controller to prepare it as a hot spare?

In other RAID software I just hit "Clear" and "Add" but this DELL software only has "Blink and unblink" which are for drive location as stated earlier.
 
FIXED
I found it. The answer was RTFM but even after reading the description it takes some looking to find. Here is what the help says:
"To locate this task in Storage Management:
Expand the Storage tree object to display the controller objects.
Select a controller object.
Select the Information/Configuration subtab."

There is normal navigation on the left that I was looking at. The Config subtab is in the main window at the top, right underneath the Tab that says "Properties". It is hyperlink text.

I wish they would put this option on the drive and not on the controller.
 
Back
Top