NAS drives keep dropping

DangerIsGo

2[H]4U
Joined
Apr 16, 2005
Messages
3,000
I have a NAS with a bunch of HDDs, all Western Digital, in a Norco 4020 case using the SuperMicro 8xSATA PCI-X SATA cards (2 of them). EVERY time I restart the PC, I am always finding 1 or 2 drives are missing. I would have to take them out and reseat them for them to be recognized. From then on, no problems...for the most part.

Recently, I've had two of my 750GBs (Blacks from 2007) die on me (too many bad sectors). I found this out the hard way...the drives would go missing, WHS would notify me, I'd reseat them, they'd be OK, then a little while later, they'd go missing again. This would go on for several hours before I take the damn thing out, put it in my main PC, run diagnostics on it and find its going. SMART status is still OK, but has too many bad sectors so the extended test failed (on both drives) while the basic test passed. I could go ahead and ignore this saying that these drives are 5 years old and they just went, but I don't want to take the chance now anymore, especially with HDD prices as crazy as they are.

I have a mix of blacks and greens, 1.5TB and 1TBs, moderately used at night (not during the day and morning), with temps floating around mid to upper 40s. I took out the 5 jet engines in the case and replaced with equivalent 3x120mm fans. Any suggestions as to what could be causing this? Bad backplanes? (I know they released a later model, the 4220) Too high temps? Bad drives? I'd REALLY like to turn my NAS off during the day since I'm at work 10 hours a day to really save on my electric bill but having to screw around with my HDDs to get them to be seen upon boot up is a feat I have yet to accomplish.
 
Bad backplanes? (I know they released a later model, the 4220) Too high temps? Bad drives? I'd REALLY like to turn my NAS off during the day since I'm at work 10 hours a day to really save on my electric bill but having to screw around with my HDDs to get them to be seen upon boot up is a feat I have yet to accomplish.

All possible, but it could be a power problem as well. Is it always the same 2 drives, or is the problem hitting all the drives? If not always the same 2 drives, is it always drives on 1 particular controller? When the drive drops out, are you getting read errors ahead of the dropout? What do the SMART reads say about the drives? Spin Up Failures? Read Errors?
 
SMART status is still OK

It has taken me many years and 75+ RMAs (mostly at work) to find the first drive that had a SMART fail before it actually completely died.. My point is the overall pass /fail is not very useful. Look at the individual SMART parameters instead. If you do not know what they mean use a program like CrystalDiskInfo to tell you if the drive is having problems and if these problems are a worry.

but has too many bad sectors

The thing about bad sectors is you should watch the count. If the count is growing by the day or week this is may be a sign that the problem is not a isolated media defect but instead a problem with the heads.
 
All possible, but it could be a power problem as well. Is it always the same 2 drives, or is the problem hitting all the drives? If not always the same 2 drives, is it always drives on 1 particular controller? When the drive drops out, are you getting read errors ahead of the dropout? What do the SMART reads say about the drives? Spin Up Failures? Read Errors?

It has taken me many years and 75+ RMAs (mostly at work) to find the first drive that had a SMART fail before it actually completely died.. My point is the overall pass /fail is not very useful. Look at the individual SMART parameters instead. If you do not know what they mean use a program like CrystalDiskInfo to tell you if the drive is having problems and if these problems are a worry.



The thing about bad sectors is you should watch the count. If the count is growing by the day or week this is may be a sign that the problem is not a isolated media defect but instead a problem with the heads.


Sorry for the incredibly delayed reply. To conserve power at my new apartment, I would really want to turn my NAS off when I'm at work or when I'm sleeping (at allow demigrator to have the chance to run at least once).

It's always with different drives. Sometimes it's more a few select drives, but really its most of the drives that this happens to. The usual fix is to pop them out and plug them back in and that fixes it. On the super rare occasion do I have to do it multiple times or pop multiple drives in and out like a combination lock :p



As far as CrystalDiskInfo, it only shows 3 out of the 12 disks in my NAS, any ideas for that? One is the main drive with the OS on it and the others are 2 random ones that are in the pool. One of the them is caution which has

cdij.png
 
Western Digital Drives absolute suck ass in RAID unless you have older FALS drives flashed with TLER or have Enterprise SATA disc which are still junk.

Once we made a move to SAS 15K Seagates they have been running like F1 cars non stop for months on end.

I'd look at different drives if you do not find out the immediate cause of your issue.
 
Back
Top