Areca 1680i: 1 drive keeps going AWOL

Jeroen1000

Limp Gawd
Joined
Sep 17, 2010
Messages
266
My good working RAID6 array is starting to act up.
It's comprised of:

- Hitachi Deskstar 5K3000 2 TB drives
-Hitachi Deskstar 7K2000 2 TB drives
- 1 Toshiba DT01ABA300 3 TB drive (same drive as the 5k3000 but Advanced Format type). It is replacing a broken Hitachi due to bad sectors
- One HP SAS-expander (with custom fan mounted for cooling)
- The PSU is a Silverstone Strider Plus ST60F-P (600W)

Usually, my server is fully shut down. When I boot it, the same Hitachi 5k3000 drive gets marked as missing. A few reboots usually fixes this. It's always the same drive that goes missing.

When the drive is present, a full check of the array (takes about 8 hours) completes successfully. This was not the case when the drive with bad sectors was still present (this one got replaced by the Toshiba drive). So I assume the drive which goes missing (a Hitachi 5K3000 ), is in good condition.

I'm not sure what I should do to troubleshoot this. Pehaps Odditory is still around with expert advice:)

Kind regards,
Jeroen
 
Last edited:
Have you checked the cables? The disk is correctly inserted? I wouldnt trust Odditorys knowledge too much, judging from some earlier threads.
 
Good point, I will check whether the drive does not "shake" loose, but it has always been correctly inserted. Next time if goes missing, I'll reinsert the drive and reboot and visually confirm it is inserted deep enough.

As for the SAS and power cables, if they would be faulty, it should affect more than 1 drive, correct?
 
When I boot it, the same Hitachi 5k3000 drive gets marked as missing. A few reboots usually fixes this. It's always the same drive that goes missing.

Does the same drive go missing mostly when you cold boot your server or does it also occur when you warm boot

If it happens mostly when you cold boot your server, most likely that drive's motor has trouble spinning up its disks quickly or too slow to respond to the host system when powering up during BIOS post process where your RAID controller firmware need to parse connected drives.

You can also try connecting your drives to different ports of the RAID controller just to rule out bad cables or ports. Last time I checked, Areca controller allows RAID disks to be connected in any sequence to any ports after they have been configured already.

My bet is that your Hitachi 5k3000 may have trouble spinning up its disks quickly during power up. (Bad motors perhaps)...

I would replace that drive ASAP..
 
Does the same drive go missing mostly when you cold boot your server or does it also occur when you warm boot

If it happens mostly when you cold boot your server, most likely that drive's motor has trouble spinning up its disks quickly or too slow to respond to the host system when powering up during BIOS post process where your RAID controller firmware need to parse connected drives.

You can also try connecting your drives to different ports of the RAID controller just to rule out bad cables or ports. Last time I checked, Areca controller allows RAID disks to be connected in any sequence to any ports after they have been configured already.

My bet is that your Hitachi 5k3000 may have trouble spinning up its disks quickly during power up. (Bad motors perhaps)...

I would replace that drive ASAP..

Agree with this post. I would try a different port (and thus power cable if hotswap chassis) and also try increasing the staggered powerup of the controller as well to see if that makes a difference. I know I had to do that on a shitty brand 850w PSU that ended up not really being able to output that.
 
Does the same drive go missing mostly when you cold boot your server or does it also occur when you warm boot
I've performed quite a bit of warm boot testing today and all seems well. Also, when I FULLY shut down the server (power off the server) and then boot it immediately thereafter, there is NO issue.

So, based on this and previous occurrences my answer is: it only happens on cold boot.
And warm drives help (that is a bit puzzling isn't it).

You can also try connecting your drives to different ports of the RAID controller just to rule out bad cables or ports. Last time I checked, Areca controller allows RAID disks to be connected in any sequence to any ports after they have been configured already.

I'll visually confirm this when I get home, but if I recollect correctly:

- I've got 3 cables going to the HP SAS-expander card (1 cable per 4 drives, I've got 10)
- I've got 1 cable going from HP SAS-expander card to Areca 1680i card.

Are you saying that I can disconnect one of the 3 cables (obviously the one with the oddball drive) going to the SAS-expander and connect it to the Areca card, which still has 1 free port?
Are you also saying that I can freely swap drives around in the chassis?

I always thought the correct drive order was critical so I would like to confirm this before I make things worse:)

@houkouonchi

An interesting fact is that staggered spin up only works when NOT cold booting. I do not remember the technical details but it was either drives that do not abide to the rules, or the Norco backplane does not contain the necessary logic.

My arrays spins down after 30 minutes and when it gets woken up, then staggered spin up works.

Based on your kind help (all of you) this is my top 3 for troubleshooting in this particular order:

1. Swap cables around. This is the easiest to do but I think this will not be the cause. More than 1 drive should be affected UNLESS a port on the backplane is faulty.

2. Bad drive (motor issue). A warm drives' motor might spin up more easily and that could explain why warmed up drives play nice during a cold boot where the system has been shut down but the drives have not cooled down just yet.

Action: replace drive.

3. Bad PSU. It is a single 12v rail 600W PSU so it can easily handle 10 drives spinning up at once. But things break all the time:).

Action: replace PSU.
 
Last edited:
I've performed quite a bit of warm boot testing today and all seems well. Also, when I FULLY shut down the server (power off the server) and then boot it immediately thereafter, there is NO issue.

So, based on this and previous occurrences my answer is: it only happens on cold boot.
And warm drives help (that is a bit puzzling isn't it).



I'll visually confirm this when I get home, but if I recollect correctly:

- I've got 3 cables going to the HP SAS-expander card (1 cable per 4 drives, I've got 10)
- I've got 1 cable going from HP SAS-expander card to Areca 1680i card.

Are you saying that I can disconnect one of the 3 cables (obviously the one with the oddball drive) going to the SAS-expander and connect it to the Areca card, which still has 1 free port?
Are you also saying that I can freely swap drives around in the chassis?

I always thought the correct drive order was critical so I would like to confirm this before I make things worse:)

@houkouonchi

An interesting fact is that staggered spin up only works when NOT cold booting. I do not remember the technical details but it was either drives that do not abide to the rules, or the Norco backplane does not contain the necessary logic.

My arrays spins down after 30 minutes and when it gets woken up, then staggered spin up works.

Based on your kind help (all of you) this is my top 3 for troubleshooting in this particular order:

1. Swap cables around. This is the easiest to do but I think this will not be the cause. More than 1 drive should be affected UNLESS a port on the backplane is faulty.

2. Bad drive (motor issue). A warm drives' motor might spin up more easily and that could explain why warmed up drives play nice during a cold boot where the system has been shut down but the drives have not cooled down just yet.

Action: replace drive.

3. Bad PSU. It is a single 12v rail 600W PSU so it can easily handle 10 drives spinning up at once. But things break all the time:).

Action: replace PSU.


Ah, your behind an HP SAS expander... well that is why staggered powerup doesn't work because the HP SAS expander ignores this setting and powers on the drives all at once so any drive behind the SAS expander will turn on as soon as it gets power.
 
Are you saying that I can disconnect one of the 3 cables (obviously the one with the oddball drive) going to the SAS-expander and connect it to the Areca card, which still has 1 free port?
Are you also saying that I can freely swap drives around in the chassis?

I always thought the correct drive order was critical so I would like to confirm this before I make things worse

I know for certain you can swap drives around in the same SAS expander chasis because i have done that before and the RAID controller recognized the existing RAID set without any problem. (I have emailed Areca support few years ago about this and they said Areca controllers store RAID configuration in both the controller's flash memory and also in hidden partitions of RAID set's member hard drives so there is no specific sequence those hard drives must be connected).

As for moving one drive out of a SAS expander and connect the drive directly to the RAID controller or to another SAS expander connected to the same controller, that I cannot be certain if that will work or not.

If you are worried about messing up your existing RAID configurations by swapping drives around, you may want to email Areca support to confirm what will work and what will not work.
 
Last edited:
Hi saiyan,

Swapping the drive around in the same chassis works fine indeed. The server went 3 days without the issue and just moments ago it happened again. I've just moved the drive to a different row in the Norco chassis.

This is my approach to swapping cables around and testing for a faulty backplane port at the same time.
Just to be clear: In the Norco chassis, each row (4 drives per row) has its own backplane and each backplane is connected to a different port on the HP expander using a different cable.

I have also tested that the 10 drives draw about 296W at start-up. Well below the rated limited of the PSU which is 600W.
More news soon hopefully:)
 
Just had an event again and this time I have more useful info. The hard drive clicks 2 times and then peeps/ squeeks once and then clicks 2 times again...etc etc.

I popped it out and then hot plugged it back in and it spins up...Array happy again. Soo...flaky drive?
 
Are you still using that hard drive which you reported would drop out of an array on cold boot?
And now it's making clicking noises?

I thought you would have that drive replaced by now..
 
It is due to be replaced next time my company orders some hardware. I thought you would like to know you and houkouonchi were spot on
 
Okay.. It's good to know that you are going to replace that drive soon. :)

From my experience, when a hard drive starts to make unusual clicking sound it probably will not last another six months of daily usage. If it's used in a 24/7 server, it will most likely fail within a couple months.
 
Back
Top