Areca Again

haileris

Limp Gawd
Joined
Jun 6, 2010
Messages
458
Had intermittent issues with my forth volume ( 12 x 2 TB disks in a RAID-6 config) on my Areca 1680 and today it failed spectacularly. I have 1 disk failed although a bunch of them decided to fail and rebuild en-masse. The RAID state shows rebuilding with the volume state of failed. I guess I'll email Areca support again but anyone got any helpful ideas :) I had just put 17 TB of data on there as well!!

Cheers
H
 
A bit more info which I sent to Areca. I guess I'll read the "help me recover my Areca RAID Sets" post but would love some guidance - some of those things in there scare me!!!

Raid Set Status

Raid Set # 003 11/12 Rebuilding 24004.8GB

Raid Set info

Raid Set # 003 E#3Slot#13 Volume-3 (0/0/3) Failed 20004.0GB
E#3Slot#14
E#3Slot#15
E#3Slot#16
E#3Slot#17
E#3Slot#18
E#3Slot#19
E#3Slot#20
Failed
E#3Slot#22
E#3Slot#23
E#3Slot#24

The disk in slot 21 states that it is a spare.

The event log shows this

2011-02-23 21:29:50 Proxy Or Inband HTTP Log In
2011-02-23 21:25:13 H/W Monitor Raid Powered On
2011-02-23 21:22:00 H/W Monitor Raid Powered On
2011-02-23 21:20:38 H/W Monitor Raid Powered On
2011-02-23 20:44:33 Raid Set # 003 RaidSet Degraded
2011-02-23 20:44:31 Volume-3 Volume Failed
2011-02-23 20:42:40 Proxy Or Inband HTTP Log In
2011-02-23 20:41:05 Enc#3 Slot#23 Device Inserted
2011-02-23 20:41:03 Enc#3 Slot#24 Device Inserted
2011-02-23 20:40:59 Enc#3 Slot#22 Device Inserted
2011-02-23 20:40:56 Enc#3 Slot#23 Device Removed
2011-02-23 20:40:54 Enc#3 Slot#21 Device Inserted
2011-02-23 20:40:53 Enc#3 Slot#24 Device Removed
2011-02-23 20:40:51 Enc#3 Slot#22 Device Removed
2011-02-23 20:40:49 Enc#3 Slot#21 Device Removed
2011-02-23 20:37:19 Incomplete RAID Discovered
2011-02-23 20:37:18 H/W Monitor Raid Powered On
2011-02-23 20:33:52 Incomplete RAID Discovered
2011-02-23 20:33:51 H/W Monitor Raid Powered On
2011-02-23 20:30:16 Incomplete RAID Discovered
2011-02-23 20:30:15 H/W Monitor Raid Powered On
2011-02-23 20:25:57 Proxy Or Inband HTTP Log In
2011-02-23 20:25:44 Enc#3 Slot#21 Device Inserted
2011-02-23 20:25:40 Raid Set # 003 Rebuild RaidSet
2011-02-23 20:25:40 Enc#3 Slot#24 Device Inserted
2011-02-23 20:25:39 Enc#3 Slot#21 Time Out Error
2011-02-23 20:25:34 Enc#3 Slot#24 Device Removed
2011-02-23 20:25:33 Raid Set # 003 RaidSet Degraded
2011-02-23 20:25:31 Raid Set # 003 Rebuild RaidSet
2011-02-23 20:25:31 Volume-3 Volume Failed
2011-02-23 20:25:31 Enc#3 Slot#24 Device Inserted
2011-02-23 20:25:04 Enc#3 Slot#24 Device Removed
2011-02-23 20:25:04 Raid Set # 003 RaidSet Degraded
2011-02-23 20:25:02 Volume-3 Stop Rebuilding 000:19:00
2011-02-23 20:25:02 Volume-3 Volume Failed
2011-02-23 20:24:59 Enc#3 Slot#21 Device Removed
2011-02-23 20:24:59 Raid Set # 003 RaidSet Degraded
2011-02-23 20:24:57 Volume-3 Volume Degraded
2011-02-23 20:06:02 Volume-3 Start Rebuilding
2011-02-23 20:06:00 Raid Set # 003 Rebuild RaidSet
2011-02-23 20:05:59 Enc#3 Slot#22 Device Inserted
2011-02-23 20:05:55 Enc#3 Slot#22 Device Removed
2011-02-23 20:05:54 Raid Set # 003 RaidSet Degraded
2011-02-23 20:05:52 Volume-3 Volume Degraded
2011-02-23 19:02:10 H/W Monitor Raid Powered On
 
you've posted nothing about what make/model expander if any, what exact model areca 1680 (is it really the original 1680 - those are rare), what make/model drives, what firmware on areca, what chassis make/model, what kind of cables, how your devices connect to one another, etc.
 
Uuugh ok

Chassis is a Dell Poweredge 2900 - Model III connected to 2 x Norco 4224. Each Norco holds a HP SAS Expander, 2.06 firmware powered by a basic motherboard, itself powered by a Coolermaster 1000 power supply.

4 volumes running off the Areca, each 12 x Hitachi 2TB in a RAID-6 configuration
Volumes 0, 1 and 2 are fine.

Volume 3 I have had intermittent drive dropouts - obviously today was worse.

The Areca info page gives this:

Controller Name ARC-1680
Firmware Version V1.49 2010-12-02
BOOT ROM Version V1.49 2010-12-02
SAS Firmware Version 4.9.2.0
Serial Number Y939CABLAR200429
Unit Serial #
Main Processor 1.2GHz IOP348 C1
CPU ICache Size 32KBytes
CPU DCache Size 32KBytes/Write Back
CPU SCache Size 512KBytes/Write Back
System Memory 512MB/533MHz/ECC
Current IP Address xxx.xxx.xxx.xxx

All drives are Hitachi. Hitachi HDS722020ALA330 Firmware JKAOA3EA

To be honest I think the RAID controller is failing - as I read this off the status page the 1st volume (volume 0) has just gone to degraded with the volume state saying requires rebuild.
 
Sorry HP SAS Expander firmware is v2.01. From memory some disks in the 1st volume said that they had experienced a timeout. Red mark in Windows event viewer stating a disk reset had been experienced. Drives are hardly breaking sweat, well cooled and all.
 
Ok latest card seemed to lock up - dull beeping when server powered off & unplugged so disconnected bbu. Connected again then could boot past detecting firmware screen. Disk 1 of raid set 0 was missing - the drive itself was marked as free. So could activate the incomplete array & assign it to the array to start the rebuild. Volume 3 still fubar (sorry if this is brief , using my iPhone!)
 
Just ordered 24 Samsungs to back up some of my data and see how it goes about migrating over to flex raid. Anyone have any good ideas about how to restore the data? The thought of ripping all those DVD's is gonna make me cry :)
 
Just ordered 24 Samsungs to back up some of my data and see how it goes about migrating over to flex raid. Anyone have any good ideas about how to restore the data? The thought of ripping all those DVD's is gonna make me cry :)

Doesn't flexraid have a 20 disk limit?
 
Quite honestly, in my experience 4 consecutive drives dropping out and then immediately reinserting is a backplane and/or expander problem and not a controller problem.
 
Quite honestly, in my experience 4 consecutive drives dropping out and then immediately reinserting is a backplane and/or expander problem and not a controller problem.

That's interesting - I can check the backplane and maybe change the keyboard. In general, it is always disks 21-24 that drop in out. Not sure how much in the way of logs I can get from the expander but I'll look at it - thanks!

On the other hand, a few weeks ago volume-0 disappeared entirely and I have to use the recover command to get it back. I guess I should have paid more attention then huh?! :)
 
Yeah the fact that its disks all on the same backplane area I would suspect cable or backplane as well and I doubt its the controller.
 
Thanks guys - will check that out once I stop swearing! The big question though is if I can get my data back? I'm probably going to image all 12 disks before I start any disk recovery unless there is something quick and painless!
 
Back
Top