Any mdadm gurus here?

dulcificum

n00b
Joined
Nov 22, 2012
Messages
55
A few days ago my QNAP RAID-6 array randomly stopped responding even though the samba share seemed to be still responding. After about 5-10 minutes of this it randomly rebooted itself and asked me to run a checkdisk (e2fsck) when it loaded back up. I did this and it reported complete after about 1 hour and everything looked fine on the admin interface. However, the services had not restarted at all. So I rebooted through admin panel.

Now I can't access the NAS at all through admin panel or even ping it and the LCD display says "Config. Disks? >RAID 6". This means that the RAID is no longer recognised and in fact, all my system settings seems to be gone. When I SSH in, all the shares have disappeared :/

Code:
[/] # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
unused devices: <none>

I've run some commands to look at the array:

Code:
[/] # mdadm --examine --scan
ARRAY /dev/md9 level=raid1 num-devices=6 UUID=2cadcb78:68f1db30:09651c03:5ac6be59
ARRAY /dev/md6 level=raid1 num-devices=2 UUID=6300b0f8:b6e2c998:3628a2c8:465c1cc5
   spares=4
ARRAY /dev/md0 level=raid6 num-devices=6 UUID=22fd322d:10048ef5:8f36ad6a:b0ea696b
ARRAY /dev/md13 level=raid1 num-devices=6 UUID=04d49597:cfd44b87:f4ae2d46:f222307c

[/] # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
unused devices: <none>

Now someone is suggesting my best hope is to run
Code:
mdadm --assemble /dev/md0 --verbose
to try to reassemble the RAID-6.

Is this wise? Could it harm my chances of recovering the data? Is there anything else I should try first? My first priority is to not perform any destructive commands.

Any help or advice greatly appreciated!
 
My recollection is that you won't go wrong by trying to assemble things in the wrong order with mdadm. You'll want to try mounting the array read-only first, though: "mount -o ro /dev/md0 /mnt/foo"
 
The mount stage comes after assembling with mdadm? Why might they be in the wrong order when the drives haven't moved? Sorry but I'm pretty clueless about this...

The normal mount command would be # mount -t ext4 /dev/md0 /share/MD0_DATA. Is it just a case of adding -o ro to this?
 
If
Code:
mdadm --assemble --scan
does not work you can try
Code:
mdadm --assemble /dev/mdX [I]insert_all_member_devices_here[/I]
You can find the right members by executing
Code:
mdadm --examine [I]apparent_member_device[/I]
and looking at the raid label or UUID.

EDIT: I'm not sure that this will always be read-only. I assume that --force is required if the operation can be unsafe.
 
Last edited:
omniscience - thanks so much for your response. I'm afraid I need a little more handholding here even after poring over man pages. I've posted my results from mdadm --examine --scan above. Is it safe to just try running mdadm --assemble --scan and then mounting with something like mount -t ext4 -o ro /dev/md0 /share/MD0_DATA?

How do I find the member devices? This is a 6 drive RAID-6. But I don't know how to find details about the individual drives. What would the member_devices look like? Are we talking /dev/sda[0-5]? Or [1-6]? Or somethign else entirely? And is it worth tacking on a --verbose?

What are things I can safely try in mdadm --examine? How can I use UUIDs?

Thanks so much.
 
Last edited:
What you posted was '--examine --scan', not '--assemble --scan'. I assume that it may not help if it did not assemble automatically in the first place. --examine used with member devices should output some metadata (or was it --detail ?). In any case --detail and --examine are read-only. Which devices you have to use can vary. I would assume something like /dev/sd[abcdef]. How many drives are in there? The first output looks awfully inconsistent if there are only 6 drives.
 
Also what's the point of "-o ro". The man page says: -o, --readonly mark array as readonly.

It doesn't mention the ro part.

edit - see that's for mount only
 
Last edited:
Sorry my bad. There are six drives in there. IIRC, QNAP uses all the drives to make several other RAIDs for firmware and settings. Mine is a RAID-6 over the top of all the drives.

So I can safely just run mdadm --assemble --scan and see what it does?


edit, also got this:
Code:
# mdadm --examine --scan --verbose
ARRAY /dev/md9 level=raid1 num-devices=6 UUID=2cadcb78:68f1db30:09651c03:5ac6be59
   devices=/dev/sdf1,/dev/sde1,/dev/sdd1,/dev/sdc1,/dev/sdb1,/dev/sda1
ARRAY /dev/md6 level=raid1 num-devices=2 UUID=6300b0f8:b6e2c998:3628a2c8:465c1cc5
   spares=4   devices=/dev/sdf2,/dev/sde2,/dev/sdd2,/dev/sdc2,/dev/sdb2,/dev/sda2
ARRAY /dev/md0 level=raid6 num-devices=6 UUID=22fd322d:10048ef5:8f36ad6a:b0ea696b
   devices=/dev/sdf3,/dev/sde3,/dev/sdd3,/dev/sdc3,/dev/sdb3,/dev/sda3
ARRAY /dev/md13 level=raid1 num-devices=6 UUID=04d49597:cfd44b87:f4ae2d46:f222307c
   devices=/dev/sdf4,/dev/sde4,/dev/sdd4,/dev/sdc4,/dev/sdb4,/dev/sdareal4
 
I can't guarantee anything, but --assemble --scan should be safe in my opinion. If you wan't to do a proper safe recovery you have to make block level backups first.
 
Just tried it:

Code:
# mdadm --assemble --scan --verbose
mdadm: No arrays found in config file

Also tried this:

Code:
 # mdadm --assemble /dev/md0 --verbose
mdadm: /dev/md0 not identified in config file.

But --examine clearly shows /dev/md0????
 
More info if it helps what to do next :/

Code:
# mdadm --examine --verbose /dev/md0
mdadm: No md superblock detected on /dev/md0.

# mdadm --examine --verbose /dev/sda3
/dev/sda3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 22fd322d:10048ef5:8f36ad6a:b0ea696b
  Creation Time : Fri Feb  1 18:11:32 2013
     Raid Level : raid6
  Used Dev Size : 2928697600 (2793.02 GiB 2998.99 GB)
     Array Size : 11714790400 (11172.09 GiB 11995.95 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Tue Sep  3 14:05:55 2013
          State : clean
Internal Bitmap : present
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6ecdd683 - correct
         Events : 0.6173635

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8        3        0      active sync   /dev/sda3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
   3     3       8       51        3      active sync   /dev/sdd3
   4     4       8       67        4      active sync   /dev/sde3
   5     5       8       83        5      active sync   /dev/sdf3

# mdadm --examine --verbose /dev/sdb3
/dev/sdb3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 22fd322d:10048ef5:8f36ad6a:b0ea696b
  Creation Time : Fri Feb  1 18:11:32 2013
     Raid Level : raid6
  Used Dev Size : 2928697600 (2793.02 GiB 2998.99 GB)
     Array Size : 11714790400 (11172.09 GiB 11995.95 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Tue Sep  3 14:05:55 2013
          State : clean
Internal Bitmap : present
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6ecdd695 - correct
         Events : 0.6173635

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       19        1      active sync   /dev/sdb3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
   3     3       8       51        3      active sync   /dev/sdd3
   4     4       8       67        4      active sync   /dev/sde3
   5     5       8       83        5      active sync   /dev/sdf3

# mdadm --examine --verbose /dev/sdc3
/dev/sdc3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 22fd322d:10048ef5:8f36ad6a:b0ea696b
  Creation Time : Fri Feb  1 18:11:32 2013
     Raid Level : raid6
  Used Dev Size : 2928697600 (2793.02 GiB 2998.99 GB)
     Array Size : 11714790400 (11172.09 GiB 11995.95 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Tue Sep  3 14:05:55 2013
          State : clean
Internal Bitmap : present
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6ecdd6a7 - correct
         Events : 0.6173635

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       35        2      active sync   /dev/sdc3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
   3     3       8       51        3      active sync   /dev/sdd3
   4     4       8       67        4      active sync   /dev/sde3
   5     5       8       83        5      active sync   /dev/sdf3

# mdadm --examine --verbose /dev/sdd3
/dev/sdd3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 22fd322d:10048ef5:8f36ad6a:b0ea696b
  Creation Time : Fri Feb  1 18:11:32 2013
     Raid Level : raid6
  Used Dev Size : 2928697600 (2793.02 GiB 2998.99 GB)
     Array Size : 11714790400 (11172.09 GiB 11995.95 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Tue Sep  3 14:05:55 2013
          State : clean
Internal Bitmap : present
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6ecdd6b9 - correct
         Events : 0.6173635

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       51        3      active sync   /dev/sdd3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
   3     3       8       51        3      active sync   /dev/sdd3
   4     4       8       67        4      active sync   /dev/sde3
   5     5       8       83        5      active sync   /dev/sdf3

# mdadm --examine --verbose /dev/sde3
/dev/sde3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 22fd322d:10048ef5:8f36ad6a:b0ea696b
  Creation Time : Fri Feb  1 18:11:32 2013
     Raid Level : raid6
  Used Dev Size : 2928697600 (2793.02 GiB 2998.99 GB)
     Array Size : 11714790400 (11172.09 GiB 11995.95 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Tue Sep  3 14:05:55 2013
          State : clean
Internal Bitmap : present
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6ecdd6cb - correct
         Events : 0.6173635

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       67        4      active sync   /dev/sde3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
   3     3       8       51        3      active sync   /dev/sdd3
   4     4       8       67        4      active sync   /dev/sde3
   5     5       8       83        5      active sync   /dev/sdf3
   
# mdadm --examine --verbose /dev/sdf3
/dev/sdf3:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 22fd322d:10048ef5:8f36ad6a:b0ea696b
  Creation Time : Fri Feb  1 18:11:32 2013
     Raid Level : raid6
  Used Dev Size : 2928697600 (2793.02 GiB 2998.99 GB)
     Array Size : 11714790400 (11172.09 GiB 11995.95 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0

    Update Time : Tue Sep  3 14:05:55 2013
          State : clean
Internal Bitmap : present
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6ecdd6dd - correct
         Events : 0.6173635

     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       83        5      active sync   /dev/sdf3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       8       35        2      active sync   /dev/sdc3
   3     3       8       51        3      active sync   /dev/sdd3
   4     4       8       67        4      active sync   /dev/sde3
   5     5       8       83        5      active sync   /dev/sdf3

So can I try a command like mdadm --assemble /dev/mdX insert_all_member_devices_here?

What should it be? mdadm --assemble /dev/md0 sd[a-f]3? mdadm --assemble /dev/md0 sd[f-a]3? Something else entirely?
 
--examine just reads some metadata blocks.

You can do a
Code:
mdadm --assemble /dev/md0 /dev/sd[abcdef]3
You can add --force if this does not work.

But I just have to add the usual disclaimer and recommend to make backups first before you start playing around.

EDIT: You are too fast :)
The --examine looks good, the event code is consistent. I can't really explain what went wrong.
 
Code:
# mdadm --assemble /dev/md0 /dev/sd[abcdef]3
mdadm: /dev/md0 has been started with 6 drives.

Now I try

mount -o ro /dev/md0 /mnt/sgare/MD0_DATA?

Or do I need to sort the other arrays (md9, md6, md13) first?
 
Last edited:
Thanks for your patience so far, this is what I'm thinking now:

Code:
# mdadm --assemble /dev/md9 /dev/sd[abcdef]1 (should I try mdadm --assemble /dev/md9 --verbose instead???)
# mdadm --assemble /dev/md6 /dev/sd[abcdef]2
# mdadm --assemble /dev/md13 /dev/sdareal4,/dev/sd[bcdef]4 (is this the correct order at all??????)

then

Code:
# mount -o ro -t ext4 /dev/md0 /share/MD0_DATA
# mount -o ro -t ext4 /dev/md9 /something? (anyone know where these should be mounted????)
# mount -o ro -t ext4 /dev/md6 /something?
# mount -o ro -t ext4 /dev/md13 /something?
 
I have no idea what /sdareal4 is, maybe some implementation detail of the QNAP. You have to separate with space instead of comma. The three assemble commands look okay. Please post a 'cat /proc/mdstat' and the end of a 'dmesg' to see what the md driver did.
 
Code:
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md0 : active raid6 sda3[0] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1]
      11714790400 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
      bitmap: 0/175 pages [0KB], 8192KB chunk

unused devices: <none>

Code:
md: md0 stopped.
md: bind<sdb3>
md: bind<sdc3>
md: bind<sdd3>
md: bind<sde3>
md: bind<sdf3>
md: bind<sda3>
raid5: device sda3 operational as raid disk 0
raid5: device sdf3 operational as raid disk 5
raid5: device sde3 operational as raid disk 4
raid5: device sdd3 operational as raid disk 3
raid5: device sdc3 operational as raid disk 2
raid5: device sdb3 operational as raid disk 1
raid5: allocated 100928kB for md0
0: w=1 pa=0 pr=6 m=2 a=2 r=6 op1=0 op2=0
5: w=2 pa=0 pr=6 m=2 a=2 r=6 op1=0 op2=0
4: w=3 pa=0 pr=6 m=2 a=2 r=6 op1=0 op2=0
3: w=4 pa=0 pr=6 m=2 a=2 r=6 op1=0 op2=0
2: w=5 pa=0 pr=6 m=2 a=2 r=6 op1=0 op2=0
1: w=6 pa=0 pr=6 m=2 a=2 r=6 op1=0 op2=0
raid5: raid level 6 set md0 active with 6 out of 6 devices, algorithm 2
RAID5 conf printout:
 --- rd:6 wd:6
 disk 0, o:1, dev:sda3
 disk 1, o:1, dev:sdb3
 disk 2, o:1, dev:sdc3
 disk 3, o:1, dev:sdd3
 disk 4, o:1, dev:sde3
 disk 5, o:1, dev:sdf3
md0: bitmap initialized from disk: read 11/11 pages, set 0 bits
created bitmap (175 pages) for device md0
md0: detected capacity change from 0 to 11995945369600

Why is it saying raid5 so much? I'm going to go ahead with assembling the other arrays.

Code:
 # mdadm --assemble /dev/md9 /dev/sd[abcdef]1
mdadm: /dev/md9 has been started with 6 drives
.
# mdadm --assemble /dev/md6 /dev/sd[abcdef]2
mdadm: /dev/md6 has been started with 2 drives and 4 spares.

# mdadm --assemble /dev/md13 /dev/sdareal4 /dev/sd[bcdef]4
mdadm: /dev/md13 has been started with 6 drives.

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md13 : active raid1 sda4[0] sdf4[5] sde4[4] sdd4[3] sdc4[2] sdb4[1]
      458880 blocks [6/6] [UUUUUU]
      bitmap: 0/57 pages [0KB], 4KB chunk

md6 : active raid1 sda2[0] sdc2[5](S) sdd2[4](S) sde2[3](S) sdf2[2](S) sdb2[1]
      530048 blocks [2/2] [UU]

md0 : active raid6 sda3[0] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1]
      11714790400 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
      bitmap: 0/175 pages [0KB], 8192KB chunk

md9 : active raid1 sda1[0] sde1[5] sdf1[4] sdd1[3] sdc1[2] sdb1[1]
      530048 blocks [6/6] [UUUUUU]
      bitmap: 0/65 pages [0KB], 4KB chunk

unused devices: <none>

One problem though. Just got this:

Code:
# mount -o ro -t ext4 /dev/md0 /share/MD0_DATA
mount: mount point /share/MD0_DATA does not exist

Cheers,
 
"raid5" is just the driver name. I see nothing here that points to a problem. It is possible that /share/MD0_DATA is part of another not yet mounted filesystem. Please post your /etc/fstab.
 
Code:
 # cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount pt>     <type>   <options>         <dump> <pass>
/dev/ram       /              ext2      defaults         1      1
proc            /proc          proc     defaults          0      0
none            /dev/pts        devpts  gid=5,mode=620  0       0

Cool. Now what then?

For reference, this is what someone else gets for # mount | grep -v qpkg and # cat /proc/mdstat

Code:
/proc on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
sysfs on /sys type sysfs (rw)
tmpfs on /tmp type tmpfs (rw,size=64M)
none on /proc/bus/usb type usbfs (rw)
/dev/sda4 on /mnt/ext type ext3 (rw)
/dev/md9 on /mnt/HDA_ROOT type ext3 (rw,data=ordered)
/dev/md0 on /share/MD0_DATA type ext4 (rw,usrjquota=aquota.user,jqfmt=vfsv0,user_xattr,data=ordered,delalloc,acl)
nfsd on /proc/fs/nfsd type nfsd (rw)
tmpfs on /.eaccelerator.tmp type tmpfs (rw,size=32M)
tmpfs on /var/syslog_maildir type tmpfs (rw,size=8M)


Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md0 : active raid5 sda3[0] sdd3[3] sdc3[2] sdb3[1]
5855836608 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 5/233 pages [20KB], 4096KB chunk

md4 : active raid1 sdd2[2](S) sdc2[3](S) sdb2[1] sda2[0]
530048 blocks [2/2] [UU]

md13 : active raid1 sda4[0] sdb4[3] sdd4[2] sdc4[1]
458880 blocks [4/4] [UUUU]
bitmap: 0/57 pages [0KB], 4KB chunk

md9 : active raid1 sda1[0] sdd1[3] sdb1[2] sdc1[1]
530048 blocks [4/4] [UUUU]
bitmap: 0/65 pages [0KB], 4KB chunk

unused devices: <none>

Not sure if that helps...
 
Code:
# mkdir -p /share/MD0_DATA 

# mount -o ro -t ext4 /dev/md0 /share/MD0_DATA

BOOM!

Code:
 # cd MD0_DATA/
[/share/MD0_DATA] # ls
Network Recycle Bin/ Qusb/                install/
Public/              Qweb/                lost+found/
Qdownload/           aquota.user          sys/
Qmultimedia/         audio/               video/
Qrecordings/         files/

You guys are heroes!
 
Last edited:
Okay, so what about the other mdX arrays? And getting the whole system back online? Should I reboot or something?

I have no idea how to get services running again and get back into the admin panel...
 
You can check whether the mdadm.conf still has the right entries. The worst thing that could happen is that you have to do all things again. I do not know enough about the internals of the QNAP to be of any help here. It seems that some other script/config file apart from fstab is responsible for the filesystems.
 
Erm....

This is what I get:

Code:
 # cat /etc/config/mdadm.conf
cat: /etc/config/mdadm.conf: No such file or directory
 
Just wanted to come back to thank those who helped - reinstalling the firmware got me exactly back to where I was. Yay! :)
 
Back
Top