x86 ZFS partition table on OmniOS/Solaris RAIDZ clonable?

Discussion in 'SSDs & Data Storage' started by KPE, Sep 15, 2018.

  1. KPE

    KPE n00bie

    Jul 15, 2018
    Hey everyone,

    I have a question about ZFS disks, as I am not savvy in Solaris partitioning / slices and would like some input from you guys

    On a Solaris/OmniOS intel x86 system with a RAIDZ array:
    Are the first 512 bytes that contains the MBR and GTP partition table on any of the disks in a RAIDZ array significanly different from any of the other (same sized) disks in that same array?

    In other words, could I "clone" the first 512 bytes from one RAIDZ disk to another in the array, and it would make little difference to ZFS, because ZFS doesn't use the information at that level?

    I am asking because I have a RAIDz2 array with 8 3TB drives, where 3 of my drives got their sector 0 corrupted in less than 24 hours leading to loss of the pool

    When I subsequently imaged the bad drives using ddrescue I could see a small scattering of other bad sectors on the drives, but I don't think these bad sectors is what made me loose the drives from the ZFS pool

    During the 24 hour crash window where the drives began to drop out, what was final nail in the coffin in terms of the failing drives being no longer considered for the ZFS pool, was when that sector 0 became unreadable, and the mbr/partition table became unreadable.

    I have imaged all the original drives to a replacement drive set 8. 5 of those drives are perfect images, and 3 of those have a blank sector 0 (Because they were unreadable on the original failing 3), and a small scattering of blank blocks which I hope/count on being repairable if I can get OmniOS to recognize the drives again.

    I am hoping that by cloning the 512 sectors from one of the good drives to the 3 drives with blank sector 0s, would make the drives visible to OmniOS with, so the drives would be passed on to the ZFS layer, which will then look at / consider the ZFS data structures located inside the Solaris partition

    Running fdisk on my linux recovery system shows me the drives are identical except for the GUID identifer - not 100% if ZFS uses the GUID in any form or function, but I imagine the OS might, and it could be a problem if there are multiple disks with identical GUID identifiers. On the other hand it has never been an issue for me working with multiple cloned drives on Linux before

    Running prtvtoc on the drives on my OmniOS box also shows that the solaris labels are identical across the various disks

    Any input / suggestions are much appreciated.

    My original post about my pool failure is here:

    If/when I eventually get the pool recovered, I will update that with the outcome, but thought it best to create a new thread with this partition/MBR specific question



    root@res:~# fdisk -l /dev/sdi
    Disk /dev/sdi: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disklabel type: gpt
    Disk identifier: 82C93EAB-8949-BECF-DC7D-9069F18EC046

    Device Start End Sectors Size Type
    /dev/sdi1 256 5860516750 5860516495 2.7T Solaris /usr & Apple ZFS
    /dev/sdi9 5860516751 5860533134 16384 8M Solaris reserved 1

    Partition 9 does not start on physical sector boundary.


    root@zfs:/dev/rdsk# prtvtoc /dev/rdsk/c3t0d0s2
    * /dev/rdsk/c3t0d0s2 EFI partition map
    * Dimensions:
    * 512 bytes/sector
    * 195371568 sectors
    * 195371501 accessible sectors
    * Flags:
    * 1: unmountable
    * 10: read-only
    * Unallocated space:
    * First Sector Last
    * Sector Count Sector
    * 34 222 255
    * First Sector Last
    * Partition Tag Flags Sector Count Sector Mount Directory
    0 4 00 256 195354895 195355150
    8 11 00 195355151 16384 195371534
  2. KPE

    KPE n00bie

    Jul 15, 2018
    Alright - looks like this story is going to have a happy ending.

    On my linux rescue system I used "dd if=/dev/sdi of=/rescued/avm5.mbr bs=512 count=1" to grab the MBR from one of the 5 good disks

    I then wrote it to each of the 3 partially recovered drives that had blank MBRs ("dd if=/rescued/avm5.mbr of=/dev/sd[abc]")

    I inserted the 5 good imaged drives, and subsequently the 3 drives with the cloned MBRs into my OmniOS box

    The drives were detected, but received warnings about the primary disk label being corrupt and the backup would be used. I don't know if it was necessary but I fixed this with the format command (Went format, selected each of the 3 disks and wrote the "backup" command to restore the backup disk label)

    I then did zpool import, (my heart skipped a beat when I saw my failed pool), and then a zpool import avm, and after about 60 seconds my pool is back online.

    Busy transferring my data to secondary storage, and this should be the culmination of 2 months of patience and not panicking

    Have a nice weekend everyone!

    And a big cheers!

    root@zfs:/dev/rdsk# zpool import
    pool: avm
    id: 4688197856225759405
    state: DEGRADED
    status: One or more devices contains corrupted data.
    action: The pool can be imported despite missing or damaged devices. The
    fault tolerance of the pool may be compromised if imported.
    see: http://illumos.org/msg/ZFS-8000-4J

    avm DEGRADED
    raidz2-0 DEGRADED
    c0t5000CCA228C0AB28d0 ONLINE
    c0t5000CCA228C34A64d0 ONLINE
    c0t5000CCA228C32757d0 ONLINE
    c0t5000CCA228C17999d0 ONLINE
    c0t5000CCA228C2F141d0 ONLINE
    c0t5000CCA228C0A65Cd0 ONLINE
    c0t5000CCA228C0B028d0 FAULTED corrupted data
    c0t5000CCA228C0A7C5d0 ONLINE