ZFS pool version dilemma with napp-it upgrade, data lost?

j1mmy

n00b
Joined
Jan 12, 2012
Messages
21
Hi,

I have a weird problem with my pools and data seems to be unaccessible. Let's describe it step by step:

My old zpool version was v28, with zfs version 5. I created that "ages" ago and run it on Solaris 11 for about a year - no upgrade was done so far. What I have done, was creating a new zpool/zfs during that year (apparently zpool v28 with zfs version 6). Then, I wanted to move away from Solaris (mostly due to missing power management for my drives) and used napp-it-14a which runs OmniOS as a base system. I guess I was a bit fast on typing the upgrade commands and my pools were upgraded to a newer zpool version ("-" 5000?). Good news: I can access all data which were previously on v28/v5.

What I cannot access is data from the formentioned zfs v6 datasets, since OmniOS understands only zfs v5 data. When I start with Solaris, I cannot read any data ("-"/5000 not compatible with v31/v28). I used to have monthly/weekly/daily zfs snapshots, all inaccessible now.

I know that a pool downgrade to v28 is not possible. Is there any way to access my data again?
 
I had something similar happen in Solaris. At some point new volumes in Solaris stopped inheriting ZFS version from their parent and they get created on the most recent ZFS version. Luckily I noticed before upgrading pool versions so could copy the data out.

Unfortunately, I can't think of a way you can access your data now. Hopefully you have a backup, if not you now know why they say "RAID is not backup".
 
apnar, you nailed it. Thats exactly what happened.

I have limited knowledge about the datails, but this are the options I can put together:

Option A: Wait till OmniOS can read zfs v6. Probably not going to happen, since not open source :-( Thank you, Oracle.

Option B: Wait, till Oracle can read v5000 - not going to happen, either.

Option C: Find out what steps OmniOS did take to upgrade from v28 to 5000. The zpool upgrade took only a couple seconds. I guess there is a version number change somewhere on the HDD. The data itself doesn't seem to be touched, on 4TB drives, this would take hours, if not days.

Option D: Write a tool (basically a dd command on the disk) an try to reconstruct the data. Most data was uncompressed in ZFS, so probably raw on the hard drive.

I can't imagine though that I'm the very first one who needs a downgrade of a ZFS version. The data was untouched since the upgrade. Yes, shame on me that my "backups" were snapshots - I expected the worst case being a failed drive.
 
ZFS from the OpenZFS alliance and Oracle ZFS are nowadays incompatible due the closed source strategy of Oracle even on the base ZFS file structure so you (mostly) need to copy all files over on a platform change.

I would not expect a solution, but you may ask at the Illumos IRC and Illumios mailing list
http://echelog.com/logs/browse/illumos/
 
Not sure if I understood your problem correctly, but as far as I understood:
1) You can perfectly fine access your ZFS v5000 pool under OmniOS
2) You can perfectly fine access your ZFS v30 pool under Oracle

Now I'm not sure if you can export your ZFS snapshots to an older version of the pool (v28). Based on your comment about both OSses not understanding the other pool's format it seemed that it should.

j1mmy said:
When I start with Solaris, I cannot read any data ("-"/5000 not compatible with v31/v28). I used to have monthly/weekly/daily zfs snapshots, all inaccessible now.
I thought you wanted to move away from Solaris. I say this because it would seem as if you would like to read data from OmniOS to Solaris. Where are your snapshots? On v30/Solaris or on v5000/OmniOS?

If that were the case I'd simply create a new v28 pool on a fresh drive/vdev (you didn't mention how big you pool was), copy over your snapshots from Oracle v30, go into OmniOS and import&upgrade your snapshots to v5000. On the net some people reported successfully sending v5000 snapshots to v28 so maybe it also works from v30 to v28.

You just need to get a new (set of :confused:) HDD and try this out if your snapshots are valuable enough to you.
Hope this helps.
 
luckylinux, that is almost correct:
1) I can perfectly fine access my upgraded v5000 pools under OmniOS, but I cannot read the zfs v6 datasets.
2) I cannot mount the pools under Oracle Solaris 11.1 at all.

BUT: Based on your post, I tried following. Note: Everythings runs under ESXi, so fast setup of VMs is not an issue, neither new harddrives, I have some old of them lying around.

1) created a Solaris 11.1 vm, attached a new spare M1015 controller to it for a new set of 8 harddrives.
2) created a new pool there (v34, doesn't matter by the way).
3) created a new dataset: zfs v6 (default).
4) logged in into nappit with OmniOS and
5) mounted the upgraded v5000 pool read-only
6) zfs v6 there is not readable (directory empty) and then
7) started the zfs send ds@lasthourlysnapshot | ssh oraclehost zfs recv command
8) after a couple hours the first dataset was moved over, and it is there on Oracle!!!
9) I'm trying the mirrored-4TB-Pool right now, this will take some time, but I can say that it looks very promising so far. Will report back.

It's almost unbelievable, that zfs send / receive does not check more stuff and works stupidly :)

The plan for the last step is to simply rsync the data from Oracle to the desired place on OmniOS.
 
Gratulation!
and thanks about the valuable report.
 
4TB-recovery still under way, but the other affected datasets recovered successfully.

Funny, that I had a month-old snapshot on the apparently-lost-dataset before I created the zfs-v6 dataset and moving everything to the new set of drives. So, in worst case I would not have lost that much data...

I don't know if anybody else does this, but I configured napp-it autoservice to yearly snapshots, monhly snapshots (keep 12), weekly snapshots (keep 4), daily snapshots (keep 7) and hourly snapshots (keep 24). Just a set-and-forget-configurations. I needed the hourly snapshots often when accidently deleting files.

The new plan is to build a second backup server and send incremental snapshots there. The machine will be in another room. I wish I could be able to put in on a remote location, but it's nearly impossible to save tens of TB online.

I'm not super active in writing in this forum, but I read here on a daily basis. I'm so thankful that you guys helped me solve the problem!
 
luckylinux, that is almost correct:
1) I can perfectly fine access my upgraded v5000 pools under OmniOS, but I cannot read the zfs v6 datasets.
2) I cannot mount the pools under Oracle Solaris 11.1 at all.

BUT: Based on your post, I tried following. Note: Everythings runs under ESXi, so fast setup of VMs is not an issue, neither new harddrives, I have some old of them lying around.

1) created a Solaris 11.1 vm, attached a new spare M1015 controller to it for a new set of 8 harddrives.
2) created a new pool there (v34, doesn't matter by the way).
3) created a new dataset: zfs v6 (default).
4) logged in into nappit with OmniOS and
5) mounted the upgraded v5000 pool read-only
6) zfs v6 there is not readable (directory empty) and then
7) started the zfs send ds@lasthourlysnapshot | ssh oraclehost zfs recv command
8) after a couple hours the first dataset was moved over, and it is there on Oracle!!!
9) I'm trying the mirrored-4TB-Pool right now, this will take some time, but I can say that it looks very promising so far. Will report back.

It's almost unbelievable, that zfs send / receive does not check more stuff and works stupidly :)

The plan for the last step is to simply rsync the data from Oracle to the desired place on OmniOS.

Sorry that I couldn't reply before.

Now you lost me there. I planned on making you transfer data by creating a common-understandable ZFS pool to save the snapshots to. Save there the snapshots from Oracle Solaris and load them into OmniOS.
What you did by creating a bleeding-edge ZFS v34 is making it even more impossible for OmniOS to understand.

In fact: are you transfering snapshots from Oracle to OmniOS or from OmniOS to Oracle :confused:

I'm glad I helped you solve your problem, although I think you did in a whole other way that what I though of :D
 
Back
Top