Questions on moving data from one pool to another

raiderj

Limp Gawd
Joined
Jun 21, 2011
Messages
340
I'm currently in the process of creating a new zpool (RAIDz1) and want to copy my current pool's data over as simply as possible. I'm not worried about downtime, so during the copy process I'm fine with any services being offline. I'd appreciate any input or suggestions on how best to do this.

My planned process:

1) Create my new zpool using 4x3TB with Advanced Format drives (512k logical sectors, 4096k physical) and using /dev/disk/by-id names:
Code:
sudo zpool create -o ashift=12 newpool raidz1 ata-TOSHIBA_SN1 ata-TOSHIBA-SN2 ata-TOSHIBA-SN3 ata-TOSHIBA-SN4

2) Set compression on the pool - worthwhile even if the majority of my data is MKV movies?
EDIT: Switching to lz4 compression instead of the default lzjb
Code:
zfs set compression=on newpool
[B]EDIT: [/B]zfs set compression=lz4 newpool

3) Set SMB and NFS sharing on the pool - sharing all downstream filesystems is desired
Code:
zfs set sharesmb=on newpool
zfs set sharenfs=on newpool

4) Now create my filesystems for media files, user data, and virtual machines that will inherit the above set properties:
EDIT: Instead of creating the filesystem, just take a snapshot. Filesystem will be imported in the next step anyway
Code:
zfs snapshot oldpool/media@snap1
zfs snapshot oldpool/usr@snap1
zfs snapshot oldpool/vm@snap1

[B]EDIT - For a recursive snapshot: [/B]zfs snapshot -r oldpool/user@snap1

5) Finally, send my ZFS data from my old pool to my new pool.
EDIT: Added in an "mbuffer" during the large media filesystem transfer. The 1G cache seems to help performance a little bit. However, the newpool can write data just about as fast as I can send it. But, I do see the cache filling up at times with what I assume are smaller files or ones that are being compressed on writing to the newpool:

Code:
zfs send oldpool/media@snap1 | zfs receive newpool/media
[B]EDIT: [/B]zfs send -v oldpool/media@snap1 | mbuffer -s 128k -m 1G -o - | zfs receive -v newpool/media

zfs send oldpool/usr@snap1 | zfs receive newpool/usr
[B]EDIT: [/B]zfs send -R oldpool/usr@snap1 | zfs receive -e newpool

zfs send oldpool/vm@snap1 | zfs receive newpool/vm

EDIT: 6) After the large media snapshot move, I'll take another snapshot and do an incremental send/receive to update anything that's changed in the ~8 hours it takes to move the snapshot over:
Code:
Stopping services: service sabnzbdplus stop, service couchpotato stop, service sickbeard stop, service smbd stop

zfs unmount oldpool/media
zfs snapshot oldpool/media@snap2
zfs send -i oldpool/media@snap1 oldpool/media@snap2 | zfs receive newpool/media

7) Once the data transfer(s) are complete, I would just need to update my services/applications to point to the new pool? Or, could I do a ZFS rename function after unmounting/destroying my old pool and removing the drives?
EDIT: Plan to switch this to just swapping mountpoints between pools.
Code:
zpool destroy -f oldpool
zpool export newpool
zpool import newpool oldpool

[B]EDIT: [/B] Changing mountpoints instead of export/import and then delete the snapshots....
zfs set mountpoint=/poolToRemove oldpool
zfs set mountpoint=/oldpool newpool

zfs destroy -r oldpool@snap1
zfs destroy -r oldpool@snap2

Verify all snapshots are deleted: zfs list -r -t snapshot -o name,creation oldpool

8) Now I just restart my services and all is good to go!



If it makes a difference, I'm running Ubuntu 12.04.
 
Last edited:
You said you wanted to create a RAIDZ2 but then your create command is a RAIDZ1. Also advanced sector drives are 4k physical.

You can use compression=lz4 without downsides really. MKVs won't compress, but LZ4 is so fast and compression is skipped anyway if a minimum ratio is not achieved that it doesn't matter.

When sending full snapshots, you don't have to create the destination dataset beforehand.
The send command should be
Code:
zfs send oldpool/media@snap1 | zfs receive newpool/media
 
Last edited:
You said you wanted to create a RAIDZ2 but then your create command is a RAIDZ1. Also advanced sector drives are 4k physical.

You can use compression=lz4 without downsides really. MKVs won't compress, but LZ4 is so fast and compression is skipped anyway if a minimum ratio is not achieved that it doesn't matter.

When sending full snapshots, you don't have to create the destination dataset beforehand.
The send command should be
Code:
zfs send oldpool/media@snap1 | zfs receive newpool/media

Agreed on all points.
 
You said you wanted to create a RAIDZ2 but then your create command is a RAIDZ1. Also advanced sector drives are 4k physical.

I'm making a RAIDz1 to start, then making a new RAIDz2 to do a final switch and then redoing all the above steps. Don't have enough SATA ports to do the switch just once, but didn't make that clear above.

A SMART scan on the drives show 512b logical, 4096b physical - so definitely AF drives. But, I have to use the ashift=12 command to force the pool to be created properly according to the sector size. At least that's my understanding.

You can use compression=lz4 without downsides really. MKVs won't compress, but LZ4 is so fast and compression is skipped anyway if a minimum ratio is not achieved that it doesn't matter.

It looks like lzjb is the default compression algorithm, so I would have to manually specify "zfs set compression=lz4 newpool", correct?

When sending full snapshots, you don't have to create the destination dataset beforehand.
The send command should be
Code:
zfs send oldpool/media@snap1 | zfs receive newpool/media

I also found that I had to create a snapshot first by using "zfs snapshot oldpool/usr@snap1" - assumed it would create one inline if it didn't already exist. If I wanted to I could just run an incremental snap after a long copy process.
 
Last edited:
Question - I can create a recursive snapshot, but there doesn't seem to be a simple way to receive snapshot recursively. Is there a command flag I can set to do that properly? -R or -r don't work.

Example:
1) Create recursive snapshots for my usr folder:
Code:
zfs snapshot -r oldpool/usr@snap1

2) Output shows all my snapshots:
Code:
zfs list -r -t snapshot -o name,creation oldpool
output:
oldpool/usr@snap1
oldpool/usr/username1@snap1
oldpool/usr/username2@snap1

3) But now I have to import each snapshot individually. Fine for a few snapshots, but would be tiresome if I have lots of snapshots. Seems like there should be a better way?
Code:
zfs send oldpool/usr@snap 1 | zfs receive newpool/usr
zfs send oldpool/usr/username1@snap 1 | zfs receive newpool/usr/username1
zfs send oldpool/usr/username2@snap 1 | zfs receive newpool/usr/username2

However, I imagine there are some automation tools/scripts out there that are made to address this exact type of thing. Any suggestions? I've previously used napp-it, but not since I moved to Ubuntu. Haven't tried the Linux build of it though.
 
It's only one snapshot per dataset though. Honestly, are you really going to be moving data pool->pool with many datasets that often?
 
It's only one snapshot per dataset though. Honestly, are you really going to be moving data pool->pool with many datasets that often?

Nope, definitely don't plan to do that. Just trying to streamline things where I can.
 
6) Once the data transfer(s) are complete, I would just need to update my services/applications to point to the new pool? Or, could I do a ZFS rename function after unmounting/destroying my old pool and removing the drives?
Code:
zpool destroy -f oldpool
zpool export newpool
zpool import newpool oldpool

If it makes a difference, I'm running Ubuntu 12.04.

That is what I did when I moved drives/pools made it simple. I didn't need to change my applications and servers that accessed the pool and worked well.
 
I would not rename the pools, I would just change the mountpoint.

The reason being that if you ever want to connect your old disks again you don't have a name clash.

You can set compression by directory, effectively, by creating more ZFSes in there. It also enables separate snapshotting. I would not rely on the compression being free in your specific case without testing it, and that test should include RAM usage. ZFS uses RAM very freely and I bet the try-compression run is done in more buffers.
 
Well, except for mental habit of using the wrong name. If he ever does need to access the old data, he can import that pool with a different temporary name.
 
Good points, I was thinking about what would happen if I needed to switch back over to my old disks for any reason.

Would it make sense to first change my old disks to a different mountpoint, e.g. /oldpoolbackup, then change my newpool to /oldpool? Then that way if I needed to switch back I could just reverse that process?
 
That is what I did when I moved drives/pools made it simple. I didn't need to change my applications and servers that accessed the pool and worked well.

That's what I'd like, save me having to reconfigure all the apps. I think I could do the same just by switching mountpoints too I think. That actually might be easier since I could keep my old drives around for a bit just in case.
 
Good points, I was thinking about what would happen if I needed to switch back over to my old disks for any reason.

Would it make sense to first change my old disks to a different mountpoint, e.g. /oldpoolbackup, then change my newpool to /oldpool? Then that way if I needed to switch back I could just reverse that process?

Yeah, if you still have the old disks around so that you can make the change right now that is how I would do it.

Personally I have no problem with a name discrepancy between pool and mountpoint, in fact I would move away from pool names that indicate a specific mountpoint or purpose. I would give the pools names that indicate the disk type or some other mental model that you use to say "this set of disks".
 
Yeah, if you still have the old disks around so that you can make the change right now that is how I would do it.

Personally I have no problem with a name discrepancy between pool and mountpoint, in fact I would move away from pool names that indicate a specific mountpoint or purpose. I would give the pools names that indicate the disk type or some other mental model that you use to say "this set of disks".

That's a good point. I can see how having them different would make a lot of sense if you had a larger ZFS setup than what I have.
 
That's a good point. I can see how having them different would make a lot of sense if you had a larger ZFS setup than what I have.

Yeah, I do the same thing with hostnames.

Separate out the purpose and the hardware. For example when I got my first 8-core in 2005 it got two hostnames, one being an alias for "this is the highest core count machine around and one for "this particular hardware". The former would move to new hardware, the latter stays.

With disk arrays I do the same thing now, and with ZFS it's really mapping great since you can identify the pool with the hardware and the mountpoint with the purpose.
 
Finished moving all my data around! Everything is up and running on my new drives. I've updated my first post with the commands I used for reference in case anyone finds it useful (and for me to reference later if needed).
 
Back
Top