How do I properly GNOP my advanced format drives for ZFS?

Antoneo · Apr 19, 2011

Hey guys,

FreeBSD 8.2 and ZFS v15 newbie here with 6x 2TB Seagate 5900RPM drives for the zpool. This system is for home use and not yet in production.

How can I tell whether a drive uses 512 byte or 4096 byte sectors from within the operating system? Or is this from datasheet information only?
What is the correct sequence of commands to create a 6 drive raidz2 pool with a 4KiB aligned file system?

Here's what I did and where I wound up:
# wrote GEOM labels to disks (da0 is where FreeBSD is installed)
glabel label zdisk1 /dev/da1
glabel label zdisk2 /dev/da2
glabel label zdisk3 /dev/da3
glabel label zdisk4 /dev/da4
glabel label zdisk5 /dev/da5
glabel label zdisk6 /dev/da6

# create raidz2 zpool
zpool create tank raidz2 label/disk{1..6}

# check ashift, confirmed it to be 9
zdb | grep ashift

# I guess start from scratch
zpool destroy tank

# use 4KiB sectors... somehow
gnop create -S 4096 /dev/label/zdisk1
gnop create -S 4096 /dev/label/zdisk2
gnop create -S 4096 /dev/label/zdisk3
gnop create -S 4096 /dev/label/zdisk4
gnop create -S 4096 /dev/label/zdisk5
gnop create -S 4096 /dev/label/zdisk6

# create new tank
zpool create tank raidz2 label/zdisk1.nop label/zdisk2.nop label/zdisk3.nop label/zdisk4.nop label/zdisk5.nop label/zdisk6.nop

# check ashift, confirmed to be 12
zdb | grep ashift

# run dd to get an idea of performance, noticed it to be lower than ashift=9, ~40% less
# note that I ran this while in directory /tank
dd if=/dev/zero of=zerofile.000 bs=2M count=10000

# reboot machine
shutdown -h now

# once logged in, confirmed that ashift=12 stuck
# zpool status listed tank raidz2
# /dev/label no longer held *.nop entries
# for some reason /tank/zerofile.000 is now missing on its own - I expected it to stick around after a reboot?

# ran dd again in while in /tank, and it stopped prematurely with "filesystem full" error
dd if=/dev/zero of=zerofile.000 bs=2M count=10000

# checked space using df, and filesystem /dev/da0s1a is at 108% capacity, no typo.
# rm /tank/zerofile.000 cleared space

Something went wrong here... The first time this happened, I went and destroyed the tank zpool and noticed that /tank still existed. I deleted the directory manually using rmdir. That is strange to me.

Just looking for some guidance. I'm probably missing a few major concepts as I don't completely understand what's going on here. Feel free to throw in keywords I should Google, but I'd also like to get this setup working.

Antoneo · Apr 19, 2011

Ok, answered my first question:
# you'll see the number of bytes per sector on each "da" device
dmesg | grep da

# another way, substitute appropriate disk number
diskinfo -v da1

Will play around with gnop and zpools and see if I can get it working without problems.

Edit:

Ok, normally I chase things until resolution but in the interest of time, I give up. I don't think I'll be giving up too much performance, and I don't think my limited dd tests mean much except for sequential IO. Tried the following but the same behavior as above was found upon reboot. I will note that this is a ESXI 4.1 VM with Intel SASUC8I (LSI 1068e) HBA passed through.

# start with raw disks again, gnop a disk
gnop create -S 4096 /dev/da1

#create zpool
zpool create tank raidz2 da1.nop label/zdisk2 label/zdisk3 label/zdisk4 label/zdisk5 label/zdisk6

#confirmed ashift=12
zdb | grep ashift

# I think this stops any further activity to the zpool
zpool export tank

#get rid of the gnop device
gnop destroy da1.nop

# bring the pool back online (this phrase might be inaccurate)
zpool import tank

#confirmed ashift=12
zdb | grep ashift

# reboot
shutdown -r now

# and from here I saw the same behavior where the files created by dd are gone, and new dd writes to /tank error with "filesystem full"

I might regret this later as I learn more, but I'm just going to run with the default 4KiB sector drives as 512B sector drives way. Maybe the FreeBSD ZFS team will fix this in a later version.

Antoneo · Apr 20, 2011

Ok, just saw the same thing with a regular non shifted zpool.

I checked the mount points ("cat fstab" and "zfs get mountpoint tank") and didn't see anything obviously wrong.

Just did a "zfs mount -a" and dd now passes, though at a much lower rate than I previously recorded (~370MB/sec vs ~570MB/sec).

Edit: Hmm, something gets lost upon reboot - the problem is back. Here's a copy & paste since SSH is now up:

Code:

[root@zfs1ny1us1 ~]# zpool status
  pool: tank1
 state: ONLINE
 scrub: none requested
config:

        NAME              STATE     READ WRITE CKSUM
        tank1             ONLINE       0     0     0
          raidz2          ONLINE       0     0     0
            label/zdisk1  ONLINE       0     0     0
            label/zdisk2  ONLINE       0     0     0
            label/zdisk3  ONLINE       0     0     0
            label/zdisk4  ONLINE       0     0     0
            label/zdisk5  ONLINE       0     0     0
            label/zdisk6  ONLINE       0     0     0

errors: No known data errors
[root@zfs1ny1us1 ~]# !dd
dd if=/dev/zero of=/tank1/zerofile.001 bs=2M count=10000

/: write failed, filesystem is full
dd: /tank1/zerofile.001: No space left on device
183+0 records in
182+0 records out
381681664 bytes transferred in 2.290498 secs (166636976 bytes/sec)
[root@zfs1ny1us1 ~]# df
Filesystem  1K-blocks    Used   Avail Capacity  Mounted on
/dev/da0s1a    659438  652260  -45576   108%    /
devfs               1       1       0   100%    /dev
/dev/da0s1e    392558   35186  325968    10%    /tmp
/dev/da0s1f   3742698 1423186 2020098    41%    /usr
/dev/da0s1d   1571726   92436 1353552     6%    /var

boydjd · Apr 20, 2011

Your pool isn't mounted, as shown by df.

So when you're doing dd, you're writing to the mount directory, but nothing is mounted there, so you're filling the root partition up instead of the pool.

What's the output of zfs get mountpoint?

boydjd · Apr 20, 2011

Also, make sure that you have this in your /etc/rc.conf:

zfs_enable="YES"

SadTelevision8558 · Apr 20, 2011

I thought that ZFS still has issues dealing with 4k drives properly, and that current workarounds are probably not that stable?

Antoneo · Apr 20, 2011

boydjd: here's what I saw after a reboot:

Code:

[root@zfs1ny1us1 ~]# df
Filesystem  1K-blocks    Used   Avail Capacity  Mounted on
/dev/da0s1a    659438  652262  -45578   108%    /
devfs               1       1       0   100%    /dev
/dev/da0s1e    392558   35186  325968    10%    /tmp
/dev/da0s1f   3742698 1423190 2020094    41%    /usr
/dev/da0s1d   1571726   92430 1353558     6%    /var
[root@zfs1ny1us1 ~]# zfs get mountpoint
NAME   PROPERTY    VALUE       SOURCE
tank1  mountpoint  /tank1      default

Embarrassing, but after adding boydjd's suggestion of zfs_enable="YES" into /etc/rc.conf, my pool came back online properly mounted after a reboot. Thank you very much, I now go back to benchmarking!

How do I properly GNOP my advanced format drives for ZFS?

Antoneo

n00b

Antoneo

n00b

Antoneo

n00b

boydjd

n00b

boydjd

n00b

SadTelevision8558

Limp Gawd

Antoneo

n00b