OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

danswartz · May 24, 2012

Not to be pedantic - ZFS doesn't support NFS, the underlying (host) OS does. So it depends if your ZFS platform is opensolaris, freebsd, or whatever. I believe the opensolaris variants can change the version offered (v4 vs v3 etc..)

Captainquark · May 25, 2012

I believed I had it running finally stable now... new mainboard, new LSI Controller... everything went fine, until yesterday.

Came home from work, wanted to watch a movie. Started my HTPC, browsed my NAS and started looking something. Changed my mind after 5 minutes and wanted to see something else, so I closed the currently playing film and chose another one, which didn't start playing. Closed my player and tried again, to no avail. Pinged the NAS, and it responded fine. Connected to napp-it, and it couldn't show me the output of 'zpool status' entirely - it showed the pools only, but no details. Wanted to reboot the NAS, and it hang again on shutdown/restart (see my previous posts about this) with no obvious error message. Had to cycle the server power. Showed everything OK again on reboot, system went back to normal. No errors at all on disks or pools!

Going nuts here. Can anyone please guide me on how to troubleshoot this? Which logs should I check? I am on DESCON (Desperation Condition) 1!

Billy_nnn · May 25, 2012

You could start by having a look in /var/adm/messages for any clues - see if anything of any interest was logged there around the time you had your issues!

Captainquark · May 25, 2012

Thanks, mate, will check this when I get home from work.
Cheers,
Cap

_Gea · May 25, 2012

There is a serie of videos about howto setup ESXi, OI and napp-it in the virtualization section
Could be very helpful for beginners

http://hardforum.com/showthread.php?t=1694795

jmk396 · May 26, 2012

Can anybody give me any advice on syncing UID and GID between Solaris Express and Ubuntu/Debian? (re: NFS)

I'm not running NIS or anything fancy, so I can sync the UID manually but what about the GID?

Ubuntu/Debian creates a separate group (GID) for every user, but Solaris uses the "Staff" (GID 10) for everybody.

How can I sync these between the different Solaris and Ubuntu/Debian?

EDIT: It looks like changing the UID's manually (ie. usermod -u ### <username>) breaks things quite badly... When I try to add members to the smb adminstrator group (smbadm add-member -m John administrators) it says unable to find SID until I put the UID back to the original value...

liam137 · May 26, 2012

Captainquark said:
Liam,
Only problem I see at the moment is performance, which does not go over 35-40 MB/s when I copy data onto it from my Windows 7 machine. I know there is a lot of information in this thread related to Windows/CIFS performance, so I will read through it when I have more time and try out what's suggested there.

I get that transfer rate on moving a lot of small files. Moving large filesI can get upwards of 90MB/s or smb. Moving large files within the pool I now see ~140MB/s write.

Captainquark said:
I believed I had it running finally stable now... new mainboard, new LSI Controller... everything went fine, until yesterday.

Came home from work, wanted to watch a movie. Started my HTPC, browsed my NAS and started looking something. Changed my mind after 5 minutes and wanted to see something else, so I closed the currently playing film and chose another one, which didn't start playing. Closed my player and tried again, to no avail. Pinged the NAS, and it responded fine. Connected to napp-it, and it couldn't show me the output of 'zpool status' entirely - it showed the pools only, but no details. Wanted to reboot the NAS, and it hang again on shutdown/restart (see my previous posts about this) with no obvious error message. Had to cycle the server power. Showed everything OK again on reboot, system went back to normal. No errors at all on disks or pools!

Going nuts here. Can anyone please guide me on how to troubleshoot this? Which logs should I check? I am on DESCON (Desperation Condition) 1!

I hate to say it, hanging when viewing zpool status and not responding to shutdown/reboot is what I would experience when a pool was dealing with a faulted drive and was in the midst of activating the spare. As Billy_nnn suggested, information will be found in /var/adm/messages or /var/messages.0

So far I've only had errors on my hot spare which is silly as it just sits idle. I've taken it out and am terrorising it as we speak to see if there are any problems. It shows a bunch of write errors but nothing else and passes all the tests.

I've come accross two bits of information: 1) an intermittently-bad drive can cause cascading errors making other drives go offline. 2) I caught a thread about nextena and one of the devs reported there was bug in the mpt_sas driver that may be causing what we have been experiencing. I
don't know if the patch will make it upstream to Oracle or if Oracle has already fixed it.

Let us know what your logs show.

Oh, as for the multipathing we were talking about, I disabled it because I would see entries about multipath status degraded which is expect as there's no multipath to the drives. Plus, the leading part of the drive name wouldn't all be c0 and had a controller number instead making it easier for me to identify it when using tools like cfgadm. I've never noticed any performance issues either way.

dougal1957 · May 27, 2012

New OI user here with 151a4 and nappit 0.8h installed appears to be fine 11 x 1.5 TB drives in rz2 with 1 spare drive however I would like to fix the IP Addre of the box and also set the MTU rate to 9000 to enable jumboframes.

Can anyone explain how to go about this please.

Any help greatfully appreciated.

Doug

(Total newbie when it comes to OS's other than Windows).

DlStreamnet · May 27, 2012

dougal1957 said:
New OI user here with 151a4 and nappit 0.8h installed appears to be fine 11 x 1.5 TB drives in rz2 with 1 spare drive however I would like to fix the IP Addre of the box and also set the MTU rate to 9000 to enable jumboframes.

Can anyone explain how to go about this please.

Any help greatfully appreciated.

Doug

(Total newbie when it comes to OS's other than Windows).

Check my video regarding static IP:
http://www.youtube.com/watch?v=yOz4-ORawl0

Jumbo frames I haven't enabled because it doesn't seem to do much good.

Edit:

Guys, my bro is trying to access my server @ \\OIZFS01 - he has a near identical setup to me except he has a lot of music production software on his 7x64 install. Previously on WHSv1 he could access shares great.

Now what's happening is when I map the drives or navigate manually, it is SLOW - like round circle loading circle for a good 2-3 mins, and then it says that W: or Y: or w/e the mapped drive is, is no longer available.

It is hit and miss whether it comes up with a green icon with space available.

Any ideas? LIke I said, same setup as me, gigabit connection, static IP, blah...gotta be a software issue as it works great everywhere else.

dougal1957 · May 27, 2012

DlStreamnet said:
Check my video regarding static IP:
http://www.youtube.com/watch?v=yOz4-ORawl0

Jumbo frames I haven't enabled because it doesn't seem to do much good.

DIStreamnet Thank you very much so much easier than I thought it would be.

Would still like to enable jumbo frames tho if anyone can help reason is that I use the box for streaming large files accross the network and would just like to reduce the overheads in the network traffic files are anything upto arround 45 Gb in size (Bluray ISO backup's).

It has just taken several days to FTP my collection from one NAS Server to this new one.

Regards

Doug

danswartz · May 27, 2012

The vmware guy on the virtualization board says jumbo is a total waste of time unless you are running 10gig.

_Gea · May 27, 2012

jmk396 said:
Can anybody give me any advice on syncing UID and GID between Solaris Express and Ubuntu/Debian? (re: NFS)

I'm not running NIS or anything fancy, so I can sync the UID manually but what about the GID?

Ubuntu/Debian creates a separate group (GID) for every user, but Solaris uses the "Staff" (GID 10) for everybody.

How can I sync these between the different Solaris and Ubuntu/Debian?

EDIT: It looks like changing the UID's manually (ie. usermod -u ### <username>) breaks things quite badly... When I try to add members to the smb adminstrator group (smbadm add-member -m John administrators) it says unable to find SID until I put the UID back to the original value...

I suppose these are two different problems
You can manually create new users with different GID's. I would also do not expect problems with changing GID's. If you have assigned a Unix user or group to an SMB group this may differ. I also have had the problem with missing administrator-group which should not happen. In such a case it is good to have an actual system snap to go back to a working state.

Child of Wonder · May 27, 2012

danswartz said:
The vmware guy on the virtualization board says jumbo is a total waste of time unless you are running 10gig.

The gains with 1gb are only 2-3%. Not worth it in a business environment where one has to modify the MTU on several servers, switches, etc. However, I run it at home since all I have is a single switch and a few servers.

jmk396 · May 27, 2012

_Gea said:
I suppose these are two different problems
You can manually create new users with different GID's. I would also do not expect problems with changing GID's. If you have assigned a Unix user or group to an SMB group this may differ. I also have had the problem with missing administrator-group which should not happen. In such a case it is good to have an actual system snap to go back to a working state.

Thanks, Gea.

I found the problem.

When you update UIDs, it does NOT update /var/adm/smbpasswd. This files stores the old UID and you must manually update it. Even if you delete a user (sudo userdel) it doesn't erase it from the smbpasswd file.

Took me a while to figure that one out... Hope it helps somebody

Captainquark · May 28, 2012

OK, just had it again...

Here's the output of /var/adm/messages since yesterday when it ran normal:

Code:

May 27 17:20:38 NAS1 console-kit-daemon[484]: [ID 702911 daemon.warning] GLib-GObject-WARNING: g_object_set_property: construct property "seat-id" for object `CkSession' can't be set after construction
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: AT_SPI_REGISTRY was not started at session startup.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: IOR not set.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: Could not locate registry
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: AT_SPI_REGISTRY was not started at session startup.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: IOR not set.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: Could not locate registry
May 27 17:20:42 NAS1 syslog[1385]: [ID 702911 daemon.warning] Gtk-WARNING: gtkwidget.c:5628: widget not within a GtkWindow
May 27 17:21:07 NAS1 sendmail[659]: [ID 702911 mail.alert] unable to qualify my own domain name (NAS1) -- using short name
May 28 03:23:48 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:48 NAS1 	Disconnected command timeout for Target 11
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31140000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_task_mgt: IOCStatus=0x4a
May 28 03:23:52 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_check_task_mgt: Task 0x3 failed. Target=11
May 28 03:23:52 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 	mptsas_ioc_task_management failed try to reset ioc to recovery!
May 28 03:23:53 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:53 NAS1 	mptsas19 Firmware version v12.0.0.0 (?)
May 28 03:23:53 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:53 NAS1 	mptsas19: IOC Operational.
May 28 03:24:40 NAS1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0):
May 28 03:24:40 NAS1 	/scsi_vhci/disk@g50024e9206338c78 (sd2): Command Timeout on path mpt_sas20/disk@w50024e9206338c78,0
May 28 03:24:40 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:24:40 NAS1 	mptsas_access_config_page: IOCStatus=0x22 IOCLogInfo=0x30030116
May 28 03:24:41 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:24:41 NAS1 	Target 11 reset for command timeout recovery failed!

How can I find out which disk is target 11...? Is that a disk, or a vdev anyway?

Thanks guys for your suggestions...

Best regards,
Cap'

Billy_nnn · May 28, 2012

It should be a physical disk - however there's not really enough info there to pin this down yet.
The disk at Target 11 could be the victim of the issue rather than the culprit - it could just be coincidence that it was this disk being accessed at the time.

Have a look for similar errors throughout all the messages files (located in /var/adm).

try
# grep "reset for command timeout" /var/adm/message*

If there are any other similar messages, do they also implicate Target11?
If so, it may well be the drive or the cable - if not it may be a controller/firmware/driver issue.

It looks like Target11 is a multipathed SAS drive with the WWN "50024e9206338c78" - this should be printed on the drive itself, but might also be found by looking at the output of the format command, listing the dev/(r)dsk directories, the output of stmsboot -l or -L .... and so on!

Disabling multipathing might be a worthwhile step in attempting to pin this down - have a look at the stmsboot man page for details of how to do that! It wouldn't be the first time there's been issues with multipathing and certain drivers/hbas etc

Captainquark · May 28, 2012

Hi Billy_nnn
and thanks a lot for your suggestions.

I looked for more messages, and found 2. It looks like it could be 11, but there's also 10 once.

Code:

root@NAS1:~# grep "reset for command timeout" /var/adm/message*
/var/adm/messages:May 28 03:24:41 NAS1  Target 11 reset for command timeout recovery failed!
/var/adm/messages.0:May 22 08:29:51 NAS1        Target 10 reset for command timeout recovery failed!
/var/adm/messages.0:May 23 23:28:05 NAS1        Target 11 reset for command timeout recovery failed!
/var/adm/messages.0:May 25 09:47:03 NAS1        Target 11 reset for command timeout recovery failed!
root@NAS1:~#

How can I find out which disk is target 10 or 11? Couldn't figure that out yet...

If target 11 was the disk that was also offline, then it was SATA only, so no multipathing involved. I just want to wait until the resilvering is done before I do such a change.
I may be buying SAS multipath disks at a later stage (=if I ever get this setup to run stable). Is it possible to disable multipathing only for one controller, and leave it enabled on another? I have an LSI SAS2008 Controller onboard where the SATA-Disks are connected, and will be connecting (more) SAS disks on my LSI 9211-8i Controller later.

For a better understanding, here are my disks:

Code:

id 	 cap 	 identify 	 error 	 vendor 	 product 	 sn
 c0t50000F000B073158d0 	 1000.20 GB 	 via dd 	 Error: S:2 H:0 T:0 	 ATA 	 SAMSUNG HD103UJ 	 -
 c0t50024E9001AD44FFd0 	 1000.20 GB 	 via dd 	 Error: S:2 H:0 T:0 	 ATA 	 SAMSUNG HD103UJ 	 -
 c0t50024E920061BE8Cd0 	 1000.20 GB 	 via dd 	 Error: S:2 H:0 T:0 	 ATA 	 SAMSUNG HD103UJ 	 -
 c0t50024E9206338B84d0 	 1000.20 GB 	 via dd 	 Error: S:2 H:0 T:0 	 ATA 	 SAMSUNG HD103SJ 	 -
 c0t50024E9206338C78d0 	 1000.20 GB 	 via dd 	 Error: S:3 H:0 T:0 	 ATA 	 SAMSUNG HD103SJ 	 -
 c0t50024E920636D9D9d0 	 1000.20 GB 	 via dd 	 Error: S:2 H:0 T:0 	 ATA 	 SAMSUNG HD103SJ 	 -
 c0t50024E920636DA1Fd0 	 1000.20 GB 	 via dd 	 Error: S:2 H:0 T:0 	 ATA 	 SAMSUNG HD103SJ 	 -
 c12t5000C50000A9B02Dd0 	 36.42 GB 	 via dd 	 Error: S:0 H:0 T:0 	 HP 	 DG036A8B53 	 -
 c13t500000E0161A9EA2d0 	 36.42 GB 	 via dd 	 Error: S:0 H:0 T:0 	 HP 	 DG036A9BB6 	 -
 c14t5000C50007536E2Dd0 	 73.41 GB 	 via dd 	 Error: S:0 H:0 T:0 	 HP 	 DH072ABAA6 	 -
 c15t5000C5000EC7ED89d0 	 73.41 GB 	 via dd 	 Error: S:0 H:0 T:0 	 HP 	 DH072BB978 	 -

The 36GB HP are SAS, and they run the Solaris 11 OS (mirrored).

The 72GB HP are SAS, for testing a VMware pool. They run flawlessly up to now.

Captainquark · May 28, 2012

Just disabled multipath now and rebooted. Hope it helps

I'll keep you posted, thanks guys for your help!

Cheers,
Cap'

JWeavis · May 29, 2012

I'm moving data off my ZFS Folder onto another ZFS Folder on the same Pool, it appears to be doubling the space used (like it's still keeping a copy of the old one even though I moved it). I want to move all these to the new folder so I can delete the old one, but don't have room for twice the space.

bexamous · May 29, 2012

In current releases, if an unmirrored log device fails during operation, the system reverts to the default behavior, using blocks from the main storage pool for the ZIL, just as if the log device had been gracefully removed via the zpool remove command.

Does this mean non-mirrored ZIL is not that that bad an idea? What would have to happen to lose data with non-mirrored ZIL? The log device would have to fail at same exact time the system lost power? I'll risk that, I mean honestly would almost be okay with disabling sync writes all together.

_Gea · May 30, 2012

JWeavis said:
I'm moving data off my ZFS Folder onto another ZFS Folder on the same Pool, it appears to be doubling the space used (like it's still keeping a copy of the old one even though I moved it). I want to move all these to the new folder so I can delete the old one, but don't have room for twice the space.

ZFS folders are independent file systems (like partitions on other filesystems).
A move is always a copy until the moving program deletes the source after the copy.

_Gea · May 30, 2012

bexamous said:
Does this mean non-mirrored ZIL is not that that bad an idea? What would have to happen to lose data with non-mirrored ZIL? The log device would have to fail at same exact time the system lost power? I'll risk that, I mean honestly would almost be okay with disabling sync writes all together.

A log device is only used in case of a power-failure.
A failing log device during normal use is uncritical. Writes are done normally and the on disk-log is used, just like with no extra log device.

If you have a working SSD log device and it fails during a power-loss, up to a few seconds of last writes are lost (the same situation like with disabling sync)

DlStreamnet · May 30, 2012

Can anybody help me with this issue?

Guys, my bro is trying to access my server @ \\OIZFS01 - he has a near identical setup to me except he has a lot of music production software on his 7x64 install. Previously on WHSv1 he could access shares great. I can also access the shares great (100MB/s sustained and great reads)

Now what's happening is when I map the drives or navigate manually, it is SLOW - like round circle loading circle for a good 2-3 mins, and then it says that W: or Y: or w/e the mapped drive is, is no longer available.

It is hit and miss whether it comes up with a green icon with space available.

Any ideas? LIke I said, same setup as me, gigabit connection, static IP, blah...gotta be a software issue as it works great everywhere else.

liam137 · May 30, 2012

Captainquark said:
Hi Billy_nnn
and thanks a lot for your suggestions.

I looked for more messages, and found 2. It looks like it could be 11, but there's also 10 once.

Code:

root@NAS1:~# grep "reset for command timeout" /var/adm/message* /var/adm/messages:May 28 03:24:41 NAS1 Target 11 reset for command timeout recovery failed! /var/adm/messages.0:May 22 08:29:51 NAS1 Target 10 reset for command timeout recovery failed! /var/adm/messages.0:May 23 23:28:05 NAS1 Target 11 reset for command timeout recovery failed! /var/adm/messages.0:May 25 09:47:03 NAS1 Target 11 reset for command timeout recovery failed! root@NAS1:~#

How can I find out which disk is target 10 or 11? Couldn't figure that out yet...

If target 11 was the disk that was also offline, then it was SATA only, so no multipathing involved. I just want to wait until the resilvering is done before I do such a change.
I may be buying SAS multipath disks at a later stage (=if I ever get this setup to run stable). Is it possible to disable multipathing only for one controller, and leave it enabled on another? I have an LSI SAS2008 Controller onboard where the SATA-Disks are connected, and will be connecting (more) SAS disks on my LSI 9211-8i Controller later.

For a better understanding, here are my disks:

Code:

id cap identify error vendor product sn c0t50000F000B073158d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ - c0t50024E9001AD44FFd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ - c0t50024E920061BE8Cd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ - c0t50024E9206338B84d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ - c0t50024E9206338C78d0 1000.20 GB via dd Error: S:3 H:0 T:0 ATA SAMSUNG HD103SJ - c0t50024E920636D9D9d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ - c0t50024E920636DA1Fd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ - c12t5000C50000A9B02Dd0 36.42 GB via dd Error: S:0 H:0 T:0 HP DG036A8B53 - c13t500000E0161A9EA2d0 36.42 GB via dd Error: S:0 H:0 T:0 HP DG036A9BB6 - c14t5000C50007536E2Dd0 73.41 GB via dd Error: S:0 H:0 T:0 HP DH072ABAA6 - c15t5000C5000EC7ED89d0 73.41 GB via dd Error: S:0 H:0 T:0 HP DH072BB978 -

The 36GB HP are SAS, and they run the Solaris 11 OS (mirrored).

The 72GB HP are SAS, for testing a VMware pool. They run flawlessly up to now.

I see the problems have returned for you, too.

As for locating which drive is which, match the WNN name against what is shown in 'cfgadm -al'. They should logically follow the port numbers related to the card but sometimes they're off. WHen I connect new drives or pools, I connect each drive one-by-one and see the identifier printed on the console and note that on the corresponding drive so I can find the offending one if need be.

'stmsboot -d' will disable multipath on all controllers. You can then enable it per controller at a later date. There's some file editing, but it's doable.

I will be sourcing different drives in the coming weeks. The samsungs are going bye-bye.

bexamous · May 30, 2012

So I wrote a shell script to run via cron that will handle creating/destroying snapshots. Somehow everything I found via google was too complex or ugly formatting.

Anyways, ZFS has no io priority levels correct? No way for me to run zfs destroy at a low io priority is there? Eg ionice on linux.

madrebel · May 30, 2012

no, no ionice in solaris. add more ram, then add even more ram. if it is still too slow, build another filer.

_Gea · May 30, 2012

bexamous said:
So I wrote a shell script to run via cron that will handle creating/destroying snapshots. Somehow everything I found via google was too complex or ugly formatting.

Anyways, ZFS has no io priority levels correct? No way for me to run zfs destroy at a low io priority is there? Eg ionice on linux.

nice script
http://docs.oracle.com/cd/E19112-01/ctr.mgr11/816-7751/6mdo2snvk/index.html

but i'm not sure if it helps with zfs destroy
but i had never performance problems with zfs destroy without deduplication enabled

ps
a added snapshot management in napp-it (how often and when to create a snap and how many to keep,
example do a snap on daytime every two hours and keep 20)

Captainquark · May 31, 2012

liam137 said:
I see the problems have returned for you, too.

As for locating which drive is which, match the WNN name against what is shown in 'cfgadm -al'. They should logically follow the port numbers related to the card but sometimes they're off. WHen I connect new drives or pools, I connect each drive one-by-one and see the identifier printed on the console and note that on the corresponding drive so I can find the offending one if need be.

'stmsboot -d' will disable multipath on all controllers. You can then enable it per controller at a later date. There's some file editing, but it's doable.

I will be sourcing different drives in the coming weeks. The samsungs are going bye-bye.

Liam,

Ahm... it might be that turning off multipath did really good to my system. Since then, it is very responsive (i.e. shutdown/restarts without problems, zpool status is displayed in a snap etc.). Had not a single error ever since. Shut down yesterday to change some disks, which worked flawlessly. Want to let it run for a few more days before getting too excited... but look really good at the moment!

Keep you posted.

Thanks,
Cap'

DlStreamnet · May 31, 2012

To improve my data redundancy, do you good folks think:

2x zpools of 3TB each in RaidZ

or 1x zpool of 6TB in RaidZ2?

danswartz · May 31, 2012

one raidz2.

DlStreamnet · May 31, 2012

danswartz said:
one raidz2.

Will I notice a performance issue with raidz2 on 5 discs?

danswartz · May 31, 2012

I'm confused. I thought you had 6 drives? Hence, two 3-disk raidz for plan #1? Anyway, the performance should be fine unless you have some IOP requirements you haven't mentioned.

DlStreamnet · May 31, 2012

danswartz said:
I'm confused. I thought you had 6 drives? Hence, two 3-disk raidz for plan #1? Anyway, the performance should be fine unless you have some IOP requirements you haven't mentioned.

I have 5, but I was planning on buying a sixth if 2xraidz1's performed significantly better than 1xraidz2.

No IOP requirement, its a basic fileserver, but quicker transfers are always nice.

danswartz · May 31, 2012

Well, even with gigabit, it's unlikely the disks will be the limiting factor (the enet will almost certainly.) If you don't mind losing half your storage, and random read performance is important, try 3 mirrors striped together. The read performance should be better than raidz*. I don't generally like multiple raidz1 pools, since you have to divvy stuff up between them. If there is something fundamentally different about the data/clients for each pool, sure, but otherwise, I wouldn't bother.

Captainquark · May 31, 2012

Captainquark said:
Liam,

Ahm... it might be that turning off multipath did really good to my system. Since then, it is very responsive (i.e. shutdown/restarts without problems, zpool status is displayed in a snap etc.). Had not a single error ever since. Shut down yesterday to change some disks, which worked flawlessly. Want to let it run for a few more days before getting too excited... but look really good at the moment!

Keep you posted.

Thanks,
Cap'

Well, Liam, got excited too early.
Came home from work and one mirror of my main datapool was gone, both disks "too many errors". I was able to just clear the error and then it started resilvering back the spare, but... this is just NOT RELIABLE enough! Found a firmware update for my Samsung F3 drives... I guess I'll give it a shot. Let me know how it works with your drive replacement (which, honestly, I cannot afford in the next 6 months or so, with HDD prices so high...).

Cheers, Cap'

turquoisewords · May 31, 2012

We have an all-in-one at our DR site with the following configuration:
HP ProLiant DL360 G6
32gb ECC memory
Xeon E5530 quad-core CPU
4 Broadcom NetXtreme II BCM5709 gigabit NIC cards
LSI 9211-8i SAS controller (for passthru)
72gb mirrored boot drive on motherboard hw raid controller
6x1Tb WDC disks
ESXi 5.0

Virtual hosts:
1. OpenIndiana 151a3 for virtual SAN ...
2. Oracle linux 5.7 Oracle server
3. Centos 6.2 server for firewall, mail, DNS, OpenVPN

Each machine has 2 [virtual] nics:
1. vmxnet3 10g for intra-host networking
2. e1000 gigabit for any outside networking

host 3 is the default route for the other two. It also has a site-to-site OpenVPN tunnel to our office.

We encountered a really vexing problem: network speed was fantastic to/from the SAN host (which contains the virtual storage for the other hosts) from the virtual hosts. Network speeds from our office to any of the three hosts was very good (~20mb/s). Network speed from 2 of the 3 hosts back to our office was also around 20mb/s. All good so far. Network speed from the Oracle host to our office, however, was ABYSMAL! Around 65kb/sec. I have been doing all kinds of experimenting and testing for weeks to try and track this problem down, all to no avail. Changing MTU, vSwitches, etc.

I decided that problem might be the kernel on the Oracle host and its implementation of the vmxnet3 driver. So I came up with the idea of adding a third NIC (e1000) to that host and setting a route to our office subnet to use that interface. After setting that up, still had terrible throughput in the one direction from that host. After using wireshark to monitor the traffic, I decided to try turning off checksum offloading on that NIC. BINGO!! After that the throughput went up to the expected 20mb/s. Actually after further testing I found that RX and TX checksum offloading were fine. It seemed to be TCP checksum offload that really killed things. Why this occurs I have no idea ... but because turning it off solves our problem I am not going to look into it any more.

Hopefully this can help someone else encountering the same problem. if it just the Oracle linux kernel (2.6.32-300.25.1.el5uek) that is the source of the problem, then that won't likely affect man other people.

liam137 · May 31, 2012

Captainquark said:
Well, Liam, got excited too early.
Came home from work and one mirror of my main datapool was gone, both disks "too many errors". I was able to just clear the error and then it started resilvering back the spare, but... this is just NOT RELIABLE enough! Found a firmware update for my Samsung F3 drives... I guess I'll give it a shot. Let me know how it works with your drive replacement (which, honestly, I cannot afford in the next 6 months or so, with HDD prices so high...).

Cheers, Cap'

Captain,

Whereabouts did you find the firmware update? I haven't found anything newer than what my set had.

How long are your cables to the samsungs? Are you using a backplane or a forward break-out?

Also, when your drives are faulted, what kind of errors does it show?

Cheers,
Liam

Captainquark · Jun 1, 2012

Hey,

Found an update for my Samsung F3 HD103SJ drives here. I know it is to fix a different problem, but my desperation level is incredibly high, so I'll try it anyway. I was too upset yesterday to try this, so I'll give it a shot this weekend. Also, I have not yet checked if my drives already have this FW level.

However, I did not find for the life of me an update for my Samsung F1 HD103UJ drives. So this whole thing might not really do something, but calming my mind knowing that I tried everything possible.

Unfortunately, I never paid attention which of the drives is failing when it does - the F1 or the F3's. I will note that in future events, which I'm sure by now they will reoccur.
What also hits me by now is that I never had *any* problems whatsoever on my rpool (both SATA Seagate and now SAS Fujitsu drives) or the VM pool (also both SATA Seagate and now SAS Fujitsu drives). It's always my datapool with the Samsung drives.

So how are my drives connected? At the moment, they're connected onboard (LSI SAS2008 Controller) with forward breakout cables, 50cm long, SFF-8087 to 4 x SATA. It's the second set I am trying, they're from Supermicro and should therefore fit perfectly the Supermicro mainboard with the onboard LSI controller. I have the same cables but 70cm in length to the SAS drives from my 9211-8i controller, and they work perfectly there, even though I use two Raidsonic backplanes for SAS.

If I see an error (means if I can still connect to the console), it's simply "too many errors" on one or several disks. In the past, I could often not connect anymore, which did not happen anymore since I turned off multipath. When I hit "clear errors", it resets everything, so it will show me S:0 H:0 T:0 afterwards... I checked the log yesterday, and it began with the same like last time, but this time for target 13, so I guess it's not really following a pattern. Then it later said "too many errors" on a device and turned it off, taking the spare into duty.

Code:

May 31 17:17:00 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:00 NAS1 	Disconnected command timeout for Target 13
May 31 17:17:04 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 	mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31140000
May 31 17:17:04 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 	mptsas_check_task_mgt: IOCStatus=0x4a
May 31 17:17:04 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 	mptsas_check_task_mgt: Task 0x3 failed. Target=13
May 31 17:17:04 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 	mptsas_ioc_task_management failed try to reset ioc to recovery!
May 31 17:17:06 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:06 NAS1 	mptsas19 Firmware version v12.0.0.0 (?)
May 31 17:17:06 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:06 NAS1 	mptsas19: IOC Operational.
May 31 17:17:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:52 NAS1 	mptsas_access_config_page: IOCStatus=0x22 IOCLogInfo=0x30030116
May 31 17:17:54 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:54 NAS1 	Target 13 reset for command timeout recovery failed!
May 31 18:00:35 NAS1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
May 31 18:00:35 NAS1 EVENT-TIME: Thu May 31 18:00:34 CEST 2012
May 31 18:00:35 NAS1 PLATFORM: X8SIE, CSN: 0123456789, HOSTNAME: NAS1
May 31 18:00:35 NAS1 SOURCE: zfs-diagnosis, REV: 1.0
May 31 18:00:35 NAS1 EVENT-ID: 9648e099-ba39-68d5-f432-8ef5ffc08cbd
May 31 18:00:35 NAS1 DESC: The number of I/O errors associated with a ZFS device exceeded
May 31 18:00:35 NAS1 	     acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-FD for more information.
May 31 18:00:35 NAS1 AUTO-RESPONSE: The device has been offlined and marked as faulted.  An attempt
May 31 18:00:35 NAS1 	     will be made to activate a hot spare if available. 
May 31 18:00:35 NAS1 IMPACT: Fault tolerance of the pool may be compromised.
May 31 18:00:35 NAS1 REC-ACTION: Run 'zpool status -x' and replace the bad device.

and then later also target 10:

Code:

May 31 18:18:00 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0/iport@20/disk@w50024e920636d9d9,0 (sd19):
May 31 18:18:00 NAS1 	drive offline
May 31 18:19:04 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:04 NAS1 	Disconnected command timeout for Target 10
May 31 18:19:08 NAS1 	mptsas_check_task_mgt: IOCStatus=0x4a
May 31 18:19:08 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:08 NAS1 	mptsas_check_task_mgt: Task 0x3 failed. Target=10
May 31 18:19:08 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:08 NAS1 	mptsas_ioc_task_management failed try to reset ioc to recovery!
May 31 18:19:57 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:57 NAS1 	Target 10 reset for command timeout recovery failed!
May 31 18:20:12 NAS1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
May 31 18:20:12 NAS1 EVENT-TIME: Thu May 31 18:20:12 CEST 2012
May 31 18:20:12 NAS1 PLATFORM: X8SIE, CSN: 0123456789, HOSTNAME: NAS1
May 31 18:20:12 NAS1 SOURCE: zfs-diagnosis, REV: 1.0
May 31 18:20:12 NAS1 EVENT-ID: fd8205fe-389a-cbfe-89f0-c77794381ff7
May 31 18:20:12 NAS1 DESC: The number of I/O errors associated with a ZFS device exceeded
May 31 18:20:12 NAS1 	     acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-FD for more information.
May 31 18:20:12 NAS1 AUTO-RESPONSE: The device has been offlined and marked as faulted.  An attempt
May 31 18:20:12 NAS1 	     will be made to activate a hot spare if available. 
May 31 18:20:12 NAS1 IMPACT: Fault tolerance of the pool may be compromised.
May 31 18:20:12 NAS1 REC-ACTION: Run 'zpool status -x' and replace the bad device.

Best regards,

Cap'

Billy_nnn · Jun 1, 2012

How many drives do you have altogether, what type are they, how are they all connected and how are your zpools configured?

bexamous · Jun 1, 2012

jmk396 said:
Huh. Very interesting... thanks!

One more question, does ZFS (again, Solaris 11 Express) only support NFSv4 or does it support earlier versions like 2 and 3? (I'm trying to learn more about NFS...)

I've read that NFSv4 servers are incompatible with earlier versions, yet it seems like my Ubuntu box can connect to ZFS (using NFS) with either NFSv4 or NFSv3...

It supports all versions.

Like if you don't want to support nfs3 you can:
sharectl set -p server_versmin=4 nfs

Or if you don't want to support nfs4 you can:
sharectl set -p server_versmax=3 nfs

You can change settings for:
server_versmin
server_versmax
client_versmin
client_versmax

OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

2[H]4U

Weaksauce

Limp Gawd

Weaksauce

Supreme [H]ardness

Gawd

n00b

n00b

Limp Gawd

n00b

2[H]4U

Supreme [H]ardness

2[H]4U

Gawd

Weaksauce

Limp Gawd

Weaksauce

Weaksauce

[H]ard|Gawd

[H]ard|Gawd

Supreme [H]ardness

Supreme [H]ardness

Limp Gawd

n00b

[H]ard|Gawd

Gawd

Supreme [H]ardness

Weaksauce

Limp Gawd

2[H]4U

Limp Gawd

2[H]4U

Limp Gawd

2[H]4U

Weaksauce

n00b

n00b

Weaksauce

Limp Gawd

[H]ard|Gawd