Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
Liam,
Only problem I see at the moment is performance, which does not go over 35-40 MB/s when I copy data onto it from my Windows 7 machine. I know there is a lot of information in this thread related to Windows/CIFS performance, so I will read through it when I have more time and try out what's suggested there.
I believed I had it running finally stable now... new mainboard, new LSI Controller... everything went fine, until yesterday.
Came home from work, wanted to watch a movie. Started my HTPC, browsed my NAS and started looking something. Changed my mind after 5 minutes and wanted to see something else, so I closed the currently playing film and chose another one, which didn't start playing. Closed my player and tried again, to no avail. Pinged the NAS, and it responded fine. Connected to napp-it, and it couldn't show me the output of 'zpool status' entirely - it showed the pools only, but no details. Wanted to reboot the NAS, and it hang again on shutdown/restart (see my previous posts about this) with no obvious error message. Had to cycle the server power. Showed everything OK again on reboot, system went back to normal. No errors at all on disks or pools!
Going nuts here. Can anyone please guide me on how to troubleshoot this? Which logs should I check? I am on DESCON (Desperation Condition) 1!
New OI user here with 151a4 and nappit 0.8h installed appears to be fine 11 x 1.5 TB drives in rz2 with 1 spare drive however I would like to fix the IP Addre of the box and also set the MTU rate to 9000 to enable jumboframes.
Can anyone explain how to go about this please.
Any help greatfully appreciated.
Doug
(Total newbie when it comes to OS's other than Windows).
Check my video regarding static IP:
http://www.youtube.com/watch?v=yOz4-ORawl0
Jumbo frames I haven't enabled because it doesn't seem to do much good.
DIStreamnet Thank you very much so much easier than I thought it would be.
Would still like to enable jumbo frames tho if anyone can help reason is that I use the box for streaming large files accross the network and would just like to reduce the overheads in the network traffic files are anything upto arround 45 Gb in size (Bluray ISO backup's).
It has just taken several days to FTP my collection from one NAS Server to this new one.
Regards
Doug
Can anybody give me any advice on syncing UID and GID between Solaris Express and Ubuntu/Debian? (re: NFS)
I'm not running NIS or anything fancy, so I can sync the UID manually but what about the GID?
Ubuntu/Debian creates a separate group (GID) for every user, but Solaris uses the "Staff" (GID 10) for everybody.
How can I sync these between the different Solaris and Ubuntu/Debian?
EDIT: It looks like changing the UID's manually (ie. usermod -u ### <username>) breaks things quite badly... When I try to add members to the smb adminstrator group (smbadm add-member -m John administrators) it says unable to find SID until I put the UID back to the original value...
The vmware guy on the virtualization board says jumbo is a total waste of time unless you are running 10gig.
Thanks, Gea.I suppose these are two different problems
You can manually create new users with different GID's. I would also do not expect problems with changing GID's. If you have assigned a Unix user or group to an SMB group this may differ. I also have had the problem with missing administrator-group which should not happen. In such a case it is good to have an actual system snap to go back to a working state.
May 27 17:20:38 NAS1 console-kit-daemon[484]: [ID 702911 daemon.warning] GLib-GObject-WARNING: g_object_set_property: construct property "seat-id" for object `CkSession' can't be set after construction
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: AT_SPI_REGISTRY was not started at session startup.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: IOR not set.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: Could not locate registry
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: AT_SPI_REGISTRY was not started at session startup.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: IOR not set.
May 27 17:20:40 NAS1 gnome-session[1325]: [ID 702911 daemon.warning] atk-bridge-WARNING: Could not locate registry
May 27 17:20:42 NAS1 syslog[1385]: [ID 702911 daemon.warning] Gtk-WARNING: gtkwidget.c:5628: widget not within a GtkWindow
May 27 17:21:07 NAS1 sendmail[659]: [ID 702911 mail.alert] unable to qualify my own domain name (NAS1) -- using short name
May 28 03:23:48 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:48 NAS1 Disconnected command timeout for Target 11
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31130000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31140000
May 28 03:23:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_task_mgt: IOCStatus=0x4a
May 28 03:23:52 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_check_task_mgt: Task 0x3 failed. Target=11
May 28 03:23:52 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:52 NAS1 mptsas_ioc_task_management failed try to reset ioc to recovery!
May 28 03:23:53 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:53 NAS1 mptsas19 Firmware version v12.0.0.0 (?)
May 28 03:23:53 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:23:53 NAS1 mptsas19: IOC Operational.
May 28 03:24:40 NAS1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0):
May 28 03:24:40 NAS1 /scsi_vhci/disk@g50024e9206338c78 (sd2): Command Timeout on path mpt_sas20/disk@w50024e9206338c78,0
May 28 03:24:40 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:24:40 NAS1 mptsas_access_config_page: IOCStatus=0x22 IOCLogInfo=0x30030116
May 28 03:24:41 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 28 03:24:41 NAS1 Target 11 reset for command timeout recovery failed!
root@NAS1:~# grep "reset for command timeout" /var/adm/message*
/var/adm/messages:May 28 03:24:41 NAS1 Target 11 reset for command timeout recovery failed!
/var/adm/messages.0:May 22 08:29:51 NAS1 Target 10 reset for command timeout recovery failed!
/var/adm/messages.0:May 23 23:28:05 NAS1 Target 11 reset for command timeout recovery failed!
/var/adm/messages.0:May 25 09:47:03 NAS1 Target 11 reset for command timeout recovery failed!
root@NAS1:~#
id cap identify error vendor product sn
c0t50000F000B073158d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ -
c0t50024E9001AD44FFd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ -
c0t50024E920061BE8Cd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ -
c0t50024E9206338B84d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ -
c0t50024E9206338C78d0 1000.20 GB via dd Error: S:3 H:0 T:0 ATA SAMSUNG HD103SJ -
c0t50024E920636D9D9d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ -
c0t50024E920636DA1Fd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ -
c12t5000C50000A9B02Dd0 36.42 GB via dd Error: S:0 H:0 T:0 HP DG036A8B53 -
c13t500000E0161A9EA2d0 36.42 GB via dd Error: S:0 H:0 T:0 HP DG036A9BB6 -
c14t5000C50007536E2Dd0 73.41 GB via dd Error: S:0 H:0 T:0 HP DH072ABAA6 -
c15t5000C5000EC7ED89d0 73.41 GB via dd Error: S:0 H:0 T:0 HP DH072BB978 -
In current releases, if an unmirrored log device fails during operation, the system reverts to the default behavior, using blocks from the main storage pool for the ZIL, just as if the log device had been gracefully removed via the zpool remove command.
I'm moving data off my ZFS Folder onto another ZFS Folder on the same Pool, it appears to be doubling the space used (like it's still keeping a copy of the old one even though I moved it). I want to move all these to the new folder so I can delete the old one, but don't have room for twice the space.
Does this mean non-mirrored ZIL is not that that bad an idea? What would have to happen to lose data with non-mirrored ZIL? The log device would have to fail at same exact time the system lost power? I'll risk that, I mean honestly would almost be okay with disabling sync writes all together.
Hi Billy_nnn
and thanks a lot for your suggestions.
I looked for more messages, and found 2. It looks like it could be 11, but there's also 10 once.
Code:root@NAS1:~# grep "reset for command timeout" /var/adm/message* /var/adm/messages:May 28 03:24:41 NAS1 Target 11 reset for command timeout recovery failed! /var/adm/messages.0:May 22 08:29:51 NAS1 Target 10 reset for command timeout recovery failed! /var/adm/messages.0:May 23 23:28:05 NAS1 Target 11 reset for command timeout recovery failed! /var/adm/messages.0:May 25 09:47:03 NAS1 Target 11 reset for command timeout recovery failed! root@NAS1:~#
How can I find out which disk is target 10 or 11? Couldn't figure that out yet...
If target 11 was the disk that was also offline, then it was SATA only, so no multipathing involved. I just want to wait until the resilvering is done before I do such a change.
I may be buying SAS multipath disks at a later stage (=if I ever get this setup to run stable). Is it possible to disable multipathing only for one controller, and leave it enabled on another? I have an LSI SAS2008 Controller onboard where the SATA-Disks are connected, and will be connecting (more) SAS disks on my LSI 9211-8i Controller later.
For a better understanding, here are my disks:
Code:id cap identify error vendor product sn c0t50000F000B073158d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ - c0t50024E9001AD44FFd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ - c0t50024E920061BE8Cd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103UJ - c0t50024E9206338B84d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ - c0t50024E9206338C78d0 1000.20 GB via dd Error: S:3 H:0 T:0 ATA SAMSUNG HD103SJ - c0t50024E920636D9D9d0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ - c0t50024E920636DA1Fd0 1000.20 GB via dd Error: S:2 H:0 T:0 ATA SAMSUNG HD103SJ - c12t5000C50000A9B02Dd0 36.42 GB via dd Error: S:0 H:0 T:0 HP DG036A8B53 - c13t500000E0161A9EA2d0 36.42 GB via dd Error: S:0 H:0 T:0 HP DG036A9BB6 - c14t5000C50007536E2Dd0 73.41 GB via dd Error: S:0 H:0 T:0 HP DH072ABAA6 - c15t5000C5000EC7ED89d0 73.41 GB via dd Error: S:0 H:0 T:0 HP DH072BB978 -
The 36GB HP are SAS, and they run the Solaris 11 OS (mirrored).
The 72GB HP are SAS, for testing a VMware pool. They run flawlessly up to now.
So I wrote a shell script to run via cron that will handle creating/destroying snapshots. Somehow everything I found via google was too complex or ugly formatting.
Anyways, ZFS has no io priority levels correct? No way for me to run zfs destroy at a low io priority is there? Eg ionice on linux.
I see the problems have returned for you, too.
As for locating which drive is which, match the WNN name against what is shown in 'cfgadm -al'. They should logically follow the port numbers related to the card but sometimes they're off. WHen I connect new drives or pools, I connect each drive one-by-one and see the identifier printed on the console and note that on the corresponding drive so I can find the offending one if need be.
'stmsboot -d' will disable multipath on all controllers. You can then enable it per controller at a later date. There's some file editing, but it's doable.
I will be sourcing different drives in the coming weeks. The samsungs are going bye-bye.
one raidz2.
I'm confused. I thought you had 6 drives? Hence, two 3-disk raidz for plan #1? Anyway, the performance should be fine unless you have some IOP requirements you haven't mentioned.
Liam,
Ahm... it might be that turning off multipath did really good to my system. Since then, it is very responsive (i.e. shutdown/restarts without problems, zpool status is displayed in a snap etc.). Had not a single error ever since. Shut down yesterday to change some disks, which worked flawlessly. Want to let it run for a few more days before getting too excited... but look really good at the moment!
Keep you posted.
Thanks,
Cap'
Well, Liam, got excited too early.
Came home from work and one mirror of my main datapool was gone, both disks "too many errors". I was able to just clear the error and then it started resilvering back the spare, but... this is just NOT RELIABLE enough! Found a firmware update for my Samsung F3 drives... I guess I'll give it a shot. Let me know how it works with your drive replacement (which, honestly, I cannot afford in the next 6 months or so, with HDD prices so high...).
Cheers, Cap'
May 31 17:17:00 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:00 NAS1 Disconnected command timeout for Target 13
May 31 17:17:04 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 mptsas_check_scsi_io: IOCStatus=0x48 IOCLogInfo=0x31140000
May 31 17:17:04 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 mptsas_check_task_mgt: IOCStatus=0x4a
May 31 17:17:04 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 mptsas_check_task_mgt: Task 0x3 failed. Target=13
May 31 17:17:04 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:04 NAS1 mptsas_ioc_task_management failed try to reset ioc to recovery!
May 31 17:17:06 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:06 NAS1 mptsas19 Firmware version v12.0.0.0 (?)
May 31 17:17:06 NAS1 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:06 NAS1 mptsas19: IOC Operational.
May 31 17:17:52 NAS1 scsi: [ID 243001 kern.info] /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:52 NAS1 mptsas_access_config_page: IOCStatus=0x22 IOCLogInfo=0x30030116
May 31 17:17:54 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 17:17:54 NAS1 Target 13 reset for command timeout recovery failed!
May 31 18:00:35 NAS1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
May 31 18:00:35 NAS1 EVENT-TIME: Thu May 31 18:00:34 CEST 2012
May 31 18:00:35 NAS1 PLATFORM: X8SIE, CSN: 0123456789, HOSTNAME: NAS1
May 31 18:00:35 NAS1 SOURCE: zfs-diagnosis, REV: 1.0
May 31 18:00:35 NAS1 EVENT-ID: 9648e099-ba39-68d5-f432-8ef5ffc08cbd
May 31 18:00:35 NAS1 DESC: The number of I/O errors associated with a ZFS device exceeded
May 31 18:00:35 NAS1 acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information.
May 31 18:00:35 NAS1 AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt
May 31 18:00:35 NAS1 will be made to activate a hot spare if available.
May 31 18:00:35 NAS1 IMPACT: Fault tolerance of the pool may be compromised.
May 31 18:00:35 NAS1 REC-ACTION: Run 'zpool status -x' and replace the bad device.
May 31 18:18:00 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0/iport@20/disk@w50024e920636d9d9,0 (sd19):
May 31 18:18:00 NAS1 drive offline
May 31 18:19:04 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:04 NAS1 Disconnected command timeout for Target 10
May 31 18:19:08 NAS1 mptsas_check_task_mgt: IOCStatus=0x4a
May 31 18:19:08 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:08 NAS1 mptsas_check_task_mgt: Task 0x3 failed. Target=10
May 31 18:19:08 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:08 NAS1 mptsas_ioc_task_management failed try to reset ioc to recovery!
May 31 18:19:57 NAS1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,d13a@5/pci15d9,400@0 (mpt_sas19):
May 31 18:19:57 NAS1 Target 10 reset for command timeout recovery failed!
May 31 18:20:12 NAS1 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
May 31 18:20:12 NAS1 EVENT-TIME: Thu May 31 18:20:12 CEST 2012
May 31 18:20:12 NAS1 PLATFORM: X8SIE, CSN: 0123456789, HOSTNAME: NAS1
May 31 18:20:12 NAS1 SOURCE: zfs-diagnosis, REV: 1.0
May 31 18:20:12 NAS1 EVENT-ID: fd8205fe-389a-cbfe-89f0-c77794381ff7
May 31 18:20:12 NAS1 DESC: The number of I/O errors associated with a ZFS device exceeded
May 31 18:20:12 NAS1 acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information.
May 31 18:20:12 NAS1 AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt
May 31 18:20:12 NAS1 will be made to activate a hot spare if available.
May 31 18:20:12 NAS1 IMPACT: Fault tolerance of the pool may be compromised.
May 31 18:20:12 NAS1 REC-ACTION: Run 'zpool status -x' and replace the bad device.
Huh. Very interesting... thanks!
One more question, does ZFS (again, Solaris 11 Express) only support NFSv4 or does it support earlier versions like 2 and 3? (I'm trying to learn more about NFS...)
I've read that NFSv4 servers are incompatible with earlier versions, yet it seems like my Ubuntu box can connect to ZFS (using NFS) with either NFSv4 or NFSv3...