OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Did anyone try CIFS guest access in this scenario:

1. Share zfs dataset, guest is allowed: zfs set sharesmb="name=public,guestok=true" poolX/share
2. Create an unix group: groupadd grpB
3. Create an unix user: useradd usrA -G grpB
4. Set password for that user: passwd usrA
5. Create SMB group: smbadm create grpB
6. Add user to SMB group: smbadm add-member -m usrA grpB
7. Add smb mapping: idmap add -d wingroup:grpB unixgroup:grpB
8. Also map guest account to that user: idmap add -d winname:Guest unixuser:usrA
9. Share ACL is everyone@:full_set

If folder ACL of poolX/share is everyone@:full_set, guest can access successfully. ("/bin/chmod -R A=everyone@:full_set:fd:allow poolX/share")
If folder ACL of poolX/share is usrA:full_set, guest cannot access. ("/bin/chmod -R A=user:usrA:full_set:fd:allow poolX/share")

Why guest acc (map to usrA) can't access the folder, which usrA has full permission?
Any help is appreciated.
Thanks.

I am not sure if there is a Windows name if you enable anonymous guest access.
So you need at least everyone@=read

Problem solved:

1. idmap settings will apply after OS reboot. svcadm restart has no effect.

2. Guest mapping:
In default idmap config: Guest account => idmap => ephemeral unix id. ZFS ACL for ephemeral account is everyone@
If using idmap: Guest account => idmap => unix account. ZFS ACL for that unix acc is applied.

3. Guest login:
If guest is mapped to unix account, that account must be no-password. No-password is not the same as empty password.
Empty password:
passwd userxyz
press enter twice
No-password:
passwd -N userxyz OR passwd -d userxyz

4. SMB account password must be synchronized with unix pw, otherwise you can't login.
For no-password unix account, use this command to create no-pw smb account:
smbadm enable-user unixacc
 
I am building an omnios server with 40 3tb drives. I also have another 12-15 ssd to put in the server along with a zeusram. Should I create all one pool and use the ssd for l2arc or leave those separate. Server is a dual quad core with 280Gb of ram. Server will be used for a vmware iscsi and backups and replication of the primary system. It will also be for non production and test servers. controllers are all 9211 lsi and qdr infiniband with 10g Ethernet.

let me know what you think would be best. originally i was just going to separate the pools, i just want to make sure that is the right idea.I have 6 500Gb and 6 250Gb right now for ssd which would cover all of my hot server data. What do you recommend?
 
I would use the slow 3 TB disks for backup and replication in a 4 x 10 disk raid Z2 and build a second highspeed pool from SSDs for ESXi VMs in a 2 x 6 SSD Raid-Z2 config with the ZeusRAM as ZIL to avoid small sync writes on the SSDs. You can ignore the unbalanced config.

With 280GB RAM there is mostly no need for a L2Arc (check ARC statistics).
 
do those SSDs cover all your needs presuming every disk is utilized for data? meaning, have you accounted for loss due to parity or mirroring?

What are you IO needs? It might be doable to do a single 6 disk raidZ2 with the 500Gb SSDs giving you 2TB of usable space and somewhere around 50K random read IO. If you use the ZeusRAM here your write performance will be stellar too.

you could then have a pool of say 20 of those 3TB drives mirrored or 2 x 10disk raidz2 and front them with 4 the 250GB SSDs and use the other two for ZIL drives.

this leaves you 20 3TB drives to create an all spinner pool for backups etc and disable sync on this pool.

in a config like that i would put database and or any other high IO workload on the 500GB SSDs, OS VMDKs and other lower IO stuff on the mirror pool pool with read cache and the spinner pool for back up.
 
I am building an omnios server with 40 3tb drives. I also have another 12-15 ssd to put in the server along with a zeusram. Should I create all one pool and use the ssd for l2arc or leave those separate. Server is a dual quad core with 280Gb of ram. Server will be used for a vmware iscsi and backups and replication of the primary system. It will also be for non production and test servers. controllers are all 9211 lsi and qdr infiniband with 10g Ethernet.

let me know what you think would be best. originally i was just going to separate the pools, i just want to make sure that is the right idea.I have 6 500Gb and 6 250Gb right now for ssd which would cover all of my hot server data. What do you recommend?

Tried all-ssd pool and there is no significant performance difference between all-ssd pool and hybrid-pool. There are too many overheads here and there, by the time your VM hits IO, all-ssd is not much faster than hybrid if you have enough SSD cache.

If your dataset fits into ssd cache, you are not going to see difference.
 
depends on the load and the tunables you've implemented. all ssd pools can safely get away with some of the more unsafe tunables. write throttling as an example should never be disabled (in prod), unless you have an all ssd pool as they can keep up even under heavy load.

the catch there is if you have multiple pools in the system, and one of those pools has spinners in it, you have to balance the tunables accordingly.
 
I like the idea of all in one pool and just getting more and more cache as needed. also that way i can have the whole pool utilize the zeusram.

does anyone else think this is bad idea.? I woudl do 4 10 disk z2 of 72000 3tb disks and then 6 500 gb and another 6 250 gb ssd for cache.
 
I like the idea of all in one pool and just getting more and more cache as needed. also that way i can have the whole pool utilize the zeusram.

does anyone else think this is bad idea.? I woudl do 4 10 disk z2 of 72000 3tb disks and then 6 500 gb and another 6 250 gb ssd for cache.

you can only use more cache up to a point. that point is defined by arc_meta_limit and typically 50% of ram is allocated for meta data. to find out how much l2arc you can then address the math looks like this.

50% of ram - 1GB .. lets just use 128GB as an example

128GB ram - 1GB = 63GB for meta data

63GB converted to bytes is 67645734912 bytes

67645734912 / 180 (size of l2arc pointer is 180bytes)= 375809638 total pointers

375809638 pointers * the record size of your pool = amount of usable L2arc. If we use 8K as the dataset record size then 63GB of ram can hold enough l2arc pointers to address 2.79TB of l2arc using 8K pointers.
 
I like the idea of all in one pool and just getting more and more cache as needed. also that way i can have the whole pool utilize the zeusram.

does anyone else think this is bad idea.? I woudl do 4 10 disk z2 of 72000 3tb disks and then 6 500 gb and another 6 250 gb ssd for cache.

also to better answer your second point, i would still create a mirror pool from those 500GB SSDs and use the 250s to front a hybrid pool. I would also use mirrors although in your case since you seem to have more a LOT more capacity than you need i would setup a pool using triple mirrors.

so create a pool with 13 triple mirror vdevs, 6 250s for cache giving you 1.5TB l2arc, and the zeusram for SLOG. this leaves you one spare drive and will give you great performance with a usable space of 39TB of hybrid pool and 1.5TB of SSD pool.

unless you have a very defined reason or need, always use mirrors (just my opinion).

*edit* one thing to note with mirror configs is the write ballooning that happens on the back end and the importance of proper cabling. example, if you have 100MB of writes coming into the box, you have 200MB of writes going to disk using traditional mirroring and 300MB for triple mirrors. each SAS cable is good for about 2GB/s. your environment doesn't sound super heavy for IO but if at some point the requirements grow having 2 cables into each JBOD FRU/Controller is NEVER a bad thing.
 
You can check the system-log, but tell us about your OS,
Workgroup or Domain mode, does a SMB service stop/start solve the problem?

Sorry for the late reply...

Domain mode, I actually have not tried to stop/start the SMB service. When it happens again (since my last post it has not) I will definitely try that and report back.
 
I did experience the same issue.
I used 1 vkernel interface for both management + storage. OI/Omni had only 1 e1000 nic.

During high load, vsphere host / storage showed "disconnected" state.

My fix:
For esxi host: i create separate management vkernel in a vlan, another vkernel for storage in different vlan. And i set different physical nic teaming policy for each vlan.
For storage vm: 1 vnic for storage, 1 vnic for management, 1 vnic for zfs replication.

Use vmxnet3 if you can. E1000 vnic will drop frames under heavy load. SSH to esxi host, type esxtop, then press 'N', read the column DROP.

HTH.

So, I got around to testing some more. I tried intel nics and separating Management and NFS networks making sure the traffic goes through appropriate vmkernels. Same NFS (even with iSCSI) disconnect issue with OmniOS/Napp-it and Esxi 5.1 when replicating via Veeam.

I gave up and went back to FreeNAS 9.1 beta. I've been replicating the same stuff and absolutely NO NFS disconnect issue using vmxnet3 on the Veeam VM. The bandwidth is slower, but I can live with this as it's stable.

Too bad because the OmniOS zfs system was saturating my 1GB NICs easily while the FreeNAS wasn't. I'm still scratching my head what's going on. I don't know jack about OmniOS to even try to troubleshoot.
 
Last edited:
There is a esxi 5.1 update 1 out that should resolve most of the drop outs with vmxnet3 with opensolaris builds.
Have you at least applied this patch to esxi?
ESXi 5.1 update patch ESXi510-201212001
Build Number:914609, Release Date: 20.12.2012
Source: https://hostupdate.vmware.com/software/VUM/OFFLINE/release-368-20121217-718319/ESXi510-201212001.zip

Thanks for the info, but I'm already two patches above this. I'm at build 1065491.

Also, I've been testing FreeNAS 9.1 beta and have no NFS disconnect whatsoever, after replicating 100's of GB. So, this is telling me that there is something up with OmniOS's NFS or Network stack not playing nice with ESXi 5.1
 
I am running a couple of NFS servers (mainly OmniOS bur also some OI) for years with only one problem. I had offline NFS datastores with messages in ESXi logs that the datastore was set offline due not responding within about two minutes.

In the Omni system log and fault management log I found error entries as well but without a pool or disk failure. It seems that Solaris ZFS is waiting longer on errors than ESXI.
For other problems I would expect a driver problem if it works with BSD.
 
So, I got around to testing some more. I tried intel nics and separating Management and NFS networks making sure the traffic goes through appropriate vmkernels. Same NFS (even with iSCSI) disconnect issue with OmniOS/Napp-it and Esxi 5.1 when replicating via Veeam.

I gave up and went back to FreeNAS 9.1 beta. I've been replicating the same stuff and absolutely NO NFS disconnect issue using vmxnet3 on the Veeam VM. The bandwidth is slower, but I can live with this as it's stable.

Too bad because the OmniOS zfs system was saturating my 1GB NICs easily while the FreeNAS wasn't. I'm still scratching my head what's going on. I don't know jack about OmniOS to even try to troubleshoot.

1. What brand is your server nic?

2. What switch are you using?

3. Did you enable storm-control on your switch port? (I did make that mistake).

4. Did you enable mtu 9000 somewhere?

I built a small zfs cluster for my company. Every 15 minutes there is 20gbytes zfs replication between 2 omni bloody nodes. Those zfs box have 10gbe intel 520, connect to IBM r8052 switch. They run just fine in 8 months already.
 
1. What brand is your server nic?

2. What switch are you using?

3. Did you enable storm-control on your switch port? (I did make that mistake).

4. Did you enable mtu 9000 somewhere?

I built a small zfs cluster for my company. Every 15 minutes there is 20gbytes zfs replication between 2 omni bloody nodes. Those zfs box have 10gbe intel 520, connect to IBM r8052 switch. They run just fine in 8 months already.

1. At first, I was using the built-in Nextreme II that came with the HP ML370 G5. I added two Intel GB NIC's, the standard/popular ones. Didn't matter which NICs I used, still NFS disconnects.

2. Cisco 3750 GB (no special config). Also tried cheap dumb Green GB switch

3. look at #2, no special config on cisco switch

4. No MTU 9000 anywhere.

Like I mentioned, I used the same hardware (HP ML370 G5) with FreeNAS 9.1 beta. I've been replicating close to 1TB and no NFS disconnect at all. So this tells me that, like what GEA said, there might be driver issue with OmniOS and the HP ML370 G5 storage box.

I can try the Bloody version of OmniOS and see. I'll also try it on a different server, HP DL380 G5. We'll see...
 
There is a esxi 5.1 update 1 out that should resolve most of the drop outs with vmxnet3 with opensolaris builds.
Have you at least applied this patch to esxi?
ESXi 5.1 update patch ESXi510-201212001
Build Number:914609, Release Date: 20.12.2012
Source: https://hostupdate.vmware.com/software/VUM/OFFLINE/release-368-20121217-718319/ESXi510-201212001.zip

Is it possible to obtain the vmxnet3 driver only? I am up ESXi 5.0U1 and I don't want to go to 5.1. Is this vmxnet3 fix in the ESXi driver, or is it a new opensolaris driver, or both?

Thanks.
 
Has anyone had luck getting the new Intel i210 NICs that Supermicro is putting on their Haswell boards working in OmniOS or OpenIndiana? I haven't been able to find any drivers or tweaks to get the existing igb drivers to work.
 
I've been running a 6 SATA disk ZFS system off a Jetway Atom board... and it's time to graduate to real gear now

I'm looking to base a sytem off the SC836A-R1200B chassis and X9SCL+-F motherboard.
The whole HBA/SAS/Expander thing is still confusing to me, so I want to ask

With this case's backplane, if I get a LSI SAS 9207-8i all I need is 2 SFF-8087 cables and I'll be set for 8 hotswap drives?

When looking for a LSI SAS 9207-8i vendor, sometimes it's listed as (KIT) and sometimes as (SGL). Is the kit just 2 cables? (Is this even the right LSI part?)

Can I fit one or two (2.5") SATA drives somewhere internal in the case for the OS drive? I guess I could also use USB sticks.

I was going to use 8 bays for a 6 drive raidz-2 vdev + 1 hotspare + a l2arc SSD and leave the other 8 open for now.

Planning on using OmniOS. Any other glaring issues?
 
Last edited:
1. At first, I was using the built-in Nextreme II that came with the HP ML370 G5. I added two Intel GB NIC's, the standard/popular ones. Didn't matter which NICs I used, still NFS disconnects.

2. Cisco 3750 GB (no special config). Also tried cheap dumb Green GB switch

3. look at #2, no special config on cisco switch

4. No MTU 9000 anywhere.

Like I mentioned, I used the same hardware (HP ML370 G5) with FreeNAS 9.1 beta. I've been replicating close to 1TB and no NFS disconnect at all. So this tells me that, like what GEA said, there might be driver issue with OmniOS and the HP ML370 G5 storage box.

I can try the Bloody version of OmniOS and see. I'll also try it on a different server, HP DL380 G5. We'll see...

Do you have distributed switch license?

1. If yes, try moving omni vnic, storage vkernel to port group which has load balancing policy "Route based on physical nic load".

2. Otherwise, you could try setting vkernel, omni storage vnic, omni replication vnic on different port group. Each port group should have different uplink physical adapter.

Hope that help.
 
Slightly off topic but can anyone point me in the right direction for installing Grsync or another GUI based front end on Openindiana?

Thanks.
 
My napp-it works just fine with the latest kernel on oi. This is also proved by it working on omnios that uses the same newer kernel.
 
Strange. I can no longer log in via the web interface (no errors that I can find) and I cannot reinstall the latest release:
step 1a: check os, setup napp-it or updating to the most current version now
please wait uname -a: SunOS hostname 5.11 illumos-7256a34 i86pc i386 i86pc Solaris

ok, your OS is not supported
Are you sure you're running off the hipster repo?
 
Hello,

is there any simple way (simple, so other than sendmail) how to configure OmniOS to send e-mails from command line for example with mailx command? I used msmtp on Nexenta, is there anything on Omnios?

I need that for some script error log notification.

Thank you all.
 
It seems that napp-it does not support OpenIndiana once it has been migrated to the 'hipster' repo: http://openindiana.org/pipermail/oi-dev/2013-May/002109.html. I would assume everything should work as-is and the kernel version simply doesn't match. Has anyone run into this yet?

On Hipster Perl + cgi seems broken
(admin.pl does receive any values)

ex
http://172.16.11.4:81/cgi-bin/admin.pl?name=xx
should give a value for name in hash %in

if someone wants to search the reason in Hipster:
edit admin.pl line 147 and look for line
print "Content-type: text/html\n\n";

add after this line a command to print given parameters
&print_hash(%in);


ps
napp-it installs now per wget on hipster.
Unless this cgi problem is fixed, Hipster will not work with napp-it
 
Is there a way to set a "insecure" setting for my NFS share?

I setup a new server running OmniOS and installed napp-it to configure all my ZFS settings and shares.

I successfully mounted my NFS share on my Mac, but I'm unable to copy existing files/folders on my Mac to the NFS share. I can create new folders and files on the share from my mac, but existing ones can not be moved.

Interestingly enough, I can copy files/folders via the terminal, but not with the Finder.

My research has led me to a setting in NFS called "insecure" that must be enabled on the service side of things. This is usually enabled with a setting like:

*(rw,insecure)

The insecure option is a requirement in order to use NFS shares with Mac's.

napp-it stores that setting a bit differently though. It uses something like:

[email protected]

The thing is, I don't see how to add my option "insecure" to that string so that my Mac can properly access my NFS shares. Is there a way to get this working?

By the way, I've tried:

rw,[email protected]
insecure,[email protected]
[email protected],[email protected]
[email protected],[email protected]
[email protected] insecure

The first three settings threw errors in napp-it for invalid options. The last two setting saved fine but still didn't work.
 
Is there a way to set a "insecure" setting for my NFS share?

I setup a new server running OmniOS and installed napp-it to configure all my ZFS settings and shares.

I successfully mounted my NFS share on my Mac, but I'm unable to copy existing files/folders on my Mac to the NFS share. I can create new folders and files on the share from my mac, but existing ones can not be moved.

Interestingly enough, I can copy files/folders via the terminal, but not with the Finder.

My research has led me to a setting in NFS called "insecure" that must be enabled on the service side of things. This is usually enabled with a setting like:

*(rw,insecure)

The insecure option is a requirement in order to use NFS shares with Mac's.

napp-it stores that setting a bit differently though. It uses something like:

[email protected]

The thing is, I don't see how to add my option "insecure" to that string so that my Mac can properly access my NFS shares. Is there a way to get this working?

By the way, I've tried:

rw,[email protected]
insecure,[email protected]
[email protected],[email protected]
[email protected],[email protected]
[email protected] insecure

The first three settings threw errors in napp-it for invalid options. The last two setting saved fine but still didn't work.

In case anyone else comes across this issue, the fix was a setting within napp-it. On the ZFS share permissions page, change the aclinherit and aclmode to passthrough.
 
I'm a bit confused about how to get permissions working correctly between my Mac and my NFS share.

On my Ubuntu machine, when I create a new file on my NFS share, or I copy an existing file, the permissions are always 777 nobody:nogroup

However, on my Mac things are a bit different. If I create a new file, the permissions are 777 nobody:nogroup, but if I copy an existing file using the Finder, it's copying my permissions and ownership with the file. For example, a file on my mac may have 644 kris:staff permissions/ownership. If I copy that file to my NFS share it retains those permissions/ownership.

On the napp-it side of things, it sees that file as being 644 501:games. I added a new user with my name and assigned it to UID 501, and now napp-it reports the file as being 644 kris:games.

games is a pre-defined unix group within napp-it that is taking the staff ID (20) from my Mac.

All that said, is there a way to get my Mac to write permissions like my Ubuntu machine does? Everything is 777 and the user:group is always nobody:nogroup? I've tried different mount options on my Mac, and I tried setting the acl mode to passthrough, discard, and groupmask, but none of those made a difference.

2013-07-26_23-40-39.jpg
 
I cannot tell you if there is an option to modify the behaviour of your Mac,
but i would try:

- set ACL of the NFS shared folder to everyone@=modify with file and folder inheritance enabled
- set ACLmode to restricted to block chmod (chmod to Unix permissions deletes ACL inheritance)
- enable sharing option root=@ips or subnet to allow access without regarding UID

only other option may be
- check if Ubuntu can use the same UID than your Mac.
- reset ACL to everyone@ recursively

other option
- use NFS4 or CIFS (with user authentification)
 
I cannot tell you if there is an option to modify the behaviour of your Mac,
but i would try:

- set ACL of the NFS shared folder to everyone@=modify with file and folder inheritance enabled
- set ACLmode to restricted to block chmod (chmod to Unix permissions deletes ACL inheritance)
- enable sharing option root=@ips or subnet to allow access without regarding UID

only other option may be
- check if Ubuntu can use the same UID than your Mac.
- reset ACL to everyone@ recursively

other option
- use NFS4 or CIFS (with user authentification)

Thanks Gea. That didn't end up working, but it looks like it's really just my mis-understanding of how NFS works. NFS lets the client's manage the permissions.So unlike when using samba and I can control the permissions and ownership through the smb.conf file, with NFS, the clients control that. If I want my group permissions to be rw, I need to update the umask on the mac side of things instead of trying to get NFS to manage that data.

Another question though. It sounds like with NFS shares the "modify" ACL is the correct default setting to use? Also, with regards to aclinherit and aclmode, are those better off being set to something specific when using NFS only? I know aclinherit doesn't work properly on my Mac when set to restricted. But when I try playing around with some of the other options I'm not seeing any differences with permissions or ownership.
 
NFS3 depends only on Unix UID and client ip without authentification or verification. So it is used mainly if you need a high perfornance protocol with anonymous access like video editing or ESXi datastores in a secure network environment.

As Solaris is an ACL-OS (with Windows alike NFS4 ACL), I would always set permissions based on ACL where you can use the basic ACL everyone@, group@ and owner@ similar to oldstyle Unix permissions like 644 with everyone@=modify as a permission to start with.

If you create a new file or folder, you want to inherit permissionms from the parent folder so aclinhert=passthrough is what you want usually.

Aclmode controls the behaviour if a service tries to modify permissions to Unix style like 644. Such a command deletes ACL and the advanced ACL inheritance settings. If you use NFS3 only, you can set to passthrough. If you want to keep ACL intact (needed ex for Solaris CIFS), restricted (ignore chnod - only available on OmniOS) is what you mostly want.

ps
Best of all protocols regarding cross-plattform and features is Solaris CIFS.
Sadly this is the slowest on Macs (Hope for 10.9 where Apple seems to move to CIFS as default from current AFP)
 
NFS3 depends only on Unix UID and client ip without authentification or verification. So it is used mainly if you need a high perfornance protocol with anonymous access like video editing or ESXi datastores in a secure network environment.

As Solaris is an ACL-OS (with Windows alike NFS4 ACL), I would always set permissions based on ACL where you can use the basic ACL everyone@, group@ and owner@ similar to oldstyle Unix permissions like 644 with everyone@=modify as a permission to start with.

If you create a new file or folder, you want to inherit permissionms from the parent folder so aclinhert=passthrough is what you want usually.

Aclmode controls the behaviour if a service tries to modify permissions to Unix style like 644. Such a command deletes ACL and the advanced ACL inheritance settings. If you use NFS3 only, you can set to passthrough. If you want to keep ACL intact (needed ex for Solaris CIFS), restricted (ignore chnod - only available on OmniOS) is what you mostly want.

ps
Best of all protocols regarding cross-plattform and features is Solaris CIFS.
Sadly this is the slowest on Macs (Hope for 10.9 where Apple seems to move to CIFS as default from current AFP)

Thanks for the detailed reply Gea.

I'm hoping that Apple can get SMB2 right with Mavericks. I'm surprised they're dropping AFP in support of something non-apple, but I see it as a better move for everyone (as long as performance is on par with Windows machines).

I've used Samba since early 2000's. My home was always Windows PC's, Servers, and Media Centers. Samba just made sense and worked. But around 2009 I switched to *nix machines and servers/media clients are linux based, while our workstations are Mac's. Samba has always worked with Mac's and my Linux machines, but the performance is never that great. Usually 1/2 the speed Windows machines would get.

I currently do video editing, as well as music recording on my Macbook. I have a second 1TB spinning drive in there but that's filled up and I'm tired of archiving data to my NAS to make room for new data. I'm also planning on going to a Retina Macbook this year, which means no second hard drive in my laptop. This lead me to storing my audio/video editing files on a NAS.

I've played around with all the protocols. SMB, AFP, NFS, and iSCSI. I won't use AFP even though it's speedy on the Mac. I've had too many networking issues with corruption and transfer errors. SMB always works, but it's too darn slow. That leaves me with NFS and iSCSI.

iSCSI provides the best read/write performance. I consistently get 105 MB/s. NFS providers the same read performance, but write performance drops down to 70 MB/s. However, since this is a laptop, iSCSI is a bit of a pain to use. I'd ideally like to disconnect my laptop from my network, work remotely, then re-connect and have access to my share without even noticing or doing any extra work. With iSCSI, I would need to manually un-mount the share, and re-mount whenever I disconnect or re-connect to my network.

With NFS however (using the soft option), My Mac will either have access to the data or not. If I'm on my network and I try to read/write to that mount point, it will work. If I'm out of my network it won't. In both cases, no extra work is needed on my part.

So I'm going to try using NFS for now (and we'll see how SMB 2 works with Mavericks). If I'm having issues with video editing or music production, I can use iSCSI and just deal with the manual headache of having to disconnect and re-connect when leaving my network. If only iSCSI was smart enough to safely unmount and remount whenever I tried to access that share. iSCSI would be the go-to option for a desktop needing the best performance, but for a laptop that's not always on my network, I think NFS will be the best option.
 
So my system has become unusable (very very slow) today. Pool status shows following:
pool: pool
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: none requested
config:

NAME STATE READ WRITE CKSUM
pool DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
c2t5000C5003E17A51Ad0 DEGRADED 0 0 0 too many errors
c2t5000C50045C136ACd0 ONLINE 0 0 0
c2t50014EE2593BD11Fd0 FAULTED 0 0 0 too many errors
c2t50024E920667E9B5d0 ONLINE 0 0 0
c2t50024E920667E9C6d0 ONLINE 0 0 0


Here are the hard drives:

c2t5000C5003E17A51Ad0 2.00 TB pool raidz DEGRADED Error: S:0 H:0 T:0 ST2000DL003-9VT166 sat,12 PASSED 5YD4WR7B 56 °C
c2t5000C50045C136ACd0 2.00 TB pool raidz ONLINE Error: S:0 H:0 T:0 ST2000DL003-9VT166 sat,12 PASSED 6YD18L9A 55 °C
c2t50014EE2593BD11Fd0 2.00 TB pool raidz FAULTED Error: S:0 H:30 T:10 WDC WD20EADS-00S2B0 sat,12 PASSED WDWCAVY2040954 66 °C
c2t50024E920667E9B5d0 2.00 TB pool raidz ONLINE Error: S:0 H:0 T:0 SAMSUNG HD204UI sat,12 PASSED S2HGJ9JBA02015 60 °C
c2t50024E920667E9C6d0 2.00 TB pool raidz ONLINE Error: S:0 H:0 T:0 SAMSUNG HD204UI sat,12 PASSED S2HGJ9JBA02016 56 °C
c4t0d0 21.47 GB rpool basic ONLINE Error: S:0 H:0 T:0 VMware disk n.a. 6000c293b3bbe29afe7ea89453c477dd -


Should I go ahead and replace the WDC drive? How can the state be faulted, my pool be degraded to the point of unusability, but all hard drives pass smart testing?

Here is smart info for WDC drive:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 199 198 051 Pre-fail Always - 4806
3 Spin_Up_Time 0x0027 152 140 021 Pre-fail Always - 9375
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 96
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 090 089 000 Old_age Always - 7762
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 80
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 55
193 Load_Cycle_Count 0x0032 057 057 000 Old_age Always - 429312
194 Temperature_Celsius 0x0022 087 079 000 Old_age Always - 65
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 196 196 000 Old_age Always - 1465
198 Offline_Uncorrectable 0x0030 196 196 000 Old_age Offline - 1463
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 001 001 000 Old_age Offline - 429648
 
Sheekamoo,thanks !
Yours post is a "last drop in the cup" to abandon AFP [netatalk v.3]due to copying issues.Transfer errors when dealing with large amount of data [500GB+]. Now I'm happy with the NFS:). Same speed,w/o stoppages/errors.
 
Should I go ahead and replace the WDC drive? How can the state be faulted, my pool be degraded to the point of unusability, but all hard drives pass smart testing?

replace the faulted disk immediatly and do a low level test on the degraded disk afterwards!

ZFS faults are real faults,
smartvalues are expectation, no more. Very often you have faulted disks without smart warnings
 
replace the faulted disk immediatly and do a low level test on the degraded disk afterwards!

ZFS faults are real faults,
smartvalues are expectation, no more. Very often you have faulted disks without smart warnings

Just took me 6 hours to backup 1000 pictures off of the pool. Horribly slow. The media I don't care about backing up that much it's all replaceable. However, while backing up the photos it generated 1.18k read errors on the degraded hard drive, so I'm swapping that one out within a couple of hours. I'll rebuild the pool and hope for the best. I'll order another 2tb drive today to replace the faulted drive then send them both out for new ones.

Pain in the ass LOL. Hopefully I don't lose too much media.
 
Back
Top