OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Xenner

n00b
Joined
Dec 23, 2011
Messages
12
Now I booted back up and it is showing my one good drive in that array as faulted and "corrupt" is that because it's mirror dissapeared?
 

Xenner

n00b
Joined
Dec 23, 2011
Messages
12
Crashing and burning now -- getting write failures on boot (no space left on device) I have changed nothing on the boot drive which is an SSD...
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
3,958
i suppose you rpool was nearly full and the logged errors were too much.
I would first look at systemsnaps (beadm list) and destroy unneeded (beadm destroy bename)

About your pool:
If you remove a disk from a mirror and the second disk fails, your pool is offline and lost.
Shutdown and power off/on and hope that the disk come back
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
3,958
seems fine. the actual data is on 15 750GB ultra stars a dd bench had something like ~350MB/s write and ~750MB/s read for a 14 disk raid 10 (7 mirro sets with a hotspare)

can you expand further on why you dislike DOMs? apart from this particular DOM which is small and a bit old i would think 16GB SATA DOMs for the root pool would be fine. If it is a log write issue can't local logging be disabled entirely in favor of a centralized system like say splunk?

also, and i apologize as I haven't spent more than 30 minutes with napp-it at this point, but how do i disable napp-it logging to the local console? super annoying when trying to work in the console and the gui at the same time :).

beside the size, doms are mostly too slow to have fun with Solaris (if you have a fast one ok)
i would prefer a modern small 40 GB Sata SSD, mostly cheaper and faster

About console: use a remote console via putty
 

Xenner

n00b
Joined
Dec 23, 2011
Messages
12
I'm somewhat back up and running but this new disk is a Seagate S2000DL003 4k sector disk. My other 2TB's are 512b ST32000542AS

I cannot add it back in because of sector alignment...

cannot attach c2t5000C500370C382Bd0 to c2t5000C50029ECDE52d0: devices have different sector alignment

Help!!
 

Latent

n00b
Joined
Jan 16, 2012
Messages
58
I'm somewhat back up and running but this new disk is a Seagate S2000DL003 4k sector disk. My other 2TB's are 512b ST32000542AS

I cannot add it back in because of sector alignment...

cannot attach c2t5000C500370C382Bd0 to c2t5000C50029ECDE52d0: devices have different sector alignment

Help!!

well your out of luck replacing with that drive. That drive reports 4k sectors and it seems you can't mix sector sizes in the same vdev. This makes a lot of sense when you think about it because each drive has to mirror the same data and trying to do this when the base unit of storage is different would be very hard! You will have to either copy all the data off your pool and reset up a new pool or source a drive that reports 512byte sectors. Note that most drives of this size are now 4k sector drives that report as being 512byte sectors for compatibility which makes them very bad for zfs as it doesnt handle the allignment without hacks right now.

The great news is that that new seagate drive is a TRUE 4k sector drive so it is a great drive for ZFS but you need more than one of it so you can combine it together into a 4k sector vdev. but as I touched on before you can't transition to a 4k sector vdev from a 512byte one as they are not compatible so you have to create a band new pool and manually copy your data over and then blow away your old pool and then you can reuse the old 512byte drives.

I'm not 100% sure if the 4k/512byte thing matching just relates to individual vdev's in a pool or the whole pool. you may be able to have one vdev of each type in the same pool to reuse the old drives.

For new builds these 4k drives may be a good option but you would have to keep a hot/cold spare drive as you won't be able to use any other brand 2TB drives as they all report as 512byte sectors.

Edit: online I found others having problems with this drive reporting 512byte physical sectors so unless they have changed the firmware it should not be complaining like it did for you.
 
Last edited:
Joined
Jan 25, 2012
Messages
21
well your out of luck replacing with that drive. That drive reports 4k sectors and it seems you can't mix sector sizes in the same vdev. This makes a lot of sense when you think about it because each drive has to mirror the same data and trying to do this when the base unit of storage is different would be very hard! You will have to either copy all the data off your pool and reset up a new pool or source a drive that reports 512byte sectors. Note that most drives of this size are now 4k sector drives that report as being 512byte sectors for compatibility which makes them very bad for zfs as it doesnt handle the allignment without hacks right now.

The great news is that that new seagate drive is a TRUE 4k sector drive so it is a great drive for ZFS but you need more than one of it so you can combine it together into a 4k sector vdev. but as I touched on before you can't transition to a 4k sector vdev from a 512byte one as they are not compatible so you have to create a band new pool and manually copy your data over and then blow away your old pool and then you can reuse the old 512byte drives.

I'm not 100% sure if the 4k/512byte thing matching just relates to individual vdev's in a pool or the whole pool. you may be able to have one vdev of each type in the same pool to reuse the old drives.

For new builds these 4k drives may be a good option but you would have to keep a hot/cold spare drive as you won't be able to use any other brand 2TB drives as they all report as 512byte sectors.

Edit: online I found others having problems with this drive reporting 512byte physical sectors so unless they have changed the firmware it should not be complaining like it did for you.

That this could happen is very disconcerting, as the main benefit of RAID is that you can replace a drive if it fails. I realize the whole 512byte/4k thing is mainly drive maker's fault, but there's not much we can do about that other than workaround it. I'm currently evaluating ZFS / OpenSolaris and it seems to me like 4k is inevitable, no matter what, that's the direction things are going so I might as well embrace it from the beginning. It wouldn't hurt anything to force 4k sectors right from the beginning when I set up my pool, even if all my drives have 512 byte compatibility, right? That way, when the drive manufacturers are only making 4k drives they can be replaced without problem. What is the best way to do this?
 

Xenner

n00b
Joined
Dec 23, 2011
Messages
12
Ok, aquired two new drives, going to re-do this all with RaidZ1

Is this kosher -- only drives I have right now:

3x2tb Seagate 5900rpm Green 512b
1x2tb Seagate 5900rpm Green 4k
1x3tb Seagate 5900rpm Green 4k

Have them in a RaidZ1, actually showing about 9TB usable space...it is using the whole 3tb drive instead of just 2tb of it as I would have expected. Pool has ashift=12, so I am good to go with replacing 512b drives with 4k drives if they fail.

Is this "OK" for data integrity, say the 3tb drive fails?
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,654
That doesn't sound right. I think it can only use 2tb of the 3tb drive. 5x2tb is 10tb. also, i seem to recall 'zpool' command shows total space, not usable space. what does 'zfs list' show?
 

Xenner

n00b
Joined
Dec 23, 2011
Messages
12
Napp-it showing 11tb total space, 9tb useable.

ZFS list shows 7.84TB if you add used and available.
 

Latent

n00b
Joined
Jan 16, 2012
Messages
58
Xenner,

Watch out as raidZ1 is not recommended with drives > 500GB or so. large drives like this take a long time to rebuild and with RaidZ1 during rebuid you have no redundancey left and the chance of a bad sector happening in this time is quite high. If your data is important then either keep a good backup and/or go to RaidZ2 or higher. Note that raidZ2 performs best with 6 drives (4 x data, 2 x parity). Also note you can't convert from Z1->Z2 without rebuilding your pool and copying data back on.

Also if you have a 3TB drive in your vdev then only 2TB can be used for data but it may show the extra 1TB in raw total pool size. this raw size includes size set aside for redundancy as well. If you were to over time fail and replace the 2TB drives with 3TB then when the last drive was upgraded to 3TB you would get a sudden jump in usable storage. It always uses the smallest drive in the vdev to decide the disk usage I think.
 

nicka

n00b
Joined
Dec 30, 2010
Messages
27
4x500GB 7200rpm, and 2x1.5TB 5900 rpm. Go two pools or one with a vdev for each set of drives?
 

msitpro

Weaksauce
Joined
Oct 10, 2011
Messages
106
Just a quick heads up to log device users.

I think I have the answer to something I've been unable to establish using my Google-Fu.

ZIL log devices seem to be able to use striping over log vdevs.
I have 2 separate log devices defined in a VM and according to IOstat ZFS is splitting writes to them 50/50.......

I wonder if anybody is able to test the following for me:

1. Write speeds with 2x physical SSD in log mirror

2. Write speed with 2x physical SSD assigned as individual log devices (unsafe but just for testing)
(alternatively 4x SSDs split into 2 log mirrors)



If it works as I expect (striping over vdevs as the actual zpool does) then that would be a viable solution to getting decent write speeds..... Using some very cheap, not-so-great-performing old enterprise SSD for example but get 4 of them for 2x mirror vdevs.

Would be an alternative to ZeusRAM, ££££ devices as a log device.
 
Last edited:

madrebel

Gawd
Joined
Sep 23, 2011
Messages
724
depends. if you're striping acards or ddrdrives both of which having battery backup and flash, there really is nothing to be terribly concerned with.

if you're using pool version 28 (or w/e was the first to allow detaching zil) with SSDs (or non super cap SSDs) your only real risk is losing in flight data. honestly mirroring SSDs imo is pointless. you gain nothing. i have never heard of SSDs randomly just failing before the NAND wears out. how could they? there are no moving parts. point is if you're mirroring then you're writing the exact same data at the exact same frequency. if/when one SSD wears out the other mirrored SSD (presuming same make and both were new) is going to wear out at the exact same time.

i suppose you could use a staggered strategy. go with a single or striped set of SSD ZILs then after a few months add a mirror. at this point you have redundancy that will fail at offset intervals.
 
Joined
Feb 19, 2012
Messages
4
The main advantage for mirroring ZIL with current zpool versions seems to be that when one of the mirrors fails you don't suffer the performance degradation that would otherwise occur when the ZIL fails (switching from an SSD-based fast ZIL to a pool-based slow ZIL.)

I don't know if it's the case that if you have two ZIL devices in a mirror configuration that they are going to both fail at the same time because they have the same rate of wear.
 

madrebel

Gawd
Joined
Sep 23, 2011
Messages
724
why wouldn't it be the case? NAND has a limited amount of write cycles. the ZIL is write heavy. mirroring means writing to two different devices at the same time. which means the two SSDs are both failing at the same exact rate.
 
Joined
Feb 19, 2012
Messages
4
why wouldn't it be the case? NAND has a limited amount of write cycles. the ZIL is write heavy. mirroring means writing to two different devices at the same time. which means the two SSDs are both failing at the same exact rate.

Because write cycles is an approximation? Not to mention that there are other failure conditions where having a mirrored ZIL is beneficial. You said that mirroring the ZIL was pointless and gained you nothing. I disagree.
 

msitpro

Weaksauce
Joined
Oct 10, 2011
Messages
106
Because write cycles is an approximation? Not to mention that there are other failure conditions where having a mirrored ZIL is beneficial. You said that mirroring the ZIL was pointless and gained you nothing. I disagree.

I disagree too. SSDs do fail from other factors than just wear.

I had a first gen OCZ Agility that when used heavily drops off the sata controller but otherwise works fine.

I think if I was building a ZFS box (non-HA) for work tomorrow I would go for 4+ SLC drives with power protection for logs configured in mirrors.
 

Waffles730

Limp Gawd
Joined
May 31, 2005
Messages
136
... The drives are all upside down though for cabling purposes - HDD's can cope with any orientation though can't they?...

I've always heard that you can mount a hard drive in any orientation, just once you choose an orientation, it should stay that way for the rest of its life. This is because the motor bearings may wear slightly over time, settle in and stay functioning, but if you change the orientation, they bearings will shift and you could end up with crashed heads/etc.

That being said, I have 2x36GB original WD Raptors that have been mounted in every possible orientation over the course of their 8 years life, and they're still kicking. Probably not for much longer though... :)
 

jim82

Weaksauce
Joined
Nov 1, 2011
Messages
77
I'm running the all-in-one version and resized my OI system disk from 12GB -> 16GB. in ESXi. But I need to expand the system drive in OI.

Can anyone help with this? Gparted does not allow me to expand the system pool/drive.

Thanks.
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,654
Don't bother. Create a new vhd in esxi, add that drive to the VM, reboot the VM and add the new vdev to the root pool. I'm not sure you can create/install a raid0 root pool, but I think it will run on it. Give it a try...
 

Latent

n00b
Joined
Jan 16, 2012
Messages
58
I'm running the all-in-one version and resized my OI system disk from 12GB -> 16GB. in ESXi. But I need to expand the system drive in OI.

Can anyone help with this? Gparted does not allow me to expand the system pool/drive.

Thanks.

maybe Try adding a second 16GB virtual Hard drive in vmware and then inside napp-it menu disks->add add this into the rpool which will make it a mirror with 2 disks. Wait for it to rebuild and then remove the 12GB drive from the mirror (and then from vmware later). You should then have a 16GB single disk pool. I haven't tried this myself this is just how I think it should work anyway.

Note that you can do it danswartz's way as well which is to add a second disk as a new vdev to the rpool which is done under the menu pools->add instead. This method it just adds the extra storage into the pool but you have to leave both disks in your vmware config forever.
 
Last edited:

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,654
Actually, I like latent's idea better. I forgot you can resize bigger by doing the mirror/resize dipsy-doodle :)
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,654
Depends on what you mean. If you mean two 2-disk mirrors striped together, yes. OTOH, zfs allows you to mirror 4 disks (incredibly redundant but wasteful.)
 

Stanza33

Gawd
Joined
Mar 31, 2010
Messages
538
Does 4 disk mirror = raid10 ?

No, Raid 10 = Raid1 + Raid 0 hence the "10"...ie 1+0 together

4 Disk mirror still equals Raid1 ie mirrored.... but instead of mirrored between 2 drives it would be mirrored between the 4 drives --- or mirrored 3 times if you like;)

.
 

bleomycin

Limp Gawd
Joined
Aug 14, 2010
Messages
238
Does anyone know the two commands i need to run to map a samba user/group to root in a workgroup environment? I connect to a samba share that is also an nfs share (to a debian client) and of course permissions are getting all screwed up but i'm content with mapping the samba users to root if that is the easiest solution. Right now any file that is created or modified by a samba user is completely broken for nfs, i've searched and tried different idmap combo's without any luck. Thanks!
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
3,958
Does anyone know the two commands i need to run to map a samba user/group to root in a workgroup environment? I connect to a samba share that is also an nfs share (to a debian client) and of course permissions are getting all screwed up but i'm content with mapping the samba users to root if that is the easiest solution. Right now any file that is created or modified by a samba user is completely broken for nfs, i've searched and tried different idmap combo's without any luck. Thanks!

In workgroup mode, SMB users are local Unix users.
You cannot map a Unix user to a Unix-user
You can only map local SMB groups or AD users/groups

what you can do:
You can assign an NFS client to root in NFS share settings

or
You can set ACL of your share to 777 and everyone@=modify with inheritance =on
to access all newly created files from SMB and NFS

For already created files, you must reset ACL or permissions recursively
(per napp-it ACL extension, CLI or remotely from Windows when connectes as root)
 

sjalloq

n00b
Joined
Jun 20, 2011
Messages
54
Just getting back to testing my drives after noticing SMART errors. Does anyone have any suggestions for tools to use to test HDD's? I'm using Samsung's ES-Tool to perform a low level format but it isn't finding any problems.
 

hominidae

n00b
Joined
Sep 11, 2011
Messages
51
Just getting back to testing my drives after noticing SMART errors. Does anyone have any suggestions for tools to use to test HDD's? I'm using Samsung's ES-Tool to perform a low level format but it isn't finding any problems.

I really like that concept of pre-clearing drives, they use over at the unRAID forum and I use it for stress-testing new or moved-around drives as well.
Here's the link: http://lime-technology.com/forum/index.php?topic=2817
Not sure if that script will work from OI/Solaris...For using it with LSI2008 based HBAs, you'll need unRAID of the newer beta versions, I think.
 

bleomycin

Limp Gawd
Joined
Aug 14, 2010
Messages
238
In workgroup mode, SMB users are local Unix users.
You cannot map a Unix user to a Unix-user
You can only map local SMB groups or AD users/groups

what you can do:
You can assign an NFS client to root in NFS share settings

or
You can set ACL of your share to 777 and everyone@=modify with inheritance =on
to access all newly created files from SMB and NFS

For already created files, you must reset ACL or permissions recursively
(per napp-it ACL extension, CLI or remotely from Windows when connectes as root)

Thank you for the help, i've read this once before and unfortunately i just don't understand it. I've tried setting permissions recursively with a windows client to full permissions for everyone but the setting doesn't seem to be persistent. I just now found the acl settings in napp-it but it appears i need register/pay for that functionality. Is there a good guide on managing these types of acl properties from the CLI in solaris? All i'm trying to do is set it so that all existing and any new files on the nfs/samba share are available to every user via nfs and samba. I've logged into solaris and set permissions to everything chmod -r 777 but that doesn't stick for newly created files.
 

madrebel

Gawd
Joined
Sep 23, 2011
Messages
724
I really like that concept of pre-clearing drives, they use over at the unRAID forum and I use it for stress-testing new or moved-around drives as well.
Here's the link: http://lime-technology.com/forum/index.php?topic=2817
Not sure if that script will work from OI/Solaris...For using it with LSI2008 based HBAs, you'll need unRAID of the newer beta versions, I think.

you can use DBAN with the drives mounted in another 'pre-field' system. phoronix test suite has disk benchmarks that can be used inside solaris. good old fashioned DD if=/dev/zero works too.
 

Obscurax

n00b
Joined
Mar 19, 2011
Messages
20
2 Days ago my pool got flagged degraded, one disk was faulted.
Today another is faulted so my RAIDZ2 pool is really vulnerable now and I'm scaredd of data loss..
The disks are Hitachi 5k3000's and were bought in 5/2011, thanks to napp-it for warning me about the faulted disks.
What should I do until the new drives arrive? Shutdown the NAS?

Code:
pool overview:

      pool: tank
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
    repaired.
  scan: scrub repaired 0 in 4h19m with 0 errors on Sun Feb 26 07:20:20 2012
config:

    NAME        STATE     READ WRITE CKSUM
    tank        DEGRADED     0     0     0
      raidz2-0  DEGRADED     0     0     0
        c1t0d0  FAULTED      1   597     0  too many errors
        c1t1d0  ONLINE       0     0     0
        c1t2d0  FAULTED      0 1.65K     0  too many errors
        c1t3d0  ONLINE       0     0     0
        c1t4d0  ONLINE       0     0     0
        c1t5d0  ONLINE       0     0     0
 

madrebel

Gawd
Joined
Sep 23, 2011
Messages
724
drives fail on power up all the time.

i would check the smart status. there is a counter ... i forget what it is right now because it is 3am but one of the smart variables deals with how prone the drive is to errors. iirc it starts at 100 and counts down. if you have any more drives below 30 i would worry. if you dont then leave it online and backup as much as you can.
 

Stanza33

Gawd
Joined
Mar 31, 2010
Messages
538
Dont panic, check smart status of the drives.... as it may just be faulty cables giving you the errors;)
 

jim82

Weaksauce
Joined
Nov 1, 2011
Messages
77
maybe Try adding a second 16GB virtual Hard drive in vmware and then inside napp-it menu disks->add add this into the rpool which will make it a mirror with 2 disks. Wait for it to rebuild and then remove the 12GB drive from the mirror (and then from vmware later). You should then have a 16GB single disk pool. I haven't tried this myself this is just how I think it should work anyway.

Note that you can do it danswartz's way as well which is to add a second disk as a new vdev to the rpool which is done under the menu pools->add instead. This method it just adds the extra storage into the pool but you have to leave both disks in your vmware config forever.

Thanks for your idea's, but unfortunatly this does not work. Napp-it won't let me mirror the rpool for some reason. Maybe because the drive is bigger than original. Any other ideas?
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
3,958
Thank you for the help, i've read this once before and unfortunately i just don't understand it. I've tried setting permissions recursively with a windows client to full permissions for everyone but the setting doesn't seem to be persistent. I just now found the acl settings in napp-it but it appears i need register/pay for that functionality. Is there a good guide on managing these types of acl properties from the CLI in solaris? All i'm trying to do is set it so that all existing and any new files on the nfs/samba share are available to every user via nfs and samba. I've logged into solaris and set permissions to everything chmod -r 777 but that doesn't stick for newly created files.

ACL settings for SMB (its SMB not SAMBA) interact with Unix permissions.
If you set ACL everyone@ with inheritance=on, you are ok for new files.

If you do a chmod 777, it will reset all already created files to 777 but will also
delete the ACL inheritance. What you can do is reset all Permissions to 777
and set ACL afterwards,

About napp-it's ACL extension: Checking ACL's and setting trivial ACL like everyone@
or local user ACL is free. You can use without buying a key.
 
Top