OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Thanks as always for your help Gea!

Will a RAID 10 of four Hitach 5k3000 2 TB drives be faster than a single 10k Raptor?

Also, should I disable sync writes on my datastore pool ONLY or should I also disable it on my RAID-Z2 archive/media pool?
 
I can get that far ;)

I mean if I come up with a ACL scheme from scratch (for a group projects share for example) I will probably do some Dumb Stuff or at least re-invent the wheel. I don't know how it is I can't find any google results that say "set your permissions like this". And it seems like I need to do some things in a certain order to get things propagated the right way.

if you only use allow rules, you do not need to care about ACL order
In such a case Solaris ACL are quite eqal to Windows ACL behaviour.
Mostly you are well with simple ACL rules for example:

set a rule for everyone (if you need) and add one or more rules for users or groups like
everyone@=read set
group sales=modify set
group admins=full set
if all should read and group sales is allowed to modify and group admins can modify rules

If you have private folders where only the owner/creator have access you may set something like
owner@=full set
in such a case only the creator (and root) have access and the owner is able to modify rules

...lots of other possibilities to play with everyone, owner, groups and users
Just try to avoid deny rules if not needed. They complicates things a lot
 
Thanks as always for your help Gea!

Will a RAID 10 of four Hitach 5k3000 2 TB drives be faster than a single 10k Raptor?

Also, should I disable sync writes on my datastore pool ONLY or should I also disable it on my RAID-Z2 archive/media pool?

if you look at sequential performance, the Raptors are quite bad. Transfer rate is due do much higher
Data denstity even on "slow" green 2 TB disks better (about the double transfer rate).

But for databases and NFS/ESXi storages I/O and disk position time is more
important. But I do not expect a single Raptor to be really better than your Raid-10

Sync writes (enable/disable) is only relevant for those applications that use sync writes for higher data security of transactions
like databases and ESXi over NFS. On your mediapool (usual SMB access) it does not matter at all.
 
Last edited:
I set this up and with 2 x 10 drive zpool2 (raid6) arrays. I powered off the VM to replace 2 drives and then booted it up and napp-it isnt starting... how do i see which services are running and restart http?
 
I'm having trouble deleting files from SMB shares (from windows) that were downloaded on my solaris machine via SABnzbd (python usenet downloader). I have all the ACLs set to full-set for everyone for the SMB share.

note: the folder is owned by the sabnzbd user on the solaris machine, however the permissions on the folder are 777 and I can delete it via the command line (from my user) or using SFTP via WinSCP.

When I try to delete the folder I get this error: you require persmission from S-1-5-21-3711944786-4061020791-2851930098-1103.

Is there any way to fix this? Or is this just an incompatibility between unix permission's and ACLs.

Maybe the only way is to use pylibacl to set the ACL in the python code upon download.
 
I set this up and with 2 x 10 drive zpool2 (raid6) arrays. I powered off the VM to replace 2 drives and then booted it up and napp-it isnt starting... how do i see which services are running and restart http?

1 show all processes via console:
ps axw

2. restart napp-it via console
sudo /etc/init.d/napp-it restart
(allowed options: stop/start/restart)
 
I'm having trouble deleting files from SMB shares (from windows) that were downloaded on my solaris machine via SABnzbd (python usenet downloader). I have all the ACLs set to full-set for everyone for the SMB share.

note: the folder is owned by the sabnzbd user on the solaris machine, however the permissions on the folder are 777 and I can delete it via the command line (from my user) or using SFTP via WinSCP.

When I try to delete the folder I get this error: you require persmission from S-1-5-21-3711944786-4061020791-2851930098-1103.

Is there any way to fix this? Or is this just an incompatibility between unix permission's and ACLs.

Maybe the only way is to use pylibacl to set the ACL in the python code upon download.

You must either smb connect as root or smb connect as root and set desired ACL recursively
to change ACL of already created files. Folder permissions are only inherited if you set these
prior of creating files.

There is no incompatibility between ACL and unix permissions, they depend on each other.
If you change unix permissions, ACL changes accordingly - the same when you modify ACL
Most problems occur if unix programs not aware of ACL change permissions wth
the result of modified/ deleted ACL (ACL are much more universal than traditional unix permissions)

Solaris ACL are very similar to Windows ACL. Main difference is that Solaris respects order of ACL
while Windows processes first all deny rules then all allow. So try to avoid deny rules otherwise
you must set them from Solaris.
 
You must either smb connect as root or smb connect as root and set desired ACL recursively
to change ACL of already created files. Folder permissions are only inherited if you set these
prior of creating files.

There is no incompatibility between ACL and unix permissions, they depend on each other.
If you change unix permissions, ACL changes accordingly - the same when you modify ACL
Most problems occur if unix programs not aware of ACL change permissions wth
the result of modified/ deleted ACL (ACL are much more universal than traditional unix permissions)

Solaris ACL are very similar to Windows ACL. Main difference is that Solaris respects order of ACL
while Windows processes first all deny rules then all allow. So try to avoid deny rules otherwise
you must set them from Solaris.

Thanks very much for your response Gea, I was hoping for a way to avoid having to set the ACLs as root all the time. I made a shell script that sets the acl with solaris chmod of the newly downloaded file. It would be great if you could somehow set the default ACL for a folder for all newly created files/subfolders. Something like the umask. Anyways, my hack works well enough!
 
I've just finished setting up my "All-In-One" ESXi + ZFS box. Basically, I have ESXi with a virtualized ZFS server. I'm sharing an ESXi datastore using iSCSI through ZFS.

However, when I reboot ESXi, it takes about 10 minutes and stops at loading "iscsi_vmk".

After it finally boots, it doesn't automatically remount my iSCSI datastore, even after my ZFS server is booted.

What can I be doing wrong?
 
Last edited:
I've just finished setting up my "All-In-One" ESXi + ZFS box. Basically, I have ESXi with a virtualized ZFS server. I'm sharing an ESXi datastore using iSCSI through ZFS.

However, when I reboot ESXi, it takes about 10 minutes and stops at loading "iscsi_vmk".

After it finally boots, it doesn't automatically remount my iSCSI datastore, even after my ZFS server is booted.

What can I be doing wrong?

use NFS for shared storage with all-in-one!
ESXi does not automount iscsi storage that is not available on ESXi boot time but with a delay after OI boots
 
Thanks for the response as always.

Is there a performance difference between iSCSI and NFS?

I could manually mount the iSCSI datastore every boot if there is a big (any?) difference in performance...

EDIT: Also, should I have one ZFS folder that ALL VMs are stored in, or should I create separate ZFS folders for each VM?
 
Last edited:
Anyone had any experience with the intel rs2wc080 on solaris/openindiana?

Struggling to find info on this lsi2008 controller and where it has pass through or is raid only.

Cheers
Paul
 
Is there a performance difference between iSCSI and NFS?
large and significant performance difference between iscsi and nfs. NFS + slog isn't entirely awful but iscsi is still going to be faster.

depends on your setup though. if you only have a single nic then you can easily saturate that even with NFS. if you have multiple nics though performance sways heavily in favor of iscsi due to multi-pathing.

i haven't had a chance yet to play with 10gbE but i've read quite a few blog/forum posts saying NFS still sucks on 10gbE compared to iscsi.
 
Thanks for the response as always.

Is there a performance difference between iSCSI and NFS?

I could manually mount the iSCSI datastore every boot if there is a big (any?) difference in performance...

EDIT: Also, should I have one ZFS folder that ALL VMs are stored in, or should I create separate ZFS folders for each VM?

In general, performance is quite similar.
Main problem with NFS is the sync write behaviour of ESXi for security reasons.
If this is a problem, you may add a SSD write cache or just disable it on a all-in-one
where in case of a crash SAN and ESXi us usually affected.

With iSCSI you should use one LUN for each VM to restore them indepemdently from ZFS snaps.
With NFS you can share your ZFS-folder also via SMB. One share for all is then ok because you can
easily restore/ clone/ copy/ backup ex from a Windows client.

Due to much better handling, I would use NFS
 
My new ZFS NAS:

ML110 G6 12GB Ram

5x1TB Drives - RAIDZ1

2x250GB Drives - RAIDZ Mirror

I have two spare 16GB SSD would i see any benefit using these for caching or would i better of using these for the operating partition instead of the 250GB drives?
 
You could try add the 16GB ssds as caches, because you could remove them from the zpool. I think it is easy to add, and remove ssds as caches. Check this before trying.
 
you should check your cache hit rate first to see if there is any vale, I suspect there will not be with 12GB ram unless you have a very heavy small random read usage. also check the read rate and iops as it may be worse than the raidz array if they are old ssd's.

as for use as an os partition, nope, no value here...unless you just want to see it boot faster...and you'll have way less space for snapshots of the rpool...so that's a big negative. you should leave the 250gb mirror as is IMHO.

it's likely they are of no value to the server.
 
If they (the 16GB ssd's) have decent write speed (the big gotcha in a drive that small) then a mirrored ZIL would be a reasonable use for them (assuming you have synchronous writes).
 
you should check your cache hit rate first to see if there is any vale, I suspect there will not be with 12GB ram unless you have a very heavy small random read usage. also check the read rate and iops as it may be worse than the raidz array if they are old ssd's.

as for use as an os partition, nope, no value here...unless you just want to see it boot faster...and you'll have way less space for snapshots of the rpool...so that's a big negative. you should leave the 250gb mirror as is IMHO.

it's likely they are of no value to the server.
It's not just the hit rate, tf you have a lot of ghosts in your ARC you could benefit from SSD L2 to grow the ARC and minimize the repeat evictions.

As for ZIL, if you are doing the all-in-one it makes no sense to have sync enabled on performance intensive NFS folders imo.
 
True about the L2ARC, but only the person with that particular system can tell if it will help. I found a program arc_summary.pl that prints out all this useful info. Ghosts are never an issue on my all in one. Agree 100% on your point about sync=disable vs zil for the all in one.
 
As for ZIL, if you are doing the all-in-one it makes no sense to have sync enabled on performance intensive NFS folders imo.

Good point - I hadn't really considered the pointlessness of sync writes for VM's hosted on the same machine. (Assuming that the zfs host is only exposing shares to locally hosted vm's - if it has any "off-machine" shares then there might be some utility).
 
True about the L2ARC, but only the person with that particular system can tell if it will help. I found a program arc_summary.pl that prints out all this useful info. Ghosts are never an issue on my all in one. Agree 100% on your point about sync=disable vs zil for the all in one.
Yea arc_summary.pl is how I did it too. Ben Rockwood (he wrote it) has a good YouTube series about zfs tips.

Good point - I hadn't really considered the pointlessness of sync writes for VM's hosted on the same machine. (Assuming that the zfs host is only exposing shares to locally hosted vm's - if it has any "off-machine" shares then there might be some utility).
You can make a folder deticated to datastores and disable sync on it and leave sync enabled on the other zfs folders shared to hosts outside the ESX server.
 
so I have a M1015 with 8 drives attached, but I want 2 extra drives for a 10 drive RAIDz2.

As a result I decided to employ the 2x SATA-3 plugs on my Z68 motherboard. I looked at the passthrough list and I saw a 4 port and a 2 port for Cougar Point. I figured the 4 port meant the SATA2 ports and the 2 port meant the SATA3 ports. I plugged my 2 drives into any of the 4 SATA2 ports and my boot drive into one of the SATA 3 ports, and passed the 4 ports through.

Apparently things aren't working. I have a Z68 Extreme4 Gen3 by ASRock. Anyway, my boot drive disappeared. I'm gonna play around more. Maybe have to plug it in on that Marvell controller? (Yuck).

Anyway, my main issue is I cannot REMOVE the Intel SATA Controller from passthrough. I'll uncheck it and EXSi will tell me that I have to reboot for changes to stick. I reboot, but it's still shown as pass through. WTF.

3sJE5.png


Edit: Second issue, is it even possible to pass through the SATA3 and SATA2 ports separately? That's how it looks on the device list.... but I thought it's just 1 controller anyway...
 
Last edited:
In case anybody is interested, I ran some benchmarks to compare NFS and iSCSI (hosted on ZFS)

Notes:
* ZFS hosted under ESXi running a dedicated local SATA drive.
* NFS/iSCSI datastore contained 4x Hitachi 5k3000 2TB drives in a RAID10. (ZFS)
* NFS had sync-write disabled.
* Benchmark was run under Windows Server 2008 R2 x64. Fresh install with no patches except for VMware Tools installed.
* Benchmark software was CrystalDiskMark.
* Benchmark was run three times in a row (which is why you see three screenshots for each test)

Here are the results:
nfsiscolo.png


It looks like the writes were very similar but the reads had a fairly significant difference. Sequential has the biggest difference (almost 2x) but the other tests also showed quite a bit of difference.

I also have benchmarks from my WD 150 GB Veliciraptor if anybody is interested.

However, these results make me think that iSCSI is quite a bit better than NFS! :(
 
What does 'hosted on zfs' mean? The datastores were hosted on nfs and/or iscsi? Was this using passthru or something with the SAN on the esxi box? (e.g. an all in one?) If not, how did you get more than 2gb/sec read thruput over the net? I'm guess random read performance (more common with esxi guests than long sustained reads/writes) would be more equal. Also, what I hate about iscsi is twofold: esxi is not as tolerant of the SAN being offline when it is starting VMs, and the iscsi datastore is a big opaque object, so you can't see/manipulate individual objects in the datastore on the SAN itself. Just my preference...
 
You are correct. Hosted On ZFS means that my datastores were stored on ZFS shared via iSCSI and NFS. (ie. I created a Windows 2008 guest operating system on the ZFS/NFS datastore, did some benchmarks, then moved it to the ZFS/iSCSI datastore and did more benchmarks).

It is an "All-In-One" box using 2x LSI 9211-8i cards using VT-d passthrough.
 
Interesting. You might want to post this on the virtualization board and see if anyone there has a thought as to why iscsi reads are so much faster than nfs. Lopoetve there is an actual vmware guy and is quite knowledgeable.
 
Keep in mind that iSCSI VMs will not come up after a reboot (of the host). ESX will see the iSCSI HBA as not available until after you do a rescan. So, after a reboot you'd have to open the Windows based VMWare management console, rescan the HBAs, and then your VMs would re-appear. Not good if you want things to come up on their own, or if you don't have any Windows machines (aside from your VMs).
 
Last edited:
Interesting. You might want to post this on the virtualization board and see if anyone there has a thought as to why iscsi reads are so much faster than nfs. Lopoetve there is an actual vmware guy and is quite knowledgeable.

because NFS adds an additional layer of overhead. think of iscsi as just a local sata bus and on the other end of that bus is a block device where as with NFS the other end there is another OS between the block device and your requests.
 
Keep in mind that iSCSI VMs will not come up after a reboot (of the host). ESX will see the iSCSI HBA as not available until after you do a rescan. So, after a reboot you'd have to open the Windows based VMWare management console, rescan the HBAs, and then your VMs would re-appear. Not good if you want things to come up on their own, or if you don't have any Windows machines (aside from your VMs).

this may be true with an all in one situation but this is NOT true for a normal situation where the SAN is a separate device.

fairly certain you can solve that situation too (for all in one) with restart ordering on the esxi hosts
 
Quick question on time machine and afp :)


So ive installed AFP and its all working

wget -O - www.napp-it.org/afp22p6 | perl

Then to make it work - added the following into afpd.conf

-noddp -transall -uamlist uams_randnum.so,uams_dhx.so,uams_dhx2.so -nosavepassword -advertise_ssh -udp

Now im trying to setup timemachine on lion to use this - but I keep getting the error

"the network backup disk does not support the required afp features"

Anyone know how to solve?

Paul

edit - > solved - missing a "-"

- -noddp -transall -uamlist uams_randnum.so,uams_dhx.so,uams_dhx2.so -nosavepassword -advertise_ssh -udp
 
Last edited:
Thank you, will I still need to add the line to the afpd.conf or is this automatic?

Thank you :)

the afp installer does not modify the default netatalk afpd.conf
you can modify if you need but netatalk 2.2 should basically work with defaults:
- -tcp -noddp -uamlist uams_dhx.so,uams_dhx2.so -nosavepassword
 
Last edited:
Thank you.

Would you know where to start looking to link with a ups?

Ta
Paul
 
Back
Top