OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

The algorithms around the ZFS Arc are one of the best at all. If you want to increase hit rate, you can add more RAM or add an L2Arc SSD (not as fast)

Not sure if this is good or bad. I had some huge directories from a recovered disk, not sure how having many files on the file system can affect the cache (if they should be moved to an area that has cache disabled since they are really just sitting there waiting for me to sort them).

Pretty much what I am seeing:

Code:
    time  hit%  dh%  ph%  mh%  arcsz
05:59:46    40   39  100   38   5.8G
05:59:56    38   37  100   38   5.8G
06:00:06    64   64   83   55   5.8G


System Memory:
         Physical RAM:  32759 MB
         Free Memory :  25227 MB
         LotsFree:      511 MB

ZFS Tunables (/etc/system):

ARC Size:
         Current Size:             5952 MB (arcsize)
         Target Size (Adaptive):   31479 MB (c)
         Min Size (Hard Limit):    3934 MB (zfs_arc_min)
         Max Size (Hard Limit):    31479 MB (zfs_arc_max)

ARC Size Breakdown:
         Most Recently Used Cache Size:          50%    15739 MB (p)
         Most Frequently Used Cache Size:        50%    15739 MB (c-p)

ARC Efficency:
         Cache Access Total:             450271
         Cache Hit Ratio:      76%       345448         [Defined State for buffer]
         Cache Miss Ratio:     23%       104823         [Undefined State for Buffer]
         REAL Hit Ratio:       63%       286969         [MRU/MFU Hits Only]

         Data Demand   Efficiency:    92%
         Data Prefetch Efficiency:    62%

        CACHE HITS BY CACHE LIST:
          Anon:                       16%        58479                  [ New Customer, First Cache Hit ]
          Most Recently Used:         26%        90886 (mru)            [ Return Customer ]
          Most Frequently Used:       56%        196083 (mfu)           [ Frequent Customer ]
          Most Recently Used Ghost:    0%        0 (mru_ghost)  [ Return Customer Evicted, Now Back ]
          Most Frequently Used Ghost:  0%        0 (mfu_ghost)  [ Frequent Customer Evicted, Now Back ]
        CACHE HITS BY DATA TYPE:
          Demand Data:                62%        214687
          Prefetch Data:              15%        54874
          Demand Metadata:            20%        72282
          Prefetch Metadata:           1%        3605
        CACHE MISSES BY DATA TYPE:
          Demand Data:                16%        17125
          Prefetch Data:              31%        32728
          Demand Metadata:            50%        53419
          Prefetch Metadata:           1%        1551
 
Last edited:
Not sure if this is good or bad. I had some huge directories from a recovered disk, not sure how having many files on the file system can affect the cache (if they should be moved to an area that has cache disabled since they are really just sitting there waiting for me to sort them).

Arc is based on recently and frequently accessed datablocks with a prefetch of data.
If you never accessed a special datablock or a prefetched datablock, it cannot be in Arc so quality of Arc increases over time.

Your hit rate is not very high. If this is the result after a few days, your data access pattern requires a larger cache (more RAM or via L2Arc) to improve hit rate.

http://dtrace.org/blogs/brendan/2012/01/09/activity-of-the-zfs-arc/
 
I know this is for *Solaris based distros but for us using ZFSonLinux the changelog from the new version is pretty impressive. This is the major revision, it's since up to 0.6.5.2

https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.6.5

interesting! i'm planning to buy a HP ProLiant MicroServer Gen8 this week and install zfsonlinux-debian in combination with nappit.. this way i can use plex/transmission/sab/owncloud/Syncthing without having to setup virtualbox and loose performance. is the zfsonlinux project production ready?
 
I'd say so. I've had no problems using it with a similar use case. I use docker containers
 
So I did a little experiment as I simply don't understand how a cache that is not growing in size can improve if more ram is given to it as I haven't saturated the RAM that I already have.

Here is what I did.

1. I created an NFS mount and copied over my existing data files to it.
2. Added a new NIC to the VM and put it on a 10G storage network.
3. Fired up the DB like service and pointed it to the NFS mount instead of the local VMWare Drive
4. Started monitoring cache hits a 1 second intervals via arcstats

Cache hit ratio went up significantly. Obviously something is going on here, cache is getting hit when on its own mount but not getting hit when inside of VMWare drive.

Any ideas why this might be?
 
Hi here,

i'm using omnios on a AMD kabini platform (16GB) for my home-setup FC on ESXi.

I have napp-it running with 10 disks on 9201-16i (LSI) (version 19 of firmware). The box starts up from 2 seperate ssd's from the same controller.

About once a month, the whole machine locks up and is only responding to ping.

FC stops, ssh stops. Even logging locally does not work anymore.

I do not seem to find a reason (the logging doesn't show anything suspicious, only time of day chip gives 28 dec 1986 when it boots up).

I just turned off any power-saving mode i could think of.

Maybe i should change the boot disks to locally (onboard sata) and reinstall the system if it is caused by a driver/firmware issue (and the logging does not remain??).


Is there another thing i could check?


Thanks in advance.


Martijn
 
Is there any general concensus or approach to how one should choose a record size - does it even matter?
 
Hmm. How do you'll monitor HDD temps in your boxes? I'm using napp-it, and check the SMART values... but want to graph it, via Observium if possible?
 
a few pages back there was a python script I think written by a member. I could be wrong but I have seen it done.
 
Hey Gea,

Considering going with replication extension. Is it possible to run a pre/post script prior to the replication job doing what it has to do eg sending the data?
 
Hey Gea,

Is it possible to run a pre/post script prior to the replication job doing what it has to do eg sending the data?

You can either create an 'other' job that you run prior/ after a replication or
modify the replication script /var/web-gui/data/napp-it/zfsos/_lib/scripts/job-replicate.pl

pre action: at the beginning of the script
post action: at the end of sub my_postreplication
 
For snaps, if the hold field is not specified what is the default? Also are # of snaps and hold mutually exclusive? What happens if there are more snaps than #snaps over the hold period?
 
Last edited:
Thinking of going the AIO route again. I moved away for a bit. Had 2 xeon servers with 32gb running esxi - storage served by esos (linux distro on a stick - now supports zfs) via iSCSI. I just scored a dell poweredge R905 with 128gb ram and 4 quad-core 2.4ghz opterons for just over $300. With that much ram, I'm thinking of throwing as much as 96gb ram at an omnios/solaris VSA and serve that back via NFS. I'd love to upgrade the existing host, but it's a sandy bridge mobo limited to 32gb :( So one of the esxi boxes would be decommisioned, as would the current storage server. The other esxi box would do what it does now (runs a backup VSA, as well as being the 'other host' in a vsphere HA cluster.) Comments?
 
For snaps, if the hold field is not specified what is the default? Also are # of snaps and hold mutually exclusive? What happens if there are more snaps than #snaps over the hold period?

The default for napp-it replication snaps is that last two snaps are there.
The important value is hold as this specify the number of days.
All replication snaps beside last two that are older are deleted.

The keep value can be used to keep special snaps forever (you must delete them manually when they are no longer needed), ex "100" will keep a snap with nr ending to 100 (_100, _1100, _2100 etc).
 
The default for napp-it replication snaps is that last two snaps are there.
The important value is hold as this specify the number of days.
All replication snaps beside last two that are older are deleted.

The keep value can be used to keep special snaps forever (you must delete them manually when they are no longer needed), ex "100" will keep a snap with nr ending to 100 (_100, _1100, _2100 etc).

Sorry Gea but a a little confused. Trying to understand which case snaps would get dropped.

If snap period is daily and snaps is blank but hold is 14. Then?
If snap period is daily and snaps is 12 but hold is 14. Then?
If snap period is daily and snaps is 12 but hold is blank. Then?

*I chose 12 for snap because there is no value of 14.
 
Sorry Gea but a a little confused. Trying to understand which case snaps would get dropped.

If snap period is daily and snaps is blank but hold is 14. Then?
If snap period is daily and snaps is 12 but hold is 14. Then?
If snap period is daily and snaps is 12 but hold is blank. Then?

*I chose 12 for snap because there is no value of 14.

ok, we are talking about different jobs,
i talked about replication and you of autosnap.

With autosnap you can set either keep (number of snaps) and hold (number of days).
A snap is deleted when both settings allow a delete (like a and relation)

ex
If snap period is daily and snaps is blank but hold is 14
= delete all snaps older 14 days

If snap period is daily and snaps is 12 but hold is 14.
= keep a minimum of 12 and delete all snaps older 14 days = delete all snaps older 14 days

If snap period is daily and snaps is 12 but hold is blank
=keep 12 snaps
 
That makes perfect sense, thank you. Replication job requires a key doesn't it in order to properly to zfs send/receive.
 
General question.

How safe to send from say an OmniOS ZFS Volume to either a Linux/FreeNAS ZFS Volume.

Can this be done safely for offsite backup or should the source and target be running the same version (eg. OmniOS ZFS -> OmniOS ZFS)? Asking because GCE seems to be cheaper than EC2 but OmniOS doesn't look to be supported on the GCE virtualization platform, no virtioscsi driver. Also if one doesn't care about performance beyond the host being able to receive data, what do you think are the minimum Ram requirements for the OS?
 
General question.

How safe to send from say an OmniOS ZFS Volume to either a Linux/FreeNAS ZFS Volume.

Can this be done safely for offsite backup or should the source and target be running the same version (eg. OmniOS ZFS -> OmniOS ZFS)? Asking because GCE seems to be cheaper than EC2 but OmniOS doesn't look to be supported on the GCE virtualization platform, no virtioscsi driver. Also if one doesn't care about performance beyond the host being able to receive data, what do you think are the minimum Ram requirements for the OS?

You should be able to send receive filesystems between OpenZFS (but not from/to Oracle Solaris). You should only use a firewall or VPN when connecting storage to the internet.

about RAM
Oracle claims a minimum of 2 GB for Solaris 11.2 without a dependency to a poolsize.I would use that minimum also for OmniOS. More RAM is used as readcache for a better performance. BSD or Linux may be different.

http://www.oracle.com/technetwork/s...ocumentation/solaris11-2-sys-reqs-2191085.pdf
 
How many people out there are using Comstar? I ask because after looking at the code it seems like there have been a huge number of bug fixes + additions in Nexenta where as Illumos hasn't been touched too much. You have to wonder if anyone will ever backport these changes from Nexenta into Illumos. It seems pretty clear why VAAI hasn't been applied even though Nexenta has it. I don't think it would be easy to do.
 
Nexenta is very active regarding integration efforts with ESXi like VAAI.
Sadly none of the other major players behind Illumos cares about that.

Overall the VAAI item is similar to SMB2+. This was developed by Nexenta as well
but did not yet found its ways to Illumos although there are efforts and announcements
in last years Illumos Day.

In next Developer Summit (October 19), there are some nice presentations
but nothing around VAAI (or SMB2).

You should ask at Illumos-discuss where all the developers are around.
Maybe you get an answer there about the state of upstreaming such features.
http://open-zfs.org/wiki/OpenZFS_Developer_Summit_2015
 
Not sure if I should post here but with the latest AIO appliance (napp-it_15d_ova_for_ESXi_5.5u2-6.0. (Sep 18,2015)) I manage to get my IBM M1015 detected.

However no disks were detected (tried Disks -> initialise and parted -l but to no avail).
Controller was cross-flashed to LSI 9211-8i IT mode (P19) and is passed-through to OmniOS's VM.

Any tips?


EDIT: found out it was a bad cable (SAS 8087 -> 4xSATA). Sorry for hijacking this thread.
 
Last edited:
For Omni is the default compression LZ4? Anyway to find out what kind of compression I'm using. Zfs get just shows compression as on.
 
I would set LZ4 manually (afaik, its not the current default for on)

You need pool v5000 and lz4 must be enabled as a feature
"zpool set feature@lz4_compress=enabled <poolname>"
"zfs set compression=lz4 <filesystem>"
 
It's also multithreaded, one of the few compression algorithms that are from what I remember. The ZFS implementation should be the threaded I think.

I set LZ4 on all my pools from the start.
 
If I already had things set with regular compression and want lz4 applied to all data, is the only way to do this to zfs send/receive to a new datastore and send/receive back with the lz4 option on?

Also, can a v28 pool be imported into Solaris?
 
I've been waiting almost 10 years for a small and good ZFS NAS. Has the hardware caught up yet? Are the new low power intel chips fast enough for a business to sell an affordable and compact 2 drive ZFS NAS?
 
If I already had things set with regular compression and want lz4 applied to all data, is the only way to do this to zfs send/receive to a new datastore and send/receive back with the lz4 option on?

Also, can a v28 pool be imported into Solaris?

Compress is only applied during a write but send receive will be always uncompressed.
You need to set LZ4 on the parent filesystem to inherit LZ4

Pool v28/ ZFS v.5 is interchangeable between Solaris and OpenZFS
(but does not support LZ4)
 
I've been waiting almost 10 years for a small and good ZFS NAS. Has the hardware caught up yet? Are the new low power intel chips fast enough for a business to sell an affordable and compact 2 drive ZFS NAS?

Any current Intel CPU is fast enough for a smaller ZFS NAS (at least with 1Gb/s network).
2GB is the minimum RAM. For business use, you should use at least 4-8GB ECC RAM
for a better performance (used as readcache) and prefer a Nic from Intel.
 
I've been waiting almost 10 years for a small and good ZFS NAS. Has the hardware caught up yet? Are the new low power intel chips fast enough for a business to sell an affordable and compact 2 drive ZFS NAS?

I use a Xeon 1540D. I believe the 1541D comes out next month. Runs like a champ. A 1520 might be a good option as well.
 
Back
Top