OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

with the current OmniOS stable (151020) in a VM, what's currently the best NVMe on the market (for prosumers) to use as a passthrough SLOG device?
 
You need powerloss protection for the NVMe as an Slog.
I would check the Intels (P750, P3700 etc)

alternative: use the NVMe as a pool for VMs directly (no Slog or L2ARC needed)
 
Installed omnios-r151020-4151d05 over the weekend. Everything seems to be up and working but I'm getting this message when I log in. I saw the notes about doing chmod 666 /dev/null as root but that returns the same error in the console.
 

Attachments

  • napp-it cant change dev.null.png
    napp-it cant change dev.null.png
    124.8 KB · Views: 81
I had the same problem suddenly.
The above fix worked for me.

Is napp-it working for you beside that message on login?
The /dev/null is only needed to send unwanted console messages to.
 
Looking into providing some storage to my 2-node ESXi cluster, via either NFS/iSCSI.
I have a whitebox that i've been wondering on whether it would have the potential to do this ("home production" use.. )

SuperMicro X7DVL-3, with dual Intel L5410 cpus
24gb ram
motherboard has 8 SAS ports via a LSI 1068E (i have this flashed to I.T. mode already).
dual on-board NICs + an additional 4-port gigabit Intel (6 total)
All shoved in a Rosewill 4u case, that can hold 12 drives.

??
 
Last edited:
Do have this box already?
A lot of energy needed for less CPU power (compared to newer systems), only one pci-e slot and the 1068 is limited to 3G and max 2TB disks

If you can live with that, it should work
 
Do have this box already?
A lot of energy needed for less CPU power (compared to newer systems), only one pci-e slot and the 1068 is limited to 3G and max 2TB disks

If you can live with that, it should work

Yes I do have this , as it is my current lab and media server :)
I have two ESXi hosts on the way, and I'm on tight budget, so I was hoping to hopefully convert it to storage use only. I'm not looking for top of the line performance. I can deal with the energy need. I'm thinking this box could still be adequate to hold me over on a tight budget.
 
I had the same problem suddenly.
The above fix worked for me.

Is napp-it working for you beside that message on login?
The /dev/null is only needed to send unwanted console messages to.

Gea,

Yes as far as I can tell everything otherwise works fine. I do not recall having this error immediately after install.. it just crept in at some point a day or two later. The use of /dev/null will work b/c it's currently set 666 (777 on the symlink but that's always reported for symlinks AFAIK). I see the line setting 666 on webpage load. Do you see any harm in commenting that out for now?

I've tried digging but can't find any fix for the error because I can't find any cause. I can't find any extended attributes on the file and ownership is root:root so I don't understand why there's a chown error. I tried to touch /reconfigure and reboot but no different.
 
In earlier napp-it releases I send some error messages to /dev/null.
When I saw this problem first I removed this so current napp-it will work despite a problem with /dev/null

I added then the chmod command on login (hence the message).
This worked in my setups but it is not needed for operation.
 
I fixed it, but I don't know exactly what did it.

It might have been creating /reconfigure and rebooting. I thought that didn't work before but perhaps it did. If anybody else stumbles on this I suppose that's worth a try. I know I did have to to manually run `sudo devlinks` after reboot and logging in as a normal user because I did see an error about /dev/null and saw that it was a regular file. I want to say it was another reboot after that where the problem went away.

Unfortunately I had more than that going on at the time taking my attention so I only noticed the issue went away by accident.
 
This is probably a crazy question. But, how come there isn't much discussion of using proxmox with napp-it for an all-in-one build?
I know ESXI has quite a bit of information posted for a Napp-it all-in-one build. (Heck I'm currently running one myself).
But, I'm just kind of surprised given the growing popularity of Proxmox and the shrinking (very slowly - especially in the actual business world) popularity of ESXI.
Not tied to ESXI and I'm kind of interested in seeing what Proxmox / Xen can offer but, at the same time the complete void of information is making me wonder if I'm missing something. (Like not even posts of how terrible proxmox & napp-it might be together)
 
Dear Gea,

I hope all is well! I am still on R151014 using ESXi 5.5. Was thinking to update to latest 6.0 of ESXi.

Since there have been so many changes with OmniOS, think I should do a fresh VM of R151020 or should I just update the R151014 following the instructions? I am aware about switching over to OpenSSH before updating per release notes. Just wondering if I am better off just doing a fresh install/VM?

Thanks!
 
This is probably a crazy question. But, how come there isn't much discussion of using proxmox with napp-it for an all-in-one build?
I know ESXI has quite a bit of information posted for a Napp-it all-in-one build. (Heck I'm currently running one myself). But, I'm just kind of surprised given the growing popularity of Proxmox and the shrinking (very slowly - especially in the actual business world) popularity of ESXI.
Not tied to ESXI and I'm kind of interested in seeing what Proxmox / Xen can offer but, at the same time the complete void of information is making me wonder if I'm missing something. (Like not even posts of how terrible proxmox & napp-it might be together)

The current hype is lightweight container based virtualisation for Linux guests.
This is something that a type-1 hypervisor like ESXi or Xen does not offer, the offer full OS virtualisation (of any guests). In case of ESXi you addionally have the best pass-through and driver support for guest OSs what makes it a perfect base for a storage VM.

If you use Proxmox (or the Illumos distribution SmartOS), you can use ZFS as native filesystem, so you do not need a storage VM if you only want ZFS. This is different if you want a full featured Storage NAS/SAN appliance. In this case you need pass-through to assigne the disk controller and disks to the storage VM.

There are reports (google "proxmox napp-it") who have done this but I have no experience myself about pass-through on Proxmox.. As an alternative you can use ESXi as a base for full virtualisation of any guests and a guest (can be even be OmniOS with LX container or any Linux) for container based virtualisation on top. You only need around 2GB addiotionally for ESXi with a very low impact on performance. The plus with ESXi is the very easy way to reinstall after a crash and to restore VMs by a simple copy on an NFS share.
 
Dear Gea,

I hope all is well! I am still on R151014 using ESXi 5.5. Was thinking to update to latest 6.0 of ESXi.

Since there have been so many changes with OmniOS, think I should do a fresh VM of R151020 or should I just update the R151014 following the instructions? I am aware about switching over to OpenSSH before updating per release notes. Just wondering if I am better off just doing a fresh install/VM?

Thanks!

You can either update (->151018, switch SSL, ->151020) or
you can use my ova template with 151020

You may wait to the next 151022 that should be available in Q2
 
I bought a couple of Samsung 960 PRO drives to use for a mirrored pool for vsphere guests. Sadly, this drive is TOO new, and being NVME 1.2, is apparently not supported by either Illumos or Solaris 11.3 :(
 
Last edited:
After updating to 17.01 free I get the menus meesed up in Chrome. Its OK in Edge.

nappit.png


Does anybody know how to fix this?
TIA
 
Just reload the page as the new version has a modified css and you use the old one from browser cache.
 
The current hype is lightweight container based virtualisation for Linux guests.
This is something that a type-1 hypervisor like ESXi or Xen does not offer, the offer full OS virtualisation (of any guests). In case of ESXi you addionally have the best pass-through and driver support for guest OSs what makes it a perfect base for a storage VM.

If you use Proxmox (or the Illumos distribution SmartOS), you can use ZFS as native filesystem, so you do not need a storage VM if you only want ZFS. This is different if you want a full featured Storage NAS/SAN appliance. In this case you need pass-through to assigne the disk controller and disks to the storage VM.

There are reports (google "proxmox napp-it") who have done this but I have no experience myself about pass-through on Proxmox.. As an alternative you can use ESXi as a base for full virtualisation of any guests and a guest (can be even be OmniOS with LX container or any Linux) for container based virtualisation on top. You only need around 2GB addiotionally for ESXi with a very low impact on performance. The plus with ESXi is the very easy way to reinstall after a crash and to restore VMs by a simple copy on an NFS share.

ProxMox actually does both containers and real VMs. SmartOS is not assembled in the same way and is more geared towards containers. Proxmox has VERY GOOD (better than ESXi) device passthrough. it can handle multiple types of passthrough, (pci, pcie limited, and pcie full (for videocards.)) and can have multiple devices on a single VM. there are some limits to the total amount of passthrough devices per VM but this can be effected by the configuration and not ProxMox itself.

the OS (freenas based on freebsd, or napp-it on some sort of solaris) are all supported on proxmox and have very usefull config guides posted. and yes ZFS is supported in the core of proxmox, there are still use cases for a dedicated virtualized NAS running on top.
 
I've got a file permissions issue that is driving me nuts!! I can't delete a file, despite /usr/bin/ls -v showing everyone having permissions to it. Files in the same directory, that show the same results in ls -v, are deletable!

Code:
root@filer:/aggr1/Folder1/Home/geofile# rm geofile.prc
rm: cannot remove 'geofile.prc': Permission denied
root@filer:/aggr1/Folder1/Home/geofile# chown root:root geofile.prc
root@filer:/aggr1/Folder1/Home/geofile# ls -lh
total 16G
-rwxrwxrwx+ 1 root   root  16G Apr 16 07:13 geofile.prc
-rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:28 a.txt
-rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:32 b.txt
root@filer:/aggr1/Folder1/Home/geofile# chmod 777 geofile.prc
root@filer:/aggr1/Folder1/Home/geofile# ls -lh
total 16G
-rwxrwxrwx+ 1 root   root  16G Apr 16 07:13 geofile.prc
-rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:28 a.txt
-rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:32 b.txt
root@filer:/aggr1/Folder1/Home/geofile# rm geofile.prc
rm: cannot remove 'geofile.prc': Permission denied
root@filer:/aggr1/Folder1/Home/geofile#

root@filer:/aggr1/Folder1/Home/geofile# /usr/bin/ls -v
total 33531322
-rwxrwxrwx+  1 root     root     17150952929 Apr 16 07:13 geofile.prc
     0:user:root:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/write_acl/write_owner/synchronize:file_inherit
         /dir_inherit:allow
     1:eek:wner@:read_data/write_data/append_data/read_xattr/write_xattr/execute
         /read_attributes/write_attributes/read_acl/write_acl/write_owner
         /synchronize:allow
     2:group@:read_data/write_data/append_data/read_xattr/execute
         /read_attributes/read_acl/synchronize:allow
     3:everyone@:read_data/write_data/append_data/read_xattr/execute
         /read_attributes/read_acl/synchronize:allow
-rwxrwxrwx+  1 andrew   staff          0 Apr 16 07:28 a.txt
     0:user:root:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/write_acl/write_owner/synchronize:inherited:allow
     1:everyone@:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/synchronize:inherited:allow
-rwxrwxrwx+  1 andrew   staff          0 Apr 16 07:32 b.txt
     0:user:andrew:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/write_acl/write_owner/synchronize:inherited:allow
     1:user:root:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/write_acl/write_owner/synchronize:inherited:allow
     2:everyone@:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/synchronize:inherited:allow
root@filer:/aggr1/Folder1/Home/geofile#

What on earth is going on?

I've done the reset ACLs option in the ACL extension (napp it Pro user) and according to the output, it resets the folder and file permissions as expected. Yet Windows still says that I don't have enough permissions to see who the owner of the file is, despite being able to see that for the folder it's in.
 
Last edited:
root at console can always delete, does not matter the acl/permission setting. This is different to Windows where even admin respects acl settings (but can always modify). The main reasons that hinders root to delete is a readonly filesystem or when the file is busy/opened by an application.
You can use Windows computer management (connect as a user that is member of Solarish SMB group administrators) to check for open files

To remotely manage ACL on Windows you need a Pro (not Home) edition, optionally SMB connect as user root for full permissions.
 
root at console can always delete, does not matter the acl/permission setting. This is different to Windows where even admin respects acl settings (but can always modify). The main reasons that hinders root to delete is a readonly filesystem or when the file is busy/opened by an application.
You can use Windows computer management (connect as a user that is member of Solarish SMB group administrators) to check for open files

To remotely manage ACL on Windows you need a Pro (not Home) edition, optionally SMB connect as user root for full permissions.

But look at the output, where I show I'm logged into OmniOS as root and it's giving me permission denied when I try to delete the files? Here it is again:

Code:
root@filer:/aggr1/Folder1/Home/geofile# rm geofile.prc
rm: cannot remove 'geofile.prc': Permission denied

Edit: Also, to be clear, this is after the client that was touching that file was a) rebooted, then after that didn't work, b) shutdown for a long time.

I guess it would be helpful to show what is locking a file open, if that is indeed the cause. Is there a command that will show any client protocol access to a file?
 
Last edited:
I am not aware for a command that shows if the file is opened by any application or protocol.
You can try a reboot or export/import that removes all locks to be sure about this option.

If you use only SMB you can check with Computer Management.
Is this a current release of OmniOS?

Beside that locking is affected by the ZFS property nbmand (set to on)

Another option that can solve some problems is (followed by a cifs service restart)
svccfg -s network/smb/server setprop smbd/oplock_enable=false
 
Not sure why but multi-path doesn't seem to be picking up my latest addition, it configured the rest of the sas drives on the same controllers and enclosure as the zeusram, but it is listed twice. Not sure how to edit the config for it to be recognized.

c22t5000A72B3007B073d0 1 via dd ok 8000 MB 8 GB S:0 H:0 T:0 STEC ZeusRAM STM000173D0B
c26t5000A72A3007B073d0 1 via dd ok 8000 MB 8 GB S:0 H:0 T:0 STEC ZeusRAM STM000173D0B

Curiously the Disk Location reports it not on the lsi, even though that's all there is
Disks available on non LSI2/3 controller
 
If you use multipath io, disks are listed twice.
If this is unintended, you can disable mpio (napp-it menu Disks > Details > Edit mpt_sas.conf) but I suppose you use mpio on the ZeusRAM to double io performance.

If the disk detection on LSI controllers does not work properly, you can try a different sas2ircu or sas3ircu from LSI as this tool is used for disk location/map. Prefer the tool that is offered by your controller manufacturer for your HBA. On some controllers (especially LSI 3008 based ones) you can try different/older versions from LSI. I have seen LSI 3008 HBAs that only work with a very old version of sas3ircu.
 
I definitely want to use multipath, I want to use the ZeusRAM as a write-log for a pool. Do I select both copies of the disk when I add it to the pool? In testing I only selected one and my sync wire performance seemed to hit a wall around 100MB/s, which was quite a bit lower than I expected.

What I meant about the LSI detection is that all my other sas disks show up with enclosure and slot numbers but the ZeusRAM does not, even though it is in the same enclosure as some of the other drives that do.
 
Looks like adding it to the override fixed it.

In the edit scsi_vhci.conf I added

scsi-vhci-failover-override =
"STEC ZeusRAM", "f_sym";

After a reboot I see 2 paths, and proper enclosure/slot information on the lsi sas2/3 controller. Additionally the disk only shows once in the add pool section. Hopefully this will help with the performance.


And thank you Gea for making such a nice product!
 
Thank you for your feedback.
I own two ZeusRAM but use them in different servers so your solution is new to me.

I am glad that you like napp-it,
more or less a spinoff of my daily work with a focus on simple property listings/editing
despite the smartphone alike app hype in other products.
 
The next OmniOS 151022 long term stable edition
(expected June 2017) is available as a public beta.
ReleaseNotes/r151022

For an update from a version prior the current stable 151020 (up from last LTS 151014),
- check Upgrade_to_r151022
- and napp-it // webbased ZFS NAS/SAN appliance for OmniOS, OpenIndiana, Solaris and Linux :Update OmniOS
- If you use napp-it, you must first update to newest v 17 free, pro or dev (April edition)

You must mainly care about number of BEs (<30) and the switch of SunSSH to OpenSSH in 151018 stable and a current napp-it to avoid problems with the Perl module Expect


Update from a current default stable OmniOS 151020

If you want to update from current stable 151020 setup, you must update napp-it, switch the repository and update. The easiest way is using Putty as root as you can copy/paste commands with a mouse right click
Code:
pkg unset-publisher omnios
pkg set-publisher -P --set-property signature-policy=require-signatures -g package repository omnios
pkg update
reboot


As the setup creates a bootenvironment, you can go back to the former OS release when needed.
Please report remaining problems

napp-it

I have updated napp-it to work with this release (April 2017 edition)
earlier versions of napp-it are not compatible (Tty problem of Perl module Expect)


Problems

What I have found is a problem with the new loader under VMware
where the installer switched to maintenance mode. You can skip that with ctrl-d to run the setup
 
Bad news for OmniOS users
OmniTi stops commercial support and development after next 151022 stable in may
[OmniOS-discuss] The Future of OmniOS


We will see what this means for the Future of OmniOS
(continuation as an OpenSource project, or move/merge with another Illumos distribution like OI or SmartOS)
 
That's unfortunate. It seems most people are moving towards ZoL anyway though.



Gea have you ever tested the Pushover functionality on Solaris? Unfortunately it has never worked for me. The test messages work but I never get alerts for pool degradation. For instance, my log shows:

Code:
 ok       scrub 1490197372       time: 2017.04.23.13.04.16       info: 
 ok       scrub 1490197372       time: 2017.04.23.13.04.16       info: resilver in progress since Sun Apr 23 13:04:10 2017
 ok       scrub 1490197372       time: 2017.03.24.06.10.35       info: 
 ok       scrub 1490197372       time: 2017.03.24.06.10.35       info: scrub repaired 2.16M in 1d18h with 0 errors on Fri Mar 24 06:10:30 2017

But I get no messages from Pushover. This happened on all the 16.x versions for us, Pro and Free. Just curious if you have any ideas or any tests you want me to run on Solariss 11.3 for you.
 
Alert only triggers on a degraded pool, not on repaired checksum problems.
I have not tried pushover on Solaris but as this is based on a simple wget it should work there as welle
see testscript "/var/web-gui/data/napp-it/zfsos/15_Jobs and data services/05_Push/09_push-test/action.pl"

about OmniOS and ZFS
Most ZFS usage will be on ZoL but this is mainly because marketshare.
I will always prefer the "it just works" ZFS solutions from Solaris and Illumos, the free Solaris fork.

There are now efforts among professional users to continue a commercial support option by collecting money
and there are efforts to end the fragmentation between OmniOS and OpenIndiand as they are nearly identical
so a merge is possible.

see https://echelog.com/logs/browse/omnios/1493157600 and https://echelog.com/logs/browse/openindiana/1493157600


I have updated my HowTo for OpenIndiana and was thinking to move to OpenIndiana after next OmniOS stable
that is announced in a few days, see http://www.napp-it.org/doc/downloads/setup_openindiana.pdf
 
Yea I definitely like that ZFS works out of box without issues on Solaris based OSs, but I'm being more and more tempted by ZoL because of wide hardware support.


Unfortunately even with pool failure, my Napp-it doesn't send a message. The testscript that sends a test message does work though.

I just tested in a VM. Made two pools fail by removing the storage and no message was sent via Pushover.

test1 - -/ 0 - - - 0 [ /] - - - - - 17534799520560085192 FAULTED n.a. clear errors - 'test1': 'test1':
test2 28 1008M/ 0 461K - - 0 [976M /1.1G] 1.00x wait off off - 11282111706698191978 SUSPENDED n.a. clear errors - 'test2': 'test2':
 
Hi _Gea, Just trying out napp-it on OI with Supermicro server and Seagate AP2584 JBOD's connected to LSI SAS3 HBAs. The drives in the system are WD 4001FYY-01SL3 drives that have "Drive Trip Temperature" set to 40 in the firmware. So after a while I found that 48 of the 168 drives were disabled by fmd. Have you run into this before? It looks like it is a known issue in illumos: https://www.illumos.org/issues/7327 and a patch was suggested but it never was merged. Any suggestions?

Two other issues: 1. I installed the sas2ircu and sas3ircu utils but they don't seem to find the controllers. These are LSI 9300 HBAs. I downloaded sas3ircu from the Broadcom site for this HBA and tried using the sas3ircu_solaris_x86_rel. The program works (shows program usage if I just run the program without any parameters) but not sure why it doesn't find the controllers.

2. Any easy way to reset the admin password from the command line?

Thanks a lot.
 
Yea I definitely like that ZFS works out of box without issues on Solaris based OSs, but I'm being more and more tempted by ZoL because of wide hardware support.


Unfortunately even with pool failure, my Napp-it doesn't send a message. The testscript that sends a test message does work though.

I just tested in a VM. Made two pools fail by removing the storage and no message was sent via Pushover.

test1 - -/ 0 - - - 0 [ /] - - - - - 17534799520560085192 FAULTED n.a. clear errors - 'test1': 'test1':
test2 28 1008M/ 0 461K - - 0 [976M /1.1G] 1.00x wait off off - 11282111706698191978 SUSPENDED n.a. clear errors - 'test2': 'test2':

regarding hardware support
Linux supports more hardware
But a "Best Use" Linux ZFS hardware is the same like a BSD or Solarish ZFS one

regarding alerts
Have you enabled auto at all (on Linux you must add a cron job for auto.pl manually)
 
Last edited:
Hi _Gea, Just trying out napp-it on OI with Supermicro server and Seagate AP2584 JBOD's connected to LSI SAS3 HBAs. The drives in the system are WD 4001FYY-01SL3 drives that have "Drive Trip Temperature" set to 40 in the firmware. So after a while I found that 48 of the 168 drives were disabled by fmd. Have you run into this before? It looks like it is a known issue in illumos: https://www.illumos.org/issues/7327 and a patch was suggested but it never was merged. Any suggestions?

Two other issues: 1. I installed the sas2ircu and sas3ircu utils but they don't seem to find the controllers. These are LSI 9300 HBAs. I downloaded sas3ircu from the Broadcom site for this HBA and tried using the sas3ircu_solaris_x86_rel. The program works (shows program usage if I just run the program without any parameters) but not sure why it doesn't find the controllers.

2. Any easy way to reset the admin password from the command line?

Thanks a lot.

1. I have not seen such a problem with my disks or with current suggested NAS disks
It seems mainly a disk/firmware problem of some disks. You can disable fmd but this will disable disk hot spare auto replace

2. For napp-it admin
edit or delete /var/web-gui/_log/napp-it.cfg
 
Back
Top