OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Discussion in 'SSDs & Data Storage' started by _Gea, Dec 30, 2010.

  1. jad0083

    jad0083 [H]Lite

    Messages:
    115
    Joined:
    Apr 30, 2006
    with the current OmniOS stable (151020) in a VM, what's currently the best NVMe on the market (for prosumers) to use as a passthrough SLOG device?
     
  2. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    You need powerloss protection for the NVMe as an Slog.
    I would check the Intels (P750, P3700 etc)

    alternative: use the NVMe as a pool for VMs directly (no Slog or L2ARC needed)
     
  3. seijirou

    seijirou n00bie

    Messages:
    33
    Joined:
    Feb 28, 2012
    Installed omnios-r151020-4151d05 over the weekend. Everything seems to be up and working but I'm getting this message when I log in. I saw the notes about doing chmod 666 /dev/null as root but that returns the same error in the console.
     

    Attached Files:

  4. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    I had the same problem suddenly.
    The above fix worked for me.

    Is napp-it working for you beside that message on login?
    The /dev/null is only needed to send unwanted console messages to.
     
  5. jlbenedict

    jlbenedict Gawd

    Messages:
    960
    Joined:
    May 22, 2005
    Looking into providing some storage to my 2-node ESXi cluster, via either NFS/iSCSI.
    I have a whitebox that i've been wondering on whether it would have the potential to do this ("home production" use.. )

    SuperMicro X7DVL-3, with dual Intel L5410 cpus
    24gb ram
    motherboard has 8 SAS ports via a LSI 1068E (i have this flashed to I.T. mode already).
    dual on-board NICs + an additional 4-port gigabit Intel (6 total)
    All shoved in a Rosewill 4u case, that can hold 12 drives.

    ??
     
    Last edited: Mar 9, 2017
  6. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Do have this box already?
    A lot of energy needed for less CPU power (compared to newer systems), only one pci-e slot and the 1068 is limited to 3G and max 2TB disks

    If you can live with that, it should work
     
  7. jlbenedict

    jlbenedict Gawd

    Messages:
    960
    Joined:
    May 22, 2005
    Yes I do have this , as it is my current lab and media server :)
    I have two ESXi hosts on the way, and I'm on tight budget, so I was hoping to hopefully convert it to storage use only. I'm not looking for top of the line performance. I can deal with the energy need. I'm thinking this box could still be adequate to hold me over on a tight budget.
     
  8. seijirou

    seijirou n00bie

    Messages:
    33
    Joined:
    Feb 28, 2012
    Gea,

    Yes as far as I can tell everything otherwise works fine. I do not recall having this error immediately after install.. it just crept in at some point a day or two later. The use of /dev/null will work b/c it's currently set 666 (777 on the symlink but that's always reported for symlinks AFAIK). I see the line setting 666 on webpage load. Do you see any harm in commenting that out for now?

    I've tried digging but can't find any fix for the error because I can't find any cause. I can't find any extended attributes on the file and ownership is root:root so I don't understand why there's a chown error. I tried to touch /reconfigure and reboot but no different.
     
  9. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    In earlier napp-it releases I send some error messages to /dev/null.
    When I saw this problem first I removed this so current napp-it will work despite a problem with /dev/null

    I added then the chmod command on login (hence the message).
    This worked in my setups but it is not needed for operation.
     
  10. seijirou

    seijirou n00bie

    Messages:
    33
    Joined:
    Feb 28, 2012
    I fixed it, but I don't know exactly what did it.

    It might have been creating /reconfigure and rebooting. I thought that didn't work before but perhaps it did. If anybody else stumbles on this I suppose that's worth a try. I know I did have to to manually run `sudo devlinks` after reboot and logging in as a normal user because I did see an error about /dev/null and saw that it was a regular file. I want to say it was another reboot after that where the problem went away.

    Unfortunately I had more than that going on at the time taking my attention so I only noticed the issue went away by accident.
     
  11. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Its as curious as the problem itself that appears suddenly....
     
  12. mmmmmcoke

    mmmmmcoke n00bie

    Messages:
    8
    Joined:
    Oct 26, 2015
    This is probably a crazy question. But, how come there isn't much discussion of using proxmox with napp-it for an all-in-one build?
    I know ESXI has quite a bit of information posted for a Napp-it all-in-one build. (Heck I'm currently running one myself).
    But, I'm just kind of surprised given the growing popularity of Proxmox and the shrinking (very slowly - especially in the actual business world) popularity of ESXI.
    Not tied to ESXI and I'm kind of interested in seeing what Proxmox / Xen can offer but, at the same time the complete void of information is making me wonder if I'm missing something. (Like not even posts of how terrible proxmox & napp-it might be together)
     
    Zedicus likes this.
  13. ZzBloopzZ

    ZzBloopzZ [H]ard|Gawd

    Messages:
    1,296
    Joined:
    Sep 18, 2004
    Dear Gea,

    I hope all is well! I am still on R151014 using ESXi 5.5. Was thinking to update to latest 6.0 of ESXi.

    Since there have been so many changes with OmniOS, think I should do a fresh VM of R151020 or should I just update the R151014 following the instructions? I am aware about switching over to OpenSSH before updating per release notes. Just wondering if I am better off just doing a fresh install/VM?

    Thanks!
     
  14. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    The current hype is lightweight container based virtualisation for Linux guests.
    This is something that a type-1 hypervisor like ESXi or Xen does not offer, the offer full OS virtualisation (of any guests). In case of ESXi you addionally have the best pass-through and driver support for guest OSs what makes it a perfect base for a storage VM.

    If you use Proxmox (or the Illumos distribution SmartOS), you can use ZFS as native filesystem, so you do not need a storage VM if you only want ZFS. This is different if you want a full featured Storage NAS/SAN appliance. In this case you need pass-through to assigne the disk controller and disks to the storage VM.

    There are reports (google "proxmox napp-it") who have done this but I have no experience myself about pass-through on Proxmox.. As an alternative you can use ESXi as a base for full virtualisation of any guests and a guest (can be even be OmniOS with LX container or any Linux) for container based virtualisation on top. You only need around 2GB addiotionally for ESXi with a very low impact on performance. The plus with ESXi is the very easy way to reinstall after a crash and to restore VMs by a simple copy on an NFS share.
     
  15. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    You can either update (->151018, switch SSL, ->151020) or
    you can use my ova template with 151020

    You may wait to the next 151022 that should be available in Q2
     
  16. danswartz

    danswartz 2[H]4U

    Messages:
    3,573
    Joined:
    Feb 25, 2011
    I bought a couple of Samsung 960 PRO drives to use for a mirrored pool for vsphere guests. Sadly, this drive is TOO new, and being NVME 1.2, is apparently not supported by either Illumos or Solaris 11.3 :(
     
    Last edited: Mar 13, 2017
  17. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
  18. danswartz

    danswartz 2[H]4U

    Messages:
    3,573
    Joined:
    Feb 25, 2011
    Hmmm, that's an interesting idea. Let me check...
     
  19. sorhol

    sorhol n00bie

    Messages:
    10
    Joined:
    Jul 29, 2014
    After updating to 17.01 free I get the menus meesed up in Chrome. Its OK in Edge.

    [​IMG]

    Does anybody know how to fix this?
    TIA
     
  20. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Just reload the page as the new version has a modified css and you use the old one from browser cache.
     
  21. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
  22. Zedicus

    Zedicus Limp Gawd

    Messages:
    287
    Joined:
    Nov 2, 2010
    ProxMox actually does both containers and real VMs. SmartOS is not assembled in the same way and is more geared towards containers. Proxmox has VERY GOOD (better than ESXi) device passthrough. it can handle multiple types of passthrough, (pci, pcie limited, and pcie full (for videocards.)) and can have multiple devices on a single VM. there are some limits to the total amount of passthrough devices per VM but this can be effected by the configuration and not ProxMox itself.

    the OS (freenas based on freebsd, or napp-it on some sort of solaris) are all supported on proxmox and have very usefull config guides posted. and yes ZFS is supported in the core of proxmox, there are still use cases for a dedicated virtualized NAS running on top.
     
  23. healthent

    healthent n00bie

    Messages:
    2
    Joined:
    Apr 16, 2017
    I've got a file permissions issue that is driving me nuts!! I can't delete a file, despite /usr/bin/ls -v showing everyone having permissions to it. Files in the same directory, that show the same results in ls -v, are deletable!

    Code:
    root@filer:/aggr1/Folder1/Home/geofile# rm geofile.prc
    rm: cannot remove 'geofile.prc': Permission denied
    root@filer:/aggr1/Folder1/Home/geofile# chown root:root geofile.prc
    root@filer:/aggr1/Folder1/Home/geofile# ls -lh
    total 16G
    -rwxrwxrwx+ 1 root   root  16G Apr 16 07:13 geofile.prc
    -rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:28 a.txt
    -rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:32 b.txt
    root@filer:/aggr1/Folder1/Home/geofile# chmod 777 geofile.prc
    root@filer:/aggr1/Folder1/Home/geofile# ls -lh
    total 16G
    -rwxrwxrwx+ 1 root   root  16G Apr 16 07:13 geofile.prc
    -rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:28 a.txt
    -rwxrwxrwx+ 1 andrew staff   0 Apr 16 07:32 b.txt
    root@filer:/aggr1/Folder1/Home/geofile# rm geofile.prc
    rm: cannot remove 'geofile.prc': Permission denied
    root@filer:/aggr1/Folder1/Home/geofile#
    
    root@filer:/aggr1/Folder1/Home/geofile# /usr/bin/ls -v
    total 33531322
    -rwxrwxrwx+  1 root     root     17150952929 Apr 16 07:13 geofile.prc
         0:user:root:read_data/write_data/append_data/read_xattr/write_xattr
             /execute/delete_child/read_attributes/write_attributes/delete
             /read_acl/write_acl/write_owner/synchronize:file_inherit
             /dir_inherit:allow
         1:eek:wner@:read_data/write_data/append_data/read_xattr/write_xattr/execute
             /read_attributes/write_attributes/read_acl/write_acl/write_owner
             /synchronize:allow
         2:group@:read_data/write_data/append_data/read_xattr/execute
             /read_attributes/read_acl/synchronize:allow
         3:everyone@:read_data/write_data/append_data/read_xattr/execute
             /read_attributes/read_acl/synchronize:allow
    -rwxrwxrwx+  1 andrew   staff          0 Apr 16 07:28 a.txt
         0:user:root:read_data/write_data/append_data/read_xattr/write_xattr
             /execute/delete_child/read_attributes/write_attributes/delete
             /read_acl/write_acl/write_owner/synchronize:inherited:allow
         1:everyone@:read_data/write_data/append_data/read_xattr/write_xattr
             /execute/delete_child/read_attributes/write_attributes/delete
             /read_acl/synchronize:inherited:allow
    -rwxrwxrwx+  1 andrew   staff          0 Apr 16 07:32 b.txt
         0:user:andrew:read_data/write_data/append_data/read_xattr/write_xattr
             /execute/delete_child/read_attributes/write_attributes/delete
             /read_acl/write_acl/write_owner/synchronize:inherited:allow
         1:user:root:read_data/write_data/append_data/read_xattr/write_xattr
             /execute/delete_child/read_attributes/write_attributes/delete
             /read_acl/write_acl/write_owner/synchronize:inherited:allow
         2:everyone@:read_data/write_data/append_data/read_xattr/write_xattr
             /execute/delete_child/read_attributes/write_attributes/delete
             /read_acl/synchronize:inherited:allow
    root@filer:/aggr1/Folder1/Home/geofile#
    What on earth is going on?

    I've done the reset ACLs option in the ACL extension (napp it Pro user) and according to the output, it resets the folder and file permissions as expected. Yet Windows still says that I don't have enough permissions to see who the owner of the file is, despite being able to see that for the folder it's in.
     
    Last edited: Apr 16, 2017
  24. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    root at console can always delete, does not matter the acl/permission setting. This is different to Windows where even admin respects acl settings (but can always modify). The main reasons that hinders root to delete is a readonly filesystem or when the file is busy/opened by an application.
    You can use Windows computer management (connect as a user that is member of Solarish SMB group administrators) to check for open files

    To remotely manage ACL on Windows you need a Pro (not Home) edition, optionally SMB connect as user root for full permissions.
     
  25. healthent

    healthent n00bie

    Messages:
    2
    Joined:
    Apr 16, 2017
    But look at the output, where I show I'm logged into OmniOS as root and it's giving me permission denied when I try to delete the files? Here it is again:

    Code:
    root@filer:/aggr1/Folder1/Home/geofile# rm geofile.prc
    rm: cannot remove 'geofile.prc': Permission denied
    
    Edit: Also, to be clear, this is after the client that was touching that file was a) rebooted, then after that didn't work, b) shutdown for a long time.

    I guess it would be helpful to show what is locking a file open, if that is indeed the cause. Is there a command that will show any client protocol access to a file?
     
    Last edited: Apr 17, 2017
  26. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    I am not aware for a command that shows if the file is opened by any application or protocol.
    You can try a reboot or export/import that removes all locks to be sure about this option.

    If you use only SMB you can check with Computer Management.
    Is this a current release of OmniOS?

    Beside that locking is affected by the ZFS property nbmand (set to on)

    Another option that can solve some problems is (followed by a cifs service restart)
    svccfg -s network/smb/server setprop smbd/oplock_enable=false
     
  27. zepcs

    zepcs [H]Lite

    Messages:
    89
    Joined:
    May 11, 2011
    Not sure why but multi-path doesn't seem to be picking up my latest addition, it configured the rest of the sas drives on the same controllers and enclosure as the zeusram, but it is listed twice. Not sure how to edit the config for it to be recognized.

    c22t5000A72B3007B073d0 1 via dd ok 8000 MB 8 GB S:0 H:0 T:0 STEC ZeusRAM STM000173D0B
    c26t5000A72A3007B073d0 1 via dd ok 8000 MB 8 GB S:0 H:0 T:0 STEC ZeusRAM STM000173D0B

    Curiously the Disk Location reports it not on the lsi, even though that's all there is
    Disks available on non LSI2/3 controller
     
  28. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    If you use multipath io, disks are listed twice.
    If this is unintended, you can disable mpio (napp-it menu Disks > Details > Edit mpt_sas.conf) but I suppose you use mpio on the ZeusRAM to double io performance.

    If the disk detection on LSI controllers does not work properly, you can try a different sas2ircu or sas3ircu from LSI as this tool is used for disk location/map. Prefer the tool that is offered by your controller manufacturer for your HBA. On some controllers (especially LSI 3008 based ones) you can try different/older versions from LSI. I have seen LSI 3008 HBAs that only work with a very old version of sas3ircu.
     
  29. zepcs

    zepcs [H]Lite

    Messages:
    89
    Joined:
    May 11, 2011
    I definitely want to use multipath, I want to use the ZeusRAM as a write-log for a pool. Do I select both copies of the disk when I add it to the pool? In testing I only selected one and my sync wire performance seemed to hit a wall around 100MB/s, which was quite a bit lower than I expected.

    What I meant about the LSI detection is that all my other sas disks show up with enclosure and slot numbers but the ZeusRAM does not, even though it is in the same enclosure as some of the other drives that do.
     
  30. zepcs

    zepcs [H]Lite

    Messages:
    89
    Joined:
    May 11, 2011
    Looks like adding it to the override fixed it.

    In the edit scsi_vhci.conf I added

    scsi-vhci-failover-override =
    "STEC ZeusRAM", "f_sym";

    After a reboot I see 2 paths, and proper enclosure/slot information on the lsi sas2/3 controller. Additionally the disk only shows once in the add pool section. Hopefully this will help with the performance.


    And thank you Gea for making such a nice product!
     
  31. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Thank you for your feedback.
    I own two ZeusRAM but use them in different servers so your solution is new to me.

    I am glad that you like napp-it,
    more or less a spinoff of my daily work with a focus on simple property listings/editing
    despite the smartphone alike app hype in other products.
     
  32. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    The next OmniOS 151022 long term stable edition
    (expected June 2017) is available as a public beta.
    ReleaseNotes/r151022

    For an update from a version prior the current stable 151020 (up from last LTS 151014),
    - check Upgrade_to_r151022
    - and napp-it // webbased ZFS NAS/SAN appliance for OmniOS, OpenIndiana, Solaris and Linux :Update OmniOS
    - If you use napp-it, you must first update to newest v 17 free, pro or dev (April edition)

    You must mainly care about number of BEs (<30) and the switch of SunSSH to OpenSSH in 151018 stable and a current napp-it to avoid problems with the Perl module Expect


    Update from a current default stable OmniOS 151020

    If you want to update from current stable 151020 setup, you must update napp-it, switch the repository and update. The easiest way is using Putty as root as you can copy/paste commands with a mouse right click
    Code:
    pkg unset-publisher omnios
    pkg set-publisher -P --set-property signature-policy=require-signatures -g package repository omnios
    pkg update
    reboot

    As the setup creates a bootenvironment, you can go back to the former OS release when needed.
    Please report remaining problems

    napp-it

    I have updated napp-it to work with this release (April 2017 edition)
    earlier versions of napp-it are not compatible (Tty problem of Perl module Expect)


    Problems

    What I have found is a problem with the new loader under VMware
    where the installer switched to maintenance mode. You can skip that with ctrl-d to run the setup
     
  33. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Bad news for OmniOS users
    OmniTi stops commercial support and development after next 151022 stable in may
    [OmniOS-discuss] The Future of OmniOS


    We will see what this means for the Future of OmniOS
    (continuation as an OpenSource project, or move/merge with another Illumos distribution like OI or SmartOS)
     
  34. CopyRunStart

    CopyRunStart Limp Gawd

    Messages:
    144
    Joined:
    Apr 3, 2014
    That's unfortunate. It seems most people are moving towards ZoL anyway though.



    Gea have you ever tested the Pushover functionality on Solaris? Unfortunately it has never worked for me. The test messages work but I never get alerts for pool degradation. For instance, my log shows:

    Code:
     ok       scrub 1490197372       time: 2017.04.23.13.04.16       info: 
     ok       scrub 1490197372       time: 2017.04.23.13.04.16       info: resilver in progress since Sun Apr 23 13:04:10 2017
     ok       scrub 1490197372       time: 2017.03.24.06.10.35       info: 
     ok       scrub 1490197372       time: 2017.03.24.06.10.35       info: scrub repaired 2.16M in 1d18h with 0 errors on Fri Mar 24 06:10:30 2017
    
    But I get no messages from Pushover. This happened on all the 16.x versions for us, Pro and Free. Just curious if you have any ideas or any tests you want me to run on Solariss 11.3 for you.
     
  35. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Alert only triggers on a degraded pool, not on repaired checksum problems.
    I have not tried pushover on Solaris but as this is based on a simple wget it should work there as welle
    see testscript "/var/web-gui/data/napp-it/zfsos/15_Jobs and data services/05_Push/09_push-test/action.pl"

    about OmniOS and ZFS
    Most ZFS usage will be on ZoL but this is mainly because marketshare.
    I will always prefer the "it just works" ZFS solutions from Solaris and Illumos, the free Solaris fork.

    There are now efforts among professional users to continue a commercial support option by collecting money
    and there are efforts to end the fragmentation between OmniOS and OpenIndiand as they are nearly identical
    so a merge is possible.

    see https://echelog.com/logs/browse/omnios/1493157600 and https://echelog.com/logs/browse/openindiana/1493157600


    I have updated my HowTo for OpenIndiana and was thinking to move to OpenIndiana after next OmniOS stable
    that is announced in a few days, see http://www.napp-it.org/doc/downloads/setup_openindiana.pdf
     
  36. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    Last edited: May 2, 2017
  37. CopyRunStart

    CopyRunStart Limp Gawd

    Messages:
    144
    Joined:
    Apr 3, 2014
    Yea I definitely like that ZFS works out of box without issues on Solaris based OSs, but I'm being more and more tempted by ZoL because of wide hardware support.


    Unfortunately even with pool failure, my Napp-it doesn't send a message. The testscript that sends a test message does work though.

    I just tested in a VM. Made two pools fail by removing the storage and no message was sent via Pushover.

    test1 - -/ 0 - - - 0 [ /] - - - - - 17534799520560085192 FAULTED n.a. clear errors - 'test1': 'test1':
    test2 28 1008M/ 0 461K - - 0 [976M /1.1G] 1.00x wait off off - 11282111706698191978 SUSPENDED n.a. clear errors - 'test2': 'test2':
     
  38. sdc

    sdc n00bie

    Messages:
    2
    Joined:
    Apr 28, 2017
    Hi _Gea, Just trying out napp-it on OI with Supermicro server and Seagate AP2584 JBOD's connected to LSI SAS3 HBAs. The drives in the system are WD 4001FYY-01SL3 drives that have "Drive Trip Temperature" set to 40 in the firmware. So after a while I found that 48 of the 168 drives were disabled by fmd. Have you run into this before? It looks like it is a known issue in illumos: https://www.illumos.org/issues/7327 and a patch was suggested but it never was merged. Any suggestions?

    Two other issues: 1. I installed the sas2ircu and sas3ircu utils but they don't seem to find the controllers. These are LSI 9300 HBAs. I downloaded sas3ircu from the Broadcom site for this HBA and tried using the sas3ircu_solaris_x86_rel. The program works (shows program usage if I just run the program without any parameters) but not sure why it doesn't find the controllers.

    2. Any easy way to reset the admin password from the command line?

    Thanks a lot.
     
  39. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    regarding hardware support
    Linux supports more hardware
    But a "Best Use" Linux ZFS hardware is the same like a BSD or Solarish ZFS one

    regarding alerts
    Have you enabled auto at all (on Linux you must add a cron job for auto.pl manually)
     
    Last edited: Apr 28, 2017
  40. _Gea

    _Gea 2[H]4U

    Messages:
    3,649
    Joined:
    Dec 5, 2010
    1. I have not seen such a problem with my disks or with current suggested NAS disks
    It seems mainly a disk/firmware problem of some disks. You can disable fmd but this will disable disk hot spare auto replace

    2. For napp-it admin
    edit or delete /var/web-gui/_log/napp-it.cfg