OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Discussion in 'SSDs & Data Storage' started by _Gea, Dec 30, 2010.

  1. danswartz

    danswartz 2[H]4U

    Messages:
    3,610
    Joined:
    Feb 25, 2011
    I currently have an NFS server running ZoL 0.7.11. Works okay, but I'd prefer some Illumos based distro. My ESXi host is connected to the storage server using a point to point QSFP cable and 2 Mellanox Connectx-3 EN cards. Works okay, but I'd like to dabble with RDMA. The most visible products are Mellanox Connectx-4 LX cards, but Mellanox seems long ago to have given up any interesting in any Solaris based product. Chelsio makes a couple of interesting cards (t5 and t6 asics), but chelsio seems to be riding the iwarp bandwagon, which is broken down by the side of the road (even Intel seems to have ditched it.) So now, chelsio is pushing things like iSCSI offload, although I guess you might be able to do RDMA NFS from ESXi => Solarish server. But vmware has stated that from 6.7 forward, they will only support RoCE (implying no support for iWARP). If illumos provides drivers for esxi and solarish, I guess I'd be fine with that, but it's hard to tell from reading illumos HCL what they in fact support. The chelsio driver claims to support everything up to t6 asic, but that doesn't mean it supports all the fancy offload stuff. Sorry for the TL;DR :) If anyone has any concrete recommendations, I'd much appreciate them. If I can't find out anything useful, I might need to stick with Linux and go with the recent Mellanox cards.
     
  2. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
  3. danswartz

    danswartz 2[H]4U

    Messages:
    3,610
    Joined:
    Feb 25, 2011
    Thanks, but I've moved on. RMA'ed the chelsio cards and ordered two connectx-5 EN cards. 50gbe single port. Unfortunately, solarish variants aren't supported - that's life I guess...
     
  4. grendel19

    grendel19 Gawd

    Messages:
    579
    Joined:
    Jun 26, 2009
    Isn't basic vdev removal available on OmniOS 151025+? I have 151026 installed and accidentally added two disks as basic vdevs to an existing z2 pool. Was looking to be able to remove these without having to perform the usual copy/destroy/recreate.

    I ran zpool removal [pool] [disk] but still get the following message:

    "invalid config; all top-level vdevs must have the same sector size and not be raidz"
     
  5. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Vdev remove is in Illumos/ OpenZFS but currently without support for pools thai contains a raid-z. Vdev remove for all kinds of vdev is currently only available in Oracle Solaris 11.4.
     
  6. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Last edited: Nov 6, 2018
  7. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    napp-it Z-Raid vCluster v.2

    I am on the way to finish my Z-Raid Cluster in a Box solution.
    In the past a Cluster in a Box consists of two servers with a common pool of mpio SAS disks. One of the servers builds a pool from the disks and offers services like NFS or SMB. On a failure or for maintenance you can switch over to the second server within a few seconds. Management is done for example with RSF-1 from high-availability. SuperMicro offers special cases that can hold two mainboards

    Such solutions are expensive and management is quite complex but offers high performance and high availability. To reduce costs, my solution is based on ESXi (any license) to virtualize the two nodes and a control instance. It uses the shared controller/ shared raw disk options of ESXi so my solution does not need multipath SAS but can work with any disks.

    Setup see http://www.napp-it.org/doc/downloads/z-raid.pdf

    If you want to try the preview state that allows a manual failover within 20s, you can update napp-it to the free 18.02 preview (or 18.12dev).


    cluster-in-a-box.png
     
  8. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    vCluster Beta2 is available (napp-it 18.12g dev)

    Current state:
    Manual Failover between nodes working for NFS and SMB
    For SMB it supports failover for local user and AD users connected to the share during failover.

    Todo
    auto-failover (on tests)

    Expect vCluster to next Pro (Jan 2019)
     
  9. spankit

    spankit Limp Gawd

    Messages:
    262
    Joined:
    Oct 18, 2010
    I'm trying to configure TLS Mail and I'm getting the following error when trying to install IO::Socket:SSL in the CPAN Shell.

    Code:
    cpan[1]> notest install IO::Socket::SSL
    Reading '/root/.cpan/Metadata'
      Database was generated on Mon, 19 Nov 2018 03:29:03 GMT
    Running install for module 'IO::Socket::SSL'
    Checksum for /root/.cpan/sources/authors/id/S/SU/SULLR/IO-Socket-SSL-2.060.tar.gz ok
    Scanning cache /root/.cpan/build for sizes
    ............................................................................DONE
    'YAML' not installed, will not store persistent state
    Configuring S/SU/SULLR/IO-Socket-SSL-2.060.tar.gz with Makefile.PL
    ld.so.1: perl: fatal: relocation error: file /usr/perl5/site_perl/5.28/i86pc-solaris-thread-multi-64int/auto/Net/SSLeay/SSLeay.so: symbol CRYPTO_get_locking_callback: referenced symbol not found
    Warning: No success on command[/usr/perl5/5.28/bin/i386/perl Makefile.PL]
      SULLR/IO-Socket-SSL-2.060.tar.gz
      /usr/perl5/5.28/bin/i386/perl Makefile.PL -- NOT OK
    Failed during this command:
     SULLR/IO-Socket-SSL-2.060.tar.gz             : writemakefile NO '/usr/perl5/5.28/bin/i386/perl Makefile.PL' returned status 9
    
    I've recently upgraded from omniosce-r151026 to omniosce-r151028.
     
  10. jad0083

    jad0083 Limp Gawd

    Messages:
    134
    Joined:
    Apr 30, 2006
    hi Gea,

    Does Omnios Support nfs 4.1/4.2 yet?
     
  11. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    I was successful with Tls (https://napp-it.org/downloads/tls_en.html):

    To enable TLS emails on OmniOS 151018 and up, use the following setup

    (use putty, login as root and copy/paste commands with a mouse right-click, on questions use defaults thanks to Rick)

    perl -MCPAN -e shell
    notest install Net::SSLeay
    notest install IO::Socket::SSL
    notest install Net::SMTP::TLS
    exit;

    Just confirm any questions with enter (skip tests)
     
  12. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    No, Illumos (OmniOS) is NFS 4.0
    Oracle Solaris is NFS 4.1 (and SMB 3.1 with the kernelbased SMB server)
     
  13. jad0083

    jad0083 Limp Gawd

    Messages:
    134
    Joined:
    Apr 30, 2006
    ok, cool, thanks Gea! Unfortunately, Oracle Solaris' lack for VMXNet 3 drivers kills it for me :/
     
  14. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
  15. spankit

    spankit Limp Gawd

    Messages:
    262
    Joined:
    Oct 18, 2010
    That's the document I followed when I was trying to get it going. I'm having issues on this particular server when running the following command.

    notest install IO::Socket::SSL
     
  16. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    On a clean install of OmniOS 151028 it installs when just confirming defaults.
    But I have seen several problems on updates (depends on first version in the update chain) as 151028 finalized the move from Sun SSH to OpenSSH.

    What you can try
    - check SSH settings, see https://omniosce.org/info/sunssh
    - deinstall and reinstall SSH

    Maybe the fastest way is
    install 151028 from scratch and import the datapool
    you must save/restore /var/web-gui/_log and users with same uid to keep napp-it and basic settings
    (napp-it Pro: run a backup job that saves this to the datapool, restore via Users > Resore)
     
  17. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Last edited: Dec 3, 2018
  18. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
  19. Nemesis_001

    Nemesis_001 n00b

    Messages:
    53
    Joined:
    Apr 3, 2011
    Hi,

    FYI to those upgrading from R151026 to R151028, I've encountered the following errors:

    tty and PAM access error, solved by upgrading napp-it to 18.12 dev from 18.06 pro

    SSH was down due to /etc/ssh/sshd_config: line 103: Bad configuration option: MaxAuthTriesLog.
    Replaced /etc/ssh/sshd_config with /etc/ssh/sshd_config.new
     
  20. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    OmniOS 151028 comes with a newer Perl and OpenSSH instead of the former Sun SSH.
    Prior updating OmniOS you must update to newest napp-it free/pro/dev release that supports the new OmniOS.

    For SSH you find an option to switch defaults to OpenSSH in menu Services

    A remaining problem is TLS alert mails. This stops working after an update.
    This is unsolved. Workaround is a new setup of 151028 instead an update.
     
  21. Nemesis_001

    Nemesis_001 n00b

    Messages:
    53
    Joined:
    Apr 3, 2011
    I was already on 12.06pro which was the latest pro. Still encountered all the errors until moved to the latest dev.

    The TLS is no big deal.
    I am using the restricted gmail smtp server to send the message, seems to be working well.

    By the way, I've been meaning to ask, what is the napp-it pro agents acceleration?
     
  22. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Without acceleration, napp-it requests OS informations like snaps or disks whenever you call the menu ex disks. With a few disks there is only a short delay. With 20 disks you may wait 10s until all disks are detected with their state. On a Petabyte system with 60 or 90 disks this may take a minute - on every action that manipulates an item under Disks or Pools.

    Acc allows napp-it to work in an async way. Disk values are continously requested in the background by a software agent (a service or daemon) to allow a disk or pool menu to be processed immediatly.
     
  23. Nemesis_001

    Nemesis_001 n00b

    Messages:
    53
    Joined:
    Apr 3, 2011
    Got it, thanks.
    That's why with 4 8TB disks there was no noticeable improvement :)
     
  24. cw823

    cw823 n00b

    Messages:
    34
    Joined:
    Mar 15, 2006
    I have two arrays that I have had for quite some time, one of which has been on both FreeNAS & OmniOS. Moving away from ZFS All-In-One to bare metal, but cannot import either of my pools due to the attached error. I should be able to get OmniOS back to whatever software level support this feature, but have not been able to figure out how to update to that level as of yet. Obviously I"m missing something simple, but can't figure exactly what.

    EDIT: Figured it out, had to use the nappit OVF and deploy that way, and then my pools were recognized.
     

    Attached Files:

    Last edited: Dec 17, 2018
  25. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    If a ZFS feature is enabled you can only import that pool on a operating system that supports this feature (does not matter if same OS or cross-platform) or in readonly mode. If you need compatibility with older OS releases you should avoid features. When you create a pool ex in napp-it you can disable all features for compatibility reasons.

    In your case you should try OmniOS 151028 (newest OmniOS) .
     
  26. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
  27. gcooper

    gcooper n00b

    Messages:
    7
    Joined:
    Feb 22, 2014
    Hello _Gea,

    I'm about to update an OmniOS/Napp-IT server and I'm curious as to your current thoughts on bonding 10GbE ports for throughput and resiliency.

    Perhaps unwisely, I have multiple ixgbe ports, but only one of them really gets used.

    I currently have 6 (XenServer) virtualization hosts, soon to be 8, using NFS for VHD storage.

    Would it be a good idea to bond a spare 10GbE (Intel X540-T2) port, or leave it fallow?

    On an older OmniOS/Napp-IT server that was used for backup, I successfully bonded 4x (Intel) 1GbE ports.

    In the past, you advised against bonding, so I didn't. Has your advice changed over the last few years?

    Thanks in advance,

    G
     
  28. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Performance may improve but you add complexity that may affect stability....


    btw
    Thanks and a Happy New Year!
     
    Last edited: Jan 2, 2019
  29. N Bates

    N Bates [H]Lite

    Messages:
    102
    Joined:
    Jul 15, 2017
    Trust me to do things the wrong way around, I have updated Omnios CE from r151026 to r151028 and yes you guest it Napp-it giving errors and can not log on, I have seen that I need to delete a line somwhere and creat a new password, or re-install omnios r151028 with napp-it or go back to previous Omnios BE and update napp--it.

    Anyone got a step by step on the easiest roote without having to create a new password?
     
  30. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Sudo/Permission problems in napp-it
    If you first update napp-it, then OmniOS, napp-it cares about OmniOS 151028

    If you missed to update napp-it first:
    - go back to former BE, update napp-it and then OmniOS or
    - delete the napp-it line in /etc/user_attr and set a napp-it pw via passwd napp-it

    https://napp-it.org/downloads/omnios-update.html
     
  31. N Bates

    N Bates [H]Lite

    Messages:
    102
    Joined:
    Jul 15, 2017
    Thank you _Gea, I have been looking out there and not much on how to go back to former BE, I would guess this involves booting the server and and hitting the number corresponding to this option on OmniOS, I don't want to just go ahead and try it because of twice loosing data when either updating to new os or other updating reason (yes, I am a ZFS OmniOS newb, although I have been using ZFS from when nexenta had a supported/ updated a free version).

    Thanks for confirmation.
     
  32. WishYou

    WishYou n00b

    Messages:
    6
    Joined:
    Oct 19, 2016
    Hi!

    After a disk crash, replacement and then a bit of fiddling I see a weird cache problem with my disks. The cap and type of one of my disks show the values of a disk that was removed from the neighboring slot.
    Any way to fix this without a reboot? See below, the smart disk table shows the correct values. The disk is a 3TB WD Red

    Disk view:
    upload_2019-1-14_18-56-21.png

    Smart view
    upload_2019-1-14_18-58-18.png

    Pool:
    upload_2019-1-14_18-59-35.png

    Regards
    Wish
     
  33. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    If you use disk detection based on controller number + controller port (c3t0) you do not have hotplug at alll or per default (you can enable Disk hotplug on OmniOS, works not on every board). This means then you need a reboot to re-read all disks correctly.

    Another problem may be disk caching in napp-it.
    You can clear the cache in menu Disks > Delete Disk buffer.

    A third problem may be a minor bug in the last release of napp-it where you must reload menu disks after a modification to show new values, https://napp-it.org/downloads/changelog_en.html
     
    Last edited: Jan 14, 2019
  34. N Bates

    N Bates [H]Lite

    Messages:
    102
    Joined:
    Jul 15, 2017

    Hi _Gea, I have deleted the Napp-it line in /etc/usr_attr., how do I set a napp-it passwd via passwd napp-it?

    Thanks
     
  35. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    At console (as root), enter:
    passwd napp-it

    Enter any password twice
     
    N Bates likes this.
  36. N Bates

    N Bates [H]Lite

    Messages:
    102
    Joined:
    Jul 15, 2017
  37. N Bates

    N Bates [H]Lite

    Messages:
    102
    Joined:
    Jul 15, 2017
    How do I fix the below in Napp-it, or is this on the omniOS side SSH vs Open SSH?

    Tty.c: loadable library and perl binaries are mismatched (got handshake key 9e40080, needed 9a80080) after from omnios r151026 to r151028
     
  38. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    There are a lot of new things in 151028 that napp-it must care about (like this error related to the newer Perl release).
    So you must first update napp-it to newest (18.12), then OmniOS.

    If napp-it is running, update now (About > Update)
    otherwise boot last BE and update napp-it, then OmniOS

    btw
    If you need TLS Alert Mail, you must do a clean install of 151028.
    TLS is not working after an update and not installable due incompatibilities with former/new SSH

    To keep settings
    After setup, restore /var/web-gui/_log/* for napp-it settings, restore users with same uid/gid and import pool
    http://www.napp-it.org/doc/downloads/setup_napp-it_os.pdf
     
    N Bates likes this.
  39. WishYou

    WishYou n00b

    Messages:
    6
    Joined:
    Oct 19, 2016
    Hi!

    I've got a problem with jobs on a fresh install... email-jobs are not run, I've created several but no emails are being sent.
    The test email works okay. As you can see auto is enabled but the jobs are still 'new' after a whole day.
    What am I missing?

    upload_2019-2-10_18-41-11.png

    Regards,
    Wish
     
  40. _Gea

    _Gea 2[H]4U

    Messages:
    3,799
    Joined:
    Dec 5, 2010
    Are you using standard or TLS email?

    If you use Jobs > Report you can set email method per job (standard port 25 or TLS).
    For the basic alert and status you must set email method globally (Menu Jobs > TLS email > enable/disable TLS) where current default is TLS (ex Gmail)
     
    WishYou likes this.