OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Discussion in 'SSDs & Data Storage' started by _Gea, Dec 30, 2010.

  1. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    13,056
    Joined:
    Jun 13, 2003
    Thanks for the detailed response!
     
  2. WishYou

    WishYou n00b

    Messages:
    7
    Joined:
    Oct 19, 2016
    Hi _Gea !

    I've found and fixed a bug in your zpool cap calculations.
    There is a rounding error that may or may not hit hard depending on the layout and usage of the pools.
    'zfs list' outputs numbers with _comma_ but perl requires _point_ to handle calculations correctly.

    I've added a quick fix to zfslib_val2kb:
    Code:
    ###############
     sub zfslib_val2kb  {  #hide:
    ###############
    
          my $w1=$_[0];
    ->    $w1=~s/,/./;
          if ($w1=~/K/) { $w1=~s/K//;  }
          if ($w1=~/M/) { $w1=~s/M//; $w1=$w1*1000; }
          if ($w1=~/G/) { $w1=~s/G//; $w1=$w1*1000000; }
          if ($w1=~/T/) { $w1=~s/T//; $w1=$w1*1000000000; }
          if ($w1=~/P/) { $w1=~s/P//; $w1=$w1*1000000000000; }
          return ($w1);
     }
    

    Before:
    Code:
    NAME    USED    AVAIL   MOUNTPOINT      %
    rpool   23,2G   15,3G   /rpool  39%
    storage 5,62T   5,61T   /storage        50%
    tank    17,3T   3,95T   /tank   15%!
    vmstore 304G    107G    /vmstore        26%
    
    After, I've added a decimal point as well here, because it looks nicer... :)
    Code:
    NAME    USED    AVAIL   MOUNTPOINT      %
    rpool   23,2G   15,3G   /rpool  39.7%
    storage 5,62T   5,61T   /storage        50.0%
    tank    17,3T   3,95T   /tank   18.6%
    vmstore 304G    107G    /vmstore        26.0%
    

    Regards,
    Wish
     
    mikeo and _Gea like this.
  3. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    Thanks a lot !
     
  4. dedobot

    dedobot [H]Lite

    Messages:
    96
    Joined:
    Jun 19, 2012
    SMB file sharing quick tip: If you can afford it -disable smb sighning at the windows client too, not only at the SMB server. Via domain policy or local gpo, depend of the situation. Leave smb1 restrictions untouched. Same for macos.
     
  5. CopyRunStart

    CopyRunStart Limp Gawd

    Messages:
    153
    Joined:
    Apr 3, 2014
    Hey Gea, not sure if it is just an issue with my system but resetting ACL permissions doesn't seem to work in the Napp-it GUI on 18.12 with Solaris 11.4.

    I'm doing it manually via:

    chmod A- Folder/Name
    chmod -fvR A=everyone@:modify_set:file_inherit/dir_inherit:allow folder/name/path/

    Does this look right? Do I also need to give root full_set? How do I do both in one command?
     
  6. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    I will check that with Solaris 11.4

    You can set ACL via A= (see https://docs.oracle.com/cd/E18752_01/html/819-5461/gbace.html) but another way is using Windows when SMB connected as root. You should only avoid deny rules from Windows as Windows process first deny rules where as Solarish respects order of rules. (napp-it can set order of rules).

    If you want to have two commands in one line, use cmd1; cmd 2 or cmd1; && cmd1;

    update
    If have tried the ACL reset in napp-it 18.12 and 19.06 and it worked in both cases.
     
    Last edited: May 31, 2019
  7. CopyRunStart

    CopyRunStart Limp Gawd

    Messages:
    153
    Joined:
    Apr 3, 2014
    Thanks Gea. It must be something wrong with my system. I can't remember exactly what it said, but it was stuck at something about the guest user.

    I have set it with
    Code:
    chmod -fvR A=everyone@:modify_set:file_inherit/dir_inherit:allow folder/name/path/
    but I'm confused how I can also add another ACL for root through Solaris command line. When I tried
    Code:
    chmod -fvR A=user:root:ful_set:file_inherit/dir_inherit:allow folder/name/path/
    it deleted the ACL for everyone@.
     
    Last edited: May 31, 2019
  8. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    A= sets an ACL and deletes all former settings.
    If you want to add an ACL additionally , use A+
     
    IdiotInCharge likes this.
  9. CopyRunStart

    CopyRunStart Limp Gawd

    Messages:
    153
    Joined:
    Apr 3, 2014
  10. sjalloq

    sjalloq n00b

    Messages:
    54
    Joined:
    Jun 20, 2011
    Hi there,

    looking for some system advice. I'm looking at setting up some shared storage for a mini cluster and wondered if I can do that with Napp-IT. We're going to have 3-4 compute nodes running a mixture of EDA tools and numerical simulators and I'm trying to understand how to implement shared storage across these. Is NFS going to be fast enough or would we have to look at some other technology? Or perhaps we just run NAIO type servers and have local work areas on each machine.

    Anyone got any experience with this type of setup?

    Thanks.
     
  11. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    It depends on what you expect from a "fast enough".

    NFS (or SMB) is the usual way to share storage between clients. The fastest way is a dedicated storage server where you need to care about a fast enough network, latency, performance improvements by ramcache or the question whether you need a secure write behaviour (can you allow a dataloss of writes beeing in the ramcache on a crash).

    If you virtualise storage and nodes, you must check if the server can satisfy the combined needs of storage and computing nodes. The method of shared access is mostly also NFS/SMB. You may use local ZFS storage ex via LX zones on OmniOS but the usual way for shared storage access especially with full virtualisation is also NFS/SMB.

    For very high performance needs, a Cluster filesystem or dedicated storage via FC/iSCSI is an option, optionally a local fast cache device that syncs or moves data asyncron to a common storage. But as you hint about a napp-in one setup, I suppose a dedicated or virtualised NFS filer is ok with virtualised or dedicated nodes connected via a fast network (Up to 10Gb/s is achievable with a quite common setup).
     
  12. sjalloq

    sjalloq n00b

    Messages:
    54
    Joined:
    Jun 20, 2011
    Thanks Gea,

    can anyone point me at some reading on how to benchmark storage and/or applications? I guess a good starting point would be to benchmark our current machine and workflows. We're running on CentOS if that makes a difference.
     
  13. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
  14. TCM2

    TCM2 Gawd

    Messages:
    572
    Joined:
    Oct 17, 2013
    I see that nothing much has changed in the amateurish way that this software is developed.

    zfs list explicitely has -p and -H to output parsable data, but no, let's go and parse _localized_(!) output for humans instead.
     
    Last edited: Jun 17, 2019
  15. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
  16. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    13,056
    Joined:
    Jun 13, 2003
    _Gea, perhaps you can answer this: are Aquantia NICs supported in Illumos yet?

    I'd like to try it on metal sometime, and that's my current roadblock.
     
  17. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    IdiotInCharge likes this.
  18. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    Current OpenIndiana 2019.05 supports native ZFS encryption after a
    pkg upgrade

    Then update your pool to support encryption
    pkg upgrade pool

    Then create a file with the key (ex 31 x 1)
    echo 1111111111111111111111111111111 > /key.txt

    Then create an encrypted filesystem ex enc on your "pool" based on that key
    zfs create -o encryption=0n keyformat=raw -o keylocation=file:///key.txt pool/enc

    Limitations:
    Do not encrypt rpool (bootloader does not support this at the moment)
    Keymanagement options are still limited

    Documentation on Open-ZFS encryption is still quite limited
    (beside Oracle Solaris, but their implementation is still more feature rich)
    what I found is https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/


    update
    current napp-it 19.dev supports ZFS encryption on Illumos based on passwort prompt.
     
    Last edited: Jun 27, 2019
    IdiotInCharge likes this.
  19. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    Question regarding the addition of a vdev to my config:
    I currently have a Norco 4220 / SM x9scm-f and two m1015 HBAs. I have two raidz2 vdevs each with six 2TB drives and have two 2TB spares.
    I was thinking of adding a new raidz2 vdev with six 3TB (512k) drives, replacing the current 2TB spares with a single 3TB spare and adding one more m1015 HBA to support the additional drives.

    Is this addition technically sound?
    2019-07-02_10-11-12.png
    [​IMG]
     
  20. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    You can of course remove the hotspares and create a new vdev but

    All modern disks are 4k disks. If you force 512B/ashift=9 you will see a performance degration
    and you are not able to replace a faulted disk with a 4k one

    If you want to create such a pool, I would create a new pool from a vdev of 3TB disks (ashift=12).
    Then copy the data over and add the 2TB vdevs where you also force ashift=12 (ashift is a vdev property)

    Your pool will have 28TB usable from 18 disks and > 100W power need.
    If you intend to buy the disks newly, I would propably avoid such many small disks and create
    a whole new pool from larger disks (6TB, or propably 8TB) and a single raid-Z2.
     
  21. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    gea, thanks for your input. I 'wanted' to build an entire new storage device, but funds are tight right now. My thought was to purchase 3TB (HUA723030ALA640) that are 512B to avoid the performance degradation. This would provide me with approximately an additional 12TB until I am able to build a new system.
     
  22. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    A performance degration is with 4k disks (512e are also 4k) and ashift=9.
    There is no problem with real 512B disks and ashift=12 so try to create a new pool with all vdevs in ashift=12

    I would not buy nowadays real 512B disks but 512e (or 4k) and I would not buy small 3TB disks.
     
  23. mikeo

    mikeo Gawd

    Messages:
    623
    Joined:
    May 17, 2006
  24. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    Due to the $$ at this time, and needed additional storage capacity with limited chasis space, I pulled the trigger on (7) 3TB 512B drives, maxing out the slots in my 4220.

    At this point what would be the benefit of creating a new pool with ashift=12, other than being able to replace the drives (one by one) at a later date?
    Wouldn't it be better to just have 3 vdevs in the same pool and build a new unit at a later time?
     
  25. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    13,056
    Joined:
    Jun 13, 2003
    I haven't, but I do want to say upon considering it, the question is hard to answer. With spinners, best advice was to do pools of mirrors especially as drive capacities increase while drive speeds largely don't.

    A 4TB SSD flips the script a bit. Relative to spinners that's still small, rebuild times are going to approach drive size divided by drive speed due to lower access times, and a second failure during rebuild becomes a much smaller worry, in theory.

    The problem with the theory is that SSDs are far less predictable than spinners. The Micron enterprise drive above is a notable exception as you should be able to take their numbers to the bank, so applying theory RAIDZ1 might even be overkill in the sense that the drives simply may not fail before they are obsoleted operationally, which means that RAIDZ1 might be the right kind of overkill.
     
    mikeo likes this.
  26. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    OmniOS bloody 151031 now supports native ZFS encryption

    napp-it 19.dev from today (jul 04) supports encryption
    in menu ZFS filesystem (create, lock, unlock)
     
    IdiotInCharge likes this.
  27. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    I did a fresh installation of OmniOS r15130 after a successful upgrade of esx 6.7u2.
    After installing napp-it, I am unable to import my existing ZFS pool. What am I doing wrong?

    When I shut down the new VM and go back to my old VM running r151018, I can import the existing zfs pool.

    upload_2019-7-8_13-36-46.png
     
    Last edited: Jul 8, 2019
  28. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    Are the disks detected?
    Have you enabled pass-through in 018 and not in 030?
     
    IdiotInCharge likes this.
  29. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    OMG I feel like a total noob...it has been so long, that is exactly what I forgot to do!!
     
  30. TigerLord

    TigerLord [H]ard|Gawd

    Messages:
    1,085
    Joined:
    Mar 11, 2007
    I'm running Napp-it 18.12u on openindiana 5.11 Hipster

    I'm running a SMB share I can connect fine to from a Windows environment.

    I am trying to mount the same SMB share on my Android TV box, specifically an Nvidia Shield.

    It's not working. The Shield reports no error, it just brings me back to the network menu and nothing is mounted or added.

    It could be a Shield problem, I'm investigating it, but I was wondering if there was any known samba issue between napp-it and android that would prevent me from successfully mounting a share on my shield?

    Thanks in advance.
     
  31. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    Are you on the newest OI?
    If not do a pkg upgrade (You can go back to the last BE)

    I know that there are some Android clients that does not work and some that does, but I do not use any.

    Alternative if possible: NFS3
     
  32. TigerLord

    TigerLord [H]ard|Gawd

    Messages:
    1,085
    Joined:
    Mar 11, 2007
    After further investigation it seems nVidia is aware of the issues surrounding Samba. They've been releasing fixes for the past year, but they still need to release more. As of April 2019 they were still working on SMBv3 implementation, as the Shield is stuck to SMBv1 for now.

    NFS is not natively supported by the Shield's OS. Kodi can mount NFS shares fine, but it defeats the purpose of what I'm trying to accomplish, which is the Shield running Plex server.

    Funnily, the shield runs on a Linux kernel but doesn't support a linux protocol like NFS, only Window's. Lol, makes total sense.
     
    IdiotInCharge likes this.
  33. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    13,056
    Joined:
    Jun 13, 2003
    For the use case, it makes sense...

    But it's still funny, and I'm still frustrated because even Microsoft has abandoned SMBv1. I've yet to get my Shield hooked up to my CentOS 7 server (running ZFS), but of course, Plex works great. I just use the CentOS box as the Plex server.

    Do they not have a spin that would work on OpenSolaris, or could you not say toss up a Ubuntu etc. VM for that purpose?
     
  34. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    I would like some direction on the next steps to fix my pool. Prior to the screenshot below, I did not have any pool errors. I had a disk (A785 - see below) that was showing hard errors so i initiated a replace with the spare. The swap did not complete and the pool was in a degraded status. A zpool clear fixed cleared the errors and I the replacement again with the same degraded status, however, the pool remains in a degraded status after the scrub. I don't know what to do next.
    What I want to accomplish is:
    1. Fix the health of the pool.
    2. Replace the disk with the spare.

    I also noticed that the zpool version is showing a value of " - "
    upload_2019-7-17_10-0-32.png

    upload_2019-7-17_9-58-18.png
     
  35. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    So basically want to remove the hot spare and replace it with the bad disk

    Use Pool > Clear to clear the "too many errors"

    To remove the hot spare finally from the pool, use menu Disk > Remove
    Then replace: use menu Disk > Replace

    Are the problem disks on the same power, backplane or HBA?
    Maybe there is a reason

    After you have replaced the disk with the hard errors do a low level/intensive test ex with WD datalifeguard

    pool version ="-" is ok.
    This is a synonym to Open-ZFS pool v5000 with feature flags where features determine differences , no longer a pool version. Internal v5000 is mainly to be sure that Oracle will never come close to this with their genuine ZFS.
     
  36. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    "bad" disk = c0t5000CCA36AC2A785d0
    spare = c0t5000CCA36AD05EA2d0


    I want to remove the bad disk and replace it with the spare. As I mentioned earlier, I ran zpool replace c0t5000CCA36AC2A785d0 c0t5000CCA36AD05EA2d0 which 'failed'.
    So if I understand what your saying, is to remove the spare (c0t5000CCA36AD05EA2d0) from the pool first and then replace replace c0t5000CCA36AC2A785d0 with c0t5000CCA36AD05EA2d0?.

    I have 3 m1015 HBAs for the backblane and there does not appear to be a correlation between the failing vdev and a specific port on the backplane or hba. However I will power this box down and check the connections.
     
  37. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    What was the problem with a replace faulted <-> spare?

    Usually a spare can replace a faulted disk but remains a spare.
    The idea behind is that you replace the faultet disk then, replace spare > new and the spare remains a spare.

    If you want to remove the spare property, first remove then replace.
     
  38. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    The drive wasn't faulted, just hard errors (S:0 H:136 T:0), so I wanted to replace with the spare. The swap never completed and the vdev went into a degraded state "too many errors". I will attempt to clear the errors and try again.
     
  39. shanester

    shanester [H]Lite

    Messages:
    68
    Joined:
    Mar 1, 2011
    The zpool clear completed but did not clear the errors. I am stuck and open to further suggestions.
    upload_2019-7-18_9-8-58.png
     
  40. _Gea

    _Gea 2[H]4U

    Messages:
    3,895
    Joined:
    Dec 5, 2010
    If the problem would be initiated by a single incident in the past and the hardware would be now ok, a scrub and pool clear would be enough.

    If you cannot identify the problem with the help of the system and fault log and if there is no common part like same HBA or power cabling for the 6 disks showing problems, I would look at the disk with the hard errors.

    Maybe this disk affects the others negatively ex by blocking something. I would offline or even physically remove this disk and retry a disk replace or scrub and clear.