OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

I have deleted and recreated with 0.7. I did it again just now on 0.7l, and I'm still getting the alert messages.

hmmm..

You may look at the jobscript

Code:
[FONT="Courier New"][SIZE="2"]  # get needed infos in $text
  #zpool status
  $text=&exe("zpool status");

  # check disk error
  if ($text=~/DEGRADED|UNAVAIL/s) {
       $r.="1"; $sub.=" -DISK ERR-";
  } else {
       $text="-disk errors: none\n";
       if (-f $err1) {   unlink ($err1);  }                # delete disk old error
  }

  # check avail < 15%  ()
  $pools=&exe("zpool list");
  $text.="\n------------------------------\nzpool list\n$pools\n\nPool capacity from zfs list\n";

  $t=&zfslib_pool2used;                  # list of pools + % used based on zfs list
  $text.="$t";

  if ($t=~/\%\!/) {
     $sub.=" -Low CAP ALERT 15%-";
     $r.="1";
  } else {
       if (-f $err2) {   unlink ($err2);  }                # delete disk cap error
  }


  # no reason
  ##########################################################################################
  if (!($r=~/1/)) { return (); }                             # no reason[/SIZE][/FONT]

comment out in the cap-check area
Code:
from:
   $sub.=" -Low CAP ALERT 15%-";
   $r.="1";;

to 
    $sub.=" -Low CAP ALERT 15%-";
#   $r.="1";

to be sure, its not a problem of this part.
 
hmmm..

You may look at the jobscript

Code:
[FONT="Courier New"][SIZE="2"]  # get needed infos in $text
  #zpool status
  $text=&exe("zpool status");

  # check disk error
  if ($text=~/DEGRADED|UNAVAIL/s) {
       $r.="1"; $sub.=" -DISK ERR-";
  } else {
       $text="-disk errors: none\n";
       if (-f $err1) {   unlink ($err1);  }                # delete disk old error
  }

  # check avail < 15%  ()
  $pools=&exe("zpool list");
  $text.="\n------------------------------\nzpool list\n$pools\n\nPool capacity from zfs list\n";

  $t=&zfslib_pool2used;                  # list of pools + % used based on zfs list
  $text.="$t";

  if ($t=~/\%\!/) {
     $sub.=" -Low CAP ALERT 15%-";
     $r.="1";
  } else {
       if (-f $err2) {   unlink ($err2);  }                # delete disk cap error
  }


  # no reason
  ##########################################################################################
  if (!($r=~/1/)) { return (); }                             # no reason[/SIZE][/FONT]

comment out in the cap-check area
Code:
from:
   $sub.=" -Low CAP ALERT 15%-";
   $r.="1";;

to 
    $sub.=" -Low CAP ALERT 15%-";
#   $r.="1";

to be sure, its not a problem of this part.

From your post, I have figured out it is indeed a bug with the way your job script parses the zpool status. The message about the pool version being out of date contains the word "unavailable" (some features are unavailable). The regex needs to match all instances of unavail that aren't on the same line as "features", or maybe just uppercase UNAVAIL and not lowercase (my regex-fu is weak, sadly, or i'd suggest a patch).
 
From your post, I have figured out it is indeed a bug with the way your job script parses the zpool status. The message about the pool version being out of date contains the word "unavailable" (some features are unavailable). The regex needs to match all instances of unavail that aren't on the same line as "features", or maybe just uppercase UNAVAIL and not lowercase (my regex-fu is weak, sadly, or i'd suggest a patch).


There must be another reason.

Code:
if ($text=~/DEGRADED|UNAVAIL/s)

matches only Uppercase
The /s means only, ignore newlines in string
 
Data recovery fun on multiple raidz1 pool. The OS is oi151.

Code:
root@nozomi:~# zpool import media
cannot import 'media': invalid vdev configuration

How the shit hit the fan:
Server got flooded and 2 of the drives died from RaidZ1-2 pool (c4t1d0 and c4t2d0), the data was recovered from the broken hdds by recovery company and have now been copied to new drives.

Code:
root@host:~# zpool import
  pool: media
    id: 17147247216063177950
 state: UNAVAIL
status: The pool is formatted using an older on-disk version.
action: The pool cannot be imported due to damaged devices or data.
config:

        media       UNAVAIL  insufficient replicas
          raidz1-0  ONLINE
            c5t3d0  ONLINE
            c5t1d0  ONLINE
            c5t2d0  ONLINE
          raidz1-1  ONLINE
            c5t7d0  ONLINE
            c5t5d0  ONLINE
            c5t6d0  ONLINE
          raidz1-2  UNAVAIL  corrupted data
            c5t4d0  ONLINE
            c4t1d0  ONLINE
            c4t2d0  ONLINE

Also. When ran
# zdb -l /dev/rdsk/c5t4d0s0

All of the paths are wrong, is there any way to correct these? I just think that the paths are causing the problem.
 
Make sure you are referencing a drive, and not a slice of a drive ( ie partition)

# zdb -l /dev/rdsk/c5t4d0s0

Should be

# zdb -l /dev/rdsk/c5t4d0

Yes?

.
 
Right the media pool is now imported but still got the insufficient replicas and corrupted data status while all the disks are online and present.

zdb -C gives right looking output from the pools and the paths.

But the pool stays unavailable. Can't find any good info either on the net how to bring it back.
 
I am running Napp-It 0.7l on Solaris 11. I have created an NFS share on my datapool, which works fine. Today, I created a new pool and wanted to make another NFS export for my ESXi 5 machine.
When trying to connect from my ESXi host, I can connect fine to my old NFS share, but not to my new NFS share. Every time I try this, VMWare tells me

Call from "HostDatastoreSystem.CreateNasDatastore" for object "ha-datastoresystem" on ESXi "192.168.1.60" has failed.
NFS-Mount 192.168.1.224:/vmware/vm failed: The mount request was denied by the NFS-Server. Check if an export is available and if the client is allowed to mount it.

The message is translated from german, so it might be a little different on an English system.

I deleted and recreated the new NFS export (which always works successfully), to no avail. I checked the ZFS folder and noticed the following:
diskpool/data name=data path=/diskpool/data prot=smb
diskpool/vmware name=vmware path=/diskpool/vmware prot=smb
diskpool/vmware name=nfs-on path=/diskpool/vmware prot=nfs
vmware/vm name=vm path=/vmware/vm prot=smb

As you can see, for my new export "vm", there is no share of protocol "nfs" like for "vmware" (see in red and below)... By the way, I can connect to the smb share through Windows with no problems and copy data on it!

Any idea how I can solve this? I would like to create a second NFS export so I can move my virtual machines there... any help appreciated. Thanks!

Cap'
 
Data recovery fun on multiple raidz1 pool. The OS is oi151.

Code:
root@nozomi:~# zpool import media
cannot import 'media': invalid vdev configuration

How the shit hit the fan:
Server got flooded and 2 of the drives died from RaidZ1-2 pool (c4t1d0 and c4t2d0), the data was recovered from the broken hdds by recovery company and have now been copied to new drives.

Code:
root@host:~# zpool import
  pool: media
    id: 17147247216063177950
 state: UNAVAIL
status: The pool is formatted using an older on-disk version.
action: The pool cannot be imported due to damaged devices or data.
config:

        media       UNAVAIL  insufficient replicas
          raidz1-0  ONLINE
            c5t3d0  ONLINE
            c5t1d0  ONLINE
            c5t2d0  ONLINE
          raidz1-1  ONLINE
            c5t7d0  ONLINE
            c5t5d0  ONLINE
            c5t6d0  ONLINE
          raidz1-2  UNAVAIL  corrupted data
            c5t4d0  ONLINE
            c4t1d0  ONLINE
            c4t2d0  ONLINE

Also. When ran
# zdb -l /dev/rdsk/c5t4d0s0

All of the paths are wrong, is there any way to correct these? I just think that the paths are causing the problem.

it is hard to survive a real disaster without backups.
maybee you can ask at http://echelog.matzon.dk/logs/browse/illumos/
(IRC at Freenode, ex via FireFox Chatzilla)
 
Last edited:
I deleted and recreated the new NFS export (which always works successfully), to no

Unless you have an active NFS share you cannot connect from ESXi.
Have you NFS shared the dataset from the napp-it ZFS folder menu or manually?
 
Right the media pool is now imported but still got the insufficient replicas and corrupted data status while all the disks are online and present.

zdb -C gives right looking output from the pools and the paths.

But the pool stays unavailable. Can't find any good info either on the net how to bring it back.



How was the data recovered from the failed drives and transferred to the new c4t1d0 and c4t2d0?

From what you've posted, it looks like the hardware is now OK, but zfs doesn't like what's actually on the two new drives - hence it thinks the vdev "raidz1-2" is corrupted as it doesn't like the contents of two of the three drives which make up this vdev.
 
Hi _Gea,
Thanks for your reply.
I wanted to connect to both, old and new NFS export at the same time, in order to move machines on VMWare directly. I can successfully connect the old export, but not the new one. I created the NFS share using the napp-it ZFS folder menu. Parameter sharenfs is set to on. As far as I can interpret the message, I assume it can connect to the export, but is then denied access. I gave root full access, just the same way as on the old NFS export.
Thanks for your support!
Cap'
 
Hi _Gea,
Thanks for your reply.
I wanted to connect to both, old and new NFS export at the same time, in order to move machines on VMWare directly. I can successfully connect the old export, but not the new one. I created the NFS share using the napp-it ZFS folder menu. Parameter sharenfs is set to on. As far as I can interpret the message, I assume it can connect to the export, but is then denied access. I gave root full access, just the same way as on the old NFS export.
Thanks for your support!
Cap'


You must either allow root access for the ESXi machine or you need to set permissions to
everyone=modify. (You can use napp-it ACL extension or do it from Windows)
 
Hi _Gea,
I have already set this!

0 user:root rwxpdDaARWcCos full_set rd(acl,att,xatt) wr(acl,att,xatt,own) add(fi,sdir) del(yes,child) x, s file,dir allow delete
1 everyone@ rwxpdDaARWc--s modify_set rd(acl,att,xatt) wr(att,xatt) add(fi,sdir) del(yes,child) x, s file,dir allow delete

You said the export must be activated... is there any manual method to do that?

Thanks!
 
Hmmm, updated to 0.8 but can't access powerconfig and getting a red error screen:

(367 autocreate default action) write-error: /var/web-gui/data/napp-it/_my/zfsos/03_system and network/081_Power Mgmtill_/04_edit powerconf/action.pl

Xammp has same problem:

(367 autocreate default action) write-error: /var/web-gui/data/napp-it/_my/zfsos/02_services/15_XAMPP-nex/03_Xampp-Services/action.pl
 
Hmmm, updated to 0.8 but can't access powerconfig and getting a red error screen:

(367 autocreate default action) write-error: /var/web-gui/data/napp-it/_my/zfsos/03_system and network/081_Power Mgmtill_/04_edit powerconf/action.pl

Xammp has same problem:

(367 autocreate default action) write-error: /var/web-gui/data/napp-it/_my/zfsos/02_services/15_XAMPP-nex/03_Xampp-Services/action.pl

its a first preview, not yet fully tested
(fix that in next preview 0.8b - tomorrow)
 
...any other idea on my NFS export problem? I have given all rights as you instructed me, but still no luck...
Thanks!
 
...any other idea on my NFS export problem? I have given all rights as you instructed me, but still no luck...
Thanks!

Solaris 11 changed the way it handles multiple shares completely compared to other Solaris versions..
Each NFS/SMB share sets the basic property to on/off but needs also a definition of a unique share name.

This part was not handled correctly by napp-it (fixed it next 0.8)
To fix it now, you need to modify the nfs share part of /var/web-gui/data/napp-it/zfsos/_lib/zfslib.pl
from line 1221 to set unique sharenames (the name of the dataset)

Code:
  #nfs-share
  ##########
  if ($in{'prop'} eq "sharenfs") {
        $in{'text'}=~s/[^a-zA-Z0-9\:\-\_\@,=\/\.]//g;
        $r="";
        if ($in{'text'} eq "") {  &mess('on or off?'); }
        if ($in{'text'} eq "off") {
             if ($sys{'auto_os'} eq "Solaris 11") {
                $t=$in{'zfs'}; $t=~s/.*\///; # dataset name
                $r=&exe("/usr/sbin/zfs set -c share=name=$t $in{'zfs'}");
             }
             $r=&exe("/usr/sbin/zfs set sharenfs=off $in{'zfs'}");

        } else {
             if ($sys{'auto_os'} eq "Solaris 11") {
                $t=$in{'zfs'}; $t=~s/.*\///; # dataset name
                $r=&exe("/usr/sbin/zfs set share=name=$t,path=/$in{'zfs'},prot=nfs $in{'zfs'}");
             }
             $r=&exe("/usr/sbin/zfs set sharenfs=$in{'text'} $in{'zfs'}");
        }
        if ($r ne "") {  &mess($r); }
  }
 
...
To fix it now, you need to modify the nfs share part of /var/web-gui/data/napp-it/zfsos/_lib/zfslib.pl
from line 1221 to set unique sharenames (the name of the dataset)

Code:
  #nfs-share
  ##########
  if ($in{'prop'} eq "sharenfs") {
        $in{'text'}=~s/[^a-zA-Z0-9\:\-\_\@,=\/\.]//g;
        $r="";
        if ($in{'text'} eq "") {  &mess('on or off?'); }
        if ($in{'text'} eq "off") {
             if ($sys{'auto_os'} eq "Solaris 11") {
                $t=$in{'zfs'}; $t=~s/.*\///; # dataset name
                $r=&exe("/usr/sbin/zfs set -c share=name=$t $in{'zfs'}");
             }
             $r=&exe("/usr/sbin/zfs set sharenfs=off $in{'zfs'}");

        } else {
             if ($sys{'auto_os'} eq "Solaris 11") {
                $t=$in{'zfs'}; $t=~s/.*\///; # dataset name
                $r=&exe("/usr/sbin/zfs set share=name=$t,path=/$in{'zfs'},prot=nfs $in{'zfs'}");
             }
             $r=&exe("/usr/sbin/zfs set sharenfs=$in{'text'} $in{'zfs'}");
        }
        if ($r ne "") {  &mess($r); }
  }

Thanks for this explanation, but unfortunately, I'm still so much of a noob that I don't understand what exactly needs to be modified... can you please be more specific? Also, if I modify this manually, is there a risk that a future upgrade to newer versions will fail or behave oddly?
Thanks,
Cap'
 
Thanks for this explanation, but unfortunately, I'm still so much of a noob that I don't understand what exactly needs to be modified... can you please be more specific? Also, if I modify this manually, is there a risk that a future upgrade to newer versions will fail or behave oddly?
Thanks,
Cap'

If you open the file and go to line 1221 in that file, you will find nearly the same code (only difference is that in the original file, the nfs-share-name is always the same)

It should be easy to replace the old code with the new one.
Keep the original file so you can go back in case of syntax errors.

On next update, the file will be replaced, so no problem.
 
Anyone else starting scrubs in napp-it and finding that the pool status shows that it's running at 1/second (not even a bit/byte figure, no time until finish) for an hour or two before it starts? Same thing happens with command line scrubs so it's not napp-it but I thought there was more of a chance of someone noticing there...
 
Anyone else starting scrubs in napp-it and finding that the pool status shows that it's running at 1/second (not even a bit/byte figure, no time until finish) for an hour or two before it starts? Same thing happens with command line scrubs so it's not napp-it but I thought there was more of a chance of someone noticing there...

I have seen such a behaviour together with hardware problems on a single disk (with delays of some minutes not hours)
check format, iostat -Enr or system statistic in napp-it if a disk behaves not well.
 
Hello,
I have 2 vms one OI and one Win2k8r2 both are running fine but i have one major issue if i want to transfer a 20gb file from the windows vm to the OI vm via VMXNET3 10gb it starts very fast but after a few seconds the transfer drops and sometimes it looses the connection completely. Vmware tools are installed on both vms VMCI is also enabled with no restrictions.

Here are some Host HW details:
CPU: Xeon E3 1225
Ram: 8gb
Board: Asus P8 WS
HBA: IBM 1015 (IT mode)
HDD: the vsphere host and the datastore are on one 500gb WD caviar black
HDD for OI: 8x WD 2tb

The OI runs a raidz2 the stats look okay to me about 500mb read and write. I know that i cant expect this kind of performance on an active transfer but since the transfer speed drops so fast i cant even complete some filetransfers small files are okay but the majority is more then 5 - 10 gb.

I'm reading all kinds of forums but it seems that the only one who has this kind of problem.


My intension is/was to have the OI vm as pure storage and connect it via nfs/iscsi to the win2k8 server and let him handle the cfis/smb shares okay i know napp-it can handle it but I want to have it like that and as i'm very ambitious with these kind of things i want to solve the problem. I love zfs and i want to keep it since we had some bad experience with a rebuild that took ages...

Since a few days I'm even thinking about buying 2 infinband cards and pass them trough each vm and let the traffic go trough them but i think it's a waste this must be possible.


If you guys need any aditional information please let me know :)

@Gea, i love your napp-it interface (the design not so much ;-) ) great work!


Thanks for your help
 
Hello,
I have 2 vms one OI and one Win2k8r2 both are running fine but i have one major issue if i want to transfer a 20gb file from the windows vm to the OI vm via VMXNET3 10gb it starts very fast but after a few seconds the transfer drops and sometimes it looses the connection completely. Vmware tools are installed on both vms VMCI is also enabled with no restrictions.

There are reports about problems with VMXNET3
Try to use e1000. With vmci it gives you also several Gbit/s

Next problem may be sync writes with ESXi and NFS
Disable sync-write on affected datasets and compare results

@Gea, i love your napp-it interface (the design not so much ;-) ) great work!

Good "design" (not only gimmics like icons and show effects) including good usability
is as hard as writing good software. Currently functionality is my major concern on the way to a 1.0 release.
Design efforts can hinder testing new ideas, so i avoided all work into that and used a very minimal user interface.

Starting with napp-it 0.8 i implemented the base infrastructure for a more modern user interface based on jsquery and mbmenu.
I will currently not invest too much work into that but it should be a good base for everyone who wants to add his own user interface
(just for fun or to bundle napp-it with their appliances). You should try

screenshot see http://napp-it.org/doc/downloads/napp-it.pdf
 
There are reports about problems with VMXNET3
Try to use e1000. With vmci it gives you also several Gbit/s

Next problem may be sync writes with ESXi and NFS
Disable sync-write on affected datasets and compare results



Good "design" (not only gimmics like icons and show effects) including good usability
is as hard as writing good software. Currently functionality is my major concern on the way to a 1.0 release.
Design efforts can hinder testing new ideas, so i avoided all work into that and used a very minimal user interface.

Starting with napp-it 0.8 i implemented the base infrastructure for a more modern user interface based on jsquery and mbmenu.
I will currently not invest too much work into that but it should be a good base for everyone who wants to add his own user interface
(just for fun or to bundle napp-it with their appliances). You should try

screenshot see http://napp-it.org/doc/downloads/napp-it.pdf

Yes it is hard I'm an interface designer so yes I agree it's not that easy :)
Keep on creating cool SW and after you think you got the most of it and have time spending being creative just do it!

I will give the e1000 a try but isn't it just 1gbe ?
 
Yes it is hard I'm an interface designer so yes I agree it's not that easy :)
Keep on creating cool SW and after you think you got the most of it and have time spending being creative just do it!

I will give the e1000 a try but isn't it just 1gbe ?

The 1 GBit limit is a limit of real network adapters on Ethernet.
In a virtualized environment, its only limited by effectiveness and CPU.
With e1000 driver and VMCI, you can reach up to several Gbit/s
(not as much as the more effective and modern VMXNET3)
 
Last edited:
Okay i will give it a try, one more question vmxnet3 for the OI vm or also the e1000?

I have heard only about problems with OI and VMXNET3.
You may keep it with Windows (or compare with e1000 everywhere).
Test e1000 at least on OI (and disable sync in a second test on your NFS shared ESXi datastores)
 
I tested now:

OI -> vmxnet3
Win -> vmxnet3

NFS sync disabled

copying 10gb from OI NFS share to win2k8 local storage (SSD) starts with 300mb and slows down to 25mb/s

Edit
Win -> CPU usage extra ordinary high :-(
was a false windows process

transfer is now stable at about 45mb/s still not enough :( i would be happy with 150mb/s

Edit 2

still high cpu usage...

will try now with e1000
 
Last edited:
Your pool is build from one Raid-Z2 what means that your I/O is equal to one disk.
Beside the e1000 option, you may check against a Raid-O or mirrored config (if there are no valid data)
to find the bottleneck. I also found that 8 GB RAM with large green pools is not enough for performance.
 
But the becnhmarks show me bigger numbers so far i still think the problem is the windows VM normal SMB network transfers to my workstation is with 105 mb/s constant!

statsCLC1Z.gif


So far i blame the windows NFS client -.-

EDIT:


Even if i try to copy files via cfis/smb the cpu load is at 100%

the Win machine has one cpu with 2 cores same as the OI vm oi runs smooth but the win vm is going crazy -.-
writes to the NFS are fine... about 150mb/s
 
Last edited:
If I would build a machine with 10 Gbit expectation, i would use much more memory than 8 GB
and use 3 x mirrors for best I/O rates.. Its hard to reach really high numbers without the hardware.
(My high performance pools are build from 4 x 3 x mirrored SSD pools and 32 BG RAM+)

And I would also expect NFS to be faster than SMB
 
Last update for now need to sleep...
I test iSCSI just for sake and it works pretty well actually still some minor things to test but i do that after work... NFS would be my fav but CPU load is far to high I also mihgt check a linux VM maybe it's just a windows problem.

Gute Nacht :)
 
Of the dozen or so ZFS boxes we run, I sat down tonight at another one to look at some performance issues my engineers reported.

I was greeted with the following:
Dual Hex X5650 2.67
16gb ram
28 1.5TB
20 1 TB
3x 256gb SSD (Mirror+spare)

The 1.5 TB drives are in 6 disc RaidZ2 sets and the 1tb are in 4, 6, and 10 RaidZ2 sets. Needless to say I have some significant changes to make. Time to tear it down and rebuild.
 
Okay I got it working it was the zPool setup a raidZ2 has not enough power to handle it.
So the 8 HDDs are now divided in 4 mirrors performance is pretty nice but now I want more ;-)

How much performance increase can i expect if i add a mirrored ssd chache?
 
Okay I got it working it was the zPool setup a raidZ2 has not enough power to handle it.
So the 8 HDDs are now divided in 4 mirrors performance is pretty nice but now I want more ;-)

How much performance increase can i expect if i add a mirrored ssd chache?

for benchmarks: none (no data to cache)
if your re-readed file is in Ram-ARC-Cache: none (fast as before)
if your file is large: none (not cached)

if your file is in your L2-ARC cache (SSD read cache that extends the RAM cache):
slower than reading from RAM cache but faster than reading from disk, helps especially
with reading small files like webpages or disk-browsing with a lot of users.
 
Hi folks,

I am running a rather small "all-in-one" at home, where I use to spin down disks, which in itself works great.

But some clients (mostly connecting via CIFS) have problems to "see"/mount the shares right away, when the array is in a spun-down state.
The result is that clients will timeout and some of the embedded kind won't re-connect automatically.
I have set my controller to spin up all disks in 1 group to save time, but in 80% of the times, this is not fast enough.

What are my options to improve things?
The goal is to keep the spin-down feature and get rid of the connection timeout, of course.

- would application of a write- and/or read-cache SDD to the ZFS array help?
- what other options are there?

Many thanks in advance for your ideas & suggestions.

regards,
Hominidae
 
Back
Top