OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Thanks _Gea,

I didn't know about the Autostart settings, I'll have to check those out.

The problem is that if you have more than one vmxnet3s NIC to the OmniOS VM then when you set the 9000 MTU in the vmxnet3s kernel driver it will by default (I think) raise both NIC interfaces to 9000.

You can still lower the 2nd management interface to 1500 MTU but I think you have to change it after setting the 9000MTU in the kernel driver:

## Set management interface to persistently stay at 1500 MTU
## Edit for your environment (maybe you want vmxnet3s1 as your mgmnt interface)
Code:
# ipadm set-ifprop -p mtu=1500 -m ipv4 vmxnet3s0

You may be able to set the management interface as above before the 9000MTU kernel driver edit, maybe it only auto-adjusts MTUs higher if not specifically set.
 
I would suppose, one disk is blocking the bus or controller.
What you can do on a hotpluggable server is (to find a single disk problem)
- power off and remove all disks
- power on, boot Solaris
- plug in a disk and wait some seconds until it comes up in menu disks
- do this with all disks. With one disk you may have problems and this is the disk that you need to replace

Unlike hardwareraid, such a procedure is not a problem with ZFS
The pool simply stays offline until enough disks come back, if a disk is missing within the redundancy the poolstate is degraded.
When the last disk is back the poolstate goes to online. If you have modified data in the meantime, it initiates a resilver.

If this does not help, you may need another setup where you can test
if the pool itself is the problem or
- power
- the SAS controller
- the expander
- the backplane

yeah at this pointi 'm thinking its a combination of the SAS controller <-> backplane. depending on how i plug things in i can get some drives to read and nothing on others. Gonna pull my entire server upstairs in the next day or so and just sit down and play around with different connnectiosn and see what i can get. Thinking i might just break down and go with a new chassis that has actual SAS ports and a new SAS controller that has enough ports to not need an expander, any suggestions on whats known to work well to control 24 drives?
 
This might be related to enabling 9000MTU and jumbo frames, or using Solaris 10 vs 11 vmware drivers. Did you modify the vmware tools script to try and select Solaris 11 drivers for OmniOS?
.

I noticed that in ESXi I had the guest OS set to solaris 11 x64. I did not modify the install script. Is this ESXi setting used by the perl script?

I tried setting it to solaris 10 x64, but after the vmware tools update the napp-it interface still wasn't working. Is there a way to see if I currently use the solaris 11 driver?

Never tried anything with jumbo frames on this machine.
 
I noticed that in ESXi I had the guest OS set to solaris 11 x64. I did not modify the install script. Is this ESXi setting used by the perl script?

I tried setting it to solaris 10 x64, but after the vmware tools update the napp-it interface still wasn't working. Is there a way to see if I currently use the solaris 11 driver?

Never tried anything with jumbo frames on this machine.

My VM's guest operating system is set to "Oracle Solaris 10 (64-bit)".. I'm not sure if it reads that during the vmware tools install though.

You should be able to figure out which one you're using by matching the file sizes:

from the vmware-tools-distrib folder, look at the filesizes
(eg the 10_64 version is 33888 bytes)

Code:
~/vmware-tools-distrib# find . -name 'vmxnet3s' -ls

 78270   34 -rw-r--r--   1 root     root        34072 Aug 23 00:41 ./lib/modules/binary/2009.06_64/vmxnet3s
 78280   25 -rw-r--r--   1 root     root        24568 Aug 23 00:41 ./lib/modules/binary/11/vmxnet3s
 78264   35 -rw-r--r--   1 root     root        35088 Aug 23 00:41 ./lib/modules/binary/11_64/vmxnet3s
 78256   34 -rw-r--r--   1 root     root        33888 Aug 23 00:41 ./lib/modules/binary/10_64/vmxnet3s
 78274   25 -rw-r--r--   1 root     root        24424 Aug 23 00:41 ./lib/modules/binary/2009.06/vmxnet3s
 78247   25 -rw-r--r--   1 root     root        24336 Aug 23 00:41 ./lib/modules/binary/10/vmxnet3s

Then look in your /kernel/drv/:

Code:
# find /kernel/drv/ -name 'vmxnet3s' -ls
 74473   34 -rw-r--r--   1 root     root        33888 Dec  1 19:39 /kernel/drv/amd64/vmxnet3s
 74472   25 -rw-r--r--   1 root     root        24336 Dec  1 19:39 /kernel/drv/vmxnet3s

The size of amd64 vmxnet3s in /kernel/drv/ matches the Solaris 10_64 version: 33888 bytes
 
solaris 10 x64 driver is installed now.

After vmware tools update the solaris 11 x64 driver is installed. So that would probably explain why napp-it breaks.

I can not find a way to force version 10.
 
solaris 10 x64 driver is installed now.

After vmware tools update the solaris 11 x64 driver is installed. So that would probably explain why napp-it breaks.

I can not find a way to force version 10.

Hi Toelie, as a last resort you could try to do the opposite in Step 4 of the following post and edit the script to use version 10:
https://forums.servethehome.com/ind...le-and-napp-it-vmxnet3-and-jumbo-frames.2853/

(they were trying to force using version 11)


You could try modifying the vmware-config-tools.pl to override the Solaris version reported by uname:

Under vmware-tools-distrib/bin

# vi vmware-config-tools.pl

Search for solaris_os_version:
Type:
/solaris_os_version (hit enter)

Add the hack section in following sub function, which basically overrides what the uname reports to the script.

Note: This will affect all the functions in the VMware tools install so use at your own risk, eg don't use unless you have nothing else to lose :)

Code:
sub solaris_os_version {
  my $solVersion = direct_command(shell_string($gHelper{'uname'}) . ' -r');
  chomp($solVersion);
  my ($major, $minor) = split /\./, $solVersion;

  # HardOCP hack - force to use Solaris 10.
  # Use at your own risk! (create a boot environment first)
  # This will force all function checks in VMware install
  #  to think you are on Solaris 10.
  $minor = 10;
  # END hack  

  return ($major, $minor);
}
 
@J-San

Thank you very much. I could only get it to work with the perl script modification. Very strange because it should detect the version automatically by querying omnios itself. No idea why it does not work on my omnios install.

I will try your omnios tweaks (Jumbo frames, TCP and NFS) on the new test server that will arrive in the next few weeks. Looks promising...
 
Glad that helped!

My version of OmniOS is from Gea's vmware appliance.
It's on OmniOS r151010

Maybe there's an issue with a newer version?
There is an OmniOS r151012 which is the latest "stable" release.. and maybe it reports as Solaris 11?

To determine the major release your system is on, look at /etc/release
 
Last edited:
I have uploaded a new ESXi appliance with current OmniOS 151012 and napp-it 0.9f3

The installed vmware tools are a from a default 5.5U2 setup that installs the Solaris 10-64 versions
If you want to try Solaris 11 versions of vmxnet3, the vmware tools installer is in /root
with the drivers in /root/vmware-tools-distrib/lib/modules/binary/

download: http://napp-it.org/downloads/index_en.html
 
Last edited:
Is there any way I can choose which NIC to use for Replication? If possible, can I use multiple NICs?

Also FYI appliance-group does not work when admin pwd starts with !.

/Paniolo
 
You replicate to another host/ip so the nic is selected based on the ip settings.
With two nics they should not be in the same subnet.

If you want to use two nics you must create a link aggregation (LACP).
A better option may be a switch to 10G if you need more performance.

About pw
napp-it parses the form values and removes some characters
Allowed characters see menu napp-it settings -> pw
allowed: [a-zA-Z0-9,.-;:_#]

Some more characters would be possible for pw like !
I check if I can parse less restrictive.
 
You replicate to another host/ip so the nic is selected based on the ip settings.
With two nics they should not be in the same subnet.

If you want to use two nics you must create a link aggregation (LACP).
A better option may be a switch to 10G if you need more performance.

I have 2x1gbit in LACP I use for management and then two Infiniband 32Gbit I have each on their own subnet.

netcat happens over the management network - each host can ping each other across all 3 ip addresses.

I would prefer if it would use at least one of the 32gbit links instead of the 1gbit link.

Paniolo
 
Auto scrub jobs are no longer working since Dec 6. I am unable to delete or manually run the jobs. I have restarted the server (OminOS) with same results. Any suggestions for troubleshooting/resolution?


Our8AAX.png
 
confirmed, seems to be a bug in dev edition
.... fixed in 0.9f4_dev dec 23 nightly
 
Last edited:
For napp-it Home and Pro: menu About - Update >> download 0.9f4

I have added 0.9f4 dev also to todays free ESXi appliance v15a (2015, first edition).
It includes support for push alerts to your smartphone/ tablet and first efforts to support IB
 
For napp-it Home and Pro: menu About - Update >> download 0.9f4

I have added 0.9f4 dev also to todays free ESXi appliance v15a (2015, first edition).
It includes support for push alerts to your smartphone/ tablet and first efforts to support IB

Since I was already at a 0.9f4 release it wasn't clear. Appears to be working fine. Thanks for your quick responses.
 
Since I was already at a 0.9f4 release it wasn't clear. Appears to be working fine. Thanks for your quick responses.

I do not assign a new release number to small bugfixes in the developer nightly edition.
A new date must be enough as they can/are be updated daily.

This is different to the stable numbers like 0.9f3 and the next 0.9f5 stable that are installed as a default setup.
They are published only after some time of "nobody reports problems"
 
Last edited:

will this increase transfer speed from external computers to nappit in ESXI or just to VM to VM inside ESXI

I'm moving my zraid1 5x2TB from freenas VM to omniOS VM but transfer speed is like starts from 100 KB/s to 6 MB/s with large files from omniOS vm to windows 7. I did a clean install with OmniOS.iso install and followed the instruction to setup vmxnet3 only. I have been running esxi 5.5.0 U2

btw I have uploaded napp-it_15a_vm_for_ESXi_5.0-5.5.zip to my server if anyone else want it because the transfer speed to my end was pretty slow.

Will the proconfigured version be any different?

network.jpg


omnios.jpg


my stand alone IO 151a5 w/ nappit works great over the years. If this doesn't work I still have some consumer hardware I can build a standalone machine but it just wont have ecc memory.
 
Last edited:
vmxnet3 speed up traffic from a guest VM to the ESXi virtual switch.
If you have two guests, transfer can go up to several GByte/s on internal transfers.

External transfers are limited by the physical NIC (1 GB/s, about 100 MByte/s).
In your case, you need to connect all VMs to the same vSwitch or transfers are going
over the slower physical NIcs.

Have you enabled sync on those filesystems that are connected to ESXi over NFS?
If so disable sync and recheck performance
 
vmxnet3 speed up traffic from a guest VM to the ESXi virtual switch.
If you have two guests, transfer can go up to several GByte/s on internal transfers.

External transfers are limited by the physical NIC (1 GB/s, about 100 MByte/s).
In your case, you need to connect all VMs to the same vSwitch or transfers are going
over the slower physical NIcs.

Have you enabled sync on those filesystems that are connected to ESXi over NFS?
If so disable sync and recheck performance

OK tested the system again found something interesting. new files I dump onto zfs drive is really fast and when I transfer it back, it's really fast.

ONLY the old files that was in the drive (from freenas) are super slow

zpool version is 5000 ???

zpool.jpg
 
Last edited:
I'm in the process of rebuilding one of my NAS machines following issues with a ST2000DM001 (apparently there is a firmware update that I need to apply to the disks). I had been running in RaidZ mode (Raid5), but want more security of data so am going to RaidZ2 (Raid6) with 6 * ST2000DM001.

This means I'm booting off the USB, which I have been doing on a couple of the other machines. However I'm unable to get the 2nd USB device to show up. The boot device is the lower of the 4 front USB slots, and I want to mirror with a device in the slot above.

The lower slot presents as c2t0d0, and the one above seems to come up as c4t0d0 - I say 'seems' as both USB's appeared in the Napp-It screen (Disks), but both has a status of 'removed', a few minutes later the mirror device no longer appeared.

Anyone have any ideas on how to mirror the zpool when booting from USB on a N40L? There's currently nothing on the machine so I have no issues with rebuilding anything to achieve the required end state (boot off mirrored USB's).

pcd
NB I'm aware that I can periodically 'clone' the USB device - not really what I'm looking for.
 
Can you just mirror the sticks?
You can check the disk state in menu pools.
 
Hey everyone,

I'm entering the final stages of my Napp-it NAS.

What I have run into is an interesting problem.

I have a Supermicro MBD-X10SL7-F-O that has the 2308 controller on it.

I've successfully flashed the onboard to IT mode and it works fine by itself.

If I add in my 9207 card, only the bios for the onboard loads. If I disable the onboard SAS the 9207 loads.

I've tried flashing the 9207 to both IR and IT mode.

What I was trying to accomplish was to have the onboard handle my ZFS drives for Napp-it while the 9207 would be in IR mode hardware mirror on SSD for the ESXI and napp-it Install.

I have the following configuration so far

On the 9207 (IR Mode)
2x 120gig SSD, raid 1 mirror for ESXI and napp-it install

On the onboard 2308 (IT Mode)
2x 256gig SSD ZFS which will host all the other VMs
4x 4TB 7200rpm Seagates ZFS for SMB or iSCSI (fast media storage)
 
I would flash

- one card with bios extension and IR firmware (boot)
- one card with removed bios and IT firmware (data, pass-through)
 
I would flash

- one card with bios extension and IR firmware (boot)
- one card with removed bios and IT firmware (data, pass-through)

Ah thanks! I didn't know that you could remove the bios on them.

I'm having a hard time finding out how to remove the bios - do you happen to have an article on that?

I'm quite familiar with how to update the firmware.
 
Clear the flash (as you should do anyway on an update) and just don't flash a BIOS.

Edit: Specifically, the options -o -e 6 should be used to clear everything except the manufacturer area of the flash. See the sas2flash manual.
 
Clear the flash (as you should do anyway on an update) and just don't flash a BIOS.

Edit: Specifically, the options -o -e 6 should be used to clear everything except the manufacturer area of the flash. See the sas2flash manual.

Ah gotcha

sas2flash -o -e 6

to erase, then

sas2flash -o -f XXXXXXXX.bin

leaving off the -b mptsas2.rom - the bios is the .rom in this, got it
 

So the saga still goes on - I'm hoping this will help someone if they run into this same situation -

1. I have successfully flashed my 2308 onboard to IT without bios

Hooked up to it:
4x Seagate 7200rpm 4gig drives (for ZFS)
2x Samsung 840 EVOs 256 (for ZFS) which will hold the VMs

2. I have successfully flashed my 9207-8i to IR mode (making it a 9217-8i) with bios

This has fixed my bios issues - the mpt/avago bios manager correctly loads which recognizes both LSI cards

However - I found out a major issue with this. I had purchased Kingston 120gig SSDNOW v300's for a raid 1 mirror off the 9207 in IR mode.

On boot it either holds at initialization indefinitely, bugs out a while, one time made it into the bios manager however it failed out almost immediately on the raid 1 creation

If I disconnect the SAS cable on the 9207 this issue goes immediately away. If I plug any other SATA drive (I have a 30gig SSD as well as a SATA DVDR on the same system) it loads perfectly. If I move the samsung 840 EVOs to the controller all the issues go away.

I was pretty stumped and looked up the Kingston drive (I got them off of Amazon last week) and low and behold they have a sandforce controller (ironically owned by LSI). After googling a bit there are issues with trim and LSI controllers, sandforce in general on LSI controllers, etc.

tl:dr - my sandforce controllers on my SSD Kingston's is screwing me over and my LSI 9207 has major issues with them. I didn't think to check for that when buying them.

I haven't checked on them to see if they have any bios updates that fix the issues.
 
Another follow up -

I had purchased a pair of 120gig PNY Optima SSD's from Bestbuy. After reading the tweaktown and other horror stories of how they swapped the SMI controllers for sandforce ones (bait and switch) and were dying frequently I kept them in the box and ordered the Kingston v300's from Amazon not knowing they were sandforce.

Tonight after having the Kingston issues with the LSI, I decided to open up the PNYs. They were not the sandforce ones but the SMI ones so I lucked out. They work perfect and I was able to raid 1 on the 9207 IR as well as installing ESXI. The bios on the 9207 sees all the SSDs in the system and HDs perfectly (now that i have removed the Kingstons). Booting is quick.

The Kingston ones seem to work fine booting them on non-LSI controllers (one of my other systems).
 
Which LSI firmware have you flashed?

There are some bug reports around with the current P20 firmware,
especially with faster disks and SSD.

Suggested workaround
use P19 firmware
 
Which LSI firmware have you flashed?

There are some bug reports around with the current P20 firmware,
especially with faster disks and SSD.

Suggested workaround
use P19 firmware

I flashed the P20 firmware. Now I'm wondering if I should go back to P19

It does seem to be working fine with the Samsung EVOs, SMI controller PNY's and regular hard drives.
 
Can you just mirror the sticks?
You can check the disk state in menu pools.
Gea
I wish I could, you even provide a menu option to do this. The problem is that the boot USB device appears as 'removed' and the candidate mirror appears as removed for a short period of time then vanishes altogether.

This is with OmniOS, which I notice is missing some utilities such as 'top' and 'rmformat'. These are both present in OpenIndiana, not sure if this helps or not. Installing the package that has 'rmformat' allowed me to at least 'see' the devices.

pcd
 
Can you see the sticks when you enter format (console, end with ctrl-c or napp-it cmd form)
You can disable monitoring (top level menu mon). That may keep the removed state.

I have currently no N40 to check the base problem but the needed steps are easy, see
http://omnios.omniti.com/wiki.php/GeneralAdministration#MirroringARootPool
Format only shows the 6 main disks:
Code:
AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <ATA-ST2000DM001-9YN1-CC4H-1.82TB>
          /pci@0,0/pci103c,1609@11/disk@0,0
       1. c1t1d0 <ATA-ST2000DM001-9YN1-CC4H-1.82TB>
          /pci@0,0/pci103c,1609@11/disk@1,0
       2. c1t2d0 <ATA-ST2000DM001-9YN1-CC4H-1.82TB>
          /pci@0,0/pci103c,1609@11/disk@2,0
       3. c1t3d0 <ATA-ST2000DM001-9YN1-CC4H-1.82TB>
          /pci@0,0/pci103c,1609@11/disk@3,0
       4. c1t4d0 <ATA-ST2000DM001-1ER1-CC25-1.82TB>
          /pci@0,0/pci103c,1609@11/disk@4,0
       5. c1t5d0 <ATA-ST2000DM001-1ER1-CC25-1.82TB>
          /pci@0,0/pci103c,1609@11/disk@5,0
Specify disk (enter its number):

rmformat shows the USB devices
Code:
     1. Logical Node: /dev/rdsk/c2t0d0p0
        Physical Node: /pci@0,0/pci103c,1609@12,2/storage@2/disk@0,0
        Connected Device: SanDisk  Ultra Fit        1.00
        Device Type: Removable
	Bus: USB
	Size: 30.6 GB
	Label: <None>
	Access permissions: Medium is not write protected.
     2. Logical Node: /dev/rdsk/c4t0d0p0
        Physical Node: /pci@0,0/pci103c,1609@12,2/storage@1/disk@0,0
        Connected Device: SanDisk  Ultra Fit        1.00
        Device Type: Removable
	Bus: USB
	Size: 30.6 GB
	Label: <Unknown>
	Access permissions: Medium is not write protected.

The disks menu only shows the boot device and indicates that it is removed (formatting stuffed up)
Code:
 id     	 part          	 identify     	 stat 	 diskcap 	 partcap 	 error 	 vendor 	 product 	 sn 
 c1t0d0 	 (!parted) 	 via dd 	 ok 	   	 2 TB 	  S:0 H:0 T:0 	 ATA 	 ST2000DM001-9YN1 	 S2F028NR 
 c1t1d0 	 (!parted) 	 via dd 	 ok 	   	 2 TB 	  S:0 H:0 T:0 	 ATA 	 ST2000DM001-9YN1 	 S2F00KND 
 c1t2d0 	 (!parted) 	 via dd 	 ok 	   	 2 TB 	  S:0 H:0 T:0 	 ATA 	 ST2000DM001-9YN1 	 S2F028W4 
 c1t3d0 	 (!parted) 	 via dd 	 ok 	   	 2 TB 	  S:0 H:0 T:0 	 ATA 	 ST2000DM001-9YN1 	 S2F0291P 
 c1t4d0 	 (!parted) 	 via dd 	 ok 	   	 2 TB 	  S:0 H:0 T:0 	 ATA 	 ST2000DM001-1ER1 	 Z4Z1MPCD 
 c1t5d0 	 (!parted) 	 via dd 	 ok 	   	 2 TB 	  S:0 H:0 T:0 	 ATA 	 ST2000DM001-1ER1 	 Z4Z14834 
 c2t0d0 	 - 	 - 	 removed 	   	 32 GB 	 - 	 SanDisk 	 Ultra Fit
 
Ok, menu disks is using format to detect disks.
You can now
- mirror manually with the disk-ids

What happens if you enable partition support in napp-it
where parted is used instead of format.
>> Menu disks >> Partitions >> enable Partition support
 
Can you see the sticks when you enter format (console, end with ctrl-c or napp-it cmd form)
You can disable monitoring (top level menu mon). That may keep the removed state.

I have currently no N40 to check the base problem but the needed steps are easy, see
http://omnios.omniti.com/wiki.php/GeneralAdministration#MirroringARootPool
Followup to the post above and my initial reply. I followed the instructions, and the pool is currently resilvering. So whilst the deices are appearing for some commands and not for others, blindly following the instructions seems to work!

Resilver complete, Grub installed, system rebooted - All OK
Everything is looking great.
 
Back
Top