OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

I've been using OpenIndiana with napp-it for 2 years and it's running good. today i tried OmniOS so I export my poolz from Openindiana and import to OmniOS but when I click to pool it shows error like this, do I need to do anything so I can import to OmniOS

Software error:
Can't load '/var/web-gui/data/napp-it/CGI/auto/IO/Tty/Tty.so' for module IO::Tty: ld.so.1: perl: fatal: /var/web-gui/data/napp-it/CGI/auto/IO/Tty/Tty.so: wrong ELF class: ELFCLASS32 at /usr/perl5/5.16.1/lib/i86pc-solaris-thread-multi-64/DynaLoader.pm line 190.
at /var/web-gui/data/napp-it/CGI/IO/Tty.pm line 30.
Compilation failed in require at /var/web-gui/data/napp-it/CGI/IO/Pty.pm line 7.
BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/CGI/IO/Pty.pm line 7.
Compilation failed in require at /var/web-gui/data/napp-it/CGI/Expect.pm line 22.
BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/CGI/Expect.pm line 22.
Compilation failed in require at /var/web-gui/data/napp-it/zfsos/_lib/illumos/zfslib.pl line 2758.
BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/zfsos/_lib/illumos/zfslib.pl line 2758.
For help, please send mail to this site's webmaster, giving this error message and the time and date of the error.
Software error:
[Tue Mar 24 10:29:56 2015] admin.pl: Can't load '/var/web-gui/data/napp-it/CGI/auto/IO/Tty/Tty.so' for module IO::Tty: ld.so.1: perl: fatal: /var/web-gui/data/napp-it/CGI/auto/IO/Tty/Tty.so: wrong ELF class: ELFCLASS32 at /usr/perl5/5.16.1/lib/i86pc-solaris-thread-multi-64/DynaLoader.pm line 190.
[Tue Mar 24 10:29:56 2015] admin.pl: at /var/web-gui/data/napp-it/CGI/IO/Tty.pm line 30.
[Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at /var/web-gui/data/napp-it/CGI/IO/Pty.pm line 7.
[Tue Mar 24 10:29:56 2015] admin.pl: BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/CGI/IO/Pty.pm line 7.
[Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at /var/web-gui/data/napp-it/CGI/Expect.pm line 22.
[Tue Mar 24 10:29:56 2015] admin.pl: BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/CGI/Expect.pm line 22.
[Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at /var/web-gui/data/napp-it/zfsos/_lib/illumos/zfslib.pl line 2758.
[Tue Mar 24 10:29:56 2015] admin.pl: BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/zfsos/_lib/illumos/zfslib.pl line 2758.
Compilation failed in require at admin.pl line 463.
For help, please send mail to this site's webmaster, giving this error message and the time and date of the error.
[Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: Can't load '/var/web-gui/data/napp-it/CGI/auto/IO/Tty/Tty.so' for module IO::Tty: ld.so.1: perl: fatal: /var/web-gui/data/napp-it/CGI/auto/IO/Tty/Tty.so: wrong ELF class: ELFCLASS32 at /usr/perl5/5.16.1/lib/i86pc-solaris-thread-multi-64/DynaLoader.pm line 190. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: at /var/web-gui/data/napp-it/CGI/IO/Tty.pm line 30. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at /var/web-gui/data/napp-it/CGI/IO/Pty.pm line 7. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/CGI/IO/Pty.pm line 7. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at /var/web-gui/data/napp-it/CGI/Expect.pm line 22. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/CGI/Expect.pm line 22. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at /var/web-gui/data/napp-it/zfsos/_lib/illumos/zfslib.pl line 2758. [Tue Mar 24 10:29:56 2015] admin.pl: [Tue Mar 24 10:29:56 2015] admin.pl: BEGIN failed--compilation aborted at /var/web-gui/data/napp-it/zfsos/_lib/illumos/zfslib.pl line 2758. [Tue Mar 24 10:29:56 2015] admin.pl: Compilation failed in require at admin.pl line 463.
 
This error message is related to Expect, a software module that is needed to run interactive console commands within the GUI.

Napp-it comes with different versions of Expect (Linux, OI, different OmniOS versions).
It usually detects the needed version during a login to napp-it, so first try a napp-it logout/login.

If this does not help, please add infos about the OmniOS and napp-it release.
 
still the same problem after log out/log in.

I'm using Current stable release (r151012, omnios-10b9c79) and nappit 0.9f4 eval
 
Depending on the reason, you have three options
- redo a basic napp-it setup via wget (helps if something went wrong with the setup)

or copy the two possible OmniOS Expect modules manually (try both)
cp /var/web-gui/data/tools/omni_bloody/CGI/* /var/web-gui/data/napp-it/
cp /var/web-gui/data/tools/omni_stable/CGI/* /var/web-gui/data/napp-it/
(replaces the Expect files in CGI)

or
redo a OmniOS setup followed by a napp-it wget setup
 
Last edited:
OK, i did a fresh install of OmniOS and then napp-it setup, and the problem is gone I was able to import my pools. Then I start to install transmission. The installation went fine, after reboot transmission running but the same problem again. So I guess it's the problem between napp-it and transmission. These are the dependent packages that I install together with transmission and pkg-config:

pkg install omniti/library/libevent
pkg install library/perl-5/xml-parser

so it seems that the error relates to perl, but Im not sure how it can be fixed so both can run together
 
Okay, I'm throwing in the towel.. I need help!

I'm trying to move the metadata of my PMS to an nfs exported ZFS filesystem. It all seems to work fine: I can mount the exported filesystem and I've copied the old PMS metadata to the mounted filesystem.

Now PMS won't start, and I think the issue is permissions or ownership.

Code:
<username>@Plex:/var/lib/plexmediaserver/Library/Application Support$ ls -l
total 9
drwxrwxrwx 10 messagebus users 11 Mar 25 12:32 Plex Media Server

The filesystem is mounted as folder "Plex Media Server".
I've created a user and group named "plex" on the omnios fileserver and changed ownership of the filesystem to plex: plex, but the above is what I see on the client.

Any ideas?
 
Does Power management work in OmniOS, I set my disks to spin down after 900s, but it seems they keep spinning all the time.

Thanks
 
I have not heard of a general problem unless you or any service like napp-it alerts or fmd (Solaris fault management service that can be disabled) hits a disk.
 
I installed Logitech Media Server, so I guess it keeps checking disks all the time making them not spinning off
 
Hello!

We are having issues with iSCSI on work. Every now and then iSCSI target just hangs up. We are unable to kill it, restart it or do anything else to restore the service. The only option to restore iscsi target to working state, is to reboot the whole server and loose all the sessions (around 100 clients).

Weird thing is, that only iscsi target hangs. I can ssh to server and work on it without any problem, there is no load or anything else, just iscsi target locks up:(

Server is a IBM 3550 M4 with dual Xeon E5-2640 CPUs and 160GB of memory.
SAS HBA: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2

Anyone encounter similar troubles?

Matej
 
I had a similar problem recently on a test machine where I needed a reboot but
I found not the time since to dig deeper into this problem.

If this is on OmniOS, you should ask at the Illumos or OmniOS discuss mailing list (http://lists.omniti.com/mailman/listinfo/omnios-discuss and http://wiki.illumos.org/display/illumos/illumos+Mailing+Lists ). Maybe someone at OmniTi can give an answer even without a support contract. (That can be helpful in a production environment)

btw
every Illumos/OmniOS user should join the Illumos and OmniOS mailling lists as there a more technical informations there compared to a more general forum like this.
 
I am trying to configure media tomb and want it to scan the directories I have for media but it doesn't seem to pick anything up. All that shows up is some pictures from my home directory - which i don't want to share.
 
_Gea: Thanks for your input.

I will give OmniOS mailing list a try. I already wanted to send a mail there as well.

In case you get any more information on this issue, please post here.

MAtej
 
I am having trouble with all my hourly snaps not being maintained as expected&#8212;specifically they are accumulating in the hundreds despite their jobs' keep/hold parameters.

As an example, I have an hourly job with the following properties:

Opt1 /from: "keep 50"
Opt2 /to: "hld 3"
Opt3: ""
every month
every day
every 1-BC hour
15 min

but it has accumulated hundreds of snaps. After the first time I noticed the accumulation, I deleted all but fifteen, but the number has built up again.

Please does anyone know if this is a problematic configuration?
 
The above setting creates 48 snaps per day.
As keep and hold are respected both, hold 3 days is the effective setting
what means that you should have 144 snaps

If snap is recursive, you have 144 snaps x number of filesystems.
 
With those settings, I'm only seeing 12 snaps per day—at fifteen past the hour from 0615 to 1715—but I have 359 of them now since 01 March and climbing.

The above setting creates 48 snaps per day.
As keep and hold are respected both, hold 3 days is the effective setting
what means that you should have 144 snaps

If snap is recursive, you have 144 snaps x number of filesystems.
 
I got a pro evaluation today and have a few questions:

1. If I start a replication that takes longer than the eval, will it stop?

2. How much is it for just the async moduel for a non-gov non-educational US company?

3. When I add my 2nd Solaris server that I want to back up via "++add appliance", it says "wrong password". It resolved the IP to hostname correctly so there is definitely a connection. I'm 100% sure the password is correct because I use it to login to the nap-itt config every day.
 
If anyone encounters troubles with iscsi target freezing with JBOD SAS expander box, here is a discussion:
http://lists.omniti.com/pipermail/omnios-discuss/2015-March/004593.html

It doesn't look good though. It looks like SATA and SAS expanders don't go well together, specially on servers with high load (average IOPS on our server is 7k/s read and 1k/s write). There might be troubles when a drive in array hangs and controller sends a restart command. In some cases, the whole SAS expander resets and all commands in queue and on bus are lost, producing panic on a system:)

In my case, sometimes only some hosts loose a connection, sometime all hosts drop,...

Matej
 
Matej -

What expander part # are you using?

Is this occurring on all your machines or just one?
 
Backplane
- BPN-SAS2-837EL2 +
- BPN-SAS-837A

Hmmmm. Are you using both ports, just one? Sata or SAS drives or mix?

I have to check tomorrow at work, but I think I'm only using one port and all drives are SATA, which I know now, can be the cause of the problems with it's poor error handling...

I get some errors in logs, but can't access them right now, will have to post tomorrow. They are messages about drive reset, but I can't figure out which drive is causing problems.

MAtej
 
1. Which disks make/manufacturer?

2. Are you passing the disks to ZFS for management or using hardware raid and passing the array through or?
 
1. Which disks make/manufacturer?

Seagate Constellation ES.3 4TB, model ST4000NM0033-9ZM170

2. Are you passing the disks to ZFS for management or using hardware raid and passing the array through or?

We are passing disks to ZFS for management. Disks -> SAS expander -> SAS HBA LSI SAS2308.

And we are only using one port on a JBOD.

Matej
 
Your options are mainly

- check HBA firmware (P20 has problems, use P19 then)
- depends on OmniOS release - timeout handling on OmniOS 151012 is better than on previous releases,
so update, optionally wait some days as 151014 long term stable is expected next week (with update option from 151006 and up)

- it seems that one or some disks are causing the problem as the problem was not there from the beginning.
if you have some indications in logs or smartvalues, replace that disk(s) - best with the SAS Constellation ES3 (they cost quite the same as the Sata one)

in general
- prefer expanderless multiple HBA solutions with sata disks
 
I checked the firmware with lsiutil software and it is reporting version 15.00.00, so I guess quite old:
Current active firmware version is 0f000000 (15.00.00)
Firmware image's version is MPTFW-15.00.00.00-IT
LSI Logic
Not Packaged Yet
x86 BIOS image's version is MPT2BIOS-7.29.00.00 (2012.11.12)
EFI BIOS image's version is 7.22.01.00

I will try and update it to P19.

- I heard about 151014 release and that there should be some nice mpt_sas updates in as well... I will probably wait for another month for all the issues to rise and then do an upgrade. I could do a new BE and upgrade ASAP, but I'm affraid to in case something goes wrong:)

- it looks like one or more drives are misbehaving, but I can't figure out which. I have a bunch of errors like this:
Apr 2 09:23:15 storage.host.org scsi: [ID 107833 kern.notice] /pci@0,0/pci8086,3c02@1/pci1000,3040@0 (mpt_sas0):
Apr 2 09:23:15 storage.host.org Timeout of 0 seconds expired with 1 commands on target 68 lun 0.
Apr 2 09:23:15 storage.host.org scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,3c02@1/pci1000,3040@0 (mpt_sas0):
Apr 2 09:23:15 storage.host.org Disconnected command timeout for target 68 w500304800039d83d, enclosure 3
Apr 2 09:23:15 storage.host.org scsi: [ID 365881 kern.info] /pci@0,0/pci8086,3c02@1/pci1000,3040@0 (mpt_sas0):
Apr 2 09:23:15 storage.host.org Log info 0x31140000 received for target 68 w500304800039d83d.
Apr 2 09:23:15 storage.host.org scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc
How do I find out which device/drive is the one causing me problems?
If I do 'zpool status' or 'cfgadm -vla', I can't see 'w5003...' WWN drive, all drives have WWN in 'w5000....' format.
I would like to look at SMART stats, but Seagate smart reporting sux, since they don't report the actual number of errors. Example:
1 Raw_Read_Error_Rate 0x000f 068 063 044 Pre-fail Always - 6424418
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 092 060 030 Pre-fail Always - 1773170504
195 Hardware_ECC_Recovered 0x001a 062 003 000 Old_age Always - 6424418

Which parameters do you usually look at?
We currently have around 140TB of storage in 2 JBODs and will attach 2 more, so I can't imagine going without expanders, although I would love to:) Also, new drives will be all SAS, but 3 years ago, when we bought the current JBODs, SAS drives were still very expensive...

Matej
 
Your problem is target 68, mpt_sas 0

If you are using napp-it with the monitor extension, you find the target to WWN/ slot translation
in menu Disks >> SAS2 extension

sas2_slots.png
 
Last edited:
I'm not the admin of the iscsi server and the current admin prefers not to install napp-it.

Can I do it from CLI? Which tools do you use to get target number and mpt_sas number? I guess I could somehow produce the same result using CLI only.

MAtej
 
Last edited:
1.
once started it will run

2.
replication extension, options see
http://napp-it.org/extensions/quotation_en.html

3.
enter the ip of the source server with its napp-it admin pw (not root pw)
some char are not allowed in your password, optionally try a pw from [A-Za-z0-9]

Thanks Gea, we just bought the Async extension. Napp-it has been great for us, we thank you a million times.

And you were right, we had a special character in the password which it didn't like.
 
Any thoughts on a 4 TB difference between a host/target when using replication extension? On the server that has the original data, the Snapshot Refer shows 16TB, on the backup server it's only showing 12TB. I thought maybe the backup got interrupted somehow so I ran the job again but there was no change.
 
Most probably

- you have additional snaps on source
the replication parameter -I includes them otherwise only the basic and last are used (default)

- replication decodes compress or dedup,
if source is not compressed but target (parent) is, then this can be the result

- suboptimal ZFS layout on source, example nonoptimal amount of 4k disks in a raid-Z
(would not give a 30% difference but can increase other effects)
 
:hello:

Are all multiple TB drives the exact same size ?

When I built my only vdev I used all kinds of 2TB drives I had, in part because that's what I had and in part because I thought that way the size of the smallest one would be used on the others, so I can replace any drive with any 2TB I have in case of failure.

Was that necessary ? I've received 20 HGST NAS 4TB and plan a vdev with them, so I want to be sure I can replace them with 4TB WDs or 4TB Seagates if the need arises.
 
Most probably

- you have additional snaps on source
the replication parameter -I includes them otherwise only the basic and last are used (default)

Perhaps I'm miss-reading it, but isn't the 15.6 the size of full dataset, not including previous snapshots?

pool3d/U pool3d/U@1427812105_repli_zfs_xxx-frontend-02_nr_1 Tue Mar 31 10:29 2015 878M - 15.6T delete

- replication decodes compress or dedup,
if source is not compressed but target (parent) is, then this can be the result

No dedup but compression is on both target and parent.

- suboptimal ZFS layout on source, example nonoptimal amount of 4k disks in a raid-Z
(would not give a 30% difference but can increase other effects)

Ashift is showing 12 for all VDEVs on both servers.

Backup server is RAID-Z2, source is RAID 10 if that matters.


eMp7329.jpg


EXcBAsj.jpg
 
Its hard to compare a snapsize with a filesystem without knowing all details.

What should happen.
If you transfer a filesystem without snaps, compress or dedup from one server
to another with same disks and pool layout, the size of source and newly created
target filesystem should be quite the same.

If some is different, result is different.
If you are in doubt that everything is transferred, create a filelist on both and compare.
But usually a zfs send that ends without an error is very trusty.
 
Back
Top