OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

There is a bug in the comstarlib.pl file in the ext_iscsi_target_comma_list sub procedure that breaks several of the menu options under Comstar.

To fix it replace the regex expression on the line that reads

$t=~s/ .*//; #remove end statement

with this one

$t=~s/(\t.*)|( .*)//; #remove end statement
 
Hi _Gea,
I have a TLS alert job to email a gmail account that has been sending alert emails every day even though it says disk errors: none. This has been a problem ever since I started using the job in napp-it 0.5. I've tried deleting and re-adding the job without success. Could you look into it? I'm currently running 0.7l on Solaris 11.
 
What is the newest distro based on Illumos?

I've just watched a video posted in another thread where Bryan Cantrill explains what happened with the Oracle takeover, and I'm embarrassed to say I use Solaris 11!
 
Folks,

A couple of questions. Solaris 11 is $1k a year for support, and as much as I dislike Oracle, having support for the OS to be used for an enterprise ZFS deployment makes a lot of sense, as least right now I think it does. :)

Plus you get the latest release of ZFS, what are the downsides other then having to give Oracle money? We've tested FreeNAS, which we love, however the ZFS v15 is a bit behind and won't be updated anytime soon.

We've also looked at using FreeBSD 9 which uses ZFS v28, but I've read that FBSD 9.0 is not very stable at the moment, so we haven't tested it.

Is napp-it the only gui based solution to manage Solaris 11 and ZFS? Are there any built in gui management tools in Solaris 11 to handle ZFS? Is using Solaris 11 as the OS for a NAS NFS enterprise class deployment silly?

Do the dtrace analytics exist in Solaris 11 (web based) like they do in the Oracle 7000 series of appliances?

Thanks, this is an awesome thread btw, much has been learned from it!
 
Plus you get the latest release of ZFS, what are the downsides other then having to give Oracle money?

You'll get the latest release of ZFS after all the core ZFS developers left oracle. IMHO, illumos is where all the best future ZFS features will land.
 
Where is Nexenta's ZFS release based on illumos? Or is that the upcoming 4.0 release?
 
Where is Nexenta's ZFS release based on illumos? Or is that the upcoming 4.0 release?

Correct, version 4.0 is based off of illumos (they are using the illumian distro) as to when its coming out, i hear varying information, but mostly "soon".
 
Folks,

A couple of questions. Solaris 11 is $1k a year for support, and as much as I dislike Oracle, having support for the OS to be used for an enterprise ZFS deployment makes a lot of sense, as least right now I think it does. :)

You could look into Nexenta if you want a paid solution that doesn't pay Larry a cent. This is economical for smaller production deployments because you pay by TB used. napp-it on top of Openindiana/Solaris has no storage limit per $ so it is a good option for some Larger datasets. For backup/secondary storage needs I think it is a great option.

I'm sure Gea wouldn't mind if you donated $1k a year to him though ;)
 
Latent,

Thanks for your reply! Do you know anything about Opensolaris and dtrace gui analytics or if anything is there with napp-it?

I can't find anything on napp-it that says one way or the other, so I assume no?

Thanks!
 
Latent,

Thanks for your reply! Do you know anything about Opensolaris and dtrace gui analytics or if anything is there with napp-it?

I can't find anything on napp-it that says one way or the other, so I assume no?

Thanks!

Sorry I know very little about dtrace sorry. I think most of the dtrace functionality falls outside of the core of what napp-it does. napp-it just gives you a simple web interface to manage the ZFS storage of a machine and manage sharing it out via NFS/iscsi etc. A machine only being used as a storage server has little need for the advanced features of dtrace I would think. I would look into joyant's smartOS which does use dtrace.
 
Lately I'm having problems with jobs...

The job, that should send me the status report hangs... When I check in nappit, it says processing, but that can last for days, until I stop it. After that, it send 1 report successfully but the next day it hangs again. If I do it manually from nappit, I can send as many of them as I want and all are successful.

What could that be and where can I check for problems?

nappit: 0.7l
OI 151

Matej
 
i've tried to update openindiana, but went out of disk space. openindiana crashed. and wont boot anymore. i can only boot to prompt.

i've given openindiana more disk space in vsphere esxi but that didnt work. can i delete some old snapshots from OI commandline? the one napp-it generates when installing? please help. other servers are down as well because of the storage of vm's on OI.

:(
 
i've tried to update openindiana, but went out of disk space. openindiana crashed. and wont boot anymore. i can only boot to prompt.

i've given openindiana more disk space in vsphere esxi but that didnt work. can i delete some old snapshots from OI commandline? the one napp-it generates when installing? please help. other servers are down as well because of the storage of vm's on OI.

:(

sudo beadm destroy snapname
 
Lately I'm having problems with jobs...

The job, that should send me the status report hangs... When I check in nappit, it says processing, but that can last for days, until I stop it. After that, it send 1 report successfully but the next day it hangs again. If I do it manually from nappit, I can send as many of them as I want and all are successful.

What could that be and where can I check for problems?

nappit: 0.7l
OI 151

Matej

Status=processing is ok for alert and statusjobs when when they are once started.
There is also no difference in actions bteween manual and timer start.

I would try to disable auto (delete cronjob) and re-enable (recreate cronjob)
or delete the job and recreate it.
 
Hi _Gea,
I have a TLS alert job to email a gmail account that has been sending alert emails every day even though it says disk errors: none. This has been a problem ever since I started using the job in napp-it 0.5. I've tried deleting and re-adding the job without success. Could you look into it? I'm currently running 0.7l on Solaris 11.

Beside disk errors, you get alerts if disk is more than 85% filled.
Could that be the reason?
 
Is napp-it the only gui based solution to manage Solaris 11 and ZFS? Are there any built in gui management tools in Solaris 11 to handle ZFS? Is using Solaris 11 as the OS for a NAS NFS enterprise class deployment silly?

Do the dtrace analytics exist in Solaris 11 (web based) like they do in the Oracle 7000 series of appliances?

1.
I have not seen any other GUI management tools for Solaris 11
A basic Web-GUI was only available in older OpenSolaris versions

2.
The storage boxes from Sun/ Oracle have extras, not included in Solaris 11
 
There is a bug in the comstarlib.pl file in the ext_iscsi_target_comma_list sub procedure that breaks several of the menu options under Comstar.

To fix it replace the regex expression on the line that reads

$t=~s/ .*//; #remove end statement

with this one

$t=~s/(\t.*)|( .*)//; #remove end statement

thanks, i will check
 
@_Gea,

Does napp-it work on the Illumian build mentioned earlier on this page (post #2806)?
 
I've been running my solaris NAS for a while now and everything has been great but I just tired to connect with ssh (password or public key) and it just sits at the screen showing the last login...

I'm running napp-it 0.7l nightly Feb.23.2012 on Solaris 11

everythning else works perfectly, I can access my files, run napp-it, etc...

any ideas?

I've tried restarting the ssh service with svcadm and nappit but it didn't seem to do anything.

edit: I enabled allow root login and I can login as root via ssh, but none of my other user accounts can log in

when logged in as root I keep getting this message in the console:

cron[2599]: [ID 574227 user.alert] Solaris_audit invalid audit flag lo\:no: Invalid argument

edit2: finger as root reports my other user account it logged in but I just dont get a prompt.

any time I click a link in napp-it I get more audit errors in the root console.

edit3:

getting this error in /var/mail/root:

Use of uninitialized value in subroutine entry at /usr/perl5/5.12/lib/i86pc-solaris-64int/DynaLoader.pm line 223.

edit4: I renamed /etc/profile /etc/profile.bak and now I can log in so something is messed up in my default profile but I'm not sure what.
here is my /etc/profile: http://pastebin.com/6TSQTtSK
 
Last edited:
- the perl info is only a warning, should not happen with current napp-it
- the audit error is also not a problem, but i have no idea how to supress
- you should be able to ssh as any regular user (created during installation or via CLI)
and manage the box via sudo
 
Last edited:
Couple of minor problem with Nappit 0.7l comstar so far.
1. iSCSI target created successfully but when I attempt to delete or edit it, nappit said no iSCSI target found. I managed to delete it manually using the following commands.
stmfadm offline-target iqn.2010-09.org.openindiana:02:b25cd8d6-62eb-ec5e-f753-f61034bc6cdc
itadm delete-target iqn.2010-09.org.openindiana:02:b25cd8d6-62eb-ec5e-f753-f61034bc6cdc

2. Minor GUI issue.
2a. Volume should be under it's own menu rather than classified under "POOL".
2b. Menu should be in sequence of Disks, Pools, ZFS Folders, Volume as this would be in order of workflow since I would create ZFS folder before creating Volume.

3. Create a couple of volumes and LU / targets . Rebooted and the targets wasn't there. I had to recreate them again. Not sure where the problem lies...

4. If I install NAPPIT on openindiana server, I could see the commands from NAPPIT being passed across and that help to troubleshoot, but with NAPPIT on a openindiana workstation (GNOME), I couldn't see those commands, is there anyway to see those commands being pass across.
 
Beside disk errors, you get alerts if disk is more than 85% filled.
Could that be the reason?

No, I'm under that threshold. It is a disk error alert. (Also, it would be awesome if you could separate out disk space from disk failure alerts). Let me know if there are any logs I can provide which would be of help.
 
Good day,

Gea: Thank you very much for your fantastic front end and for all the work you have put into it.

I am running napp-it 0.7l nightly Feb.23.2012 on OpenIndiana 151

Still just playing with it before I move from Freenas 8 to OI.

I have the Autojob set to every 1 minute.
I have the Alert email set to "every" in all the fields. (in other words, every minute.)

If I pull a drive out of the Raidz1 pool, after a few minutes, I can see the degraded status in Napp-It / Pools.

Unfortunately it takes about 27 hours AFTER pulling it (leaving the pool in degraded status) before I get an alert email.

I have re-created the job several times before and given up after an hour waiting for the alert email to arrive. Manually executing the job works OK everytime and the SMTP test works every time.

There is only a 5 min delay between sending email and receiving it from my ISP.

The Alert email only arrived after I left the pool in a degraded status and waited..... for 27 hours.

Is there anything I can do to fix this to get an email sent ASAP (1 min or even 15 mins) after failures?

Thank You


EDIT: Alert Email headers show that email was sent at 21:31 to my ISP on 11-03-2012. I failed the drive and degraded the pool at 18:15 on 10-03-2012 which it promptly showed in the Pool-Status about 3~5mins after pulling the drive.
 
Last edited:
No, I'm under that threshold. It is a disk error alert. (Also, it would be awesome if you could separate out disk space from disk failure alerts). Let me know if there are any logs I can provide which would be of help.

look at the jobfile in /var/web-gui/data/napp-it/_log/jobs/"jobid".pl
The trigger is a zpool status when it returns a degraded or unavail

Prior of sending alerts, it checks if there is a file file ../_log/auto_mail_alert1.log
when this file is not there a mail is send

After sending, it creates the logfile ../_log/auto_mail_alert1.log to prevent multiple alerts per day
this logfile is deleted prior all actions if older than 1 day

Time of alerts:
If a disk is removed, this must not affect zpool status immediatly.
You need a disk access to trigger a failure in zpool status


For tests, you can comment lines in this script.
If you need a resend of mails, you may comment the logfile check or delete the logfiles
auto_mail_alert1.log and auto_mail_alert2.log (capacity error)


ps
i will add the option to select alert reasons in a future version
 
look at the jobfile in /var/web-gui/data/napp-it/_log/jobs/"jobid".pl
The trigger is a zpool status when it returns a degraded or unavail

Prior of sending alerts, it checks if there is a file file ../_log/auto_mail_alert1.log
when this file is not there a mail is send

After sending, it creates the logfile ../_log/auto_mail_alert1.log to prevent multiple alerts per day
this logfile is deleted prior all actions if older than 1 day

Time of alerts:
If a disk is removed, this must not affect zpool status immediatly.
You need a disk access to trigger a failure in zpool status


For tests, you can comment lines in this script.
If you need a resend of mails, you may comment the logfile check or delete the logfiles
auto_mail_alert1.log and auto_mail_alert2.log (capacity error)


ps
i will add the option to select alert reasons in a future version

Thanks for the info Gea. Here is the zpool status I have:

pool: data1
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scan: scrub repaired 0 in 9h13m with 0 errors on Mon Mar 5 15:13:06 2012

(I want to keep this pool at v28 for portability reasons).

Here is the content of the log file:
--alert 1321948776 at . :

Alert/ Error on rse-server from :

-disk errors: none

------------------------------
zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
rpool 232G 33.5G 199G 14% 1.00x ONLINE -
rpool2 55.5G 206K 55.5G 0% 1.00x ONLINE -
data1 14.5T 11.1T 3.44T 76% 1.00x ONLINE -
 
Thanks for the info Gea. Here is the zpool status I have:

pool: data1
state: ONLINE
status: The pool is formatted using an older on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on older software versions.
scan: scrub repaired 0 in 9h13m with 0 errors on Mon Mar 5 15:13:06 2012

(I want to keep this pool at v28 for portability reasons).

Here is the content of the log file:
--alert 1321948776 at . :

Alert/ Error on rse-server from :

-disk errors: none

------------------------------
zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
rpool 232G 33.5G 199G 14% 1.00x ONLINE -
rpool2 55.5G 206K 55.5G 0% 1.00x ONLINE -
data1 14.5T 11.1T 3.44T 76% 1.00x ONLINE -


is this an old job from 0.5 or have you deleted/ recreated with 0.7
if not, try that
 
So I'd like to run an OI+Napp-IT performing SMB for an Active Directory. This in itself is working fine, except OI+Napp-IT+LDAP doesn't appear to support OSX Lion (or vise-versa). All the Windows machines work fine and the Linux workstations work fine, but the OSX boxes act strangely... First they prompt for a username and password as expected, they appear to authenticate fine (bringing up the list of available SMB shares) and then when you pick one to mount, it mounts but wont let you use it as it says you don't have permission. The user(s) in question have specific "Full" permissions set via Windows Explorer from a windows workstation with the user mapped to root.

I just tested the exact same configuration on Nexenta and it appears to works fine with OSX Lion.

Anyone seen this and have a reasonable workaround for OI+Napp-IT? I've got as mixed an environment as you can have and need everything to play happily with access restrictions in place from AD.
 
So I'd like to run an OI+Napp-IT performing SMB for an Active Directory. This in itself is working fine, except OI+Napp-IT+LDAP doesn't appear to support OSX Lion (or vise-versa). All the Windows machines work fine and the Linux workstations work fine, but the OSX boxes act strangely... First they prompt for a username and password as expected, they appear to authenticate fine (bringing up the list of available SMB shares) and then when you pick one to mount, it mounts but wont let you use it as it says you don't have permission. The user(s) in question have specific "Full" permissions set via Windows Explorer from a windows workstation with the user mapped to root.

I just tested the exact same configuration on Nexenta and it appears to works fine with OSX Lion.

Anyone seen this and have a reasonable workaround for OI+Napp-IT? I've got as mixed an environment as you can have and need everything to play happily with access restrictions in place from AD.

This is originally a Lion Bug but already fixed in Illumos.
https://www.illumos.org/issues/1718

The bugfix is included in current Nexenta, Illumian and OpenIndiana 151a1+ (and Solaris 11)
You need to update OI http://wiki.openindiana.org/oi/oi_151a_prestable1+Release+Notes
 
Last edited:
Gea,

Is it possible to get Napp-It to delete the auto_mail_alert1.log and auto_mail_alert2.log files when we make changes to the email Alert/Status? Or if the Alarm reason is cleared (eg: disk replaced - Pool healthy), the auto_mail_alert1.log is deleted in case another alarm is triggered?

That way when we make changes/test, the Alerts will arrive promptly.

I think my problem was that the auto_mail_alert1.log and auto_mail_alert2.log was there from previous testing when I made changes, so Napp-It did not send any emails when I failed my drives in the latest testing until the 24hr timer expired.

Also, what happens if a drive is restored, the drive is re-silvered and the Pool is healthy and then another drive fails in the same 24 hour period?

From what I read, we won't get any alert email until the next day? Is that correct?


Thank You
 
I'm looking for a documentation manual for illumian, will the OI wiki apply to illumian as well? I'm considering the old opensolaris bible but i'm not sure if that's sufficient enough for todays fork.
 
the open indiana wiki links back to oracle/solaris documentation quite a bit. i would wager most of the opensolaris documentation is still 100% valid.
 
Gea,

Is it possible to get Napp-It to delete the auto_mail_alert1.log and auto_mail_alert2.log files when we make changes to the email Alert/Status? Or if the Alarm reason is cleared (eg: disk replaced - Pool healthy), the auto_mail_alert1.log is deleted in case another alarm is triggered?

That way when we make changes/test, the Alerts will arrive promptly.

I think my problem was that the auto_mail_alert1.log and auto_mail_alert2.log was there from previous testing when I made changes, so Napp-It did not send any emails when I failed my drives in the latest testing until the 24hr timer expired.

Also, what happens if a drive is restored, the drive is re-silvered and the Pool is healthy and then another drive fails in the same 24 hour period?

From what I read, we won't get any alert email until the next day? Is that correct?
Thank You

current behaviour of new alert jobs in napp-it 0.7:
The logfiles (hindered multiple alerts per day) are created after an alert mail was send

The logfiles are deleted after one day or after a check that reports no failure.
(You get new alerts immediatly if you repair a disk and another disk fails)

If you like another behaviour, you can edit or duplicate jobfiles manually with
minimal scripting knowledge.
 
current behaviour of new alert jobs in napp-it 0.7:
The logfiles (hindered multiple alerts per day) are created after an alert mail was send

The logfiles are deleted after one day or after a check that reports no failure.
(You get new alerts immediatly if you repair a disk and another disk fails)

If you like another behaviour, you can edit or duplicate jobfiles manually with
minimal scripting knowledge.

Thank you kindly for the reply and clarifying that.

The default behaviour is perfect for my needs.
 
Hello again Gea,

I'm having trouble with my OI box reconnecting to my AD server. If I remove its entry in the directory and attach it, it works, but upon reboot it never connects to the directory again and wont until I remove the entry in the directory again and manually join it again. The only thing I see in the logs is this over and over:

smbd[769]: [ID 702911 daemon.notice] smbd_dc_update: aja.com: located

However it never actually connects and AD login attempts against SMB shares fail.

Any ideas?

Thanks!
 
Hello again Gea,

I'm having trouble with my OI box reconnecting to my AD server. If I remove its entry in the directory and attach it, it works, but upon reboot it never connects to the directory again and wont until I remove the entry in the directory again and manually join it again. The only thing I see in the logs is this over and over:

smbd[769]: [ID 702911 daemon.notice] smbd_dc_update: aja.com: located

However it never actually connects and AD login attempts against SMB shares fail.

Any ideas?

Thanks!

Did you use lmauth-level 3 with OI 151 a1?
On former OI versions you need mostly level 2

Did you tried different levels?
With napp-it 0.7l level 3 is now default in menu service-SMB-Active Directory
where you can try also other levels
 
Hello again Gea,

I'm having trouble with my OI box reconnecting to my AD server. If I remove its entry in the directory and attach it, it works, but upon reboot it never connects to the directory again and wont until I remove the entry in the directory again and manually join it again. The only thing I see in the logs is this over and over:

smbd[769]: [ID 702911 daemon.notice] smbd_dc_update: aja.com: located

However it never actually connects and AD login attempts against SMB shares fail.

Any ideas?

Thanks!

duplicate post
 
Back
Top