OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

mikeo

Gawd
Joined
May 17, 2006
Messages
660
On openindiana 151 how do I replace disks in a pool set to ashift=9 with 4k sized disks? Is there a way to have the 4k disks emulate 512b ones so I can add them to the ashift=9 pool? It seems that the -o ashift=9 doesn't exist in the zpool replace command with this version?
 

markrhf

n00b
Joined
Nov 15, 2013
Messages
4
For some reason, my snapshot jobs aren't running after updating to 0.9f1 on OmniOS v11 r151008 running under ESXi 5.5. If I activate .9e1 again, everything works fine. I get the same thing if I click "run now", with f1 it fails to run but e1 runs fine.

I've rebooted the server after each version change. Anything specific I can look for to see why the jobs aren't working? I tried creating new snapshot jobs and they won't run either. Also, under f1. the jobs log files don't update while they do under e1

Thanks!
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
I have checked Omni 1008 with napp-it 0.9f1 (Aug 18.)
Autosnap works without a problem on my setup

If you have an older 0.9f1 (example a preview, update/download newest)

What you can do:
Start the snapjob per CLI (example with Putty where you can copy paste commands with a mouse right-click)

perl /var/web-gui/data/napp-it/zfsos/_lib/scripts/job-snap.pl run_jobid
example (with jobid from job menu);
perl /var/web-gui/data/napp-it/zfsos/_lib/scripts/job-snap.pl run_1408958237

There is no return value on success but maybe you get an error that helps.
You can also check the snap-scripts if you have some basic scripting knowledge.


ps
with current napp-it (0.9e1+) you do not need ro reboot on update/downgrade
 

Lipe

n00b
Joined
May 4, 2010
Messages
29
Hey _Gea, can i ask for some advice?

Im an longtime user of Nexenta Community but my new pool exceeds their 18Tb policy, so i had to migrate to a new OS.

After reading a lot, i choose OmniOS + NappIT, which we are going to upgrade to PRO as soon as the Trial expires, but my first benchmark wasnt that great. I tought Nexenta, which is knew for its UI cpu usage, would have a lower performance among all other distributions.

Nexenta Community 3.1.5 - Bonnie Benchmark:
edit


OmniOS + NappIT (both using latest versions) - Bonnie Benchmark:
edit


Any tips? Maybe some optimization parameters needed? I noticed a big difference in the CPU usage also.

Machine configuration, which i know lack some memory:

Xeon 3440
Supermicro X8SIL-F
12Gb DDR3 ECC Memory
8x4TB WD Red Disks

Best regards,
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
hello Lipe

It is hard to compare NexentaStor 3 that is based on OpenSolaris 134 with its own optimazations with Opensolaris 148+ that is the base of Illumos and therefor NexentaStor 4 and OmniOS.

Mostly I would not expect a huge difference regarding performance between them - at least with SMB and NFS when settings like compress, dedup and sync are disabled. GUI performance is not relevant to storage performance.
 

markrhf

n00b
Joined
Nov 15, 2013
Messages
4
Thanks Gea,

I ran all jobs from the command line and they worked fine so I'll do some digging to see why it won't run as a job or from the web interface. It's not urgent so when I get time I might just do a fresh install and export/import the pool.

One thing I just noticed is that under 0.9f1, I am unable to delete jobs, even ones that were created under f1. time to try a fresh install!
 
Last edited:

Lipe

n00b
Joined
May 4, 2010
Messages
29
hello Lipe

It is hard to compare NexentaStor 3 that is based on OpenSolaris 134 with its own optimazations with Opensolaris 148+ that is the base of Illumos and therefor NexentaStor 4 and OmniOS.

Mostly I would not expect a huge difference regarding performance between them - at least with SMB and NFS when settings like compress, dedup and sync are disabled. GUI performance is not relevant to storage performance.
Hi mate, thanks for the feedback.

Let me know if you have any advices. The pool was created on a NexentaStor 4 machine and imported on OmniOS+NappIT. Not sure if that can be a issue, maybe the disks were misformatted.

In the past we had a lot of trouble with WD40 disks and ashift alignment, not sure if this still apply.

Congrats for the great work with this GUI. It exceed all my expectations.

Best regards
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
If you import the pool and use the same settings (dedup, compress)
on the same hardware the bonnie results should be comparable.

What you can do
- check iostat if one disk become worse
- disable background acceleration and realtime monitoring for benchmarks in napp-it as this adds some extra load (top menu mon, acc)
 

enigmah

n00b
Joined
May 13, 2010
Messages
23
I have problems to install omnios and nappit on the following system:
Supermicro Atom/Rangeley A1SRi-2758F
IBM 1015/LSI 9211-8i IT
SSD Bootdisk on port SATA0 of the mainboard
Installation is done by IPMI.

omnios r151010, omnios-8c08411 stucks during loading the installer of the ISO Image:

Code:
usb_mid0 is /pci@0,0/pci8086,7270@16/hub@1/hub@3/deuice@1
/pci@0,0/pci8086,7270@16/hub@1/hub@3/deuice@1 (usb_mid0) online
USB 2.0 device (usbea0,1111) operating at hi speed (USB 2.x) on USB 2.0 external
 hub: storage@2, scsa2usb0 at bus address 5
scsa2usb0 is /pci@0,0/pci8086,7270@16/hub@1/hub@3/storage@2
/pci@0,0/pci8086,7270@16/hub@1/hub@3/storage@2 (scsa2usb0) online
USB 1.10 interface (usbif557,2419.config1.0) operating at low speed (USB 1.x) on
 USB 2.0 external hub: keyboard@0, hid0 at bus address 4
hid0 is /pci@0,0/pci8086,7270@16/hub@1/hub@3/deuice@1/keyboard@0
sd0 at scsa2usb0: target 0 lun 0
sd0 is /pci@0,0/pci8086,7270@16/hub@1/hub@3/storage@2/disk@0,0
/pci@0,0/pci8086,7270@16/hub@1/hub@3/storage@2/disk@0,0 (sd0) online
/pci@0,0/pci8086,7270@16/hub@1/hub@3/device@1/kegboard@0 (hid0) online
USB 1.10 interface (usbif557,2419.config1.1) operating at low speed (USB 1.x) on
 USB 2.0 external hub: mouse@1, hid1 at bus address 4
hid1 is /pci@0,0/pci8086,7270@16/hub@1/hub@3/device@1/mouse@1
/pci@0,0/pci8086,7270@16 (ehci0)I Low speed endpoint’s poll interual of 5 ms is
below threshold. Rounding up to 8 ms
/pci@0,0/pci8086,7270@16 (ehci0)I Low speed endpoint’s poll interual of 5 ms is
below threshold. Rounding up to 8 ms
/pci@0,0/pci8086,7270@16 (ehci0)I Low speed endpoint’s poll interual of 5 ms is
below threshold. Rounding up to 8 ms
/pci@0,0/pci8086,7270@16/hub@1/hub@3/deuice@1/mouse@1 (hid1) online

Something is wrong with the usb devices but I don't know what.
So I decided to try omnios r151011, omnios-8e364a8 bloody to see if the error occurs there.
It didn't occur and I could finish installation.
But afterwards during the login a serious error came up:

Code:
Loading smf(5) seruice descriptions: 1/1
sesf console login: root
SUNW-MSG-ID: PCIEX-8000-KP, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Wed Sep 3 22:00:38 CEST 2014
PLATFORM: A1SAi, CSH: 123456789, HOSTNAME: sesf
SOURCE: eft, REV: 1.16
EVENT-ID: d3d18d3a-1f76-cfac-f150-f0be72e7e8d9
DESC: Too many recovered bus errors have been detected, which indicates a proble
m with the specified bus or with the specified transmitting device. This may deg
rade into an unrecoverable fault.
 Refer to http://illumos.org/msg/PCIEX-8000-KP for more information.
AUTO-RESPONSE: One or more deuice instances may be disabled

IMPACT: Loss of services provided by the device instances associated with this f
ault

REC-ACTION: If a plug-in card is involved check for badly-seated cards or bent p
ins. Otherwise schedule a repair procedure to replace the affected deuice. Use
fmadm faulty to identify the deuice or contact Sun for support.

Sep 3 22:00:38 sesf genunix: NOTICE: Deuice: already retired: /pci@0,0/pci8086,
1f12@3/pci1000,3020@0

Password:

Cause of the U-NAS NSC-800 case I have to use a PCIE Extension Cable. This seems to make problems right?
Anyways I tried to install nappit.
And ended up in another error during the "omni 1.6 install gcc48" part of the script:

Code:
ld.so.1: rm: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/rm)
ld.so.1: rm: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 99: 690: Killed
ld.so.1: ln: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/ln)
ld.so.1: ln: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 100: 691: Killed
ld.so.1: rm: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/rm)
ld.so.1: rm: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 102: 692: Killed
ld.so.1: ln: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/ln)
ld.so.1: ln: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 103: 693: Killed
ld.so.1: rm: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/rm)
ld.so.1: rm: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 105: 694: Killed
ld.so.1: ln: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/ln)
ld.so.1: ln: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 106: 695: Killed
ld.so.1: rm: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/rm)
ld.so.1: rm: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 108: 696: Killed
ld.so.1: ln: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/ln)
ld.so.1: ln: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 109: 697: Killed

ld.so.1: tee: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/tee)
ld.so.1: tee: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 115: 699: Killed
---------------------------------

ld.so.1: tee: fatal: libc.so.1: version 'ILLUMOS_0.8' not found (required by file /usr/gnu/bin/tee)
ld.so.1: tee: fatal: libc.so.1: open failed: No such file or directory
./nappit2.sh: line 120: 701: Killed
---------------------------------

These errors occur during the whole folowing nappit script.
After the script finished none of the "basic" commands (ls, rm, cp, etc.) worked.

There are so many things going wrong that I don't know where and how to start with fixing them.
Please help!
 

parnassus

n00b
Joined
Sep 4, 2014
Messages
1
Did anyone noticed how OmniOS installer will format the root pool (rpool) volume? I mean which settings it uses when create it and its datasets...

I discovered I have very little free space on a 8GB USB3 key (despite its 8GB) after OmniOS 151010s was installed: Napp-It installer was able to complete the install but was unable to install gcc48 and various other dependencies (needed to then compile smartmontools 6.3) so the system I'm testing lacked of smartmontools.

The point was that during OmniOS install the OI installer automatically creates a (1) Kernel Crash dump (which is questionable) filesystem and a (2) Swap (which is normal) filesystem, both have large preset sizes so those ones will almost fill the space of the 8GB USB key.

I removed rpool/dump and recovered almost 2GB...only then I was able to install gcc48 and compile smartmontools as per Napp-It install script.
 

calvinbui

n00b
Joined
Apr 2, 2011
Messages
31
I'm finding working with ACLs is difficult to do for the most simplest tasks.It's easy when there is a blank filesystem and I begin adding files but the problems started when I added a new user to the ACL. I am mainly using the ACL on folders.

Right now I have a a file system with 10 folders in it.
I just want users to be able to read and write and guests to only read.
Seems easily enough except the problems I run into:

  • Resetting ACLs is the only way to get every folder to have the same ACL settings, I can't just set it in one place for the whole file system. That's going to take forever if I have to go through every single folder to set the same ACL.
  • Users cannot delete folders created by other users, even though they have 'full_set' and are higher on the ACL list.

ACL is driving me nuts :mad:

EDIT: I've read through some earlier posts by Gea. Doesn't seem I can have credentials + guest access at the same time. There goes that I guess - sorry friends you will only have FTP and WWW read access.
 
Last edited:

enigmah

n00b
Joined
May 13, 2010
Messages
23
I have new insights regarding my installation problems on Supermicro A1SRi-2758F:

1. omnios r151011, omnios-8e364a8 2014-07-23 bloody (name of the ISO, after installation version omnios-6962d80) is not working with the current nappit script 0.9f1. It breaks up during the "omni 1.6 install gcc48" part of the script (see description in previous post). I checked that with a clean vmware installation.
2. The PCIE Extender Cable seem indeed to be broken. I checked with the LSI 9211-8i directly inserted in the motherboard slot.
3. Boot error of omnios r151010, omnios-8c08411 remains. No idea what to do.
 

Synthetickiller

Limp Gawd
Joined
Apr 5, 2009
Messages
285
I had a mishap with my ESXi 5.1 hypervisor & lost my open indiana VM.

I had Napp-it running a Raidz.The raid card & drives are intact. Is there a way a new install of Napp-it can read the file system on the drives (and rebuild the array) or is it a lost cause?
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,711
napp-it wouldn't do that, the OS would. re-install OI and import the pool. should 'just work'.
 

Freak1

Limp Gawd
Joined
Sep 9, 2009
Messages
191
Hi.

I need some help. My Napp-it 13B keeps crashing. I get this error:

error.jpg


It seems like some of the drives are timing out right?


I also tried to import the pool on a new install but then i get this error:

Vendor: ATA ,Product: Hitachi HUA72303 ,Revision: A800 ,Serial No: xxxxxxxxxxxxxxxxxxx
Size: 3000.59GB <3000592982016 bytes>
,Media Error: 0 ,Device Not Ready: 0 ,No Device: 0 ,Recoverable: 0
Illegal Request: 0 ,Predictive Failure Analysis: 0 p1main, /_lib/illumos/zfslib.pl, line 4627
exe: zpool import

When i restart i sometimes get this at the console, Other times it starts but crash later.

startup.jpg


The setup is like this:
MBD-X8DAH+-F -B, 2x Intel Xeon E5606, 6x Kingston ECC Reg, 4 GB (24 GB) 2x Intel 510 120 GB + 2x1TB Seagate RE2 (RAID 10 For ESXi and VMs ), Adaptec 6405, LSI SAS9201 PCIE 16 PORT INT, 48x Hitachi 3TB 24 in external enclosure
Napp-it 13B on ESXi 5.0 with OmniOS
 
Last edited:

Freak1

Limp Gawd
Joined
Sep 9, 2009
Messages
191
I now have it moved to a new Napp-IT 14b still get the same error and servers goes offline.

I tried changing all cables stille the same. Also change enclosure/backpane. In Disks hotswap all disks are stated as online but still the log says degraded.

Sep 9 11:44:31 napp-it-14b genunix: [ID 408114 kern.info] /scsi_vhci/disk@g5000cca225c60785 (sd2) online
Sep 9 11:44:31 napp-it-14b genunix: [ID 483743 kern.info] /scsi_vhci/disk@g5000cca225c60785 (sd2) multipath status: degraded: path 1

mpt_sas2/disk@w5000cca225c60785,0 is online
Sep 9 11:44:32 napp-it-14b scsi: [ID 583861 kern.info] sd3 at scsi_vhci0: unit-address g5000cca225c454b2: f_sym
Sep 9 11:44:32 napp-it-14b genunix: [ID 936769 kern.info] sd3 is /scsi_vhci/disk@g5000cca225c454b2
Sep 9 11:44:32 napp-it-14b genunix: [ID 408114 kern.info] /scsi_vhci/disk@g5000cca225c454b2 (sd3) online
Sep 9 11:44:32 napp-it-14b genunix: [ID 483743 kern.info] /scsi_vhci/disk@g5000cca225c454b2 (sd3) multipath status: degraded: path 2

mpt_sas2/disk@w5000cca225c454b2,0 is online
Sep 9 11:44:33 napp-it-14b scsi: [ID 583861 kern.info] sd4 at scsi_vhci0: unit-address g5000cca225c4570f: f_sym
Sep 9 11:44:33 napp-it-14b genunix: [ID 936769 kern.info] sd4 is /scsi_vhci/disk@g5000cca225c4570f
Sep 9 11:44:33 napp-it-14b genunix: [ID 408114 kern.info] /scsi_vhci/disk@g5000cca225c4570f (sd4) online
Sep 9 11:44:33 napp-it-14b genunix: [ID 483743 kern.info] /scsi_vhci/disk@g5000cca225c4570f (sd4) multipath status: degraded: path 3

mpt_sas2/disk@w5000cca225c4570f,0 is online
Sep 9 11:44:34 napp-it-14b scsi: [ID 583861 kern.info] sd5 at scsi_vhci0: unit-address g5000cca225c45818: f_sym
Sep 9 11:44:34 napp-it-14b genunix: [ID 936769 kern.info] sd5 is /scsi_vhci/disk@g5000cca225c45818
Sep 9 11:44:34 napp-it-14b genunix: [ID 408114 kern.info] /scsi_vhci/disk@g5000cca225c45818 (sd5) online
Sep 9 11:44:34 napp-it-14b genunix: [ID 483743 kern.info] /scsi_vhci/disk@g5000cca225c45818 (sd5) multipath status: degraded: path 4

mpt_sas2/disk@w5000cca225c45818,0 is online
Sep 9 11:44:35 napp-it-14b scsi: [ID 583861 kern.info] sd6 at scsi_vhci0: unit-address g5000cca225c4558a: f_sym
Sep 9 11:44:35 napp-it-14b genunix: [ID 936769 kern.info] sd6 is /scsi_vhci/disk@g5000cca225c4558a
Sep 9 11:44:35 napp-it-14b genunix: [ID 408114 kern.info] /scsi_vhci/disk@g5000cca225c4558a (sd6) online
Sep 9 11:44:35 napp-it-14b genunix: [ID 483743 kern.info] /scsi_vhci/disk@g5000cca225c4558a (sd6) multipath status: degraded: path 5

ect.

pool: Datapool
state: ONLINE
status: The pool is formatted using a legacy on-disk format. The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
pool will no longer be accessible on software that does not support feature
flags.
scan: none requested
config:

NAME STATE READ WRITE CKSUM CAP MODELL
Datapool ONLINE 0 0 0
raidz3-0 ONLINE 0 0 0
c7t5000CCA225C4532Fd0 ONLINE 0 0 0 3 TB Hitachi HUA72303
c7t5000CCA225C453EEd0 ONLINE 0 0 0 3 TB Hitachi HUA72303
c7t5000CCA225C454A1d0 ONLINE 0 0 0 3 TB Hitachi HUA72303
c7t5000CCA225C454B2d0 ONLINE 0 0 0 3 TB Hitachi HUA72303
ect.

errors: No known data errors

pool: rpool
state: ONLINE
scan: none requested



Some files i can copy / read fine others the whole thing just stops
 
Last edited:

Freak1

Limp Gawd
Joined
Sep 9, 2009
Messages
191
The log keeps telling me "Sep 9 15:08:41 napp-it-14b scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci15ad,7a0@16/pci1000,30c0@0 (mpt_sas0):
Sep 9 15:08:41 napp-it-14b Disconnected command timeout for Target 52"

But what is Target 52?

Update: I found a drive with a solid light and replaced it, now its rebuilding so hopefully this will help.
 
Last edited:

Eschertias

n00b
Joined
Apr 7, 2010
Messages
52
_Gea: I have a pretty obnoxious issue I've discovered in the Disk Volume/COMSTAR volume LU creation systems.

When you create a disk volume and specify a size, it ignores "." in the input field. 13.6 TB turns into 136 TB.

EDIT:
The second issue is that the sizing of the zfs Volume and the COMSTAR volume LU are inconsistent. When you create a volume LU, it'll show as 11 TB volume size in the disk volumes page, but show as 11.7 in the COMSTAR volume LU side. When you modify the size of the LU through the napp-it menu, it will shrink the size of the disk volume as well, so you still end up with a volume about 90% the size it should be.

When I try to bitlocker or truecrypt the volume, it gets about 85-90% done then breaks. It breaks because the volume the data is stored on fills up 100%, but the COMSTAR LUN says there is still 1+ TB of data free, which is what windows thinks is happening.

I ended up having to manually modify the size of the COMSTAR LU to be 10TB, then use "zfs set volsize=10.1TB test/test_iscsi". It took me like 4 days to finally figure out what the issue was.

Can you adjust the COMSTAR volume LU create to properly detect the size of the zfs volume? And change the behavior of the LU resize script to avoid this behavior?

I now have ~20 TB of data I have to migrate off the SAN so I can resize the volumes it resided on, then bitlocker them to insure the data volume is correctly sized, then move the data back. It will literally be a 3 week job.


On the plus side, I love the new real time monitors. They're very very nice. Can we get one for memstat as well?
 
Last edited:

levak

Limp Gawd
Joined
Mar 27, 2011
Messages
386
I would suspect a drive problem. kern.warning is about mpt_sas driver, whish is a SAS HBA driver, so there has to be something wrong with either harddrive, cables or HBA card.
 

Eschertias

n00b
Joined
Apr 7, 2010
Messages
52
I would suspect a drive problem. kern.warning is about mpt_sas driver, whish is a SAS HBA driver, so there has to be something wrong with either harddrive, cables or HBA card.

I had those same issues, and it turned out to be a 5 year old flaky WD Green drive that was causing the issues. Moved the data to a new set of drives, removed the old ones and the system was just fine.
 

dedobot

Weaksauce
Joined
Jun 19, 2012
Messages
96
Hi friends.
Just to post results from my test lab about upgrading from Solaris 11.1 to Solaris 11.2.

So I prepared an PC with Solaris 11.1 ,napp-it 09f1+ AFP netatalk v3,Sun Ray software [yes it better than VNC] with Sun ray admin web Gui .

Via Solaris upgrade manager I upgraded the system to Solaris 11.2 !

Everything still works fine - no problems connecting Oracle virtual desktop client to the Sun ray server,napp-it is functioning right, AFP too. No any issues till now.
Of course if you have working setups with Solaris 11.1 in use [like me] there is no benefits to upgrade,but it is good to be known.

Have a nice day !
 

Nemesis_001

Weaksauce
Joined
Apr 3, 2011
Messages
69
For some reason napp-it keeps changing my hostname from san to ZFSSAN. I created /etc/host.nic and add the correct name. That solved the problem while I was using nwam. Now, without nwam it has started renaming my host again.

Any idea how can I get rid of it, or why is it doing it?

Thanks

edit:
Ok, im not sure it's napp-it doing it. There are no more new entries in the host file.
It's driving me crazy :)

edit2:
Nvm. Seems I misspelled something in the hostname.nic, and when I brought the second adapter online it went back.
 
Last edited:

HammerSandwich

[H]ard|Gawd
Joined
Nov 18, 2004
Messages
1,126
Gea, I may have found a weird bug in Napp-it by being dumb.

OmniOS v11 r151010 8c08411 with Napp-it free 0.9f1 Aug.14.2014

  1. I destroyed a pool that had a Comstar share (file-based LU) which I'd forgotten. (The "Info!! Pools with active shares or targets cannot be destroyed" doesn't seem accurate; it destroyed with no warnings.)
  2. I then repurposed the HDs before noticing that...
  3. Comstar status in Napp-it showed a dangling view for the missing LU.
  4. LU status also displayed goofy info on both the main Comstar page as well as in Comstar/Logical Units. All the data were there, but formatting was messed up.
  5. Napp-it would not allow me to delete the dangling view, though this worked from the CLI.

Sorry in advance - I did not run "stmfadm list-lu" manually, nor did I take a screenshot. So,I'm not sure if this is an OmniOS issue or if Napp-it burped when parsing the LU status.
 

Mastaba

Limp Gawd
Joined
Apr 2, 2011
Messages
228
Weird thing, i can't access the "security" tab from windows anymore.

It only occur on the shared folders, on the others folders i still can access the security tab, but i can't find the "everyone" user anymore.
Instead i only have root and the mastaba user i created when i couldn't use the root account -issue resolved with another "passwd root" command-.

settings are the same as before (guest=ok abe), i tried to unshare/reshare, restart the SMB service, reboot the server.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
The log keeps telling me "Sep 9 15:08:41 napp-it-14b scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci15ad,7a0@16/pci1000,30c0@0 (mpt_sas0):
Sep 9 15:08:41 napp-it-14b Disconnected command timeout for Target 52"

But what is Target 52?

napp-it menu "disks-sas2 extension" translates target numbers to disk ids
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
Weird thing, i can't access the "security" tab from windows anymore.

It only occur on the shared folders, on the others folders i still can access the security tab, but i can't find the "everyone" user anymore.
Instead i only have root and the mastaba user i created when i couldn't use the root account -issue resolved with another "passwd root" command-.

settings are the same as before (guest=ok abe), i tried to unshare/reshare, restart the SMB service, reboot the server.

You need full access to modify permissions
(ex: disable guest, login as root)
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
_Gea: I have a pretty obnoxious issue I've discovered in the Disk Volume/COMSTAR volume LU creation systems.

When you create a disk volume and specify a size, it ignores "." in the input field. 13.6 TB turns into 136 TB.

EDIT:
The second issue is that the sizing of the zfs Volume and the COMSTAR volume LU are inconsistent. When you create a volume LU, it'll show as 11 TB volume size in the disk volumes page, but show as 11.7 in the COMSTAR volume LU side. When you modify the size of the LU through the napp-it menu, it will shrink the size of the disk volume as well, so you still end up with a volume about 90% the size it should be.

When I try to bitlocker or truecrypt the volume, it gets about 85-90% done then breaks. It breaks because the volume the data is stored on fills up 100%, but the COMSTAR LUN says there is still 1+ TB of data free, which is what windows thinks is happening.

I ended up having to manually modify the size of the COMSTAR LU to be 10TB, then use "zfs set volsize=10.1TB test/test_iscsi". It took me like 4 days to finally figure out what the issue was.

Can you adjust the COMSTAR volume LU create to properly detect the size of the zfs volume? And change the behavior of the LU resize script to avoid this behavior?

I now have ~20 TB of data I have to migrate off the SAN so I can resize the volumes it resided on, then bitlocker them to insure the data volume is correctly sized, then move the data back. It will literally be a 3 week job.


On the plus side, I love the new real time monitors. They're very very nice. Can we get one for memstat as well?

About the point problem:
see line 85 of the LU creation script
$size=~s/[^0-9]//g;

It allows only digits not a point
I will modify to
$size=~s/[^0-9\.]//g; to allow a point in next release

The size calculation is not as easy as i do no more than to display the output of the corresponding cli commands
 

TCM2

Gawd
Joined
Oct 17, 2013
Messages
572
About the point problem:
see line 85 of the LU creation script
$size=~s/[^0-9]//g;

It allows only digits not a point
I will modify to
$size=~s/[^0-9\.]//g; to allow a point in next release

Uhh, blindly throwing away input is a horrible method of "validation". You should validate input by matching only and throw a message if the input deviates from what's expected.

Seriously...

And about the sizes: The zfs command has a -p switch to get exact byte values.

People pay money for this?
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
People pay money for this?

You do not need to pay for, this is free..
But honestly, I do not parse any inputs (only check if numeric data is > 0) and calculate any possible outputs.

You may use CLI commands.
 
Last edited:

TCM2

Gawd
Joined
Oct 17, 2013
Messages
572
When you create a disk volume and specify a size, it ignores "." in the input field. 13.6 TB turns into 136 TB.

But honestly, I do not parse any inputs

Right...

This is called input validation and you're not doing it. You take the input, mangle it according to some method unknown to the user and then keep using it. This is the worst programming practice.

Edit: Check for a match with =~ m/^[0-9]+$/ and complain if it doesn't match, anything else is garbage.
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
Neither is perfect.
I check for numeric data > 0 and expect a integer

Your suggestion gives a 3. invalid and a 0 valid.

(But you are right, first part of my parsing is only a remove of "not expected inputs" so correct would be your parsing with && > 0)
 
Last edited:

TCM2

Gawd
Joined
Oct 17, 2013
Messages
572
No, you are not checking. At all.

I'm currently looking at data_0.9f1/napp-it/zfsos/09_comstar iscsi=-lin/01_volumes/01_create volume/action.pl:

Relevant lines containing $size:

Code:
my ($unit,$size,$block,$vol,$s);

$size=$in{'value_size'};
$size=~s/[^0-9]//g;
if ($size eq "") { &mess("A numeric value is needed for size"); }

$r=&exe("/usr/sbin/zfs create$s -V $size$unit -b $block $vol");

You're simply not checking. You're trashing the input, then look if anything is left and feed that directly into zfs.

My suggestion, BTW, wouldn't match -3. And yes, 0 is a valid number format. You should first check for proper number format with a regex match and then check the number for validity. If you want to include decimals, something like =~ m/^[0-9]+(?:\.[0-9]+)?$/ should work. _Then_ check if the number is > 0.

Are you seriously saying that 'Yes please, make me 1 volume of 10GB size' should be a valid input and should result in a 110$unit volume? That can't be right.

Edit: If you had done it this way, entering a decimal wouldn't have worked and this bug report by Eschertias would have been a simple feature request.

Feature request beats bug report in product quality.

Edit2: I'm not trashing napp-it for some simple input validation issue. I'm worried what _other_ bugs there could be, when people are paying for this and trusting their data to it.
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
Yes, the check vs an empty string is senseless at this point (and a partly relict from a former single input like 10G) where the unit is now a separate field. The final validation is done in zfs create itself.

Your regex is correct.
 

TCM2

Gawd
Joined
Oct 17, 2013
Messages
572
I would hardly call it validation if you just pass unsanitized input into an external command and relying on _its_ validation.

http://en.wikipedia.org/wiki/Robustness_principle Seems you never heard of it.

I'm giving up at this point as it looks like you're really trying to defend this coding style. Good luck to all users.

Edit: All this is not even touching the second issue Eschertias had, which I can only imagine comes from exactly this style as well. 4 hours trouble shooting, 3-week job ahead. Oh boy.
 
Last edited:

chune

Weaksauce
Joined
Nov 2, 2013
Messages
70
I would hardly call it validation if you just pass unsanitized input into an external command and relying on _its_ validation.

http://en.wikipedia.org/wiki/Robustness_principle Seems you never heard of it.

I'm giving up at this point as it looks like you're really trying to defend this coding style. Good luck to all users.

Edit: All this is not even touching the second issue Eschertias had, which I can only imagine comes from exactly this style as well. 4 hours trouble shooting, 3-week job ahead. Oh boy.

People aren't "trusting their data TO napp-it". They are trusting our data to the underlying enterprise-level ZFS stack. Nappit it just a great (!!WEB-BASED!!) front end that allows solaris noobs to never touch the CLI (among a plethora of other things). Gea isn't some big software developer, he's just an awesome guy who felt like sharing his hard work with you... FOR FREE. Save your whining for something you actually paid for.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,069
When you are logged in, you can just enter a remove all command in the cmd form - no need to insert a dangerous command into some input data. A tool like napp-it should also only be used in secure networks.
 

TCM2

Gawd
Joined
Oct 17, 2013
Messages
572
It looks like you misunderstood my point about the robustness principle. I'm not talking about network inputs. I'm talking about the "output" of your function which is the "input" of the zfs command. The robustness principle dictates that you should be sure _what_ you're sending to zfs.

People aren't "trusting their data TO napp-it". They are trusting our data to the underlying enterprise-level ZFS stack. Nappit it just a great (!!WEB-BASED!!) front end that allows solaris noobs to never touch the CLI (among a plethora of other things). Gea isn't some big software developer, he's just an awesome guy who felt like sharing his hard work with you... FOR FREE. Save your whining for something you actually paid for.

Fact is, napp-it is offered with a hefty price tag in certain licensing options, which invalidates your whole point about the good guy and "FOR FREE". I can't imagine the pro version to magically have better code. And of course you trust napp-it to handle your data correctly. If it trashes your data by calling zfs with some unvalidated input, it's not ZFS's fault, is it?
 
Top