OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

CopyRunStart

Limp Gawd
Joined
Apr 3, 2014
Messages
155
It looks like it is CPU-bound.

OK I'll test with compression off. I thought an Intel Xeon X5675 Six-Core @ 3.06GHz would easily handle this. Is there anyway to use LZ4 on Solaris 11.1 or is it Illumos only?

EDIT: Turned compression off on the Pool, similar results.

XAoVLbu.png
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
LZ4 is free ZFS only.

Beside that, a massive amount of mirrored vdevs is good for IO.
Benchmarks are mostly sensitive for sequential performance. In your case the values are not really bad but not good for that amount of disks. I would expect the expander, cabling, Sata, RAM or PCI-e slot to be the problem.

And time for "use many spindels for best IO" is over.
Use that many disks if you need capacity, otherwise use SSD as a single Enterprise SSD gives you more IOPS than 100 disks. I would avoid an expander whenever possible (Use larger disks and several HBAs) - especially with Sata disk.
 

chune

Weaksauce
Joined
Nov 2, 2013
Messages
70
Gea, I have been using your replication locally and it works great! Very reassuring that i have a full nightly copies of my datastores on a backup server for all my esxi boxes.

But i am having lots of issues with WAN replication. Office is 20/20Mbit fiber, offsite is 100/50Mbit fiber linked via cisco VPN on two ASA5505. Some replications just continue to work without issues, however i have a certain 4 servers that randomly fail to replicate.

They all throw this error after a very long time and the replication job stays active until i manually cancel it:
info: initial remote replication finished (time: 86390 s, size: MB, performance: 0.0 MB/s) error

Any ideas here? It looks like its just timing out or getting stalled for some reason.

Is it not best practice to replicate right from production to offsite via WAN? Would a better setup be to replicate to a on-site backup server nightly and then replicate that to the offsite server?

I am running nappit 0.9d2 nightly Nov.03.2013 on all servers with your OmniOS preconfigured virtual machine.

The only difference is that the one server that has never failed to replicate is actually running Openidiana still, but the same nappit build.

If you get my offsite replications working reliably, i can justify buying another set of licenses for the new co-lo site!! =P

Any advice is greatly appreciated! Thanks for all your hard work on this and i find it amazing you are able to keep up on a million post thread!
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
Sometimes newer software releases introduces new problems
but mostly 0.9e1 has a improved stability regarding replications over 0.9d2

Beside that, zfs send/receive is very sensitive regarding stability of a network connection.
If you can, try if an update improves stability. Current updates do not require a reboot unless you want to go back a BE
 

chune

Weaksauce
Joined
Nov 2, 2013
Messages
70
Sometimes newer software releases introduces new problems
but mostly 0.9e1 has a improved stability regarding replications over 0.9d2

Beside that, zfs send/receive is very sensitive regarding stability of a network connection.
If you can, try if an update improves stability. Current updates do not require a reboot unless you want to go back a BE
is the improved stability related to the sender or receiver? I updated one of my senders to 0.9e1 and will try to run the replication again. If i am to update the receiver i will have to do it during the afternoon so as not to interrupt replications
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
is the improved stability related to the sender or receiver? I updated one of my senders to 0.9e1 and will try to run the replication again. If i am to update the receiver i will have to do it during the afternoon so as not to interrupt replications

0.9e1 fixes a bug when source and target zfs names are identical
0.9f1 fixes "failed to read from stream" problems on some configs

They are (mostly) relevant on the receiver side
http://napp-it.org/downloads/changelog_en.html
 

x-cimo

Limp Gawd
Joined
Mar 26, 2012
Messages
209
Hello, I ran into a problem with OmniOS

I ran: pkg upgrade

Since then many commands are broken (and napp-it dosen't work).

When I just type:
Code:
root@Athena:~# pkg
ld.so.1: python2.6: fatal: libc.so.1: version 'ILLUMOS_0.6' not found (required by file /usr/lib/amd64/libpython2.6.so.1.0)
ld.so.1: python2.6: fatal: libc.so.1: open failed: No such file or directory
Killed

I googled the issue but fun no help.

Anyone know what I can do? ZFS / CIFS still works.
 

levak

Limp Gawd
Joined
Mar 27, 2011
Messages
386
Looks like libc.so is missing. Upgrade gone bad.

Revert to previous BE (man beadm or use google), reboot and try upgrade again.

Matej
 
Joined
Mar 15, 2011
Messages
5
Hello Gea and Matej,

Thank you for your help with my debian+kvm/qemu+libvirt_webvirtmgr and omnios+napp-it nfs share rights.
I have solved the problem this morning.
The issue was that creation of a vm was done by root with rw for the root account only on the disk file. The starting and stopping of that vm is done by the libvirt-qemu account on the hypervisor. This account has no rights on the disk, that results in an error in webvirtmgr.
What i have done to make it work:

create a zfs filesystem on omnios/napp-it
enable nfs with for example:

rw=@192.168.1.0/24,root=@192.168.1.0/24 data01/cloud

as nfs arguments for the share.check if aclmode is set to passthrough for the share, if not set it to passthrough.

On the hypervisor:

change qemu.conf so that kvm runs under root account.
After that the vm starts and stops without a problem.

Best regards,

Dirk Adamsky
 

Eschertias

n00b
Joined
Apr 7, 2010
Messages
52
I'm seeing some really REALLY high wait times and 100% wait and busy times in iostat when copying stuff to or from one of my arrays. Wait times of over 33,000 ms, both in iostat, and when looking at the iscsi latency on the VM it's pointing to.

Is there a dtrace script, or something I can run that will tell me what drive or subsystem is causing the delays? As it stands, the system is almost 100% unusable trying to use the drives as COMSTAR hosted iSCSI LUNs. Writeback cache is enabled on the LU, and in the VM accessing it.

Also, just picked up a pro sub, thanks for the work you did to get napp-it to be as awesome as it currently is.


Edit: 2nd question: What settings do I have to toggle to get the equivalent of setting zfs_vdev_max_pending=1 in OmniOS? They depreciated the tunable in favor of a whole pile of different ones.
 
Last edited:

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
100% busy and wait values for a pool are quite normal under load as it means "as fast as possible". Even a simple scrub can produce a 100% busy pool.

Compare the values of single disks to check for disks that are significantly slower and check fault or system log for messages (iostat at console, system-statistic or realtime iostat/iostat2 values)

Check arcstat values (maybe more RAM can help)

Regarding settings (such a tuning can improve performance on special workloads but does not switch a unusable slow system to a fast system, mostly defaults are ok for a typical use case): start with https://www.illumos.org/issues/4045
 

Eschertias

n00b
Joined
Apr 7, 2010
Messages
52
This is the kind of busy that causes the windows VM to disconnect the disk because IOs hung for over 30 seconds. A month ago this didn't happen, which makes me think that it may have been caused by my last update.

I'm trying to sync files between the pool and the COMSTAR iscsi volume, and it gets about 5 files in and basically shits itself and locks up hard. Even doing a simple folder copy/paste inside the windows VM will cause the thing to lock up. Then even if I kill the transfer, it takes another 5-10 minutes for iostat to calm back down and for the VM to start behaving normally again.

My setup is a LSI 9211-8i attached to a Supermicro Expander, attached to 8 WD RED 4 TB drives. The expander daisy chains to another Supermicro expander, which has 20 2.5" HDDs. The other port on the HBA goes to a Chenbro 36 port expander which has another set of 10 WD RED 3 TB drives on it. The 3TB reds work fine as a regular zfs share, no issues with throughput, no 100% busy, wait times are ~300ms under full load, which is more or less expected when IOs are queued 10 deep.

Anything that touches the COMSTAR volumes seems to just shit itself hard, and I'm trying to figure out why. Is there a Dtrace command I can run to output pending IOs per vdev and device so I can see what's causing this? Because it almost looks like a read/modify/write lockup that I used to see on my WD green 4k drives before the ashift=12 fix came to light

Edit:
Is there a way to verify that the comstar file volume, windows NTFS volume, and he zfs file system are all aligned? A read/modify/write would explain a lot.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
You may play with several ZFS or iSCSI blocksizes but these is only a minimal tuning option for special workloads. Most importatnt is usual a worse IO of a single disk.

I have seen a similar behaviour two years ago when I played with an LSI expander and WD Raptors (Performance sometimes went to zero where I could not identify a single disk as a problem.) I then attached the disks directly and the problem was gone.

At the end I do not use or suggest expander based solutions with Sata disks without intensive testings. It can have problems with a special series of disks or problems can result due the limited cable length of Sata (100cm - some cm for backplanes and other connectors). Best is using SAS disks with SAS expanders.

This is one of the reasons, commercial vendors do not support Sata + expander or request "certified disks" at the double price. If possible, connect the disks directly especially from the affected pool or build a testpool and compare result
 

Eschertias

n00b
Joined
Apr 7, 2010
Messages
52
You may play with several ZFS or iSCSI blocksizes but these is only a minimal tuning option for special workloads. Most importatnt is usual a worse IO of a single disk.

After looking to exactly how I had the iSCSI LU set up, I finally noticed that the ZFS file system that was housing the thin provisioned file had hit 0 bytes free. So yeah, that was what was causing the entire system to completely lose it's mind.

I blew away the entire zpool it was stored on and copied the data back over from the cold spares.

Now I need to figure out how to get jumbo packets enabled on all my devices so I can finally get the 10GbE cards to break ~150ish MB/sec throughput.


Double Edit:
You have a bug in the COMSTAR section, the LU size uses 1 KB = 1000 bytes, instead of 1KB = 1024 bytes. This means when I create a 24TB volume, and create a volume LU, the Max Size column shows 26.3 TB, instead of 24, despite the volume size listed under 'stmfadm list-lu -v' being 26388279066624 bytes, which google agrees is 24TB.

I probably blew away my volume/LU two or three times trying to guarantee I won't hit 0 bytes free again before I noticed the error.
 
Last edited:

ST3F

Limp Gawd
Joined
Oct 19, 2011
Messages
181
I have some on these Nic

Tested with the setup in CIFS :
  • Xeon x3450 4c/8t 2,66 / 3,1
  • Intel S3420GPV
  • 8 GB (4x 2 GB ECC HP 1066)
  • System : SSD OCZ Vertex Series Plus 60 GB partitioned @ 30 GB
  • IBM M1015 flashed in LSI2008
  • vDev #1 : 5x 1 To WD Black WD1001FALS 7200 Trs in Icy Dock MB455SPF-B 5 in 3
  • vDev #2 : 5x 2 To Hitachi HDS72302
  • 2x SSD STEC Mach 16 50 Go(ZIL)
  • 1x SSD OCZ Vertex 3 60 Go (LOG)
  • Sharkoon Rebel 12
  • Seasonic M2II 650w

OpenIndiana 151a5, Napp-it 0.8, version Pool : 28, RaidZ, ashift=9 // SSD inclu

Code:
   pool: black
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
	still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
	pool will no longer be accessible on older software versions.
  scan: scrub repaired 0 in 0h31m with 0 errors on Sat Jul 28 23:44:19 2012
config:

	NAME                       STATE     READ WRITE CKSUM     CAP            Product
	black                      ONLINE       0     0     0
	  raidz1-0                 ONLINE       0     0     0
	    c3t50014EE203B9A3E5d0  ONLINE       0     0     0     1000.20 GB     WDC WD1001FALS-0
	    c3t50014EE2AE6582FEd0  ONLINE       0     0     0     1000.20 GB     WDC WD1001FALS-0
	    c3t50014EE6000CBA31d0  ONLINE       0     0     0     1000.20 GB     WDC WD1001FALS-5
	    c3t50014EE655620F20d0  ONLINE       0     0     0     1000.20 GB     WDC WD1001FALS-5
	    c3t50014EE6AAAAB8DCd0  ONLINE       0     0     0     1000.20 GB     WDC WD1001FALS-7
	  raidz1-2                 ONLINE       0     0     0
	    c3t5000CCA369C9168Fd0  ONLINE       0     0     0     2.00 TB        Hitachi HDS72302
	    c3t5000CCA369CACE70d0  ONLINE       0     0     0     2.00 TB        Hitachi HDS72302
	    c3t5000CCA369CADD07d0  ONLINE       0     0     0     2.00 TB        Hitachi HDS72302
	    c3t5000CCA369CAE3F5d0  ONLINE       0     0     0     2.00 TB        Hitachi HDS72302
	    c3t5000CCA369CBD585d0  ONLINE       0     0     0     2.00 TB        Hitachi HDS72302
	logs
	  mirror-1                 ONLINE       0     0     0
	    c3t5000A720300547DDd0  ONLINE       0     0     0     50.02 GB       STEC MACH16 M
	    c3t5000A720300547FFd0  ONLINE       0     0     0     50.02 GB       STEC MACH16 M
	cache
	  c3t5E83A97E6268ABEDd0    ONLINE       0     0     0     60.02 GB       OCZ-VERTEX3

errors: No known data errors

  pool: rpool
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM     CAP            Product
	rpool       ONLINE       0     0     0
	  c6t0d0s0  ONLINE       0     0     0     120.03 GB      OCZ VERTEX-PLUS

errors: No known data errors

Benchmark with 10.24 GB
Code:
write 10.24 GB via dd, please wait...
time dd if=/dev/zero of=/black/dd.tst bs=1024000 count=10000

10000+0 records in
10000+0 records out

real       12.3
user        0.0
sys         5.0

10.24 GB in 12.3s = 832.52 MB/s Write

read 10.24 GB via dd, please wait...
time dd if=/black/dd.tst of=/dev/null bs=1024000

10000+0 records in
10000+0 records out

real       14.4
user        0.0
sys         3.5

10.24 GB in 14.4s = 711.11 MB/s Read

.... Write: 832.52 Mo/s
.... Read 711.11 Mo/s

With folder "video" on the NAS, mounted as dirve Y:\ in Windows 7

ATTO_Benchmark_ZFS_Samba_10GbE_RaidZ_2_vDevs_ 5xWD_WD1001FALS_5xHitachi_HDS72302.JPG
CrystalDiskMark_Test__ZFS_Samba_10GbE_RaidZ_2_vDevs_ 5xWD_WD1001FALS_5xHitachi_HDS72302.JPG


148 w IDLE.
 

Nemesis_001

Weaksauce
Joined
Apr 3, 2011
Messages
69
Hi,

I have built a new napp-it all in one.
For some I cannot set trivial acl on folders, which is supposed to work without a license.
It just pops up a message linking me to a purchase page.

Any idea?

Thanks,
 

Aesma

[H]ard|Gawd
Joined
Mar 24, 2010
Messages
1,854
I'v been running a single vdev of 19 disks in RAIDZ3 consisting of 2TB drives, for one year now. I plan to add a second vdev of 19 4TB in RAIDZ3. I know that performance won't be ideal but the current performance is enough for home use and it shouldn't go down. My backups are on windows/NTFS, but I plan to move to another OpenIndiana (or maybe OmniOS) box a bit later, reusing drives. I have plenty of 3TB drives so I was wondering if making it 2*19*3TB would yield the same pool size as 19*2TB+19*4TB, in theory it should. I also have plenty of 2TB left so the alternative would be to sell the 3TBs and buy another 19 4TB to make the same 19*2+19*4.
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,704
/dev/zero as a source is not a good data source. If you are running compression (are you?) the write number will be unreasonably high. If compression is off, it probably shouldn't be.
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
Hi,

I have built a new napp-it all in one.
For some I cannot set trivial acl on folders, which is supposed to work without a license.
It just pops up a message linking me to a purchase page.

Any idea?

Thanks,

From current 0.9f and up, partly use of the commercial acl extension is no longer free. The reason is that I offer a home license at about 10% of a commercial licence. Other reason is that this was a setting that you can do quite easily via a /usr/bin/chmod or from Windows. Third reason may be that the time consumption of developping and maintaining napp-it is over the "do it in the spare time" limit. Despite of about 1000 downloads per month, earnings are quite low. But this does not affect napp-it. This is a item on extensions that deliver some extras and management comfort and that are needed to finance the game (intended from commercial users).

If you need you can also use a 2day testkey to setup in single cases.

Other option:
Stay on a former release if you do not need current features (mostly realtime monitor extension).

Hope this is ok for you


Gea
 

bbzidane

n00b
Joined
Dec 22, 2002
Messages
40
I tried replacing a disk, and after resilvering, the faulted/replaced disk is still showing.
After the resilver, the replaced disk was still present. I powered off and removed the replaced disk. When started again, it showed the new disk resilvering again, but the replaced disk is still showing (as not accessible or something).

In the remove/replace menus, it doesn't show the replaced/faulted disk as removeable/replaceable. I'm going let it finish resilvering of course, but not sure what I should do next.

Any advice? Thanks
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
I tried replacing a disk, and after resilvering, the faulted/replaced disk is still showing.
After the resilver, the replaced disk was still present. I powered off and removed the replaced disk. When started again, it showed the new disk resilvering again, but the replaced disk is still showing (as not accessible or something).

In the remove/replace menus, it doesn't show the replaced/faulted disk as removeable/replaceable. I'm going let it finish resilvering of course, but not sure what I should do next.

Any advice? Thanks

I use four tools that give a different point of view to the current storage state:

iostat
this is the fastest tool. it is like a inventory of all known disks up from last boot including removed disks. I use it in all menus below disks

format
quite slow but returns all current disks

parted
returns all partitions. can hang on disks with unknown partitions.
In such a case you need to initialize the disks prior use (menu disks-initialize)

zpool status
returns all disks that are a member of zfs pools even when they are removed/missing.

In your case it seems that the first resilver was not successfull as the resilver started again after reboot. The removed disk is at this state a member of the pool and therefor shown as missing unless you have a successful replace/resilver. Errormessages that are shown and no longer valid can be removed with a clear errors (menu pools). If you switched disks between controller ports, a pool export/import can clear the situation, especially when using port ids like c0t1d0
 
Joined
Dec 30, 2010
Messages
43
From current 0.9f and up, partly use of the commercial acl extension is no longer free. The reason is that I offer a home license at about 10% of a commercial licence. Other reason is that this was a setting that you can do quite easily via a /usr/bin/chmod or from Windows. Third reason may be that the time consumption of developping and maintaining napp-it is over the "do it in the spare time" limit. Despite of about 1000 downloads per month, earnings are quite low. But this does not affect napp-it. This is a item on extensions that deliver some extras and management comfort and that are needed to finance the game (intended from commercial users).

If you need you can also use a 2day testkey to setup in single cases.

Other option:
Stay on a former release if you do not need current features (mostly realtime monitor extension).

Hope this is ok for you


Gea

Hello Gea,
I think you would get many more people buying a license if you lower the prices. 500eur for home use is too much money. i'm not even considering it.

for a company it is different of course...
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
No problem.
Even for commercial usage the free version is quite often enough as there is no limitation regarding functionality, performance or capacity of the inderlying OS. The extension saves time and simplifies daily admin or management tasks.

Beside that, there is no home use license at 500 Euro. They start at 50 Euro for two years (25 Euro per year) for a single extension and doubles for all extensions.

see http://napp-it.org/extensions/quotation_en.html
 

bbzidane

n00b
Joined
Dec 22, 2002
Messages
40
After the resilvering was complete, I did the export/import.
It still lists the missing disks as unavail and the replacing is still in progress.

how do i just complete the replace and remove the unavail disk entries?
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
You will see the missing disks unless they are successfully replaced.

If the replace fails constantly you should check the new disk with a low level tool from
the disk manufacturer or try another disk for the replace.

If you replace more than one disk simultaniously, you have to wait until all replacements are finished.

edit
It is also possible that another disk gives problems
check system log and system faults
 
Last edited:

CopyRunStart

Limp Gawd
Joined
Apr 3, 2014
Messages
155
Hey Gea I can't seem to get ProFTPD working in Napp-it. I compiled it by doing "./configure", "make" and "make install". I checked to make sure the files were installed to the proper /user/local/bin sub-directories. When I try to start the service from Napp-it, it says SVCADM couldn't find the service. Do I have to add ProFTPD to the SMF somehow? I'm used to init, so I'm not sure how to use this. (I'm running Solaris 11.1 with the latest updates.)
 

bbzidane

n00b
Joined
Dec 22, 2002
Messages
40
Below is my current status
so there is another disk that is exhibiting issues. So I am a bit stressed due to 3 disks exhibiting issues when the tolerance is for 2 disks.

With a third disk exhibiting issues, resilvering will complete, with the third disk having status degraded. How should I proceed after the resilviering completes?

Thanks

Code:
  pool: storage
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Jun 25 16:09:06 2014
    15.7T scanned out of 23.6T at 249M/s, 9h18m to go
    3.11T resilvered, 66.34% done
config:

	NAME                         STATE     READ WRITE CKSUM
	storage                      DEGRADED     0     0     1
	  raidz2-0                   DEGRADED     0     0     2
	    replacing-0              DEGRADED     0     0     0
	      5802520008555750639    UNAVAIL      0     0     0  was /dev/dsk/c1t5000CCA228C06F43d0s0
	      c1t5000CCA22CEED14Ed0  ONLINE       0     0     0  (resilvering)
	    c1t5000CCA228C06F69d0    ONLINE       0     0     0
	    replacing-2              DEGRADED     0     0     0
	      4574634853336216011    UNAVAIL      0     0     0  was /dev/dsk/c1t5000CCA228C07074d0s0
	      c1t5000CCA22CF2279Cd0  ONLINE       0     0     0  (resilvering)
	    c1t5000CCA228C07F9Bd0    ONLINE       0     0     0
	    c1t5000CCA228C08301d0    ONLINE       0     0     0
	    c1t5000CCA228C0D0D6d0    ONLINE       0     0     0  (resilvering)
	    c1t5000CCA228C0D2C3d0    ONLINE       0     0     0
	    c1t5000CCA228C0D60Fd0    ONLINE       0     0     0
	    c1t5000CCA228C0D66Ed0    ONLINE       0     0     0
	    c1t5000CCA228C0D71Cd0    ONLINE       0     0     0
	logs
	  c1t500A07510900A12Ed0      ONLINE       0     0     0
	cache
	  c1t500A0751090091E8d0      ONLINE       0     0     0

errors: 1 data errors, use '-v' for a list

You will see the missing disks unless they are successfully replaced.

If the replace fails constantly you should check the new disk with a low level tool from
the disk manufacturer or try another disk for the replace.

If you replace more than one disk simultaniously, you have to wait until all replacements are finished.

edit
It is also possible that another disk gives problems
check system log and system faults
 

_Gea

2[H]4U
Joined
Dec 5, 2010
Messages
4,051
welcome to the reality beyond the mtbf failure rates from Seagate and Co

see my backup system nr 4
I suppose this is the reason I use ZFS

Detect all errorrs immediatly and fight against. In my case I even needed to offline a disk or system crashes on a resilver.


Code:
config:

	NAME                         STATE     READ WRITE CKSUM      CAP            Product /napp-it   IOstat mess
	b4                           DEGRADED     0     0     0
	  raidz3-0                   DEGRADED     0     0     0
	    replacing-0              DEGRADED     0     0     0
	      4930719320381122518    UNAVAIL      0     0     0  was /dev/dsk/c6t5000C5004E94BE1Ad0s0
	      16224181184375348222   FAULTED      0     0     0  was /dev/dsk/c6t5000C50066259F7Ed0s0/old
	      c6t5000C500662511F7Ed0  ONLINE       0     0     0  (resilvering)      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C500311F4B7C1d0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    replacing-2              DEGRADED     0     0     0
	      9131101971556216633    UNAVAIL      0     0     0  was /dev/dsk/c6t5000C5004E94C6B0d0s0
	      c6t5000CC112BD43247d0  ONLINE       0     0     0  (resilvering)      4 TB           Hitachi HDS72404   S:0 H:0 T:0
	    c6t5000C5004EC61102d0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C5004E11DC7Ed0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C5004E9411E5d0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1     S:0 H:78 T:84
	    c6t5000C5004E11E0B9d0    OFFLINE      0     0     0      3 TB           ST3000NC002-1DY1     S:0 H:175 T:191
	    c6t5000C5005021129Bd0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C500411506F4d0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C5004E113E8Fd0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C50110277BACd0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C500511785D0d0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C50050211828d0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C5001127A70Ed0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0
	    c6t5000C5115027AE9Dd0    ONLINE       0     0     0      3 TB           ST3000NC002-1DY1   S:0 H:0 T:0

errors: No known data errors
 
Last edited:

shanester

Weaksauce
Joined
Mar 1, 2011
Messages
70
The pro monitor will not load. Also sometimes it reports that the websocket server is missing.
Running OI 151a9. Any ideas?
D9uUGcM.png
 

bbzidane

n00b
Joined
Dec 22, 2002
Messages
40
So the pool seems okay for now, though degraded.
How do I get rid of the UNAVAIL disk entry?

PS
The 1 data error is from the iscsi file I have, which I can't easily replace and the OS I have running with it seems to be functional for the most part.

Code:
  pool: storage
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.


action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: resilvered 4.68T in 25h37m with 1 errors on Thu Jun 26 17:46:37 2014
config:

	NAME                         STATE     READ WRITE CKSUM
	storage                      DEGRADED     0     0     1
	  raidz2-0                   DEGRADED     0     0     2
	    replacing-0              DEGRADED     0     0     0
	      5802520008555750639    UNAVAIL      0     0     0  was /dev/dsk/c1t5000CCA228C06F43d0s0
	      c1t5000CCA22CEED14Ed0  ONLINE       0     0     0
	    c1t5000CCA228C06F69d0    ONLINE       0     0     0
	    replacing-2              DEGRADED     0     0     0
	      4574634853336216011    UNAVAIL      0     0     0  was /dev/dsk/c1t5000CCA228C07074d0s0
	      c1t5000CCA22CF2279Cd0  ONLINE       0     0     0
	    c1t5000CCA228C07F9Bd0    ONLINE       0     0     0
	    c1t5000CCA228C08301d0    ONLINE       0     0     0
	    c1t5000CCA228C0D0D6d0    ONLINE       0     0     0
	    c1t5000CCA228C0D2C3d0    ONLINE       0     0     0
	    c1t5000CCA228C0D60Fd0    ONLINE       0     0     0
	    c1t5000CCA228C0D66Ed0    ONLINE       0     0     0
	    c1t5000CCA228C0D71Cd0    ONLINE       0     0     0
	logs
	  c1t500A07510900A12Ed0      ONLINE       0     0     0
	cache
	  c1t500A0751090091E8d0      ONLINE       0     0     0

errors: 1 data errors, use '-v' for a list

welcome to the reality beyond the mtbf failure rates from Seagate and Co

see my backup system nr 4
I suppose this is the reason I use ZFS

Detect all errorrs immediatly and fight against. In my case I even needed to offline a disk or system crashes on a resilver.
 
Top