OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

The Spyder · Sep 12, 2011

Ahhhh something I had not even thought of. Oh well, my second 1068E will be here in a few days and that will solve that. Thanks!

kegler · Sep 12, 2011

anyone knows how i can improve my read speed over smb?

benchies shows i can go over 100MB/s to about 150MB/s from DD but over the network its only 30+ MB/s

write speed 50+ MB/s from DD benchies...over the network i can hit 40+ and after turning on jumboframe i could hit 50+ MB/s but read become worse....left 25MB/s

sigh...

PigLover · Sep 12, 2011

s0lid said:
vmdirectpath doesn't work with sata contollers that are integrated into SB/NB. So that's problem you cannot fix.

Not a true statement at all. Directpath works fine with the built in ICH10 controller on that motherboard. It is configured on the system sitting right beside me right now that I am testing with.

Unfortunately, I don't know what is causing your problem.

danswartz · Sep 12, 2011

As I understand it, it might be because the controller is behind a PCI bridge...

unclerunkle · Sep 12, 2011

PigLover said:
Not a true statement at all. Directpath works fine with the built in ICH10 controller on that motherboard. It is configured on the system sitting right beside me right now that I am testing with.

Unfortunately, I don't know what is causing your problem.

I can back this up. On my SuperMicro X9SCM-F with VMDIRECTPATH and ESXi 5.0, I can passthrough my internal SATA controllers. For me, it's all or nothing though - all SATA2 and SATA3 ports or none at all, I can't split them up.

EDIT: Yes, as danswartz pointed out below, the SATA2/3 issue is because they are run on one controller. Still, an issue for me nonetheless

danswartz · Sep 12, 2011

Well, sure, but that's a different issue altogether. The sata ports are all associated with a single controller, so they go as one. It can be a bit of a mystery as to why some controllers can work in passthru and others not - sometimes it seems the only viable thing is to try

TigerLord · Sep 12, 2011

BUILD
MOBO..........................MSI X58M 1366
CPU..............................Intel Xeon E5606
SSD..............................Kingston SSDNow V100 64GB
HDD..............................8x Hitachi Deskstar 5K3000 2TB
PSU..............................Antec QuattroPower 850W Modular
GPU..............................EVGA 8600GT
HS.................................CORSAIR CAFA50
RAM..............................G.SKILL Ripjaws Series 8GB
CASE...........................Antec Three Hundred Mini Tower
NIC................................Intel PRO/1000 Pci-e
SAS..............................LSI SASUC8I (flashed with IT firmware. See this)

OS................................OpenIndiana with Napp-it
Total space..................16TB
Usable space..............10.1TB
Parity............................RAIDZ1

And my main rig, because I never miss an excuse to showcase it...

Thanks to [H] for the help!!!

PS. For the purists, I'll work on wire management this week-end.

jmk396 · Sep 12, 2011

Something is very wrong with my fileserver... It's going so slow that it's almost unresponsive. If I reboot it seems OK for a few minutes or hours but then it's almost unresponsive again.

Here are some interesting messages from the syslog:

Code:

Sep 11 23:17:26 fileserver smbsrv: [ID 421734 kern.notice] NOTICE: [NT Authority\Anonymous]: media access denied: IPC only
Sep 11 23:24:00 fileserver last message repeated 1557 times
Sep 11 23:24:04 fileserver smbsrv: [ID 421734 kern.notice] NOTICE: [NT Authority\Anonymous]: media access denied: IPC only
Sep 11 23:30:41 fileserver last message repeated 1555 times
Sep 11 23:30:45 fileserver smbsrv: [ID 421734 kern.notice] NOTICE: [NT Authority\Anonymous]: media access denied: IPC only
Sep 11 23:37:07 fileserver last message repeated 1515 times

Code:

Sep 12 07:04:07 fileserver ahci: [ID 517647 kern.warning] WARNING: ahci0: watchdog port 3 satapkt 0xffffff01cd800cb8 timed out
Sep 12 07:05:52 fileserver ahci: [ID 517647 kern.warning] WARNING: ahci0: watchdog port 3 satapkt 0xffffff01cd845e90 timed out
Sep 12 07:05:52 fileserver ahci: [ID 517647 kern.warning] WARNING: ahci0: watchdog port 3 satapkt 0xffffff01cc9a7640 timed out
Sep 12 08:39:08 fileserver ahci: [ID 777486 kern.warning] WARNING: ahci0: ahci port 3 has interface fatal error
Sep 12 08:39:08 fileserver ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci port 3 is trying to do error recovery
Sep 12 08:39:08 fileserver ahci: [ID 551337 kern.warning] WARNING: ahci0:       Transient Data Integrity Error (T)
Sep 12 08:39:08 fileserver      Internal Error (E)
Sep 12 08:39:08 fileserver      CRC Error (C)
Sep 12 08:39:08 fileserver ahci: [ID 657156 kern.warning] WARNING: ahci0: error recovery for port 3 succeed
Sep 12 09:09:03 fileserver ahci: [ID 777486 kern.warning] WARNING: ahci0: ahci port 3 has interface fatal error
Sep 12 09:09:03 fileserver ahci: [ID 687168 kern.warning] WARNING: ahci0: ahci port 3 is trying to do error recovery
Sep 12 09:09:03 fileserver ahci: [ID 551337 kern.warning] WARNING: ahci0:       Transient Data Integrity Error (T)
Sep 12 09:09:03 fileserver      Internal Error (E)
Sep 12 09:09:03 fileserver      CRC Error (C)

We had a major flood in our area and I had to disconnect/move the fileserver to a safe location and ever since I moved it back it's been acting strange. It wasn't affected by water or anything like that but it seems like moving it around somehow broke something...

The Spyder · Sep 12, 2011

Have you checked the wiring/cables at both ends on port 3?

The Spyder · Sep 12, 2011

danswartz said:
Well, sure, but that's a different issue altogether. The sata ports are all associated with a single controller, so they go as one. It can be a bit of a mystery as to why some controllers can work in passthru and others not - sometimes it seems the only viable thing is to try

I believe I figuered it out and will test my theory here in a minute- What happened is ESXi recognized 2 controllers (a 2 and 4 port) for all 6 sata ports. I only passed one controller through. I am going to attempt to install both controllers and just install OpenSolaris straight to the 640, instead of letting ESXi handle it.

I finally got everything working, but was disappointed by the speeds so far. On a 2k8r2 VM (running on the same host), CDM is only showing ~58mb/s sequential reads and 200mb/s sequential writes. This is using the e1000 network adapters. To my local machine (a building away, still gigabit- but through 5 switches total), I am seeing 38mb/s read and 98mb/s write.
However on Napp-It, I am seeing the following.

Uploaded with ImageShack.us

jmk396 · Sep 12, 2011

The Spyder said:
Have you checked the wiring/cables at both ends on port 3?

How can I tell which port is port 3? I have some drives connected to the motherboard and other drives connected to an LSI card. However, I checked ALL of the wiring and it looks fine.

I also ran a ZFS scrub and it didn't find any errors and the pools and drives are all showing as healthy...

hnkudr · Sep 12, 2011

The Spyder said:
I finally got everything working, but was disappointed by the speeds so far. On a 2k8r2 VM (running on the same host), CDM is only showing ~58mb/s sequential reads and 200mb/s sequential writes. This is using the e1000 network adapters. To my local machine (a building away, still gigabit- but through 5 switches total), I am seeing 38mb/s read and 98mb/s write.
[/URL]

When testing the smb transfer between Solaris/OI and Windows using CDM, I always get ~40MB/s read and ~100MB/s write.

But when using the Windows built-in command - robocopy or just Explorer, I can get 100~110MB/s on both read and write between Solaris and Windows (the network utilization in the windows task manager is about 95%~99%).

I can get ~100MB/s r/w without any issues when using CDM to benchmark Win-to-Win.

So, the conclusion is CDM doesn't work well with Solaris/OI smb? Could someone please verify this?

jmk396 · Sep 12, 2011

Well... I think I found my problem:

Code:

HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE [asc=5d, ascq=10]

Code:

Error: S:4 H:310 T:0

That's two Hitachi drives that are dead within about two months... I'm not that thrilled so far.

TigerLord · Sep 12, 2011

Is there a reason why my mapped ZFS NAS in Windows 7 doesn't show capacity in Windows Explorer? The usage bar is there but there`s no text (XXX TB free of XXX TB). Other drives I map do show it.

The Spyder · Sep 13, 2011

hnkudr said:
When testing the smb transfer between Solaris/OI and Windows using CDM, I always get ~40MB/s read and ~100MB/s write.

But when using the Windows built-in command - robocopy or just Explorer, I can get 100~110MB/s on both read and write between Solaris and Windows (the network utilization in the windows task manager is about 95%~99%).

I can get ~100MB/s r/w without any issues when using CDM to benchmark Win-to-Win.

So, the conclusion is CDM doesn't work well with Solaris/OI smb? Could someone please verify this?

You must be right, as I am hitting over 100MB/s on 3TB of data I am moving over.

jmk396 · Sep 13, 2011

Is it possible for napp-it to e-mail me when SMART reports a hard drive is failing?

If you read a few posts above you'll see that ZFS reported my pool as being perfectly fine but I manually checked the SMART info and it says that a hard drive is "IMPENDING HARDWARE FAILURE".

It would have been nice to receive an e-mail notification so I knew ahead of time, etc...

kegler · Sep 13, 2011

i really dont get it.... write at 80+ MB/s and read only 40+ MB/s

sigh

unclerunkle · Sep 13, 2011

jmk396 said:
Well... I think I found my problem:

Code:

HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE [asc=5d, ascq=10]

Code:

Error: S:4 H:310 T:0

That's two Hitachi drives that are dead within about two months... I'm not that thrilled so far.

Get a better power supply to start. And be sure not to get hung up on the "this is 800W!!!" wattage bandwagon.

http://www.jonnyguru.com/

I own the Seasonic Gold x650 and love it! My server ranges from ~140 watts idle to 160 watts load with a Xeon E3-1230 and 11 Seagate 2TB drives. The best part is that the power supply fan never even kicks on!

_Gea · Sep 13, 2011

jmk396 said:
Is it possible for napp-it to e-mail me when SMART reports a hard drive is failing?

If you read a few posts above you'll see that ZFS reported my pool as being perfectly fine but I manually checked the SMART info and it says that a hard drive is "IMPENDING HARDWARE FAILURE".

It would have been nice to receive an e-mail notification so I knew ahead of time, etc...

you may duplicate/edit the email alert menu.
napp-it is intended to be editable/ expandable by users.
You only need little perl/php scripting knowledge to understand the menu script

kegler · Sep 13, 2011

Gea

any advice for improving read speed? my read is avg 40MB/s where my write is 80MB/s

dedup is off

ripken204 · Sep 14, 2011

http://wiki.openindiana.org/oi/oi_151a+Release+Notes

OpenIndiana oi_151a is released

_Gea · Sep 14, 2011

ripken204 said:
http://wiki.openindiana.org/oi/oi_151a+Release+Notes

OpenIndiana oi_151a is released

update instructions:
http://wiki.openindiana.org/oi/Upgrading+OpenIndiana

Random.Nick · Sep 14, 2011

Hi,

On a newly upgraded ESXi 5.0 host I also tried to upgrade OpenIndiana vm to the latest 151a version. The OI vm was running Napp-it 0.500s.

During the OI update I bricked the vm, basically by running out of space on rpool.

I tried to increase the disk size available for the OI vm, hoping that this increased space might show up under OI, but this is not the case.

My question is how to do a disaster recovery using the Napp-it backup data from the bricked vm to a new vm?

1) On the bricked OI vm I got into system maintenance mode and exported 2 data pools, which were assigned through a dedicated DirectPath I/O LSI controller and a FC card to several LUNs with COMSTAR.

2) I created a new OI vm, re-assigned the DirectPath I/O LSI controller and a FC card, I installed Napp-it and I could import the 2 data pools into the new OI vm.

3) I would like to restore the saved Napp-it COMSTAR configuration from the bricked vm to the new vm - if I knew that where is the saved COMSTAR backup data is stored on the bricked vm. I tired to import LUNs from the imported data pools with Napp-it "Import Lu" option, but it did not see the LUNs that existed on the same pool on the bricked vm.

4) Is there any easy way to transfer other configuration items from the bricked vm?
a) OI server name
b) smb/cifs (the server was already joined to AD, with shares enabled)
c) FC card configuration (e.g. target mode, instead of initiator)

5) Is it possible to enable overflow protection (use max 90% of current space) on an existing Rpool?

Thank you for the advice.

moose517 · Sep 15, 2011

quick question that i'm sure has been asked. I'm planning on overhauling my storage server soon as was wondering if it would be able to import my pools no problem if its using a different controller card than what it was. Right now i'm using supermicro AOC-SAT2-MV8's and was looking at the intel SAS one that people like.

dave99 · Sep 15, 2011

yep, I've done it multiple times. Make sure you export the pool before you switch cards, although even if you didn't it would probably still work. Better to do it right though.

axan · Sep 15, 2011

OK i need help from someone familiar with solaris networking.
I'm trying to connect my zfs nas to 2 vmware hosts through 10 gig ethernet but without using a switch
I have a 2 port 10gig card on the nas and single port cards on each vmware server. I figured it would be simple just create a bridge on zfs nas (solaris) and it would work. Problem is after I use dladm to create a bridge between ixgbe0 and ixgbe1 i can't assign an ip to the bridge device (bridge0).
Any help would be appreciated.

PigLover · Sep 15, 2011

axan said:
OK i need help from someone familiar with solaris networking.
I'm trying to connect my zfs nas to 2 vmware hosts through 10 gig ethernet but without using a switch
I have a 2 port 10gig card on the nas and single port cards on each vmware server. I figured it would be simple just create a bridge on zfs nas (solaris) and it would work. Problem is after I use dladm to create a bridge between ixgbe0 and ixgbe1 i can't assign an ip to the bridge device (bridge0).
Any help would be appreciated.

VMware does not support bridging or routing between external interfaces across a Vswitch. You'll have to set up a separate point-to-point subnet between the NAS and each client vmware host - each subnet will need to be on a separate Vswitch in the NAS. You won't be able to pass traffic between the two vmware hosts over the 10gig links because VMware won't bridge or route them.

If your really ambitious, you can set up a router application as a VM on the NAS and route traffic between the two subnets (something like pfsense, etc). I tried this and it does work. Problem is that using a compute-plane router inside a VM creates a good deal of packet routing latency - not a problem at 1gig, but at 10gig its devastating to throughput on NFS and/or TCP.

danswartz · Sep 15, 2011

I'm confused. I thought he was talking about the Solaris box bridging two 10gig nics - the Vmware hosts are just clients - they are not involved in bridging in any way, no?

PigLover · Sep 15, 2011

danswartz said:
I'm confused. I thought he was talking about the Solaris box bridging two 10gig nics - the Vmware hosts are just clients - they are not involved in bridging in any way, no?

Ah, got it. I assumed he was talking about an "all-in-one" ZFS NAS. I should really read more carefully before I answer!

He should certainly be able to do this on his Solaris ZFS host. Not sure why it isn't working.

spankit · Sep 15, 2011

I'm not expert but maybe try this?

Set static IP on NIC1
Bridge NIC2 to NIC1
Plug a pc into NIC2 and see if you can ping NIC1's IP.

Could be wrong but I don't think you configure IP's on bridge's, even in pfsense. I'm going to test this in a vm as soon as I get a chance and edit this post. Please keep us updated on your progress because I'm working on a solution for work that this option would come in handy with.

EDIT: pfSense does allow for setting an IP on a bridge.

EDIT2: What I was trying to get at is maybe the NIC's, even though as part of a bridge, have to be configured individually.

axan · Sep 15, 2011

PigLover said:
Ah, got it. I assumed he was talking about an "all-in-one" ZFS NAS. I should really read more carefully before I answer!

He should certainly be able to do this on his Solaris ZFS host. Not sure why it isn't working.

Exactly the zfs box is physical server not a vm in not all-in-one config

danswartz · Sep 15, 2011

Yeah, wish I could help, but solaris network is, well, not that straightforward (least not if you're a linux/bsd guy...) If you come up dry, might want to subscribe to openindiana mailing list - some pretty experienced folks there...

axan · Sep 15, 2011

spankit said:
I'm not expert but maybe try this?

Set static IP on NIC1
Bridge NIC2 to NIC1
Plug a pc into NIC2 and see if you can ping NIC1's IP.

Could be wrong but I don't think you configure IP's on bridge's, even in pfsense. I'm going to test this in a vm as soon as I get a chance and edit this post. Please keep us updated on your progress because I'm working on a solution for work that this option would come in handy with.

EDIT: pfSense does allow for setting an IP on a bridge.

EDIT2: What I was trying to get at is maybe the NIC's, even though as part of a bridge, have to be configured individually.

OK this just got crazier. If i set the ip on ixgbe0 and connect something to ixgbe1 it doesn't work.
If i reverse it (ip on ixgbe1 connected to host through ixgbe0) it works.
Now the only difference is when i try to set ip on ixgbe1 ipadm says it can't set pernament ip on temporary device so i have to use ipadm -t (temp setting) i don't get that error on ixgbe1

danswartz · Sep 15, 2011

question: is nwam enabled? I seem to recall nwam does not play well with multiple nics. If this is the case, you probably need to disable nwam, enable the physical, and go from there?

axan · Sep 15, 2011

nwam is disabled

danswartz · Sep 15, 2011

Hmm, no idea then, sorry

kegler · Sep 15, 2011

i3-2100T
dell 5i hba
J&w H61 mini itx
dual broadcom nic

oi 148
napp-it 0.5

Intel g2 SSD 80GB (boot
Seagate constellation 1TB x 2 (mirror)
Samsung F4 2TB x 2 (mirror)

the write speed is about 80MB/s but read speed is on 40+ MB/s

I couldnt figure out whats wrong....anyone has any idea?

danswartz · Sep 15, 2011

what is the client? win7? i seem to recall multiple threads with people reporting slow reads with CIFS.

kegler · Sep 15, 2011

danswartz said:
what is the client? win7? i seem to recall multiple threads with people reporting slow reads with CIFS.

yes...i google and realise alot of people complaining about it.... but seems like non knows how to resolve this....sigh

and this is so depressing.... thought mirroring would give me a better performance than raidz.... which is why i opt to go the mirroring path.... hdd are cheap anyway

danswartz · Sep 15, 2011

It does give better performance (read at least) - can't help it if CIFS is screwing you. Can you use NFS?

OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

2[H]4U

Weaksauce

[H]ard|Gawd

2[H]4U

Weaksauce

2[H]4U

[H]ard|Gawd

Gawd

2[H]4U

2[H]4U

Gawd

n00b

Gawd

[H]ard|Gawd

2[H]4U

Gawd

Weaksauce

Weaksauce

Supreme [H]ardness

Weaksauce

[H]ard|Gawd

Supreme [H]ardness

n00b

Gawd

2[H]4U

[H]ard|Gawd

[H]ard|Gawd

2[H]4U

[H]ard|Gawd

Limp Gawd

[H]ard|Gawd

2[H]4U

[H]ard|Gawd

2[H]4U

[H]ard|Gawd

2[H]4U

Weaksauce

2[H]4U

Weaksauce

2[H]4U