OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

I see from earlier that you have 8GB of memory. I recommend running tests with no less than 2x physical ram, to enforce that at least half of the bytes will actually get all the way to disk. In addition, if you see a difference between a test with 2x ram and one with 3x ram, then you probably need to go very large (30x ram, say) to benchmark the actual disk system.
ATTO : limited to 2GB
CDM : limited to 4 GB

That said, you need to benchmark what you care about. CDM and ATTO are not the real applications you want to run; what do you want to do with this system? You need to profile your real use. If you watch movies from the array, it's a very different access pattern than compiling code for 50 developers on it (shudder) or running a database on it. Benchmarks only show array performance on the thing they do, not what you want to do with it.
This is only test for shared storage in video editing application up to 4 users in HD (120 Mb/s / feed, up to 4 feeds / user.)

This is Bonnie++ whitout cache (still with 2x vDevs RaidZ : 5x 2 To 7200 trs Hitachi 7k3000 + 5x 1 To 7200trs WD Black, ashift =9, Ram = 8 GB, ZIL = STEC Mach16 50GB)
=> bonnie++ -u root -d /dev/zvol/rdsk/black/video -s 16384M -m ZFServer

Code:
Version 1.03c       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
ZFServer        16G 73193  99 655848  69 332747  51 58376  99 860896  47  2321   7
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 31544  99 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
ZFServer,16G,73193,99,655848,69,332747,51,58376,99,860896,47,2321.4,7,16,31544,99,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++

Cheers.

St3F
 
Hi - I am running OI 151a5

I am getting this error when trying to enable netatalk after upgrading to .8k and netatalk 3 per the instructions on the Napp-it website:

"sudo: /etc/init.d/netatalk: command not found"

The scripts for installing netatalk appeared to run fine and I rebooted afterwards.

Netatalk was working prior to the upgrade except for the problems with OI151a5.

Thanks,
cwagz
 
I know nothing about the netatalk program but troubleshooting that shouldn't be all that hard:

1. Does /etc/init.d/netatalk exist ?
$ ls -l /etc/init/netatalk

2. If it does, then:

$ sh -x /etc/init.d/netatalk start

To see where it's bombing.

Also, you can see that message if the #! (shebang) directive is pointing at something that doesn't exist:

$ head -2 brokenscript
#!/sbin/program_missing

$ ./brokenscript
command not found
 
Last edited:
Anybody have any idea when OI will support versions higher than 28? Been wanting to try it but created my pools with 31 back when I switched to zfs
 
And the chaos begins........;)

not chaos, independence from Oracle -
and the only way to add new features to a free ZFS
since ZFS V.29+ are closed source by Oracle.

AND
Whille ZFS 5000 allows closed source and incompatible features done by any developer
I would expect a set of common used features distributed by Illumos
(upstream of FreeBSD, Illumian, OpenIndiana, OSX, SmartOS, ZFSonLinux and others)

more
http://wiki.illumos.org/display/illumos/Distributions
 
I would expect a set of common used features distributed by Illumos
i know nexenta is working on some new stuff. not sure about bp_rewrite but they specifically mention encryption on their sales slides (although it keeps slipping).

i don't expect 4.0 to have much in the way of zfs features apart from possibly a bump to zpool 5000 (almost 1000% sure it wont be default to 5k) as the code base switch is fairly significant itself.
 
If I want to test controllers is there a simple way to move a pool from one controller to another? Do I have to do anything, or will OI recognize that the drives have moved and automatically use them in their new location?

I'm attempting to compare how a 6 disc raidz2 (2tb hitachi) performs on an M1015 or an Areca 1260, both configured in JBOD mode. I'm not booting from this pool, the OS is on it's own drive.
 
ATTO : limited to 2GB
CDM : limited to 4 GB
So pick a different benchmark, as you did below.
This is only test for shared storage in video editing application up to 4 users in HD (120 Mb/s / feed, up to 4 feeds / user.)

This is Bonnie++ whitout cache (still with 2x vDevs RaidZ : 5x 2 To 7200 trs Hitachi 7k3000 + 5x 1 To 7200trs WD Black, ashift =9, Ram = 8 GB, ZIL = STEC Mach16 50GB)
=> bonnie++ -u root -d /dev/zvol/rdsk/black/video -s 16384M -m ZFServer

Code:
Version 1.03c       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
ZFServer        16G 73193  99 655848  69 332747  51 58376  99 860896  47  2321   7
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 31544  99 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
ZFServer,16G,73193,99,655848,69,332747,51,58376,99,860896,47,2321.4,7,16,31544,99,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++

Cheers.

St3F
That looks fast locally. How does it do at video editing? A single stream of writing is a different workload than five read/write streams.
 
If I want to test controllers is there a simple way to move a pool from one controller to another? Do I have to do anything, or will OI recognize that the drives have moved and automatically use them in their new location?

I'm attempting to compare how a 6 disc raidz2 (2tb hitachi) performs on an M1015 or an Areca 1260, both configured in JBOD mode. I'm not booting from this pool, the OS is on it's own drive.

It's a good idea to export the pool if you know ahead of time that you're going to be changing controllers, so the OS doesn't expect to find the disks in the same place they were on the old controller. So run zpool export, turn the machine off and swap controllers, then run zpool import.
 
That looks fast locally. How does it do at video editing? A single stream of writing is a different workload than five read/write streams.
Adobe Premiere C5.5 can't handle all my rushes ... it freezes.
I have to try with Avid Media Composer but on these days, not much time.

I'm attempting to compare how a 6 disc raidz2 (2tb hitachi) performs on an M1015 or an Areca 1260, both configured in JBOD mode. I'm not booting from this pool, the OS is on it's own drive.
Interesting.

++
 
It's a good idea to export the pool if you know ahead of time that you're going to be changing controllers, so the OS doesn't expect to find the disks in the same place they were on the old controller. So run zpool export, turn the machine off and swap controllers, then run zpool import.

you don't need to do this, you can rescan the HBAs. off the top of my head i don't know the CLI commands for this but nexenta's gui has the rescan HBA option.

i think devfsadm is where you want to start your googles.
 
you don't need to do this, you can rescan the HBAs. off the top of my head i don't know the CLI commands for this but nexenta's gui has the rescan HBA option.

i think devfsadm is where you want to start your googles.

If you're willing to replug the disks while the machine is on, yes, you can plug in both controllers and start with the disks on one, run your first test, zpool export, then use cfgadm to release the disks, then physically swap the cable to the other controller and rescan and import the pool, then run the second test.

I'd prefer to just turn the machine off in the middle, to work on it with it off.
 
If you're willing to replug the disks while the machine is on, yes, you can plug in both controllers and start with the disks on one, run your first test, zpool export, then use cfgadm to release the disks, then physically swap the cable to the other controller and rescan and import the pool, then run the second test.

I'd prefer to just turn the machine off in the middle, to work on it with it off.

These are SATA drives in a 24 bay chassis, with individual SATA connections. The ones on the M1015 are SAS-SATA breakouts, the ones on the Areca are straight SATA. So swapping cables would be tedious. Trickier than I'd want to risk in a running machine.

That and I don't want to get into risking the drives with getting them spun down before swapping them around. The time to reboot is trivial compared to the possibility of dealing with failed hardware. I'm fine with powering it down and physically moving the drives to bays connected to the appropriate controller.

I just want to avoid screwing anything up by moving it around without having taken proper steps. Color me cautious for having been burned in the past by various devices and schemes for setting up arrays and their particulars regarding drive ordering and locating.

A bit of preliminary testing in windows (fwtw) shows the M1015 being 5-10% faster than the Areca when being used as pass-through drives, with the exact same model devices. And this is just with one drive attached. I've not yet done any testing involving multiple devices active on one controller.
 
why the reservation? SATA/SAS is hotplug by default.
Sticking my tender conductive hands in a box with a thousand watt power supply seems like something to avoid ;)
These are SATA drives in a 24 bay chassis, with individual SATA connections. The ones on the M1015 are SAS-SATA breakouts, the ones on the Areca are straight SATA. So swapping cables would be tedious. Trickier than I'd want to risk in a running machine.
Agreed. Especially if you have to swap both ends of the connections.
A bit of preliminary testing in windows (fwtw) shows the M1015 being 5-10% faster than the Areca when being used as pass-through drives, with the exact same model devices. And this is just with one drive attached. I've not yet done any testing involving multiple devices active on one controller.
Interesting that one drive would show such a difference. I guess if the drive is doing readahead and buffering data, it could conceivably return data to the host at a lower rate than the media transfer rate. What kind of testing are you doing? Sequential transfers? Random reads?
 
Interesting that one drive would show such a difference. I guess if the drive is doing readahead and buffering data, it could conceivably return data to the host at a lower rate than the media transfer rate. What kind of testing are you doing? Sequential transfers? Random reads?

I've got 2 sets of drives, one seagate 1.5tb and another 2tb wd. I ran HD tune and Crystal Disk Mark against each of them individually and noticed the same drives returning slower results when connected to the areca. No other drive activity and the OS was booted from an SSD on motherboard SATA. Both the M1015 and the 1260 were in x8 PCI-e slots.

When running passmark there was a *considerable* performance difference, with the 1260 returning single digit MBps figures. Why, I have no idea.

I'm in the process of rearranging the box to load OI and see what bonnie reports. I'm thinking of setting up just one pool at a time, on just one drive each time and rebooting in-between. Once that's done I intend on setting up a 6 drive raidz2 and repeat the tests. Measuring the performance related to this raidz2 pool is what I'm after, testing individual drives is just an exercise. Might was well do it before I start using the box.
 
Last edited:
not chaos, independence from Oracle -
and the only way to add new features to a free ZFS
since ZFS V.29+ are closed source by Oracle.

AND
Whille ZFS 5000 allows closed source and incompatible features done by any developer
I would expect a set of common used features distributed by Illumos
(upstream of FreeBSD, Illumian, OpenIndiana, OSX, SmartOS, ZFSonLinux and others)

more
http://wiki.illumos.org/display/illumos/Distributions


Fair enough - I suppose only time will tell!

The comment was meant to be bit tongue-in-cheek (hence the smilie) - but I really hope we don't end up with numerous incompatible versions......



PS - I suppose OI is the safest bet at the moment then - at least there's the prospect of support from other OS (I believe FreeBSD is following suit on zpool version 5000).

Personally though, I'd still be wary of creating/upgrading any pool to version 5000 unless you actually need the features v5000 provides, at least for now!
 
Last edited:
Fair enough - I suppose only time will tell!

The comment was meant to be bit tongue-in-cheek (hence the smilie) - but I really hope we don't end up with numerous incompatible versions......



PS - I suppose OI is the safest bet at the moment then - at least there's the prospect of support from other OS (I believe FreeBSD is following suit on zpool version 5000).

Personally though, I'd still be wary of creating/upgrading any pool to version 5000 unless you actually need the features v5000 provides, at least for now!

The whole point of this is to avoid incompatibilities. Now that Oracle has closed source Solaris, the numbering is useless, since we have no way to be sure (unless they condescend to tell us) how a particular feature for version N works. IMO, it's far worse to have your own encryption and use the same pool version as oracle when you aren't 100% sure it's compatible. As to converting, yes, I agree. Unless you want/need one or more of the new 'feature' things, don't bother converting your pools. I do like the async destroy feature though :)
 
First post to the forum. Lots of great info here, thanks to you all.

I have a question regarding performance of iSCSI over 10GbE (or Fibre Channel) on an OI server, direct connected to a Mac. But first, background:

My goal is to build an OI-based server that I will connect to a single Mac via either iSCSI or Fibre Channel. In essence the OI box should just funciton like a huge hard drive attached directly to the Mac. I want to do it this way because (1) ZFS will give me data integrity and (2) I can put the storage in a different physical location from my Mac, which is good for security, noise and heat. I currently have a Promise Pegasus R6 to do the job, which is a Thunderbolt-connected hardware RAID with six disks. It's nice and fast and hasn't failed me in any way, but I would feel better with ZFS on the job and no proprietary hardware. Also moving the array away from the Mac would be nice. So I plan to sell the Pegasus once my OI solution is up and solid.

What I have built:
-Intel S1200BTS (actually S1200BTSR, supports v2 CPUs)
-Xeon E3-1225v2
-16GB Kingston ECC
-LSI 9211-8i SAS/SATA controller, flashed to newest IT firmware
-Hitachi 7K4000 4TB HDDs x8
-Intel 160GB x25m SSD (had laying around)
-SuperMicro CSE-M35T-1B SATA hot swap chassis x2
-Nexus Prominent 9 case

The CPU was originally in a board that needed a CPU with built-in video, thus the E3-xxx5 CPU choice, does me no good in this mobo but at least it still works.

Installed OI 151a5 (desktop) in a 30GB partition on the SSD. Maybe that's way too big, don't know. My thought is I can put the rest of the SSD to good use for something else later (ZIL, cache, I don't know). Suggestions welcome on the use of the SSD, like how big the OI boot partition should be and what I should use the rest for.

Made a raidz2 zpool out of the eight 7K4000 drives.

As suggested by Gea long ago, to test write speed, ran:

time dd if=/dev/zero of=/tank/test bs=1024000 count=10000

and, to test read speed, after reboot since the results seemed wrong, ran:

time dd if=/tank/test of=/dev/null bs=1024000

My results averaged 10.93 seconds (907MB/s) for write and 11.29 seconds (907MB/s) for read.

I understand that maybe dd is not the best benchmark but I wanted to get a rough idea of what the hardware was capable of. The numbers, if accurate, are pretty impressive, to me at least.

So after all that, the reason for my post. I want to connect this thing to my Mac. But I don't want to choke it by connecting it with gigabit Ethernet. Seems like such a waste. And it would be a lot slower than my current Thunderbolt solution too. So I'm thinking I either need to go with 10Gb Ethernet + iSCSI or Fibre Channel. Alas, I will probably need to swap my iMac for a Mac Pro to be able to do so, unless I find a good Thunderbolt-connected solution. But low end refurb Mac Pros cost less than my fairly high end iMac should fetch, so I guess that's not a big deal.

Anyway, I'm curious for any real-world experience running iSCSI over 10GbE or running Fibre Channel, which I guess is either 4Gb or 8Gb. In essence, before taking the plunge of (1) switching my Mac, (2) getting an expensive card for it and (3) getting an expensive card for my OI box, I'd like to know if it will be worth it. How much of that raw performance should I expect to see at the Mac? If my numbers are to be believed, anything short of a 10GbE card will be a bottleneck. I do plan to test iSCSI over gigabit as soon as I can just to verify that it is a bottleneck as I suspect.

Also, any practical experience on the Mac side regarding iSCSI over 10GbE? GlobalSAN initiator work fine?

Sorry for the long post!
 
Silly question: at some time (not sure when), napp-it was changed to (apparently) cache the list of current disks. When I click on the smartinfo tab, it says "Soft-, Hard- and Transfer-errors and unconfigured/offline drives (updated on system-boot!)." I guess I don't understand why one would want this information to be static, but if it is, is there a way other than rebooting to get napp-it to refresh it's info?
 
Silly question: at some time (not sure when), napp-it was changed to (apparently) cache the list of current disks. When I click on the smartinfo tab, it says "Soft-, Hard- and Transfer-errors and unconfigured/offline drives (updated on system-boot!)." I guess I don't understand why one would want this information to be static, but if it is, is there a way other than rebooting to get napp-it to refresh it's info?

This message is there from the very beginning of smartmontools support.
I have never rechecked if its updated now.

The caching of disk infos in napp-it is only in the SAS2 slot monitoring extension. Collecting all needed infos to display the physical slot of SAS2 WWN disks is very slow and not needed when disks are unchanged (napp-it 0.8k)

Without caching, it can last minutes until slotinfo is displayed with lots of disks.
 
My goal is to build an OI-based server that I will connect to a single Mac via either iSCSI or Fibre Channel. In essence the OI box should just funciton like a huge hard drive attached directly to the Mac. I want to do it this way because (1) ZFS will give me data integrity and (2) I can put the storage in a different physical location from my Mac, which is good for security, noise and heat.

I would consider

- much more RAM
- optionally use file based sharing instead of iSCSI
reason: Its much easier to restore files from snapshots
with iSCSI you must clone the whole disk to access a snapshot of a file
or you must use TimeMachine which is not comparable in any way

AFP is similar regarding performance with iSCSI
 
This message is there from the very beginning of smartmontools support.
I have never rechecked if its updated now.

The caching of disk infos in napp-it is only in the SAS2 slot monitoring extension. Collecting all needed infos to display the physical slot of SAS2 WWN disks is very slow and not needed when disks are unchanged (napp-it 0.8k)

Without caching, it can last minutes until slotinfo is displayed with lots of disks.

My point was, if the display is static, things like s/h/t errors won't change, so I'm not sure what the point of displaying them is? Or am I confused?
 
My point was, if the display is static, things like s/h/t errors won't change, so I'm not sure what the point of displaying them is? Or am I confused?

Its basically the output of iostat -Enr
so it displays the "current state"

But the displayed current state is not always the real state
(ex if you unplug and replace a disk, this is not displayed always and immediatly)
sometimes you need to unconfigure/configure a disk, sometimes you need a reboot
but this is also controller dependant. Also smartchecks modifies displayed soft-errors.
But after a reboot, the info is absolute valid.

But you are right. This info may irritate and may not always be true.
I will think about removing the info.
 
Sorry if you answered this, is there a way to click something and refresh it? I have a disk I have pulled since, still showing up :)
 
noob question how do you install bonnie++ ?

using OI 151a5, I only see DD Benchmark

i did the
Code:
root@openindiana:~# pkg install bonnieplus
No updates necessary for this image.

sorry new to this
 
Last edited:
I would consider

- much more RAM
- optionally use file based sharing instead of iSCSI
reason: Its much easier to restore files from snapshots
with iSCSI you must clone the whole disk to access a snapshot of a file
or you must use TimeMachine which is not comparable in any way

AFP is similar regarding performance with iSCSI

Much more RAM? I thought 16GB was already too much... How much would you recommend?

True, I lose some of the nice ZFS stuff like snapshots, which is a pity. But, if I were to use AFP, I lose spotlight too, which seemed like a bigger pity. Or is that not true?

Did you have any comment regarding the 10GbE / FC question?

Thanks!
 
noob question how do you install bonnie++ ?

using OI 151a5, I only see DD Benchmark

i did the
Code:
root@openindiana:~# pkg install bonnieplus
No updates necessary for this image.

sorry new to this

i install it the same way during setup
its in menu pool - benchmark
 
Much more RAM? I thought 16GB was already too much... How much would you recommend?

True, I lose some of the nice ZFS stuff like snapshots, which is a pity. But, if I were to use AFP, I lose spotlight too, which seemed like a bigger pity. Or is that not true?

Did you have any comment regarding the 10GbE / FC question?

Thanks!

about RAM
One of the advantages of ZFS ist the ARC read cache. It uses available RAM as cache to improve read performance. So ZFS can get faster the longeryou use it.

So you can only say: the more RAM the better, us as much as you can plugin or affort. Its the best way to achieve performance ond to allow most of read requests to be deliverd from RAM. So even several hundered GB RAM can make sense.

http://www.anandtech.com/Show/Index...=2&slug=zfs-building-testing-and-benchmarking

about spotlight
I have not tried spotlight with netatalk3 (it is supported with newer Apple servers)

about speed
with 1 Gb, you can achieve about 100 MB/s sequential via net in best case. If you go to 10 Gb
it is very hard to achieve 1 GB/s but it seems possible to go up to 300-600 MB/s
- depending on hardware and filesize.

Disks have between 70 and 150 MB/s transfer rates these days. If you want to achieve 600 MB/s, you need a disk stripe of at least 10 disks in Raid-0. With 8 disks in a single Raid-Z2 you have 6 data disks and that can achieve at least 400 GB/s (inner disc sectors) locally with one large sequential transfer. With small files you have the problem that a single Raid-Z2 has only the I/O of a single disk so values may be worse.

So reachable values depends on hardware and test patterns/use case.
 
Last edited:
It's a bit of a hassle trying to post HTML tables, but here goes:
Code:
NAME		SIZE	Date(y.m.d)	File	Seq-Wr-Chr	%CPU	Seq-Write	%CPU	Seq-Rwr	%CPU	Seq-Rd-Chr	%CPU	Seq-Read	%CPU	Rnd Seeks	%CPU	Files	Seq-Create	Rnd-Create
areca-seagate	1.36T	2012.08.03	32G	119 MB/s	99	127 MB/s	22	43 MB/s	10	75 MB/s		81	88 MB/s		8	624.8/s		2	16	26622/s	23914/s
areca-wd	1.81T	2012.08.03	32G	99 MB/s		83	98 MB/s		17	42 MB/s	10	70 MB/s		76	92 MB/s		8	451.3/s		1	16	26793/s	26405/s
m1015-seagate	1.36T	2012.08.03	32G	107 MB/s	90	102 MB/s	18	46 MB/s	11	68 MB/s		73	102 MB/s	9	606.9/s		1	16	26896/s	22197/s
m1015-wd	1.81T	2012.08.03	32G	79 MB/s		67	77 MB/s		14	41 MB/s	10	84 MB/s		92	112 MB/s	10	457.2/s		1	16	26636/s	25519/s
This is on a Supermicro X7DWE motherboard with 16gb and dual E5440 Xeons. There are two sata drive models here: Seagate ST31500341AS and Western Digital WD20EARX. One each on an IBM M1015 and an Areca 1260; both in JBOD mode. The OS is on a separate Intel 120gb SSD connected to a motherboard SATA port. This is a fresh install of OI and napp-it, no other drives connected, no other activity on the box.

I created one pool on each drive. There are no folders in them. I then ran Bonnie++ benchmarks from within the napp-it web-ui. I rebooted between each test. I was expecting the Seagates to be faster than the WD, so that's not a surprise. But it is a surprise to see the m1015 being slower than the 1260; as it was just the opposite during tests running in windows. The areca uses the included driver. I had to load the 3.0 imr_sas LSI driver for the m1015. I've done no tweaking or configuring of their setups, but would certainly welcome suggestions.

How does ashift affect performance here? I don't recall choosing it when I set up the pools, but the m1015-seagate is set to ashift 12, while the others are set to 9. Did napp-it make a choice behind the scenes? If so, why did it choose a different setting for the same kind of drive? Would going back and recreating the pools with different ashift make a difference?

My next test is to compare how a 6 drive raidz2 performs on each controller. My goal is to have that perform as best as possible on this box. I then plan on using the remaining connections for media storage for streaming.

I don't expect there to be more than a dozen devices (pcs, tablets, streamers, etc) accessing this box at any one time. It's a beast of a box for a home office and home media library.

Doing single drive testing was just a way to give me a baseline to compare the controllers. There's enough of a difference to make me wonder which one to use. I'll now go connect the 6 drive raidz2 pool and do some more bonnie testing.
 
i install it the same way during setup
its in menu pool - benchmark

nappit-1.jpg


don't know why its missing

if I go on putty and do the bonnie++ it says its version 1.03c

how can I add it back to napp-it?

thanks
 
Hi,
I did now finished the setup on my iO ZFS machine everything is working exept one minor thing.

napp-it 0.8k & natatalk 3

What we have here:
1x Mac (ML)
2x Win 7 (Home)

When the Windows PC moves a folder to the SMB share which includes an "thumbs.db" file the mac isn't able to delete the folder and says the folder is currently in use but deletes all the files in the folder exept the thumbs.db....

Any tips?
 
Back
Top