ZFS build check & hardware RAID

Mastaba · Jan 6, 2013

I wanted first to build my fileserver using an expensive areca ARC-1882ix-24 controller. Then, after reading lots of threads about ZFS, which seems to be much more safer about all the silent data corruption problems than hardware RAID.
So i changed my mind and rethink my build.

First question, do i miss something? Is there still any benefit from using this kind of expensive hardware RAID setup against a cheaper and safer software RAIDZ that doesn't require expensive controller, neither expensive TLER enterprise drives while having extended healing capabilities? Does hardware RAID give more performances of anything?

As data integrity is my priority i don't think i have much choice, but anyway i would like to be sure.

UPDATE

The config i thought:

mobo: X9SCM-iiF
cpu: xeon E3-1270 V2
ram: 4*8GB DDR3 ECC
hba: 3*M1015
network: X540-T2
ssd: intel 335 240GB (boot/ZIL/L2ARC to be defined)
hdd: 24*5K4000
case: RM424pro
psu: seasonic P-760
os: ? (OpenIndiana, OpenSolaris, FreeNAS, Nexenta, ???)

mobo: X9SRH-7TF or X9SRL-F
cpu: xeon E5-1620
ram: x*16GB DDR3 ECC registered
hba: 2 or 3*M1015
network: none or X540-T2
boot: SLC USB key
ssd: ZIL intel 320 80GB + L2ARC intel 520 480GB
hdd: 24*5K4000
case: RM424pro
psu: seasonic P-760
os: OpenIndiana+napp-it
cooling: kama stay

Questions:

About the RAM size, i wonder if replacing this LGA1155 system with a LGA2011 one is worthy ?
I could use a lot more RAM than 32GB, and i read that ZFS love RAM.
But i also saw this too much RAM issue.
So what's the best ammount of RAM to avoid these freeze?
Does increasing RAM above 32GB worth it? (128GB would be required for the 1GB RAM > 1TB HDD ratio, and more could always be useful)
Also does deduplication worth it? Is it safe? (as it seems to be impossible to "undedupe" data)
Seems to consume a lot of CPU power and lot of RAM to work properly, so i wonder.
I won't use deduplication

About CPU power, what are ZFS's needs (with/without deduplication)?
Because there is also the E3 1220L with his tiny 17W TDP, but i'm afraid it would bottleneck the performances.
Also 2011 setup would allow 6core CPU.

About the M1015, i read on a thread there is different revisions (46M0861 & 46M0831), which one is the good and how to know?
How to cool them properly as they seems to becomes very hot?
120mm fans cooling the PCI cards should make it, but infortunately i couldn't find setups that don't eat PCI ports (i thought of placing the fans "topdown").
I noted the (discontinued) kama stay, but it still require one PCI slot.
I hope the 3*120mm fanwall will be enough as i can't find the kama stay or similar

About SSD, ZIL, L2ARC & HDD, does it worth it ?
ZIL seems to be very dangerous to use as losing it could lead to losing the whole data stored on the hdds, Does the performance increase worth the risk of having such a weak link?

How do you install the OS? On a third SSD? Is it possible to use the ZIL SSD as only 8GB are required for it? (from what i read)

The 5xxxRPM non TLER hdds should work quietly enough and give sufficient performance, i hope so.
I also read that ZFS don't work well with 4k hdds, but i can't find any 4TB non-4k hdd, so how to proceed? Is there a way to build a proper RAIDZ using 4TB hdds?
ashift=12

About L2ARC, while it being very useful for increasing the IOPS of slow 5xxxRPM drives and safer than ZIL being only a read cache, how does it deal with faster read performance?
What if the L2ARC SSD can't read as fast as the 24*drives can? (And can't even write data as fast as the hdds can read).
Because i don't think there is any consumer SSD that can reach the 1GB/s read (and even less write) speed attained by a 24*hdd array...
Still wondering if a lower througput L2ARC can bottleneck a large array in case of sequential reads

Can ZFS use SGPIO from backplanes for spotting failed drive(s) ?

About the OS, which one to use?
What are the pros/cons of the different ZFS capable OS ?
OpenIndiana+napp-it sounds like a nice choice

About the RAIDZ configuration, what's the best setup for 24 drives?
I read pools should be made of a precise number of drives, like 10 for best performances, but that would make a third pool of only 4 hdds.
Also what kind of RAIDZ ?
-24*hdd all in one RAIDZ2/RAIDZ3 ?
-2*10 RAIDZ2 + 4 in RAIDZ ?
-3*8 RAIDZ1/RAIDZ2 ?
-2*12 RAIDZ2 ?
Does the performance drop is significant if i don't make 10drives pools ? Because i don't want to lose too much diskspace as the main goal will be storage.
I will probably make a 24*hdd Z3

-----------------------------------------------------------------------------------------------------------------------------

Second build:

I also want to renew my windows desktop multi purpose config. (internet, photo editing, light games, download box...)
The problem is while using windows, required for apps/games, i won't be able to benefit the ZFS strength and selfhealing capabilities, which annoy me as data could be corrupted before being written on the fileserver. Also i will get the RAID5 write hole problem using hardware RAID5...
I read someone made a ZFS through VM, but lost his data because some flush problem from VM.
Is there any ways to do this? (safe windows RAID system) with checksums and all.
Some sort of automatic par/checksum system, added to weekly scrub.

I know i could also write directly on the ZFS server through network and even boot from it, but this would mean leaving the fileserver online 24/7, consuming power and making noise (probably more noise than the quiet desktop config i want) as it will never idle.
Any ideas for this problem?

I still have to choose between all-in-one or two separate configs

The config I though so far :

mobo: X9SCM-iiF
cpu: E3 1270 V2
heatsink: CR95C
ram: 16/32GB DDR3 ECC
network: X540-T2
video: passive 7750 or 7770
sound: xonar stx
ssd: intel 335 240GB
raid: areca ARC-1223 8i
hdd: 4*WD RED 3/4TB in RAID5
case: RSV-L4000
fans: 3*noctua NF-P12 for the middle 120mm fanwall
backplane: SK-34A-S2, removing the stock 80mm and using the fanwall for cooling instead
psu: seasonic P-660
os: windows7

in case of two configs:
desktop config:

mobo: X9SCM-iiF or X9SRH-7TF
cpu: E3-1270V2 or E5-1620
ram: 16GB DDR3 ECC or registered
network: X540-T2 or none
video: 7750/7770
sound: xonar stx/st
ssd: intel 335 240GB
case: RSV-L4000
fans: 3*NF-P12
psu: seasonic P-520 or P-660 (unsure about safe position of a fanless psu into a 4U case)
os: windows7
5"1/4 drive bays: LTO6, sata CF reader, 2.5" racks

ZFS config:
mobo: X9SCM-iiF or X9SRH-7TF
cpu: E3-1270V2 or E5-1620
ram: 4*8GB DDR3 ECC or 2/4*16GB DDR3 ECC registered
network: X540-T2 or none
ssd: ZIL 2*intel 320 80GB mirrored, L2ARC 2*intel 520 480GB stripped
case: RSV-L4000
fans: 3*NF-P12
backplane(s): 1/2*CSE-M35T-1 black
hba: M1015 or none
hdds: 5/10*5K4000 raidz2
psu: seasonic P-520 or P-660
os: OI+napp-it
5"1/4 bays: 6*2.5>1*5"1/4 rack

in case of all-in-one:
all-in-one:

mobo: X9SRH-7TF
cpu: E5-1620
ram: 3*16GB DDR3 ECC registered
video: 7750/7770
sound: xonar st
ssd: 2*335 240GB boot mirrored, 2*320 80GB ZIL mirrored, 2*520 480GB L2ARC stripped
case: RSV-L4000
fans: 3*NF-P12
backplane(s): 1/2*CSE-M35T-1 black
hba: M1015
hdds: 5/10*5K4000 raidz2
psu: seasonic P-520 or P-660
os: OI+napp-it
5"1/4 bays: 6*2.5>1*5"1/4 rack, LTO6, sata CF reader.

About compatibility, does the nofan CR95C fit into this case ?
Does the SK-34A-S2 fit in the RSV-L4000, before the fanwall?
Or do you know a better rackmount case that also have a 3*120mm mid fanwall and could use it to cool the SK-34A-S2 backplane with its fan removed?
Still searching for alternate choices of 4U case with 3*120mm fanwall

About network, i though of a 10GbE link between these two config, the X540-T2 seems to be the thing to buy, but what about the switch?
i would need something with at least 8/12*1Gb ports and 4*10Gb ports.
I noted some interesting 24*Gb+4*10Gb switchs within the $1000 price range from cicso/hp/netgear, like the HP E2910-24G, the SG500X-24-K9-NA or the GSM7328S-200NAS.
What about the noise? They seems to have inbuilt fan.

About the backup, i thought of a LTO loader, but it seems these are subject to the same silent data corruption problem.
Is there a way to safely backup data without building a costly second ZFS server?

edit:
I just realized that the X9SCM-iiF has only two PCI-E 8x and two PCI-E 4x, is it going to create problem or bottleneck?
For the M1015 (+8*5k4000)
For the X540-T2 10GbE NIC
For the areca 1223 8i
For the 7750/7770
problem would only occur in all-in-one into 1155 plateform.

Billy_nnn · Jan 6, 2013

Mastaba said:
Questions[/U]:

About the RAM size, i wonder if replacing this LGA1155 system with a LGA2011 one is worthy ?
I could use a lot more RAM than 32GB, and i read that ZFS love RAM.
But i also saw this too much RAM issue.
So what's the best ammount of RAM to avoid these freeze?
Does increasing RAM above 32GB worth it? (128GB would be required for the 1GB RAM > 1TB HDD ratio, and more could always be useful)
Also does deduplication worth it? Is it safe? (as it seems to be impossible to "undedupe" data)
Seems to consume a lot of CPU power and lot of RAM to work properly, so i wonder.

ZFS can use as much memory as you give it, but that isn't the same as saying it actually needs it all.
Much depends on the type of data and your I/O profile.

Dedup also depends on the type of data - some data doesn't have much duplication (a media collection for instance), and in those cases dedup is a bit of a waste TBH.
You can test your data before enabling dedup though!
It can be quite resource hungry, so if you don't need it, don't use it!

About CPU power, what are ZFS's needs (with/without deduplication)?
Because there is also the E3 1220L with his tiny 17W TDP, but i'm afraid it would bottleneck the performances.
Also 2011 setup would allow 6core CPU.

Depends how many pools/vdevs/users/features......it can vary from needing very little CPU...then upwards from there.

About SSD, ZIL, L2ARC & HDD, does it worth it ?
ZIL seems to be very dangerous to use as losing it could lead to losing the whole data stored on the hdds, Does the performance increase worth the risk of having such a weak link?

I'm not sure what makes you think this.
All pools have a ZIL - though it's only used for synchronous writes - it's not needed for asynchronous writes.
You can speed up the ZIL by using a "log" device, essentially the ZIL is moved from the slower main pool, to faster storage, usually an SSD.
Due to the atomic nature of write I/Os and the way ZFS is designed though, means it's not really all that dangerous to lose a log device, unless you are unlucky and lose the log device and main memory at the same time - if you are worried about this, you can mirror the log devices.

How do you install the OS? On a third SSD? Is it possible to use the ZIL SSD as only 8GB are required for it? (from what i read)

You can do this, though a ZIL log device really wants a write optimised SSD as opposed to the general purpose SSD the OS wants.
The size is variable - depends on the server write throughput - but you are in the ballpark with the ZIL log device not needing to be all that big, relatively speaking!

The 5xxxRPM non TLER hdds should work quietly enough and give sufficient performance, i hope so.
I also read that ZFS don't work well with 4k hdds, but i can't find any 4TB non-4k hdd, so how to proceed? Is there a way to build a proper RAIDZ using 4TB hdds?

ZFS is OK with 4k disks, as long as you configure the pool from the outset for 4k disks (ashift=12)

About L2ARC, while it being very useful for increasing the IOPS of slow 5xxxRPM drives and safer than ZIL being only a read cache, how does it deal with faster read performance?
What if the L2ARC SSD can't read as fast as the 24*drives can? (And can't even write data as fast as the hdds can read).
Because i don't think there is any consumer SSD that can reach the 1GB/s read (and even less write) speed attained by a 24*hdd array...

It's not about throughput (that's how you measure sequential I/O) - it's about IOPs with small random I/O. L2ARC won't help if you are serving up HD movie files for instance (sequential I/O) - though it might help with supporting may clients doing small random reads. L2ARC won't help as much as ARC though - so increase ARC first if small random read performance is an issue - L2ARC is much cheaper though........

Can ZFS use SGPIO from backplanes for spotting failed drive(s) ?

This is a hardware function - nothing to do with ZFS. It the system fails a drive, ZFS will notice it though, as soon as it does any I/O from that pool/vdev.

About the OS, which one to use?
What are the pros/cons of the different ZFS capable OS ?

That can't really be answered in a few paragraphs - though I might steer clear of Solaris itself unless you have a support contract with Oracle.

About the RAIDZ configuration, what's the best setup for 24 drives?
I read pools should be made of a precise number of drives, like 10 for best performances, but that would make a third pool of only 4 hdds.
Also what kind of RAIDZ ?
-24*hdd all in one RAIDZ2/RAIDZ3 ?
-2*10 RAIDZ2 + 4 in RAIDZ ?
-3*8 RAIDZ1/RAIDZ2 ?
-2*12 RAIDZ2 ?
Does the performance drop is significant if i don't make 10drives pools ? Because i don't want to lose too much diskspace as the main goal will be storage.

It's all a tradeoff here in the end.....a 24 disk single vdev pool might give the best sequential I/O numbers, but it'd also give the longest resilver times, carries the highest risk of multiple disk failure and has the lowest IOPs.

The other question is whether you go far a single pool with say 3x 8drive vdevs, or 3 seperate 8 disk, single vdev pools......

It depends on how you manage your data and what that data is - unfortunately there's no one best approach to suit all situations. You have to tradeoff performance, capacity, cost and resilience against each other - eg if resilience and capacity are your priorities, then you must either sacrifice performance or increase cost!

As to performance, there is a sweetspot for vdev sizes

Z1 - 3, 5 or 9 drives (2, 4 or 8 data plus one parity)
Z2 - 4, 6 or 10 drives (2, 4 or 8 data plus two parity)
Z3 - 5, 7 or 11 drives (2, 4 or 8 data plus three parity)

but I really wouldn't get too hung up on it TBH, unless synthetic benchmarks are your thing, and you must squeeze every last ounce out of your config!

Eg - an 8 drive RAIDZ2 will still be faster on seq I/O than a 6 drive RAIDZ2, but not by as much as you might expect - ie it probably won't be 33% faster (as you might expect with traditional raid systems).
You might get a bit more performance from 4x 6 drive Z2 vdevs than you would from 3x 8drive RAIDZ vdevs, but at the cost of another two drive's capacity (there's that tradeoff again

)

Dami · Jan 6, 2013

Since you want 10gbit, I would go with integrated X9SRH-7TF. And you only need 2x M1015 now, since there is also a LSI 2308 on it.

Mastaba · Jan 6, 2013

Thanks for advises!
Damn this board looks nice!

edit: but there's no SFF-8087 ports but single sas ports instead?

edit2: another options:
X9SRL-F with enough PCIE 8x.
X9DRH-7TF with 10GbE + 8*SAS SFF-8087 + dual 2011 + 512GB max

Dami · Jan 7, 2013

Mastaba said:
edit: but there's no SFF-8087 ports but single sas ports instead?

You use reverse breakout cables (like this) to connect it to the 4224 chassis. The same goes for onboard sata controller ports.

Mastaba said:
edit2: another options:
X9SRL-F with enough PCIE 8x.

My precious

I just pointed X9SRH-7TF out because by buying a mobo with integrated 10gbit you save a lot of $. But if you have $ then go for the X9SRH-7TF. Just a note, LSI 2208 is a highend hardware raid controller. Using it as HBA for ZFS is kind of pointless.
edit: Sory for the typo. In the last sentance i meant X9DRH-7TF.

Mastaba · Jan 8, 2013

Nice!
About the X9SRL-F can you use the first PCI-E port? It seems very close to the RAM banks.

About the X9SRH-7TF,is the LSI2308 better or equal compared to areca's arc-1223-i8 ? Because this mobo could be a nice replacement for my desktop computer (second build).

I'm still wondering if it could be possible to build a RAIDZ over windows using VM?

Does installing a M1015 (filled with 8*5k4000) or a X540-T2 on a PCI-E4x will bottleneck their performances ?
Because the X9SCM-iiF only have two x8 ports.

As i don't think i will use deduplication/compression, i wonder if it still worth it to upgrade from LGA1155 to LGA2011. (more future RAM upgrade, more CPU upgrade when the E5 2600 v2 22nm will come, and more PCI-E lines for more 8x ports as i couldn't find any 1155 mobo with enough lanes)
On the other hand the CPU seems to be more power hungry, and I'm also a little worried about the "too much ram" issue.

Is it possible/recommended to install OS on the ZIL or L2ARC SSD ?

Does M1015 with SFF-8087 ports placed at the rear end of the card exist, like on the more expensive LSI model? Because this would be more practical to cool the cards with a topdown fan like the kama stay.
About this, does anyone know where to buy this now discontinued scythe kama stay, or any equivalent?

About the RAIDZ confguration; if i go with 2 vdevs of 12*hdds each in a RAIDZ2, this is like some RAID60? If i lose three hdds on the same vdev that means all data is lost, compared to a single RAIDZ3 of 24*hdds?

drescherjm · Jan 8, 2013

I just pointed X9SRH-7TF out because by buying a mobo with integrated 10gbit you save a lot of $.

Thanks for that tip. I just did a quick check and it appears it's actually cheaper to purchase a motherboard with a dual 10GBit intel nics than to purchase an adapter with dual 10GBit intel nics. Seems crazy. Although I have not looked at eBay..

Dami · Jan 8, 2013

About the X9SRL-F can you use the first PCI-E port? It seems very close to the RAM banks.

I have one m1015 in first slot and its fine.

About the X9SRH-7TF,is the LSI2308 better or equal compared to areca's arc-1223-i8 ? Because this mobo could be a nice replacement for my desktop computer (second build).

LSI2308 is a update to 2008 (which is on lsi9211 and ibm m1015 for example), so no cache and raid1,0,10 only. That areca is a fully raid6 and stuff card.

I'm still wondering if it could be possible to build a RAIDZ over windows using VM?

If you want zfs and windows, just do a all-in-one like the rest of us

Does installing a M1015 (filled with 8*5k4000) or a X540-T2 on a PCI-E4x will bottleneck their performances ?
Because the X9SCM-iiF only have two x8 ports.

Don't know about the network but it should be fine for the hdd's.

As i don't think i will use deduplication/compression, i wonder if it still worth it to upgrade from LGA1155 to LGA2011. (more future RAM upgrade, more CPU upgrade when the E5 2600 v2 22nm will come, and more PCI-E lines for more 8x ports as i couldn't find any 1155 mobo with enough lanes)
On the other hand the CPU seems to be more power hungry, and I'm also a little worried about the "too much ram" issue.

I used s2011 because i wanted an option of cheap ram (have 32gb, going to buy 32 more) and enough pci lanes (and slots). My e5-2620, 10x RED 3TB, 3x m1015, 3 ssd and SS860 platinum runs at 90W idle and thats good enough for me.

Is it possible/recommended to install OS on the ZIL or L2ARC SSD ?

Its possible but not recomended. I know the idea of ZIL & L2ARC sounds cool, but until you build a system and the speeds of your way of use are too low, forget about them. I plan to have 2 or 3 pools so I'm not going to have 6 ssd's just for that. Besides 10xRED raid2z works at 1gbit just fine (with sync=default) and thats all I need for now.

Does M1015 with SFF-8087 ports placed at the rear end of the card exist, like on the more expensive LSI model? Because this would be more practical to cool the cards with a topdown fan like the kama stay.
About this, does anyone know where to buy this now discontinued scythe kama stay, or any equivalent?

That and also you could use 50cm sas cables. But for m1015 they are too short. 60cm would work though.
Thats a perfect way of adding ssd's and a pcie cooler to the case. I modified my 120mm fan bracket to attach this. Its ugly and takes a lot of space but I can attach extra 3x3.5"+2x2.5" drives to the case. But it turns out the 120mm fans are blowing enough air over the m1015 + i have them with a pci bracket with holes so the hot air gets right out of the case. Will upload a photo of the setup but it's very messy and I'm not proud of it

But I needed to add at least 3ssd and one 3.5" hdd (hot spare) to the setup and couldn't think of any other idea to put all of that in one case.

About the RAIDZ confguration; if i go with 2 vdevs of 12*hdds each in a RAIDZ2, this is like some RAID60? If i lose three hdds on the same vdev that means all data is lost, compared to a single RAIDZ3 of 24*hdds?

Yes and yes. Im in the same situation but I'm gonna go with 4(raid10) and 2x 10 raid2z with separated pools for now.

brutalizer · Jan 8, 2013

If you lose the Logzilla (the ZIL on SSD) then you will only loose your latest writes, you will not loose the entire zpool. Read wikipedia about ZFS, for more info on this.

For the problem of Windows using ZFS to get data corruption protection, you can install Windows on the Solaris server, and boot your workstation via remote connection by using iSCSI. So, you dont have any harddisk in your workstation, because it uses the ZFS drive on the Solaris server, via LAN.

If you are using the open source Solaris distro called "SmartOS", then you can create a container/jail/zone/whatever you call it/ and install Windows in it, via KVM. Then you connect your workstation PC to the SmartOS server and run everything from there, via LAN. Now windows is installed ontop ZFS, so you get full data corruption protection. SmartOS allows you to boot entirely from USB stick, and does not install itself to any disk - it runs everything from RAM. It is mainly used to deploy lot of servers in the cloud, if you are a cloud provider, you will love SmartOS. Via KVM you can install many different OSes, and they all get the protection from ZFS, and the transparency from DTrace.
http://smartos.org/

To run your Workstation from your server, uses lot of power, if your server uses 24 disks. Therefore you can use one single SSD for the Windows install, and shut down power to the 24 disks. When you need the 24 disks, you turn them on by using something like this:
http://www.lian-li.com/v2/en/product/product06.php?pr_index=487&cl_index=2&sc_index=6&ss_index=125

Actually, I am doing something similar. I have one single SATA disk for temporary storage, and when I need access to my zpool to offload/backup I just turn all disks on, by using a device similar to the one in the link. This way the server is silent with one active SATA disk, and one active SSD disk.

Or, you can install SmartOS on your Windows workstation at the bottom, and then run Windows on top via KVM.

Or you can install Solaris and install VirtualBox ontop Solaris. Then install Windows in VirtualBox and run Windows ontop Solaris. This is what I do. Heavy new 3D graphic games dont work, but older games like Quake2 works fine in virtualized Windows on Solaris.

I actually use SunRay thin clients, that are connected to my Solaris server. No software is run on the SunRay, it only shows bitmaps from the server output, and sends further keyboard/mouse to the server. The SunRay is totally OS free, nothing runs on it. If you need more cpu power, you upgrade the server. One core, can drive five heavy office users. So one quad core can drive 20 heavy office users. So, my girl friend logs into the Solaris via SunRay and boots up Windows on VirtualBox. I sit on the server and play Windows games via VirtualBox, while she does Office work. And everything is protected because it runs ontop ZFS. Neat.

EDIT: SmartOS seems to be the shit of all Solarish distros. Several of the famous Solaris kernel hackers quit Oracle and joined SmartOS. They have improved SmartOS considerably.
http://www.theregister.co.uk/2011/08/15/kvm_hypervisor_ported_to_son_of_solaris/
With I/O-bound database workloads, he says, the SmartOS KVM is five to tens times faster than bare metal Windows and Linux (meaning no virtualization), and if you're running something like the Java Virtual Machine or PHP atop an existing bare metal hypervisor and move to SmartOS, he says, you'll see ten to fifty times better performance - though he acknowledges this too will vary depending on workload. "We can actually take SQL server and a Windows image and run it faster than bare metal windows. So why would you run bare metal Windows?"
"We're actually able to do instrumentation around Windows and Linux that Windows and Linux have never seen[!!!!!!this is thanks to DTrace!!!!], not even at Microsoft or Red Hat,"

As I understand it, if they virtualize for instance, 32 bit WindowsXP or 32 bit Linux, then you can only access 4GB RAM. But say you virtualize the OS ontop SmartOS server with 128GB RAM, then you can diskcache much data, and you can also use 10GBit NIC, which WinXP can not do. So you can increase performance much.

Billy_nnn · Jan 8, 2013

brutalizer said:
If you lose the Logzilla (the ZIL on SSD) then you will only loose your latest writes, you will not loose the entire zpool.

You shouldn't lose any writes as long as that's the only failure.
At least in more recent ZFS implementations, the main pool is updated from ARC in main memory (or rather the transaction groups stored in the ARC), not from the ZIL - the ZIL (whether it's on a log device (eg SSD) or in the main pool), is an intent log rather than a write cache.

Mastaba · Jan 10, 2013

Thanks you all very much!

If you want zfs and windows, just do a all-in-one like the rest of us

What do you mean by "all-in-one" ?

I plan to have 2 or 3 pools so I'm not going to have 6 ssd's just for that.

There must be separate ZIL/L2ARC for each pool?

I'm also interested by your homemade cooling setup using lianli's EX-36.

@brutalizer:

That's very interesting!
As i'd like to have my desktop computer independant from the fileserver i don't think i will boot & write directly on it's raidz.

Is it better to run a virtualized windows ontop of smartos than the opposite ? (I mean a windows 7 64bits that could make use of more than 4GB RAM)

And what about compatibility and performances ? You said heavy 3D games won't work, that mean there is some performance or/and compatibility loss ?

thedge · Jan 10, 2013

Mastaba said:
Thanks you all very much!

What do you mean by "all-in-one" ?

There must be separate ZIL/L2ARC for each pool?

I'm also interested by your homemade cooling setup using lianli's EX-36.

@brutalizer:

That's very interesting!
As i'd like to have my desktop computer independant from the fileserver i don't think i will boot & write directly on it's raidz.

Is it better to run a virtualized windows ontop of smartos than the opposite ? (I mean a windows 7 64bits that could make use of more than 4GB RAM)

And what about compatibility and performances ? You said heavy 3D games won't work, that mean there is some performance or/and compatibility loss ?

All in one = ESX installed, pass the controller through to a VM, the VM is the datastore for other VMs.

http://www.napp-it.org/napp-it/all-in-one/index_en.html

ZILs and L2ARCs are per pool, not global.

Mastaba · Jan 12, 2013

Oh fantastic, that's exactly what i dreamed of!

But where to install windows7 ? Separately on the boot drive or also virtualized into another VM ?

Dami · Jan 19, 2013

Mastaba said:
I'm also interested by your homemade cooling setup using lianli's EX-36.

As promised
Lian li EX-36B1 minus the front vent&panel
Build 1 and 2.

Silhouette · Jan 19, 2013

Billy_nnn said:
ZFS is OK with 4k disks, as long as you configure the pool from the outset for 4k disks (ashift=12)

Be aware that this is currently a choice between speed and capacity.

Mastaba · Jan 22, 2013

@Dami

Wow, that's a very ingenious and well done setup, don't understand why you're not proud of it.
This should come as a case option for non DIY people.

@Silhouette:

So even with ashift=12 there will be performance loss?

Silhouette · Jan 23, 2013

Mastaba said:
@Dami
@Silhouette:

So even with ashift=12 there will be performance loss?

No, you will lose some capacity (compared to ashift=9) because you get more slack space. How much depends on the data that you store. If you search you should find some discussions about this. I ended up switching back to ashift=9 for some of my servers, even though the performance difference is significant.

Mastaba · Jan 26, 2013

About ESXi & all-in-one, what kind of performance can i expect compared to a normal, non virtualized windows config?

*About desktop tasks, from internet browsing/mail/downloading apps to heavy photo editing (all will be used at the same time with pretty much heavy load)?

*About games? It looks like directx games won't work fine, but where is the limit? Can i at least output a 2560*1600 signal?

*About soundcard, can i use one in a all-in-one?

Also what are the CPU needs for:

*All-in-one config; what's the best choice between more cores & mores GHz?
Do i need to allocate cores or the load is automatically balanced?
About RAM i though allocating 16GB for the ZFS and 16GB for windows part. (on a 1155 board, more like 32/16GB with 2011)

*ZFS storage with 5/10*hdds in RAIDZ2?
Does a low power E3-1220L and 32GB RAM is enough or the througput will be CPU limited ? (without dedupe, but possible 10GbE futur upgrade).

Because i'm wondering what would be the best choice between:
1/classic windows desktop config + second zfs config as storage space.
2/all-in-one config.

I know two configs would eat more energy than only one, but with one X9SCM-iiF + E3-1220L V2 + 32GB + M1015 (can i use some SATA2 ports already on the mobo for completing a 10*hdd RAIDZ2 or it's not recommended to use them?) for the ZFS, and another 1155 config with X9SCM-iiF + E3-1270 V2 + 16GB + some AMD GPU for the desktop, the total consumption should be limited and this way i don't have any performance nor compatibility problems as the desktop os won't be virtualized.

On the other hand, a all-in-one config would perhaps be more interesting in a different way?
If i can do all the desktop tasks (=all but games) in the all-in-one without losing too much performances, i could build another config only for games, with the benefit of not mixing games & desktop apps and thus lowering the risk of crashing the whole system with a game.
Also the power consumption should be lower with only one config.

About the board, would it be more interesting to go for a 2011 config for the ZFS, allowing more RAM upgrade capability (more than 32GB), mores pcie lanes & sas + 10gbe nic already onboard, and future 22nm upgrade?

terahz · Jan 31, 2013

Mastaba;1039488412 [COLOR="DeepSkyBlue" said:
About CPU power[/COLOR], what are ZFS's needs (with/without deduplication)?
Because there is also the E3 1220L with his tiny 17W TDP, but i'm afraid it would bottleneck the performances.

What kind of load do you expect for you file server? I recently upgraded my home fileserver with that CPU and haven't seen problems. When I run a dd benchmark from napp-it, the cpu is still 82%-79% idle.

I have 6x 2TB 5400 green hdds in raidz2 and the write speed is a bit over 200MB/s with the pool being 90% full (need to get more drives soon...). During read of the same big test file, kernel usage is about 6%-10% so not a problem there. Read speeds are around 250MB/s.

Ran prstat on dd while doing a 32G write (16G ram):

PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
26434 root 0.0 6.6 0.0 0.0 0.0 0.0 89 4.4 83 39 153 0 dd/1

89% SLP means it is IO bound.

Mastaba · Feb 1, 2013

Thanks for input!

I'm still hesitating between two separate config (windows desktop + ZFS storage) or all-in-one (don't know much about limitations, soundcard capability, compatibility & performances).

The expected load for the ZFS storage part would be internet download box/home server/dvb-s2 recording from network, so nothing really heavy but i don't want any througput bottleneck in the case of a possible 10Gb network.

While turned on 24/7, i'd like to limit the power consumption as much as i can without hurting the performances. (I plan to build a RAIDZ2 of 5or10 hdds).

The larger 24*drives storage server will obviously needs more cpu power to run the raidz3 and manage the higher througput.

Here, Zarathustra[H] had to upgrade his dualcore to a quadcore when increasing the number of disks from 4 to 6.
SMB seems to eat a lot of CPU power too.

Don't know if a quadcore or more is needed/sufficient/overkill.
Something like E3-1270V2 or E5-1620.

Mastaba · Feb 2, 2013

Dami said:
I used s2011 because i wanted an option of cheap ram (have 32gb, going to buy 32 more) and enough pci lanes (and slots). My e5-2620, 10x RED 3TB, 3x m1015, 3 ssd and SS860 platinum runs at 90W idle and thats good enough for me.

What's the benefit of using a E5-2620 instead of a E5-1620 on a single cpu config?

Dami · Feb 2, 2013

Mastaba said:
What's the benefit of using a E5-2620 instead of a E5-1620 on a single cpu config?

Lower TDP, the cheapest 6core.

terahz · Feb 3, 2013

I don't use SMB so not sure how much that adds to the cpu load but I doubt you will be able to fill a 10Gbe with 10 drives regardless of the CPU. Faster CPU is nice over the long run for sure, but you might be able to save $ from electricity to buy a new CPU after a few years

.

Mastaba · Feb 4, 2013

@Dami
Interesting, i missed this one!
Does more cores are better? Or you needed them for specific reason?

@Terahz
Yeah, i only don't want the througput to be cpu limited as it won't be network limited.
If that work well enough i probably won't have to upgrade the cpu as zfs server will be it's only purpose. (Or with a lower consumption revision, like some 22nm ivy bridge-ep).

If the consumption is higher doesn't that mean i was cpu limited?

Dami · Feb 4, 2013

Mastaba said:
@Dami
Interesting, i missed this one!
Does more cores are better? Or you needed them for specific reason?

@Terahz
Yeah, i only don't want the througput to be cpu limited as it won't be network limited.
If that work well enough i probably won't have to upgrade the cpu as zfs server will be it's only purpose. (Or with a lower consumption revision, like some 22nm ivy bridge-ep).

If the consumption is higher doesn't that mean i was cpu limited?

Afaik there is no comparison whats better (more cores with low clock, or less core with high clock). For pure ZFS setup both (e5-x620) are enough. Because I have esxi I went with more cores and lower TDP.

Mastaba · Feb 4, 2013

About ESXi, can you use a soundcard in a all-in-one?
What are the compatibility and performance limitations?

twistacatz · Feb 6, 2013

In regards to the E2910-24G I can tell you that it is loud. Right now I'm running mine with the top off and the fans disconnected and it still seems to operate fine. I also have a small silent home theater fan blowing on CPU.

Mastaba · Feb 13, 2013

Thanks, that's interesting.
Do you think it would be possible to replace the fans with quieter ones ?
How much watts does it consume ?

About making a 24*drives ZFS RAIDZ3, is there a bottleneck problem using a LGA1155 config ?
Because considering the lack of PCIE8x ports i'll have to install some of the M1015 into PCIE4x ports.
Also the RAM would be limited to 32GB, does i need more for a ~70TB 24*Z3 array ? For a ~26TB 10*Z2 ?
I heard about some 1GB RAM for each TB data, and 2GB for each 100GB L2ARC, which mean 32GB total would be insufficient ?

For the boot (OI/napp-it) i though using this:

with a Kingston traveler mini:

Am i right?

asgards · Feb 13, 2013

that ram amount thing is for dedupe to keep table in ram
in case of file server for most cases dedupe is impractical and with 32g you should be fine

about those kingstons, have you found any 3.0 with decents speeds?
cause the normal 2.0 ones are slow 10/5mb tops
i have them on several servers, works fine if you don't poke the os, but upon changes takes ages to compile anything on em

Mastaba · Feb 17, 2013

Does anyone know any pcie card with internal bootable usb3 ports capable of booting OpenIndiana for a ZFS config?

_Gea · Feb 18, 2013

Mastaba said:
Does anyone know any pcie card with internal bootable usb3 ports capable of booting OpenIndiana for a ZFS config?

- There is currently no USB3 support in OI
- There is currently no ESXi support for datastores on USB

So prefer a small Sata SSD 20GB+ for booting

Mastaba · Feb 18, 2013

Thanks!
And what do you think of :

-possible bottleneck of plugging a M1015 into a PCIE4x port (in case of 1155 config + 24*drives)
-possible lack of RAM if using a 1155 limited to 32GB
>do you think a 2011 is worth it over 1155 for these possible issues?

-limitations of a ESXi config (sound? graphic? games? What i won't be able to do compared to a normal, non-virtualized windows desktop config?)

Dami · Feb 18, 2013

As long as you put hdd's on pcie4x, you're fine. But you cant get pass the 32gb so ask yourself if its really that much giving 100-150 more for a 2011 board.
With esxi I dont think you can have any of that. I believe hyper-v is more friendly.
Just FYI, I'm gonna setup my ZFS as a baremetal and have VM (or single OS, dont know yet) on a separate box.

Mastaba · Mar 4, 2013

About compatibility, do you see any problem between these ?
X9SCM-iiF + CR95-C + E3-1270V2 + RSV-L4000 + 3*Gentle Typhoons AP-12 (800RPM)

(mobo/heatsink and case/heatsink size compatibility, fans/heatsink/cpu cooling)
edit: CR95-C seems to block first pcie... any idea for a silent heatsink? I though using the 3*120mm fanwall from the rosewill to cool some passive heatsink.

What are the requirements for booting a ZFS OS like OpenIndiana ? Does the boot drive needs to have fast reads/fast writes, or is it only useful at initial boot and after that all the OS remain loaded into RAM ?

brutalizer · Mar 8, 2013

SmartOS is a Solaris distro that boots from USB, and into RAM.

Never use a single vdev of 24 disks. Limit each vdev to 8-11 disks. Read on wikipedia about ZFS.

Mastaba · Mar 8, 2013

You mean like 4*6 Z2 vdev ? 8/24 parity drives (=1/3 of total space) seems overkill to me.
Or 3*8 Z2 ; 6/24parity (=1/4 of space), a bit better but still alot of waste, plus non-optimal setup.
Or 2*12 Z2 ; 4/24parity (=1/6), still lot but more balanced ratio, non-optimal setup too.
Or 2*11 Z3 ; 6/24parity, optimal but costly (1/4) and 2*drives are left alone...

Also need to add more ZIL+L2ARC for each vdev (smaller L2ARC*2 for a 2*12 Z2 or larger L2ARC for 24*Z3 ?)

_Gea · Mar 9, 2013

Mastaba said:
You mean like 4*6 Z2 vdev ? 8/24 parity drives (=1/3 of total space) seems overkill to me.
Or 3*8 Z2 ; 6/24parity (=1/4 of space), a bit better but still alot of waste, plus non-optimal setup.
Or 2*12 Z2 ; 4/24parity (=1/6), still lot but more balanced ratio, non-optimal setup too.
Or 2*11 Z3 ; 6/24parity, optimal but costly (1/4) and 2*drives are left alone...

Also need to add more ZIL+L2ARC for each vdev (smaller L2ARC*2 for a 2*12 Z2 or larger L2ARC for 24*Z3 ?)

With 20+ disks, I would use 2 x Raid-Z2 vdevs with a hotspare
advantages compared to one large Z3
- double I/O performance
- similar sequential performance
- you can replace half of the pool with larger disks when you need more capacity
- shorter resilver time (also affects performance)

ARC and ZIL are per pool not per vdev

Ich you decide to go barebone, OmniOS and ZFS mirrorred USB sticks are very nice.
I have prepared ready to use and fully configured USB sticks for HP Microserver and common
SuperMicro server or oards from the X9 line with "napp-it to go" You just need a 16 GB stick.
Download the Image, copy to stick and you can run the box.

geant90 · Mar 9, 2013

Why not just 1 M1015 and a HP expander? or is 3 x M1015 cheaper? I cant find them for the sub $100 price any more.

geant90 · Mar 9, 2013

Also you should give Nas4Free a try. And although they do recommend 1GB RAM per TB I've Tried with even less than half and it performed great. Other than the nice graphs freenas provides I've decided to go with Nas4Free as ZFS seems lighter on RAM in my experience.

start out with 16GB RAM if your goign to have around 30+TB try the performance your self and another kit is just going to get cheaper.

If you go with Esxi 5.1 you can pass the M1015 to the N4F VM so it can work on any setup if it ever dies.

Mastaba · Mar 9, 2013

Interesting, thanks!

I made my choice for three separate configs without virtualization (windows desktop only, small zfs 24/7 nas (10*Z2) and larger zfs storage (24drives))
I read zfs need (for best performance) 1GB RAM for each TB data even without dedupe, only for metadata.
So i'll go for lga2011 (power consumption is equivalent to 1155's ivy bridge at idle so it should be good).

@_Gea:
Sounds good, but what's better between 2*10 Z2 and 2*12 Z2 ? (optimal number of drives vs more drives, where's the tradeoff threshold?)
I won't need hotspare as i already planned to buy coldspares and can swap them myself. Also having 4 wasted slots annoy me a little.

So i'll also need 2*ZIL and 2*L2ARC if 2 pools ?

About the boot, is there a benefit of using a ssd instead of booting from usb stick ? (boottime should be longer, but after that?)

Also i will use 5K4000 hdds, already have 6 of them

ZFS build check & hardware RAID

Limp Gawd

Limp Gawd

Weaksauce

Limp Gawd

Weaksauce

Limp Gawd

[H]F Junkie

Weaksauce

[H]ard|Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Weaksauce

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

n00b

Limp Gawd

Limp Gawd

Weaksauce

n00b

Limp Gawd

Weaksauce

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Supreme [H]ardness

Limp Gawd

Weaksauce

Limp Gawd

[H]ard|Gawd

Limp Gawd

Supreme [H]ardness

Limp Gawd

Limp Gawd

Limp Gawd