"new" ZFS Server build

TType85

[H]ard|Gawd
Joined
Jul 8, 2001
Messages
1,554
I am looking to put together a new ZFS server for work on the cheap. This is where all of our work get stored. Nightly the changes are backed up to our SAN in the data center and mirrored from there to our disaster recovery site. I would love to just keep the files on the SAN but we only have a 10MB connection to our data center and can max that out with one or two users working.

Hardware I have right now that will be re-used
Rosewill RSV-L4411 12 bay hot-swap rack mount case
Supermicro 550W PSU
8x 2TB WD RE4 Drives
4x 1TB WD RE4 Drives (actually have 6 in a raid 5 array right now)

New Hardware
I think I am going to go with the ASRock C2550D4I (12 SATA ports) and 8GB or ECC ram.

I am going to use freenas or OmniOS/Napp-it for the OS.

We currently store about 5TB worth of data total and the growth is not that much per year but I need to set it up where I put 4x 2TB in copy data over, put the other 4 in and then the 1TB drives.

Resiliency is more important than space and speed wise we are capped at 1GB network links.

My thought is set up the first batch as a striped mirror, copy data over, Add the 2nd batch as another striped mirror extending the first set. I will want the 1TB drives as a separate striped mirrored pool.

Does that sound like a sane plan?
 
Just a heads up i just bought a C2750D4I

When i asked about OmniOS Forums they sent me the following info! (Not all the SATA Ports might not work due to the Marvell Chipset with Solaris)

A similar board might work: ASRock's C2750D4I, reviewed at Serve The
Home: https://urldefense.proofpoint.com/v...K1f9WFgUxg= &4e502283de618a76f1a7a652bd7165e4

That has Intel i210 NICs which are supported in recent illumos (including the upcoming OmniOS r151008), and the first six SATA ports come from the C2750, so presumably they'll attach to ahci(7D). Also it will be easier to find ECC modules for the full-size DIMM slots than for the SODIMMs on the Supermicro.
 
Last edited:
Is the one I had in my OP the 4 core version of that one? The only problem with the 2750 is for the price I can buy a i3/mATX supermicro board and a ebay HBA.
 
You did choose the 4 Core Version but most likely built fairly closely the same. (I havent seen the 2550 on the market yet whats the cost?)

I just bought a 2750 system and yea its 400$ but i see it like this,

1) Mini-ITX with 64GB Ram capability
2) Fanless CPU
3) LOW Powered!!!

if Mobo Size is not a factor then why not go bigger board and really make that system shine...

My ESXi Hosts run L5539 cores 60w TDP that can dual socket but those can grow up to 192GB Ram per board!

Its all dependent what ure looking for.. i would say more cores is better but thats me
 
FreeNAS 9.2 won't support the i210 NIC. I'm not sure about the Marvell SATA ports, but I'm not optimistic about it.

If the NICs won't work and about half the SATA ports won't work, why use this board? Don't get me wrong, I'm really excited to see this board and I would love to run one through its paces, but the software support isn't there yet. It will be, I'm sure but not for at least 3-6 months.

Also 8GB of RAM is bare minimum. The current rule of thumb is 1GB of RAM for every 1TB of disk. That would put you at 16GB minimum. Don't even think about skimping on RAM. ZFS doesn't run well when RAM is low.
 
Looks like i'd be better off with older hardware.
Probably go with a 1155 supermicro board with a i3 and 8gb ecc.

What about the disk config?

It is going to just serve files.
 
There is no rule of thumb with ZFS that says 1GB of ram for every 1TB. Jeeeez can people saying that shit. I'm starting to see it everywhere there is a ZFS discussion. You can run ZFS just fine on systems with only 1GB of system memory and petabytes of storage.
 
No, < 4Gb will give you interesting issues in the long run.
4-8Gb is probably enough for most home users with <20TB.
//Danne
 
Oh and how come? Because your data doesn't fit in the ARC cache and you have to read data from the disk drive instead?
 
What Kind of network transport are you going to use?

NFS or CIFS?

If NFS i would invest in a L2ARC or SLOG SSD to help the transfers out as they are synchronous and would give you poor performance without.

If CIFS then i would say u should be fine... i Ran a VM that just did CIFS with 4GB Ram for a long time with an I/O Passthroughed LSI-1068E with IT Firmware (JBOD Mode) and never had an issue with it.

Disk config is based on how you want to want to preserve the data.

In most 8 Drive configs i would do a Raid-Z2 (6 Data + 2 Parity) with a 1TB Reservation. This gives you double disk failure capability and yea you loose 4TB but if protecting the data is important Z2 or Z3 depending on the platter count.

with the 1TB Drives same rules apply above. however since there are only 4 u might consider a Raid Z with them to maximize space.

The only thing i would tell you is that for every Z Level you use the write parity gets hit harder and harder for I/O when writing data. Z3 gonna consume more Write I/O per block than Z2 and so on with Z1
 
What Kind of network transport are you going to use?

NFS or CIFS?

If NFS i would invest in a L2ARC or SLOG SSD to help the transfers out as they are synchronous and would give you poor performance without.

If CIFS then i would say u should be fine... i Ran a VM that just did CIFS with 4GB Ram for a long time with an I/O Passthroughed LSI-1068E with IT Firmware (JBOD Mode) and never had an issue with it.

Disk config is based on how you want to want to preserve the data.

In most 8 Drive configs i would do a Raid-Z2 (6 Data + 2 Parity) with a 1TB Reservation. This gives you double disk failure capability and yea you loose 4TB but if protecting the data is important Z2 or Z3 depending on the platter count.

with the 1TB Drives same rules apply above. however since there are only 4 u might consider a Raid Z with them to maximize space.

The only thing i would tell you is that for every Z Level you use the write parity gets hit harder and harder for I/O when writing data. Z3 gonna consume more Write I/O per block than Z2 and so on with Z1

We will have our in-office VM's via a NFS or iSCSI (4-5 relatively light weight server 2008 R2 instances). The file share is going to be CIFS.

The SSD needs to be a SLC drive correct?

On the drive config, if I can free up all 8 2TB drives at once I was going to do a Raid-Z2, it looks like I can only do 4 at a time so would I be better with 2 4-drive striped mirrors or Z2s?
 
No, < 4Gb will give you interesting issues in the long run.
4-8Gb is probably enough for most home users with <20TB.
//Danne
Like what issues? I have followed ZFS mail lists since the very beginning and been active on the largest ZFS forums, and I have never seen any "issues" anywhere since 10 years. Maybe you can link to some issues? Myself has run a raidz1 pool on 1GB RAM 32 bit Pentium 4 server, for over a year without any issues.

Sure, if you use deduplication you will get issues with anything under 1GB RAM per 1TB disk. This is a well known problem. ZFS dedup is not mature yet, and should be avoided. But for regular use, there is no recommendation of 1GB RAM for each TB disk. Has never been, since the very start.

Probably you have confused ZFS dedup with ordinary ZFS usage. But if you can link to some issues, please do. Show us these issues you refer to?
 
If you have more than one user you'll run into performance issues and eventually kernel crashes due to out of ram. If it works for you fine, not even i386 arch is recommended though.
//Danne
 
If you have more than one user you'll run into performance issues and eventually kernel crashes due to out of ram. If it works for you fine, not even i386 arch is recommended though.
//Danne
This is interesting. Do you have any links on this? Or is it hearsay?
 
The ethernet controller on the supermicro board is not a Marvell, it is an Intel controller. Only the PHY is Marvell (Intel does not build discrete PHYs AFAIK). You don't need drivers for PHYs, since only the MAC has to deal with the PHY and the interface is pretty standard.

The Asrock boards are not as efficient as the Supermicro boards because they do not only have discrete ethernet controllers, but also add a PCIe bridge to use so many external controllers.

Even though the Asrock board does not have a fan, it will most likely get too hot without any active cooling. The C2550 is 14W, the C2750 is even 20W, anything above 10W will require an absurdly large heatsink for completely passive cooling, larger than the one on the board.
 
ZFS Administration Considerations

ZFS Storage Pools Recommendations
This section describes general recommendations for setting up ZFS storage pools.

Systems
Run ZFS on a system that runs a 64-bit kernel
Memory and Swap Space
One Gbyte or more of memory is recommended.
Approximately 64 Kbytes of memory is consumed per mounted ZFS file system. On systems with 1,000s of ZFS file systems, we suggest that you provision 1 Gbyte of extra memory for every 10,000 mounted file systems including snapshots. Be prepared for longer boot times on these systems as well.

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
http://www.unixlead.com/zones/ZFS Best Practices Guide.doc

//Danne
 
The poster said he was going to run FreeNAS. FreeNAS is tuned to work with this. You can "RUN" FreeBSD in 24MB of RAM. It won't run well and neither will FreeNAS on 1GB RAM.

See we are arguing two different points. I think running a RAIDZ1 pool on 1GB of RAM is going to be so slow as to be unusable. You claim that it works. Yes, technically, it does work. I could put a 5HP lawn mower engine in my car and drive across the country too.

Look, this is absurd. You post no proof of your claims, yet demand everyone else prove theirs. Your configuration might operate, but performance will be terrible. You cannot possibly tell me that spending money on a ZIL or L2ARC before RAM is the best use of funds. Any reasonable L2ARC will need RAM just to index what is in the cache.

Please post a listing of your system config along with performance benchmarks.
 
Well too few RAM is one thing, 1GB per TB is another thing entirely. I've got 8GB with 27TB in a single RAIDZ3 vdev, a GUI (openindiana), I even run some wine apps, and no performance troubles at all.
 
I think running a RAIDZ1 pool on 1GB of RAM is going to be so slow as to be unusable. You claim that it works. Yes, technically, it does work. I could put a 5HP lawn mower engine in my car and drive across the country too.
It would be an interesting test case. I myself has used 1GB RAM on a pc for over a year. It was a 32bit cpu so I got only 30MB/sec for a four disk raidz1. ZFS is 128 bits and prefers 64 bit cpus, or it will be slow. Also, 32bit ZFS code was not good, which also could be part of the slowness. But there was no issues.

It would be interesting if someone tried 1GB RAM on a multi disk array and compared same setup to say, 8GB and 16GB.
 
We will have our in-office VM's via a NFS or iSCSI (4-5 relatively light weight server 2008 R2 instances). The file share is going to be CIFS.

The SSD needs to be a SLC drive correct?

On the drive config, if I can free up all 8 2TB drives at once I was going to do a Raid-Z2, it looks like I can only do 4 at a time so would I be better with 2 4-drive striped mirrors or Z2s?

For a pure fileserver, you do not need a ZIL and sync writes not over CIFS nor NFS (disable sync). If you use NFS with ESXi, NTFS or ext filesystems I would enable sync or with iSCSI I would disable writeback cache for security reasons. In such a case, you should have a dedicated ZIL to improve performance like a ZeusRam (best of all) or a Intel S3700 (very good)

Your pool config with 8 disks depends on your need. A 4 x 2 disk mirrored setup gives you best I/O values. (Multiple Raid-10). A Raid-Z2 gives you more capacity but only 1/4 of the I/O values: I would go Multi-Raid-10
 
Last edited:
For a pure fileserver, you do not need a ZIL and sync writes not over CIFS nor NFS (disable sync). If you use NFS with ESXi, NTFS or ext filesystems I would enable sync or with iSCSI I would disable writeback cache for security reasons. In such a case, you should have a dedicated ZIL to improve performance like a ZeusRam (best of all) or a Intel S3700 (very good)

Your pool config with 8 disks depends on your need. A 4 x 2 disk mirrored setup gives you best I/O values. (Multiple Raid-10). A Raid-Z2 gives you more capacity but only 1/4 of the I/O values: I would go Multi-Raid-10

Unfortunately the good ZIL drives are out of our budget.

I have the system set up right now and to save money decided to take one of our VM servers and make it an AIO.

Specs are E3-1230v2, 32GB ECC Ram, Supermicro X9SCL, m1015. Loaded it with ESXi 5.5 and OmniOS/Napp-it with 8GB ram allocated to it.

I had one of drives in the first group of four start showing bad sectors so I started an RMA on that one and WD sent me a 4TB RE4 to replace a 2TB RE4 (/sigh, while this looks like a big mistake in my favor, I really need the 2TB drive :eek: )

I picked up 4 3TB Reds right now set up as a Raid-Z2 and with a iSCSI (whatever the standard settings are) drive mounted in a Windows 2008 R2 server I got ~100MB/s read and write off it over a 1GB uplink to another server. I can't really expect more, so would a ZIL help? I might blow it out and do a striped mirror (Raid10). The only concern on that is resiliency if 2 drives fail.
 
Unfortunately the good ZIL drives are out of our budget.

I have the system set up right now and to save money decided to take one of our VM servers and make it an AIO.

Specs are E3-1230v2, 32GB ECC Ram, Supermicro X9SCL, m1015. Loaded it with ESXi 5.5 and OmniOS/Napp-it with 8GB ram allocated to it.

I had one of drives in the first group of four start showing bad sectors so I started an RMA on that one and WD sent me a 4TB RE4 to replace a 2TB RE4 (/sigh, while this looks like a big mistake in my favor, I really need the 2TB drive :eek: )

I picked up 4 3TB Reds right now set up as a Raid-Z2 and with a iSCSI (whatever the standard settings are) drive mounted in a Windows 2008 R2 server I got ~100MB/s read and write off it over a 1GB uplink to another server. I can't really expect more, so would a ZIL help? I might blow it out and do a striped mirror (Raid10). The only concern on that is resiliency if 2 drives fail.

If you build All-In-One with ESXi, you should use NFS because NFS can auto-reconnect. With iSCSI you must reconnect after each reboot manually. NFS is also more comfortable as you can reach your VM files over CIFS or NFS to clone, move, backup or access to Snaps via Windows pervious version

iSCSI is fine if you want a ZFS backend for Apple, Windows or Linux machines or for a storage head connected to mirrored iSCSI boxes.

For All-in-One you can think of disabling sync (enable write-back in iSCSI) for maximal performance. A Zil is not used then at all. The danger is that your ESXi VM may get corrupted after a crash because last 5s of writes may be lost. You can use a UPS to reduce the problem and you should create snpshots and replicatons/backups to another system.

Regarding your performance benchmarks:
These 100 MB/s are sequential values. Multiple VMs are I/O sensitive. For concurrent small reads/writes your performance can go down to a fraction. If you enable secure sync writes performance can slow down to 10% of nonsync values.

In your case I would disable sync (or spend the money for a 100GB Intel S3700) and use a Raid-10 config with NFS for the VMS. Do backups. If you can spend more RAM for your OmniOS VM this improves performance as well.

In my own AiOs I use fast SSD only Raid-Z2 pools for VM datastores and Raid-Z2 disk based ones for filers and backups.

more, see my tuning page
http://www.napp-it.org/manuals/tuning_en.html
 
Last edited:
Back
Top