As some people have suggested, i'm creating this thread to help people setting up their fileserver using the ZFS filesystem. I will also try to explain some things about ZFS, and give some links to get your started.
What is ZFS?
Sun Microsystems developed the Zettabyte FileSystem, which first showed up in Sun´s Solaris operating system in mid-2005. ZFS is different from other filesystems like FAT, NTFS, Ext3/4, UFS, JFS and XFS because ZFS is both a filesystem and RAID-engine in one package. This yields emerging properties; things that aren't possible if the two systems were separate.
Why is it so cool?
ZFS is the most advanced single-system filesystem available, with unique properties:
Some more technical features:
In essence, ZFS offers excellent protection against data-loss of any kind; other than disasters which affect the whole system physically like flooding/fire. Its also very flexible, allowing you to change almost anything.
What do i need to use ZFS?
Sadly, ZFS doesn't run on Windows, and Apple withdrawn their plans to include it into MacOS. Windows falls short in offering advanced storage technology - all the hot stuff is either on Linux or UNIXes like FreeBSD. So for the purpose of building our fileserver, we want something more modern. To use ZFS, you need to have:
Tell me more about FreeNAS
FreeNAS is an open source OS based on FreeBSD, but adapted to function as an easy NAS operating system that is configurable over the network. FreeNAS installs very easy and should be easy to configure, but it has limited features. It also offers ZFS, but a slightly older version; v6 instead of v13/14 which FreeBSD/OpenSolaris use. Generally your best option if you want something setup quick; but for large filesystems or more complicated setups, a full-fledged Operating System is required.
So what about FreeBSD?
FreeBSD version 7.0 - 7.2 supports ZFS version 6. This is what FreeNAS uses.
FreeBSD version 8.0 supports ZFS version 13.
FreeBSD version 8.1 supports ZFS version 14.
FreeBSD 8.1 is not released yet, so FreeBSD 8.0 is the stable release that packs the real kernel-based implementation of ZFS in full ornate.
FreeBSD is a UNIX operating system, and may be difficult to master. Especially the installation part is a bit tricky, as it uses an outdated text-based installer.
Wait, hold on! So you're saying this FreeBSD is command line stuff?
Well yes, especially the installation and setup will be hard and you should follow a howto document or be guided by someone familiar with FreeBSD. After that, you can use a windows terminal like PuTTY to connect to your FreeBSD server; so you can configure the server from your Windows pc; but still have a command-line.
So what do i need to do in this command-line; i prefer something graphical!
Yes, well, it can be useful to work with commands so you know exactly how you got there. Meaning, if you write down the commands you used to create your filesystem, you know how to do that a second time. So while commands may be scary at first, it will be more logical on the long run.
So how do these commands look like, can you give me an example?
Certainly!
But ZFS is still Software RAID right? Should i get a real Hardware RAID card instead?
No, certainly not! Doing so would mean you lose the portion of ZFS where it can heal itself. You should let ZFS do the RAID, and just use the onboard SATA ports. If those are not enough, expand with PCI-express controllers which can act as normal SATA controller to ZFS. Never use PCI for anything! Only PCI-express.
But still isn't software RAID slower than hardware RAID?
On paper its not, on paper software RAID is superior to hardware RAID. Hardware RAID has increased latency that is unavoidable - software RAID has access to very fast hardware already and is easier to implement an advanced RAID controller in software. Note that in the case of Hardware RAID, its still the firmware software that actually implements the RAID; and its implementation may be simple/unsophisticated when compared to ZFS.
As for speed, ZFS is speedy enough; but never without making sure the data is safe. Some unprotected filesystems may be a fraction faster, but ZFS adds a lot of reliability without sacrificing on speed all that much.
So exactly how do i setup this ZFS?
I will be explaining this in detail later, but generally:
Can i expand my ZFS RAID array?
Yes, but some restrictions apply.
What you cannot do, is expand an existing RAID-Z (RAID5) or RAID-Z2 (RAID6) array with one or more disks.
But, you can add new disks or RAIDs to an existing pool. So if you have a 4-disk RAID-Z, you can add another 4-disk RAID-Z so you have 8 disks. The second array would share free space with the first; in essence it would be a RAID0 of two RAID5 arrays. ZFS can expand this way.
What you can do, is expanding mirrors and RAID0's. In the example above that's what actually happened: a new array is RAID0-ed with the existing array. New created files will be written to both devices, for additional speed. Setting copies=2 would make files in that directory be stored on both RAID arrays; for extra redundancy.
What if a disk fails?
Then you identify which disk is causing problems with the zpool status command. Then replace the disk for a new one.
As long as the failures do not exceed the redundancy offered by ZFS, everything will continue to work, including write access.
Do i need to use TLER or RAID edition harddrives?
No and if you use TLER you should disable it when using ZFS. TLER is only useful for mission-critical servers who cannot afford to be frozen for 10-60 seconds, and to cope with bad quality RAID controller that panic when a drive is not responding for multiple seconds because its performing recovery on some sector. Do not use TLER with ZFS!
Instead, allow the drive to recover its errors. ZFS will wait, the wait time can be configured. You won't have broken RAID arrays, which is common with Windows-based FakeRAID arrays.
How future-proof is ZFS?
As Sun got acquired by Oracle, the future of ZFS may be uncertain. However, it is open source code and still in development. Several non-Sun operating systems now have ZFS integrated, and projects like kFreeBSD may port ZFS to Linux-distributions like Ubuntu.
But, ZFS is not very portable; only few systems can read it.
However, you can connect the disks to windows, and use VirtualBox/VMWare to let FreeBSD inside a VM access the RAID, and export over the network. That works, but Windows should not touch your disks in any way. Simply selecting to 'initialize' the disks would lead in data-loss and perhaps total corruption as key sectors get overwritten. ZFS is resilient, but such tampering may exceed the metadata redundancy of 3 copies per metadata block.
How do i maintain ZFS? Defragment etc?
You don't. You don't need to.
The only thing you need to do, is make sure you get an email/update when one of your drive fails or has corruption, so you are aware and can intervene at earliest opportunity.
Can ZFS replace a backup?
RAID alone can never replace a backup; RAID doesn't protect against accidental file deletion, filesystem corruption or a virus that wipes the drive. But ZFS can protect against that, using snapshots you can make incremental backups so you can go back in time and get each day's version of the filesystem.
A nightly snapshot is very useful and snapshots do not use additional storage space, unless you modify your files since the latest snapshot.
So yes, ZFS can replace a backup. But note that ZFS is advanced software with many lines of code, and any bug may still threaten your data. So for data you can't afford to lose, a real backup at another physical location is still highly recommended.
What about the ZFS support in Linux?
Linux has the GPL-license, and it is not compatible with the CDDL-license that ZFS uses. That means ZFS can't be directly integrated into the Linux Kernel; the best implementation possible. Instead, the FUSE project aims at implementing ZFS in userspace; something that has huge drawbacks and is generally not suitable for most users.
Another effort to implement a kernel-level implementation of ZFS as a CDDL-module, linked to the Linux kernel has a working prototype but appears unmaintained. If you want ZFS, you need either FreeBSD, FreeNAS or OpenSolaris.
How fast does ZFS go?
Real performance is too complicated to be reduced to simple numbers. The buffering and caching of ZFS also make benchmarking it quite hard. But its very fast in real-world scenario's and its speed should never be any issue. As long as you do not use PCI in your system!
What are ZFS cache devices?
ZFS is able to use SSDs in a special configuration, where it acts as cache for the HDDs. This is like having more RAM as filecache, but then SSDs are much bigger than your RAM can hold. Whenever you read something from the RAID array that is cached, the SSD will perform the read request instead; with very low access times. ZFS tracks which data is most accessed and puts those on the SSD. This means it automatically adapts to your usage pattern. You
can have an array of many TeraBytes and have a small SSD that serves the files you access every day and make a real improvement to the performance of the array.
Can i use hot-spares with ZFS?
Yes, you can add one or more hot-spare disks as 'spare' device. These will be available to any array that is degraded; so you can share one hot-spare disk across multiple RAID-Z arrays, for example.
How much RAM can ZFS use?
A lot, largest i've seen was 6.3GB. The RAM space depends on settings, number of disks, stripesize and most of all - the workload. The faster you can make ZFS work, the more memory it will consume. It will be well spent though. For low-memory systems, you can limit the memory ZFS uses; but this also limits performance. Generally, you should not use ZFS with less than 2GB without conservative tuning disabling alot of fancy ZFS features.
Why can't i use ZFS on 32-bit?
You can, but memory constraints mean ZFS is limited to 512MiB memory; where only minimum settings would work. In those conditions, heavy workload would cause ZFS to panic and crash. It wouldn't really be bad; just reboot and it works again without you needing to do anything. But that's not the way you should use ZFS. ZFS is an 128-bit filesystem and feels at home with an 64-bit CPU and 64-bit operating system.
How do i access ZFS from my Windows PC?
For that you need a network protocol. Windows filesharing is common, which uses CIFS/SMB protocol. Samba can be used to export your ZFS filesystem to your windows PC's; you would have a drive letter like X:\ which contains your ZFS filesystem. Other protocols are recommended though, especially NFS and iSCSI work very well. Unfortunately they are not natively supported by Windows. While Samba works, it may limit throughput speed. Its a shame if your ZFS array does 400MB/s internally but over the network you're stuck at 40MB/s. That's an a common issue with Samba.
How do you access your ZFS filesystem?
I use ZFS as mass-storage disk and access it using NFS or Network File System - the preferred way to share files on Linux and alike.
I also use ZFS to store my system volumes of my five Ubuntu workstations. So my desktop PCs don't have internal drives - everything is on the network, on ZFS. This makes using backups much easier as i can perform snapshots on my system disks. The system drives are accessed using iSCSI, which also works on ZFS using FreeBSD. Booting also happens over the network, using "PXE" and specifically, "pxelinux".
The upside is i have a lot of control over my data - especially because i can make incremental snapshots really easy. The downside is that performance is capped by the network bandwidth, as i'm still using 1Gbps ethernet. 10Gbps NICs are available but at supreme cost - more than $500 per NIC; and switches are even more exotic. I suspect prices will drop significantly in 2011; getting 10Gigabit to enthusiasts as well as the server market.
Please use this thread to discuss setting up ZFS and talk about its features. Feel free to ask questions.
Version history:
1.0 - initial version
1.1 - added Hot-Spare section, added section about how i access my ZFS
What is ZFS?
Sun Microsystems developed the Zettabyte FileSystem, which first showed up in Sun´s Solaris operating system in mid-2005. ZFS is different from other filesystems like FAT, NTFS, Ext3/4, UFS, JFS and XFS because ZFS is both a filesystem and RAID-engine in one package. This yields emerging properties; things that aren't possible if the two systems were separate.
Why is it so cool?
ZFS is the most advanced single-system filesystem available, with unique properties:
- ZFS is both Filesystem and RAID-engine.
- ZFS protects your data from corruption using checksums.
- ZFS is maintenance-free and requires no filesystem check; it automatically fixes any problems or corruption.
- ZFS can also act as (incremental) backup using snapshots, much like Windows Restore Points.
- ZFS is versatile; it allows you to grow your filesystem by adding more disks.
Some more technical features:
- It allows you to make any combination of RAID0 (striping), RAID1 (mirroring), RAID5 (single parity) and RAID6 (double parity)
- Because on its Copy-on-Write design, its very resilient against crashes and other problems, and won't need a filesystem check ever!
- Because ZFS is both RAID and Filesystem, it can use dynamic stripesizes to adapt to the I/O workload.
- Aggressive caching and buffering make ZFS consume lots of RAM, but benefits I/O performance.
- Transparent compression can significantly reduce size in some cases.
- ZFS can use SSDs as cache device, increasing the performance of the entire array and adapts to your usage pattern.
In essence, ZFS offers excellent protection against data-loss of any kind; other than disasters which affect the whole system physically like flooding/fire. Its also very flexible, allowing you to change almost anything.
What do i need to use ZFS?
Sadly, ZFS doesn't run on Windows, and Apple withdrawn their plans to include it into MacOS. Windows falls short in offering advanced storage technology - all the hot stuff is either on Linux or UNIXes like FreeBSD. So for the purpose of building our fileserver, we want something more modern. To use ZFS, you need to have:
- 64-bit dual-core/multi-core processor (AMD64)
- Lots of RAM (2GB minimum; 4GB+ recommended)
- Modern motherboard with onboard SATA + gigabit ethernet and PCI-express
- Operating system that supports ZFS. Currently: OpenSolaris, FreeBSD and FreeNAS
Tell me more about FreeNAS
FreeNAS is an open source OS based on FreeBSD, but adapted to function as an easy NAS operating system that is configurable over the network. FreeNAS installs very easy and should be easy to configure, but it has limited features. It also offers ZFS, but a slightly older version; v6 instead of v13/14 which FreeBSD/OpenSolaris use. Generally your best option if you want something setup quick; but for large filesystems or more complicated setups, a full-fledged Operating System is required.
So what about FreeBSD?
FreeBSD version 7.0 - 7.2 supports ZFS version 6. This is what FreeNAS uses.
FreeBSD version 8.0 supports ZFS version 13.
FreeBSD version 8.1 supports ZFS version 14.
FreeBSD 8.1 is not released yet, so FreeBSD 8.0 is the stable release that packs the real kernel-based implementation of ZFS in full ornate.
FreeBSD is a UNIX operating system, and may be difficult to master. Especially the installation part is a bit tricky, as it uses an outdated text-based installer.
Wait, hold on! So you're saying this FreeBSD is command line stuff?
Well yes, especially the installation and setup will be hard and you should follow a howto document or be guided by someone familiar with FreeBSD. After that, you can use a windows terminal like PuTTY to connect to your FreeBSD server; so you can configure the server from your Windows pc; but still have a command-line.
So what do i need to do in this command-line; i prefer something graphical!
Yes, well, it can be useful to work with commands so you know exactly how you got there. Meaning, if you write down the commands you used to create your filesystem, you know how to do that a second time. So while commands may be scary at first, it will be more logical on the long run.
So how do these commands look like, can you give me an example?
Certainly!
Code:
# creates a RAID-Z (RAID5) array called "tank" from disks 1, 2 and 3
zpool create tank raidz label/disk1 label/disk2 label/disk3
# create filesystems
zfs create tank/pictures
zfs create tank/documents
# enable compression for our documents directory only
zfs set compression=gzip tank/documents
# also store each file in the documents directory on all 3 three disks; for maximum safety
zfs set copies=3 tank/documents
# snapshot the documents directory, creating a "restore point"
zfs snapshot tank/documents@2010-03-04
# made a mistake? simply roll back to the last snapshot
zfs rollback tank/documents@2010-03-04
# get status from your array
zpool status tank
But ZFS is still Software RAID right? Should i get a real Hardware RAID card instead?
No, certainly not! Doing so would mean you lose the portion of ZFS where it can heal itself. You should let ZFS do the RAID, and just use the onboard SATA ports. If those are not enough, expand with PCI-express controllers which can act as normal SATA controller to ZFS. Never use PCI for anything! Only PCI-express.
But still isn't software RAID slower than hardware RAID?
On paper its not, on paper software RAID is superior to hardware RAID. Hardware RAID has increased latency that is unavoidable - software RAID has access to very fast hardware already and is easier to implement an advanced RAID controller in software. Note that in the case of Hardware RAID, its still the firmware software that actually implements the RAID; and its implementation may be simple/unsophisticated when compared to ZFS.
As for speed, ZFS is speedy enough; but never without making sure the data is safe. Some unprotected filesystems may be a fraction faster, but ZFS adds a lot of reliability without sacrificing on speed all that much.
So exactly how do i setup this ZFS?
I will be explaining this in detail later, but generally:
- First, install your operating system, i'm assuming FreeBSD here. The OS should be on a different system drive, which can be a USB pendrive or compactflash card, a parallel ATA disk or just a SATA drive. It's best if the system drive is completely separate from the disks that will be used by ZFS.
- Then, connect your HDDs, let FreeBSD find them. Label the drives so each drive has a name like label/disk1 or label/disk2 etc. This avoids confusion, and makes sure that it will be found and identified correctly, regardless of how it was connected.
- ZFS is already pre-installed in FreeBSD; so no need to install FreeBSD.
- Create ZFS RAID pool using the "zpool create" command
- Create ZFS filesystems
- Set various ZFS options
- Set permissions
- Setup Samba/NFS so you can use the filesystem from your networked computers
Can i expand my ZFS RAID array?
Yes, but some restrictions apply.
What you cannot do, is expand an existing RAID-Z (RAID5) or RAID-Z2 (RAID6) array with one or more disks.
But, you can add new disks or RAIDs to an existing pool. So if you have a 4-disk RAID-Z, you can add another 4-disk RAID-Z so you have 8 disks. The second array would share free space with the first; in essence it would be a RAID0 of two RAID5 arrays. ZFS can expand this way.
What you can do, is expanding mirrors and RAID0's. In the example above that's what actually happened: a new array is RAID0-ed with the existing array. New created files will be written to both devices, for additional speed. Setting copies=2 would make files in that directory be stored on both RAID arrays; for extra redundancy.
What if a disk fails?
Then you identify which disk is causing problems with the zpool status command. Then replace the disk for a new one.
As long as the failures do not exceed the redundancy offered by ZFS, everything will continue to work, including write access.
Do i need to use TLER or RAID edition harddrives?
No and if you use TLER you should disable it when using ZFS. TLER is only useful for mission-critical servers who cannot afford to be frozen for 10-60 seconds, and to cope with bad quality RAID controller that panic when a drive is not responding for multiple seconds because its performing recovery on some sector. Do not use TLER with ZFS!
Instead, allow the drive to recover its errors. ZFS will wait, the wait time can be configured. You won't have broken RAID arrays, which is common with Windows-based FakeRAID arrays.
How future-proof is ZFS?
As Sun got acquired by Oracle, the future of ZFS may be uncertain. However, it is open source code and still in development. Several non-Sun operating systems now have ZFS integrated, and projects like kFreeBSD may port ZFS to Linux-distributions like Ubuntu.
But, ZFS is not very portable; only few systems can read it.
However, you can connect the disks to windows, and use VirtualBox/VMWare to let FreeBSD inside a VM access the RAID, and export over the network. That works, but Windows should not touch your disks in any way. Simply selecting to 'initialize' the disks would lead in data-loss and perhaps total corruption as key sectors get overwritten. ZFS is resilient, but such tampering may exceed the metadata redundancy of 3 copies per metadata block.
How do i maintain ZFS? Defragment etc?
You don't. You don't need to.
The only thing you need to do, is make sure you get an email/update when one of your drive fails or has corruption, so you are aware and can intervene at earliest opportunity.
Can ZFS replace a backup?
RAID alone can never replace a backup; RAID doesn't protect against accidental file deletion, filesystem corruption or a virus that wipes the drive. But ZFS can protect against that, using snapshots you can make incremental backups so you can go back in time and get each day's version of the filesystem.
A nightly snapshot is very useful and snapshots do not use additional storage space, unless you modify your files since the latest snapshot.
So yes, ZFS can replace a backup. But note that ZFS is advanced software with many lines of code, and any bug may still threaten your data. So for data you can't afford to lose, a real backup at another physical location is still highly recommended.
What about the ZFS support in Linux?
Linux has the GPL-license, and it is not compatible with the CDDL-license that ZFS uses. That means ZFS can't be directly integrated into the Linux Kernel; the best implementation possible. Instead, the FUSE project aims at implementing ZFS in userspace; something that has huge drawbacks and is generally not suitable for most users.
Another effort to implement a kernel-level implementation of ZFS as a CDDL-module, linked to the Linux kernel has a working prototype but appears unmaintained. If you want ZFS, you need either FreeBSD, FreeNAS or OpenSolaris.
How fast does ZFS go?
Real performance is too complicated to be reduced to simple numbers. The buffering and caching of ZFS also make benchmarking it quite hard. But its very fast in real-world scenario's and its speed should never be any issue. As long as you do not use PCI in your system!
What are ZFS cache devices?
ZFS is able to use SSDs in a special configuration, where it acts as cache for the HDDs. This is like having more RAM as filecache, but then SSDs are much bigger than your RAM can hold. Whenever you read something from the RAID array that is cached, the SSD will perform the read request instead; with very low access times. ZFS tracks which data is most accessed and puts those on the SSD. This means it automatically adapts to your usage pattern. You
can have an array of many TeraBytes and have a small SSD that serves the files you access every day and make a real improvement to the performance of the array.
Can i use hot-spares with ZFS?
Yes, you can add one or more hot-spare disks as 'spare' device. These will be available to any array that is degraded; so you can share one hot-spare disk across multiple RAID-Z arrays, for example.
How much RAM can ZFS use?
A lot, largest i've seen was 6.3GB. The RAM space depends on settings, number of disks, stripesize and most of all - the workload. The faster you can make ZFS work, the more memory it will consume. It will be well spent though. For low-memory systems, you can limit the memory ZFS uses; but this also limits performance. Generally, you should not use ZFS with less than 2GB without conservative tuning disabling alot of fancy ZFS features.
Why can't i use ZFS on 32-bit?
You can, but memory constraints mean ZFS is limited to 512MiB memory; where only minimum settings would work. In those conditions, heavy workload would cause ZFS to panic and crash. It wouldn't really be bad; just reboot and it works again without you needing to do anything. But that's not the way you should use ZFS. ZFS is an 128-bit filesystem and feels at home with an 64-bit CPU and 64-bit operating system.
How do i access ZFS from my Windows PC?
For that you need a network protocol. Windows filesharing is common, which uses CIFS/SMB protocol. Samba can be used to export your ZFS filesystem to your windows PC's; you would have a drive letter like X:\ which contains your ZFS filesystem. Other protocols are recommended though, especially NFS and iSCSI work very well. Unfortunately they are not natively supported by Windows. While Samba works, it may limit throughput speed. Its a shame if your ZFS array does 400MB/s internally but over the network you're stuck at 40MB/s. That's an a common issue with Samba.
How do you access your ZFS filesystem?
I use ZFS as mass-storage disk and access it using NFS or Network File System - the preferred way to share files on Linux and alike.
I also use ZFS to store my system volumes of my five Ubuntu workstations. So my desktop PCs don't have internal drives - everything is on the network, on ZFS. This makes using backups much easier as i can perform snapshots on my system disks. The system drives are accessed using iSCSI, which also works on ZFS using FreeBSD. Booting also happens over the network, using "PXE" and specifically, "pxelinux".
The upside is i have a lot of control over my data - especially because i can make incremental snapshots really easy. The downside is that performance is capped by the network bandwidth, as i'm still using 1Gbps ethernet. 10Gbps NICs are available but at supreme cost - more than $500 per NIC; and switches are even more exotic. I suspect prices will drop significantly in 2011; getting 10Gigabit to enthusiasts as well as the server market.
Please use this thread to discuss setting up ZFS and talk about its features. Feel free to ask questions.
Version history:
1.0 - initial version
1.1 - added Hot-Spare section, added section about how i access my ZFS
Last edited: