ZFS on Linux vs. MDADM ext4

Rudde93

Limp Gawd
Joined
Nov 19, 2010
Messages
137
Because of different reasons I have decided to move away from Solaris and move over to Linux, specifically CentOS 6, and I was wondering what would be the best for my file storage on Linux. I need the RAID-6 capability (ZRAID2 in ZFS) so I have been reading on this matter I found two possible solutions MDADM or ZFS on Linux (http://zfsonlinux.org/) it does look like from recent activity that they have active developers on this and it can only go one way I guess. But I know nothing about MDADM, I know it is not a filesystem, so I guess I would be using ext4 or something.

What will be the best/safest/fastest solution?
 
Because of different reasons I have decided to move away from Solaris and move over to Linux, specifically CentOS 6, and I was wondering what would be the best for my file storage on Linux. I need the RAID-6 capability (ZRAID2 in ZFS) so I have been reading on this matter I found two possible solutions MDADM or ZFS on Linux (http://zfsonlinux.org/) it does look like from recent activity that they have active developers on this and it can only go one way I guess. But I know nothing about MDADM, I know it is not a filesystem, so I guess I would be using ext4 or something.

What will be the best/safest/fastest solution?

MDADM handles Raid on Linux file systems.
BTRFS, EXT 2,3 or 4 are Linux file systems.
LVM provides drive pooling for Linux systems.
ZFSonLinux is a project that compiles the ZFS file system for Linux use.
ZFS is most up to date and stable within Solaris. Less so in the open source alternatives (OI, Linux, etc) but they are completely usable. I just wouldn't put them in enterprise environments.

In terms of the best? This depends. MDADM and ZFS are pretty quick (faster than Windows). But ZFS has some features that are only partially added for EXT4 or not at all. But the most stable version of ZFS really exists within Solaris. While MDADM is more stable than ZFSonLinux or the open source ZFS offshoots.

So the decision is going to come down to what you want.

Safety? Solaris or MDADM.
Speed? ZFS will be faster usually however you need to give it enough resources for this to be the case.
Best? Your call.
 
Last edited:
How can ZFS be unstable and less safe then MDADM?

Because the open source alternatives are usually behind the ZFS version that is found within Solaris. There are many bug fixes for things like dedup and more found within the "paid version".

That and MDADM is old as dirt and it's been widely used and tested.

That's not to say that ZFS in it's open source versions is unusable. I'm currently using ZFSonLinux, but it's just not enterprise ready in my eyes like Solaris+ZFS and Linux+mdadm+lvm are.
 
Last edited:
Well I'm not a enterprise user, this is for my home server but I still wan't the best I can get.

The server will have 24 GB of ram so enough for ZFS to play with, I need linux because I don't find any good virtualization software for Solaris and the the current config is ESXi + Windows + Solaris. And I'm switching to linux because I don't get the storage access speed I need in Windows, I can run these application natively on Linux Solaris bench my pool to 700-900 MB/s and I would like to get 400/400 MBs read/write with ZFS on Linux (is that reasonable?), maybe 500 would be nice. I will format the pools as recommended by ZFS on Linux.

My trouble with leaving ZFS for MDADM:

* Pool scrubbing
* Snapshots
* Automatic defrag.
 
My trouble with leaving ZFS for MDADM:

* Pool scrubbing
* Snapshots
* Automatic defrag.

mdadm+ext+lvm do those things. The main reason for ZFS is if you want dedup and block level check summing. These are the things that mdadm+ext+lvm barely do or not at all.
 
If my os disk crash is there any easy way to recover my MDADM pool in a fresh install?

And I will get my expected speed?
 
If my os disk crash is there any easy way to recover my MDADM pool in a fresh install?

And I will get my expected speed?

The number of steps to evict or commit a failed or new drive between ZFS or MDADM is basically the same.

Speed should be unaffected or improve.
 
Consider also that for zfs, since volum+raid+fs are all in one system, you don't have to worry about how to divide up free space, etc...
 
What should be considered?

And what chunk size should be used for 10x 2TB and 10x 3TB?

I see it put it down to 4 from 64? Why so drastic change?
 
If my os disk crash is there any easy way to recover my MDADM pool in a fresh install?

Just boot from any recent livecd/dvd/usb stick from any distro and you should have automatic access to your data.
 
ZFS is not just most up to date with Solaris.
For example, TRIM support has been in FreeBSD HEAD for almost a month now. Do Solaris or Illumos have it yet?
ZFS on Linux is not quite production ready, but getting close. On FreeBSD and Illumos it is, and have been for years now.
 
. ZFS is not just most up to date with Solaris.
For example, TRIM support has been in FreeBSD HEAD for almost a month now. Do Solaris or Illumos have it yet?
ZFS on Linux is not quite production ready, but getting close. On FreeBSD and Illumos it is, and have been for years now.

What someone commits to zfs in open source does not a new zpool version make. Production ready and enterprise grade are two different things ENTIRELY.
 
I don't find any good virtualization software for Solaris
What do you mean with this statement? I am running VirtualBox on my Solaris server and everything works very well for my small needs: I run Windows and occasionally Linux. And sometimes Apple OS X.

VirtualBox can run the most different OSes out there, bar none: OS/2, Haiku, FreeBSD, MS-Dos, Windows95, etc. ESXi does not support that many OSes.

But maybe you have very special needs, which makes VirtualBox infeasible for you?
 
ZFS is not just most up to date with Solaris.
For example, TRIM support has been in FreeBSD HEAD for almost a month now. Do Solaris or Illumos have it yet?
I would not consider TRIM to be that valuable to the ZFS community. Sure, it is nice to have, but not a major contribution. ZFS has been run for many years on SSD disks without TRIM. Actually, some ZFS developers on the OpenSolaris side has been working with TRIM for a while.

I would much prefer FreeBSD developers do something like bp rewrite or something major, which has been missing since day one. But that needs a thorough understanding of ZFS, which BSD developers does not have. I dont expect BSD developers to do a significant contribution to ZFS.

The creators of ZFS comes from Sun and have left Oracle to work in the open source version. I expect the creators of ZFS to have better understanding of ZFS, than BSD developers.

But sure, if BSD developers can make a significant contribution back to the ZFS community, instead of only porting from the ZFS creators on Solaris, I would be glad to proven wrong. That would only strengthen ZFS for all users, which makes everybody happy!

But to believe that BSD developers will be strong ZFS developers and make significant contributions is far fetched. Until now, they have only been porting everything that ZFS developers have done. They have done nothing new. It is like believing that BTRFS developers will surpass ZFS, BTRFS is just a clone of ZFS and they mimic everything that ZFS does. But sure, if BTRFS came up with something new and major, I would be glad. Until now BTRFS devs have only been copying from ZFS. And to believe BTRFS will surpass ZFS is far fetched. How can someone that leeches off ZFS surpass ZFS? They must prove that they are able to make something new and innovative, themselves. Until now, BSD devs and BTRFS devs have not done that, only copying/porting from Solaris devs.
 
bsd just ports zfs from solaris, as far as I know they dont delvelop anything at all for zfs, except maybe patches to make it work on freebsd properly.
 
That is not true. Pawel Jakub Dawidek is a true ZFS developer, he has great understanding of how both ZFS and the operating systems works. He has done a lot of work related to storage solutions and he is very active on the Ilumos ZFS mailing list. He knows the VFS (ZFS use it) very well, which most of you don't know what is at all. The FreeBSD ZFS implemenation is not a toy, its the real thing aimed for enterprise.

There is a ZFS test suite that you can run to test how reliable your ZFS system is.
 
That is not true. Pawel Jakub Dawidek is a true ZFS developer, he has great understanding of how both ZFS and the operating systems works. He has done a lot of work related to storage solutions and he is very active on the Ilumos ZFS mailing list. He knows the VFS (ZFS use it) very well, which most of you don't know what is at all. The FreeBSD ZFS implemenation is not a toy, its the real thing aimed for enterprise.

There is a ZFS test suite that you can run to test how reliable your ZFS system is.
We dont claim the BSD port of ZFS is a toy. We doubt if BSD developers can contribute to ZFS. I have never seen any contributions to ZFS, from the BSD side. Can you name something? Raidz-3? bp rewrite? Rebalancing of vdevs?

Of course, we would love if BSD devs did contribute to ZFS. It would only make ZFS stronger.
 
They've contributed back minor bugfixes, improvements, and added extra functionality. The most major thing done recently is TRIM support. However ZFS feature flags, LZ4 compression is a collaboration between Illumos, FreeBSD and some other developers.

And keep in mind that FreeBSD offers by far more flexibility to your storage, because of the GEOM framework. Specifically encryption and high availability.
 
Because of different reasons I have decided to move away from Solaris and move over to Linux, specifically CentOS 6, and I was wondering what would be the best for my file storage on Linux. I need the RAID-6 capability (ZRAID2 in ZFS) so I have been reading on this matter I found two possible solutions MDADM or ZFS on Linux (http://zfsonlinux.org/) it does look like from recent activity that they have active developers on this and it can only go one way I guess. But I know nothing about MDADM, I know it is not a filesystem, so I guess I would be using ext4 or something.

What will be the best/safest/fastest solution?

the safest solution is mdadm :) very mature and stable.

if you need as a backup server, ZoL on centos 6.X ...

I am using ZoL on centos 6.3 with knowldege where some feature on ZFS are not supported/buggy in ZoL.

on my situation,
on my backup server, love ZoL. I move from OI ZFS to ZoL centos 6.3, the performance is better than OI, NFS never hanging on ZoL ( randomly I had hanging on OI where copying many small files, the writing rate was going dong almost to 0..).
the cumbersome is you need to use samba for windows sharing.. where takes time at the beginning during configuration.

or you can try other mature ZFS on opensolaris(or solaris 11), or BSD...

good luck!!!
 
Think it matters on what you consider safe. Silent data corruption on madam + ext4 can go unnoticed for a long time. Long enough that the corrupted files can make their way to backups and you lose any good copy. I worry about that more than anything. ZoL has been just fine for me... I don't use dedupe or anything fancy... just want checksumed data + snapshots + scrubbing.

I'd rather my ZoL file server blow up and I have to start from scratch and restore data from backups vs a mdadm/ext4 not catching my data being corrupted and me losing it.
 
I worry about that more than anything. ZoL has been just fine for me... I don't use dedupe or anything fancy... just want checksumed data + snapshots + scrubbing.

Do you have any data that was created before you started using ZFS that you copied into ZFS when you started using it?
 
Not sure why you're asking... prior to ZFS I used mdadm + btrfs. Mdadm for raid6 and btrfs for checksums... with this setup it coudln't actually fix any errors but at least it could identify them. I was a bit desperate. Eventually I gave up and switched to Solaris+ZFS. To move data it was copied over NFS, and some I think I restored from backups... but all my backup drives were btrfs so I'd have to mount them on Linux system and then copy data over NFS to new Solaris system. Then to go from Solaris to Linux w/ZoL, all I did was export array on Solaris, move SAS card to Linux and have it import array. And when I say move SAS card... all these were VMs on ESXi, so I just would just change what VM got SAS card w/Pass-through. Actually file server has onboard SAS controller, and two addin SAS controllers... so it was a lot of shuffling what drives were on which SAS controllers and which VM got which SAS controllers via pass through.
 
You trying to say source of data was not checksummed? So what? I'm not saying things can't go wrong or are perfect... just trying to avoid things getting worse unknowingly.
 
You trying to say source of data was not checksummed? So what? I'm not saying things can't go wrong or are perfect... just trying to avoid things getting worse unknowingly.

He's probably looking at the filesystem you chose. Btrfs is experimental and in no way is it enterprise ready. Btrfs + mdadm would be even worse.
 
You trying to say source of data was not checksummed? So what? I'm not saying things can't go wrong or are perfect... just trying to avoid things getting worse unknowingly.

I am just curious about the data that you are so concerned about getting corrupted.

What type of data files are you most concerned about getting corrupted?
 
Well the data being stored should be irrelevant. I could be storing randomly generated data that is totally worthless, it doesn't change how well the storage/backup system is will prevent against data loss. I'm just arguing that ZoL + good backups is safer to use than mdadm + ext4 + good backups. I see the most likely source of data loss will be silent data corruption that goes unnoticed and makes it way onto backups. Anything obvious you can easily restore from backups.

I only focused on this as the person I initially replied to said "the safest solution is mdadm very mature and stable."

I don't think that is true. I've lost data using mdadm + reiserfs.... backups would have let me restore it all, but I was very close to hitting seneratio I describe where data is corrupted, not noticed, and makes it to backups. In my case a memory dimm started to fail, system hung and was rebooted.. upon startup fsck ran and though the filesystem had errors... I said yes to correct them and it kept finding more and more. In the end figured out one memory dimm had failed and was causing fsck to think the filesystem was corrupted when it was not, and when it 'fixed' these errors it was just going through corrupting data. After replacing dimm the filesystem could still be mounted but tons of files were fucked up. Had this dimm failure not been so severe it could have resulted in some silent data corruption that went unnoticed to backups.

Anyways this is just one reason why I think ZoL is better than ext4+mdadm. My file server also hosts VM images and using ZFS snapshots is so much nicer than any alternative... eg compared against ext4+lvm+mdadm. Or just how nice it is to quickly scrub array means only scrub the data and not unused space, eg with mdadm.

I guess the one advantage of mdadm is expanding arrays by 1 disk. That is only thing I wish ZFS could do that mdadm can.

Anyways for your question, I'm not 'that' concerned about anything getting corrupted, I just see no reason not to prevent against it. Most of my file server is stuff I don't really care about, tv shows and movies... a small amount is VM images that I use... and then like backups of 'my documents' and stuff that I care about, and then probably most important stuff is pictures/videos I've taken. I'm not greatest photographer but I've spent lots of time taking pictures and post processing them. The final images can be smaller but some intermediate steps the space is a bit larger, the inital raw files from a 5d2 are like 20MB or so? I dunno. I'd be pretty disappointed if I lost this type of stuff. I keep backups of everything because why not, but the smaller mount of stuff I really care about I'll keep two backups of. Its pretty unlikely I think I'll be losing it... except for that one situation where silent data corruption ends up in the backups. Its so easy to guard against, just use some file system with checksums + scrubbing I see no reason not to .
 
Last edited:
ext4 does not support partitions over 16 TB nativly, I went with MDADM + XFS (How is it with defrag and this, does it do it automatic? Or what?)

So now I don't know what hypervisoer I want to use, my alternatives is Debian + Proxmox, Debian + oVirt or Fedora + oVirt

CentOS didn't work out as well as I hoped.
 
ext4 does not support partitions over 16 TB nativly, I went with MDADM + XFS (How is it with defrag and this, does it do it automatic? Or what?)

So now I don't know what hypervisoer I want to use, my alternatives is Debian + Proxmox, Debian + oVirt or Fedora + oVirt

CentOS didn't work out as well as I hoped.

Sorry the update for e2fsprogs is in kernel 3.7. I try not to be a distro-whore but I just prefer Debian based distros. Everything is just easier. KVM works pretty well. Passthrough is a PITA but other than that it's pretty good.
 
He's probably looking at the filesystem you chose. Btrfs is experimental and in no way is it enterprise ready. Btrfs + mdadm would be even worse.

At work (medical imaging research) I am just about to begin slowly migrating my 40TB+ of data to that (btrfs on mdadm raid 6) after about 1.5 to 2 years of testing with basically a few TB of throw away data but similar usage pattern to real data. My first array with real data will be just 6 x 2TB drives in mdadm raid6 with btrfs as the filesystem. I actually put the drives in the server today however they must all pass a 4 pass badblocks read/write test before being put into usage. With that said I will backup this data just like all other data in my raid arrays on to my 2 drive LTO2 autochanger.
 
Back
Top