ZFS Recomendations

Joined
Mar 20, 2005
Messages
665
I'm looking for as much FIRST-HAND information as possible from anyone who has tried ZFS. I am also wanting any advice on building a server tailored towards this purpose.

I am planning to implement an Opensolaris server in my home which will eventually fill a norco 4020 (20 3.5" Bays) Similar to Ockie's Galaxy 6 only I need to start on a much smaller scale.

Also data integrity and redundancy is 100% imperative to me so I plan on having all of my data Protected by parity as I understand a ZFS pool is capable of doing. Currently i have over 4TB of data stored in 3 RAID5 Arrays using various raid5 implementations but want a server that will scale to fill a all 20 bays of a Norco 4020 over time even though i'm not going to be buying all the drives at once and still be able to have all the files organized in a single volume. If I understand correct a single ZFS Volume can expand to incorproate drives as they are added and can even be configured to span over a local network if necessary.

here is a link to The parts I intend to use over the long run.

http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=9737231

My budget up front is very small (enough for a card and 2-3 1tb HDD's) and I would like to start out simply by installing opensolaris on one of my existing machines and migrating this pool into the norco and onto a new board in time.

Any advice and or resources would be highly appreciated.
 
even though i'm not going to be buying all the drives at once and still be able to have all the files organized in a single volume. If I understand correct a single ZFS Volume can expand to incorproate drives as they are added and can even be configured to span over a local network if necessary.
ZFS pools can't expand one disk at a time and have parity... yet. There's an enhancement in progress that will make this possible, but it's not likely to happen for a few months at least.

Instead, what you should plan to do is expand a few disks at a time. For example, I've started with six disks in raidz2 and will buy six more in a few months. I'll add them like this:
zpool add mypool raidz2 c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0
You could plan to do five sets of four disks, or four sets of five disks, or three sets of six disks plus boot disks.
here is a link to The parts I intend to use over the long run.

http://secure.newegg.com/WishList/PublicWishDetail.aspx?WishListNumber=9737231
I'd consider waiting for a different disk with fewer firmware problems to be out.
My budget up front is very small (enough for a card and 2-3 1tb HDD's) and I would like to start out simply by installing opensolaris on one of my existing machines and migrating this pool into the norco and onto a new board in time.

Any advice and or resources would be highly appreciated.
How about 4 7K1000.Bs for $350, and use an onboard controller until you need the extra space? Configure them in raidz, and add more vdevs as necessary.

NB: You must not make your pool out of non-redundant vdevs. "zpool create apool disk1 disk2" means the pool can be lost when a single disk dies, even if you set copies=2.

I've played with ZFS a lot, so if you have questions or a proposed configuration let me know. 64-bit is vital, and ECC memory is next on the list.
 
64-bit is vital, and ECC memory is next on the list.

Can you expand on this statement a bit more? Why the necessity for a 64Bit Processor? Or are you speaking in terms of PCI-X 64Bit? Are is memory utilization extremley high with this type of setup? I was figuring on 2GB of RAM being more than enough for my purposes. Also While I do understand ECC RAM being the end-all for data integrity I am just a home user and have to decide to cut costs somewhere... I will not be relying on this to 'back up' my important data, as I will be using cloud and offsite (parents house, yay) DVD storage for the stuff that is truly vital to me. But I want to build the most robust system i can without spending thousands of dollars on server hardware.

Also, Another concern I have is just how much CPU ZFS uses in this type of configuration. I currently have an AMD-Sempron I was thinking would work for the task temporarily but its pretty sluggish even running ubuntu Hardy heron so I really don't know (2GB DDR(2?) mem) Socket 754 proc.

Finally, is 2 drives at a time sufficient to expand the array? While even 4 at a time into a central pool is better than what i've been doing with raid5 (making a new array each time) Not to mention all the other ZFS benefits. I guess I just don't know enough about how ZFS handles redundancy to know what the best way to go about this is.

Another bit of info: The main day-to-day purpose of this machine will be a media server to a local network. there will not be high bandwidth-demands on it other than to transfer files to and from the server from time to time. The biggest question i'll have once I get this thing running is what is the best protocol available in open solaris to serve these files to all of my various devices (windows laptops, XBMC 1st gen xbox (very versatile), and PS3 for high-def content, but maybe that is a question for another thread and another day.
 
Can you expand on this statement a bit more? Why the necessity for a 64Bit Processor? Or are you speaking in terms of PCI-X 64Bit?
64-bit processor; ZFS likes the extra address space. This means you can address more memory and more devices at the same time. 64-bit PCI would be nice, but it's hard to find cheap boards that have it. You might consider holding out for the AOC-SASLP-MV8 controller to get better speed; it uses a more easily findable PCI Express interface.
Are is memory utilization extremley high with this type of setup? I was figuring on 2GB of RAM being more than enough for my purposes.
2GB will suffice. ZFS will use as much RAM as you have for caching, but 2GB is plenty for home use.
Also While I do understand ECC RAM being the end-all for data integrity I am just a home user and have to decide to cut costs somewhere... I will not be relying on this to 'back up' my important data, as I will be using cloud and offsite (parents house, yay) DVD storage for the stuff that is truly vital to me. But I want to build the most robust system i can without spending thousands of dollars on server hardware.
Understood.
Also, Another concern I have is just how much CPU ZFS uses in this type of configuration.
Not enough to be a concern, in my experience. I ran OpenSolaris on a dual-P3 system at 866 mHz and got 50 MB/s raidz over 8 drives with one CPU pegged. These days they've multithreaded more things, so you can get a cheap dual core AMD and benefit from that. And modern CPUs have better IPC than P3s; this means even a 2 GHz box can probably do a few hundred megabytes per second---way more than you need for watching movies.
I currently have an AMD-Sempron I was thinking would work for the task temporarily but its pretty sluggish even running ubuntu Hardy heron so I really don't know (2GB DDR(2?) mem) Socket 754 proc.
I think network access is plenty fast on my machine; working at the console can be a little laggy sometimes, but I rarely do that and ssh is plenty fast.
Finally, is 2 drives at a time sufficient to expand the array?
You could do mirrors with this setup, but you lose half the space to redundancy.
While even 4 at a time into a central pool is better than what i've been doing with raid5 (making a new array each time) Not to mention all the other ZFS benefits. I guess I just don't know enough about how ZFS handles redundancy to know what the best way to go about this is.
I'd do a little bit of reading into how zfs pools are composed.
Another bit of info: The main day-to-day purpose of this machine will be a media server to a local network. there will not be high bandwidth-demands on it other than to transfer files to and from the server from time to time. The biggest question i'll have once I get this thing running is what is the best protocol available in open solaris to serve these files to all of my various devices (windows laptops, XBMC 1st gen xbox (very versatile), and PS3 for high-def content, but maybe that is a question for another thread and another day.
I'd suggest installing the CIFS server. This provides a Windows-compatible file share. For *nix machines, NFS is available. These services are configurable through the zfs tools:
zfs set sharenfs=on data/Movies
zfs set sharesmb=name=music data/Music

If you haven't done so already, I'd suggest reading the man pages for zfs and zpool in their entirety. Kinda dry, I guess, but they cover the territory pretty well. Not a bad use of half an hour.
 
I'd read all about firsthand FSCK in regards to zfs reboots before configuration.
............
per the posts below: Maybe I misread "zfs slowdowns temporarily" as FSCK's
............
 
Unhappy_mage:

Thanks so much for all the great info. Any links you've found helpful would be greatly apreciated as well for my future reference. Maybe there's a good ZFS book somewhere? :D

I'd read all about firsthand FSCK in regards to zfs reboots before configuration.

And what did your readings lead you to believe? Do you have any specific resources I could check into? While downtime is an issue for me (when i want to watch TV, it ought to be running....) However, A Long File system check on reboot is not necessarily an issue for me, and in fact I would prefer it to having corrupted data when I get up and running.
 
I would drop the Supermicro card as well and spend the savings on a slightly better motherboard. Here is my ZFS setup:
http://www.newegg.com/Product/Product.aspx?Item=N82E16813131134 - Asus M2N-LR MB w/PCI-X
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231122 - Gskill 2 x 2GB Memory
http://www.newegg.com/Product/Product.aspx?Item=N82E16819103298 - Athlon X2 5050e
http://www.newegg.com/Product/Product.aspx?Item=N82E16822152102 - 5 x Samsung F1 1TB
http://www.newegg.com/Product/Product.aspx?Item=N82E16817104014 - FSP 600Watt P/S
Put it all in an Antec 900 case.

Planning on going up to a Norco case at some point but I've still got over 1TB left after RAIDZ so I'll probably wait until 2TB drives come out and build a second array at that point.

Performance is excellent, I went 64bit as well, its strongly recommended in every whitepaper, doc, and forum post about Solaris and ZFS. Solaris and ZFS are both optimized for 64bit and if you look around you actually can find people complaining about throughput performance on 32bit platforms. I figured for a mere $62 processor I could do 64bit it was a no-brainer. I also did opt for 4GB of memory as I also run Virtualbox on this machine to run a Server 2003 VM for my windows requirements.

I run CIFs and have automated snapshots turned on (possibly the coolest part of ZFS). I can max out the gigabit link to the server when hitting it with transfers from two our more workstations at once (The drive speed on one machine isn't fast enough to do more than roughly 60MB/s avg.

You really need to go for a larger number of drives at once if you are doing RAIDZ. Two drive at once really won't do it as you are essentially adding another raid set under the pool which means you will lose more space to parity. I'd go for more smaller drives up front to meet your requirements and you can always carve up a second pool later on when 2TB+ drives are out.

ZFS does online disk scrubbing, you just have to schedule it. Its strongly recommended for online data integrity.

All in all I'm very pleased with ZFS performance and reliability. The ease of setup is ridiculous (it was more of a pain to setup CIFs initially as many of the guides on the net show how to setup CIFs on an older version of Solaris which was completely different).
 
I'd read all about firsthand FSCK in regards to zfs reboots before configuration.
There is no filesystem checker for ZFS, only a "scrubber". That is, you don't have to run fsck which checks for inconsistent structures, but rather "zpool scrub" which checks for incorrect checksums and fixes them from parity if possible.
Unhappy_mage:

Thanks so much for all the great info. Any links you've found helpful would be greatly apreciated as well for my future reference. Maybe there's a good ZFS book somewhere? :D
zpool man page
zfs man page
ZFS Best Practices Guide

I run CIFs and have automated snapshots turned on (possibly the coolest part of ZFS).
Howto and demo here
I'd go for more smaller drives up front to meet your requirements and you can always carve up a second pool later on when 2TB+ drives are out.
You don't need a second pool, just a second vdev in the same pool. This lets you have all your storage in one place; the pool just gets bigger.
ZFS does online disk scrubbing, you just have to schedule it. Its strongly recommended for online data integrity.
Agreed. Add something to root's crontab like this:
20 2 * * 0 /usr/sbin/zpool scrub mypool
and once a week at 2:20 AM it'll get automatically checked for errors.
 
You don't need a second pool, just a second vdev in the same pool. This lets you have all your storage in one place; the pool just gets bigger.

And THAT right there, is my single favorite ZFS feature. Everything else is just icing on the cake to me. Looks Like i've got reading to do folks! Thanks for all the great resources. :D
 
My OpenSolaris/ZFS setup is a Gigabyte P45 board, a E5200 CPU and 4 GB of RAM with 4x Samsung 1TB drives. You can save money buy going with a motherboard that has enough SATA ports natively, which is why I chose the one I did. Though you want to check the Sun HCL first, popular motherboards will usually be on there if someone has tried it out. I've been playing with ZFS for a couple years on Solaris Express, you need a 64 bit CPU (32 bit does work, but not nearly as well performance wise), 4 GB of RAM is my suggestion (runs OK on 2 GB, but again better with 4 GB) and you want to not cut corners with hardware if going with a RAIDZ setup. The only thing that I'd give a traditional hardware RAID5 setup over ZFS is that they usually have a battery backup for glitches that can wipe out data. Quality hardware with a UPS is a must-have IMO. Other than that ZFS is the only filesystem I felt confident enough with to replace my aging SCSI RAID5 setup with :)
 
he only thing that I'd give a traditional hardware RAID5 setup over ZFS is that they usually have a battery backup for glitches that can wipe out data.

The reason you make this statement i assume is because of the way ZFS caches its writes in order to do a checksum and ensure data integrity as well as wirting in 'batches' so that the drives are not constantly seeking. Correct?

A good UPS is on my list. I've found a couple of server-grade ones that are available on craigslist for under $300 right now and also have my eye on this guy

http://www.amazon.com/Tripp-Lite-SM...active/dp/B000DZRY9C/ref=reg_hu-wl_item-added

which seems affordable and ought to offer at least enough time to shut down the system in the event of power failure...
 
As an Amazon Associate, HardForum may earn from qualifying purchases.
Can you expand on this statement a bit more? Why the necessity for a 64Bit Processor?

the only reason people say this is because opensolaris and zfs is not as well tested or developed on low end systems like 32bit with low memory. sun develops it and tests it on more powerful systems, and doesnt particularly care about your 2gb memory junker server.
 
the only reason people say this is because opensolaris and zfs is not as well tested or developed on low end systems like 32bit with low memory. sun develops it and tests it on more powerful systems, and doesnt particularly care about your 2gb memory junker server.

Memory is cheap enough these days that this won't be an issue. I actually think that for this build I may end up migrating my existing core2duo 2.6Ghz processor into a new mainboard and putting a Core2Quad into my gaming rig so 64bit should be a non-issue.
 
Finally, is 2 drives at a time sufficient to expand the array?

okay think about it like this. zfs cant expand its arrays. all you can do is add more new arrays to a file system.

so you can have a file system with a 4 disk array, and then later add another 4 disk array, and then later 1 disk array, and all of these show up as one file system. but they are separate arrays. you cant add 1 disk to the 4 disk array.
 
zfs cant expand its arrays.

This was my assumption that was incorrect. I stand corrected and see exactly where this is going. It doesn't fit my optimum scenario but it only makes sense and still works better than any other option I have so far. Not to mention all of the other Pluses that ZFS offers in terms of data integrity
 
assuming raidz, you don't add a disk to a raidz array, you add another raidz array. It's analogous to raid5 becoming raid50, a stripe of 2 raid5 arrays.
 
The reason you make this statement i assume is because of the way ZFS caches its writes in order to do a checksum and ensure data integrity as well as wirting in 'batches' so that the drives are not constantly seeking. Correct?

A good UPS is on my list. I've found a couple of server-grade ones that are available on craigslist for under $300 right now and also have my eye on this guy

http://www.amazon.com/Tripp-Lite-SM...active/dp/B000DZRY9C/ref=reg_hu-wl_item-added

which seems affordable and ought to offer at least enough time to shut down the system in the event of power failure...

Correct on the write caching, good hardware and a good UPS will help reduce the risk of losses in that manner though.

Also on the RAIDZ expansion, there should be a solution to that sometime in the future. I know Sun is not very interested in adding the feature, but there has been some talk from a few on the OpenSolaris project about how to go about adding that feature. I believe the basic concept has been worked out. I simply decided to build my RAIDZ array twice as large as I needed it, hopefully true expansion will be out before it's full :)
 
As an Amazon Associate, HardForum may earn from qualifying purchases.
Correct on the write caching, good hardware and a good UPS will help reduce the risk of losses in that manner though.

Also on the RAIDZ expansion, there should be a solution to that sometime in the future. I know Sun is not very interested in adding the feature, but there has been some talk from a few on the OpenSolaris project about how to go about adding that feature. I believe the basic concept has been worked out. I simply decided to build my RAIDZ array twice as large as I needed it, hopefully true expansion will be out before it's full :)

Truth be told I really was just mistaken on how it worked, all the hype i've been hearing went to my head I guess. Truly though I am very excited to get this up and running. Does anyone know what the MOST drives you can configure in a single raidz and still only lose the equivalent of one drive's space to 'parity'? Can I do 6? or 8?
 
Truth be told I really was just mistaken on how it worked, all the hype i've been hearing went to my head I guess. Truly though I am very excited to get this up and running. Does anyone know what the MOST drives you can configure in a single raidz and still only lose the equivalent of one drive's space to 'parity'? Can I do 6? or 8?

You can do as many as you want (well, maybe 255 or something is the limit) but you'll want to keep it to single digits for performance reasons. A raidz stripe of N disks can only process as many random I/Os as a single disk, because ZFS writes each block across all the disks, and so to check the parity you need to read all the disks. Thus, if you have (say) 12 disks you'll get better performance out of 2 6-disk stripes than 1 12-disk stripe. Sequential transfers are not affected by this, only random accesses.

Also, consider that if you have very wide stripes (say, 8 disks) you might be better off dedicating two disks per stripe to parity (i.e., use raidz2). Robin Harris isn't my favorite source, but his article Why Raid 5 stops working in 2009 is worth a read. Basically, with a large stripe (8 1TB disks, say) the total volume of data is so big that if one disk goes there's a pretty good chance that you won't be able to read a particular sector from the remaining disks, and with raid 5 this means a failure. ZFS is a little smarter than the raid 5 he discusses---one missing sector should result in one corrupt file, or nothing wrong if it's metadata (there are multiple copies of that). But the point stands; if your data is important, and you have a lot, consider double parity.
 
As far as I know there's no real limit on disks in a vdev or a zpool. It's also probably not recommended to go over 6 or 8 drives in a raidz due to the possibility of losing 2+ disks at the same time or during the resilvering (rebuild) process. Check out the ZFS Best Practices guide, it has some info on pool setups and sizes. As previously mentioned, you grow zpools by either replacing disks 1:1 with larger replacements, adding an additional vdev to the zpool, or creating another zpool. If you're planning on eventually filling 20 bays and only starting with a handful of disks, you're going to have to put a decent amount of thought into what you want the final setup to look like and plan accordingly.

For the UPS, make sure you get something that works with Solaris/OpenSolaris, APC is one of the best manufacturers, and seems to be well supported.

Definitely consider the WD Green drives, they run cool, quiet, and are more than fast enough for a media server. The power savings add up once you get to 20 drives, they would probably save you around $100 in electricity a year if it's always on.

Whatever motherboard you choose, make sure you can realistically run 20 drives from it. You'll need something with 2x PCIe or 2x PCI-X.
 
Let me hijack this thread for a moment and ask: Can you use drives of different sizes with ZFS and RAIDz without losing space as you would with say, raid 5? And all being part of a single pool?

I use WHS now mainly because I can use all of the drives I have of varying sizes and it's easy to setup and maintain. I use a CM stacker case now with room for a dozen drives. I currently have 10 drives on an old Asus P4C800-E board with a 2.4ghz P4 northwood and 2gb of DDR 400mhz. It works well but I am concerned about data corruption.
For drives I have 2x250gb, 2x500gb, 4x750gb and 2x1tb.
I do have a spare socket 775 micro atx board with 4 SATA ports. I also have 2 TX4 PCI promise control cards with 4 sata ports on each. I would need to buy a cpu for it. I do have 2gb of DDR2 ram for it.

Since you can't add more disks to an array but rather more arrays to the pool, how does the redundency happen? Does each array have it's own parity disk? Or does it use a disk in the pool for all of the arrays? What if you just want to add a single disk to the pool, will it have it's won redundency?

Sorry for the noobish questions, i'm just trying to wrap my brain around this ZFS and RAIDz thing. And if it's right for me. I do like what i've read about ZFS so far.
I have been looking at freenas 0.7 which will have ZFS and RAIDz but it's taking so long i'm beginning to wonder if it will every see the light of day.
 
[...] ZFS is a little smarter than the raid 5 he discusses---one missing sector should result in one corrupt file, or nothing wrong if it's metadata (there are multiple copies of that). But the point stands; if your data is important, and you have a lot, consider double parity.

Awesome point, and well taken. This is something that I will make sure to take into account. IF I am able to build an 8 drive array initially This really isn't any worse than building 2 4 drive arrays only with the added benefit of protecting against any two of the drives failing, instead of one in each 4 drive array. Now. The question comes in is cost/risk worth it with single parity and 6 drives <hmmm...>
 
[...]
Since you can't add more disks to an array but rather more arrays to the pool, how does the redundency happen? Does each array have it's own parity disk? Or does it use a disk in the pool for all of the arrays? What if you just want to add a single disk to the pool, will it have it's won redundency? [...]

Now I'm no expert yet but this should answer at least this part of your question

No, you can only have redundancy within either a mirror or a raidZ (known as virtual devices within a pool). Pools do not manage redundancy. It is my understanding that this handling of parity (essentially just like any old raid) is far superior if less convenient than WHS currently offers.

Question: how much data is redundant in your current WHS implemenation? And how much space have you dedicated to this redundancy?
 
Question: how much data is redundant in your current WHS implemenation? And how much space have you dedicated to this redundancy?

I have redundency turned on for every share. So the entire thing is mirrored. Something like 2.2tb of actual data and about 6tb of total space with all my drives.
And of course the backups it's making of the other computers on my network are also taking up space but it's not more than 200gb.
 
I have redundency turned on for every share. So the entire thing is mirrored. Something like 2.2tb of actual data and about 6tb of total space with all my drives.
And of course the backups it's making of the other computers on my network are also taking up space but it's not more than 200gb.

Your drives add up to Roughly 6500GB total space, you are saying you are only able to get 2.2TB usable out of that with WHS? If all is mirrored then the max space you should have avail is 3250GB If not you will be much better off with any implementation of RaidZ or mirroring under ZFS. You would create a mirrored pair with each of your 2x drives and a raidz with your 750gb drives. which would give you
Code:
your virtual devices (vdevs) would be:
2x250GB mirror = 250GB, 
2x500GB mirror = 500GB
2x1000GB mirror = 1000GB
4x750GB RaidZ = 2250GB
Total usable storage using a single ZFS pool containing all vdevs = 4000GB

Total increase over current config ~ 750GB of extra space (a little less due to metadata for ZFS) along with the benefits of ZFS data integrity.

Be sure to check the opensolaris hardware compatibility list before even thinking about migrating. Note you won't be using any raid functionality of any of your raid cards or devices with ZFS, it will have to be able to run in pure SATA mode.

[...]
Definitely consider the WD Green drives, they run cool, quiet, and are more than fast enough for a media server. The power savings add up once you get to 20 drives, they would probably save you around $100 in electricity a year if it's always on.

Whatever motherboard you choose, make sure you can realistically run 20 drives from it. You'll need something with 2x PCIe or 2x PCI-X.
I was thinking about the WD blacks because of the 5 year warranty but low power usage is a goal of mine so I might end up taking the green side of things if I don't buy the 1.5TB Seagates i've been eyeing.

When you say 'realistically run 20 drives from' what do you mean? Specifically would cause the PCI bus or other component to be a bottleneck? there are others on this board, (namely Ockie If I understand his setup properly) Using the regular PCI 33mhz bus with dual PCI-X Cards to support this many drives. it is true that data thruoughput is limited by this slow bus however if the end result is a transfer out over Gigabit ethernet (which is slightly slower than the max the PCI-33mhz bus can handle) Then what is the difference to the end user?
 
Your drives add up to Roughly 6500GB total space, you are saying you are only able to get 2.2TB usable out of that with WHS? If all is mirrored then the max space you should have avail is 3250GB.

No, i'm saying that I have 2.2tb of actual data. I have 1.3tb or so of current free space.

So with the 2.2tb duplicated data I am using 4.4tb roughly.
With 1.3tb free for a total of approximately 5.7tb of total space. Of course the backups of my other computers are not figured in that.
I don't have the server turned on at the moment so I'm just guessing on the numbers.

Note you won't be using any raid functionality of any of your raid cards or devices with ZFS, it will have to be able to run in pure SATA mode.

I don't have any raid cards. The promise cards I have are add on cards only. Just used to give me more sata ports. But I will definitely check out the compatibility list. Though i'd really rather just use the hardware I have so I don't have to spend any money.


I wonder how would you remove a drive from the array and pool without losing the data? In WHS I just select the drive I want to pull out and hit remove and it transfers all the data off to the other drives.

If I have the two 250gb drives mirrored and one of them should die i'm not going to get another 250gb to replace it. I'd rather just pull both drives out(assuming I have enough free space).
 
No, i'm saying that I have 2.2tb of actual data. I have 1.3tb or so of current free space.

So with the 2.2tb duplicated data I am using 4.4tb roughly.
With 1.3tb free for a total of approximately 5.7tb of total space. Of course the backups of my other computers are not figured in that.
I don't have the server turned on at the moment so I'm just guessing on the numbers.

2.2TB + 1.3TB =~ 3.5TB of useable space is bigger than a pure mirror setup by the numbers i listed there so It must be doing something different. I am not knowledgeable enough about either to debate the merits of raidz vs WHS and I think a different thread would be better suited for that since ZFS is a complex enough topic as it is.. I really am mostly interested in the checksum capabilities and the pool expansion with ZFS for myself so that along with the required hardware, are what i'm trying to take away from this thread.

To your question on removal here is the quote from the Zpool Man page:
Code:
zpool remove pool device ...

"Removes the specified device from the pool. This command currently only supports removing hot spares and cache devices. Devices that are part of a mirrored configuration can be removed using the “zpool detach” command. Non-redundant and raidz devices cannot be removed from a pool."

you would have to manaully move the data off of the working 250GB mirror then remove the vdev from your zpool (somehow that just sounds dirty when read out loud)
 
2.2TB + 1.3TB =~ 3.5TB of useable space is bigger than a pure mirror setup by the numbers i listed there so It must be doing something different. I am not knowledgeable enough about either to debate the merits of raidz vs WHS and I think a different thread would be better suited for that since ZFS is a complex enough topic as it is.. I really am mostly interested in the checksum capabilities and the pool expansion with ZFS for myself so that along with the required hardware, are what i'm trying to take away from this thread.

While the 1.3tb is free space the reported amount hasen't been halved to take into account share duplication.
And of course one must take into account the actual formatted size of the drive. 1tb drives format to 900gb roughly.

If I mirrored two 500gb drives in other server software you would have a reported amount of 500gb.

In WHS you turn duplication on a share by share basis. It reports the total amount of free space. So if I have 1.3tb of free space and I copy over 300gb of data it will actually take up 600gb as it's duplicated and I will end up with 700gb of free space left.
WHS isn't a true mirror in the sense of RAID 0. You just turn on duplication for whatever shares you want and anything placed into that share will be copied onto two drives.
 
[...]
In WHS you turn duplication on a share by share basis. It reports the total amount of free space. [...]

Okay and from the Man page of ZFS the equivalent feature that encompasses the way WHS manages the whole thing would be

Code:
copies=1 | 2 | 3

"Controls the number of copies of data stored for this dataset. These copies are in addition to any redundancy provided by the pool, for example, mirroring or raid-z. The copies are stored on different disks, if possible. The space used by multiple copies is charged to the associated file and dataset, changing the “used” property and counting against quotas and reservations.

Changing this property only affects newly-written data. Therefore, set this property at file system creation time by using the “-o copies=” option."
 
So it could work the same. One could just go with the ZFS file system, all drives pooled together no matter the size and enable the copies at creation. So you'll always have two copies of everything on different disks protecting against drive failure.

And still get the benefits of ZFS too. I wonder why if one could just use this copies would anyone use RAID 0? It seems you would get the same benefits with less to manage?

Also, I wonder if it would make more sense to wait for freenas 0.7 which includes zfs and raid z? Since the product is aimed at exactly what we are trying to do.
 
I was thinking about the WD blacks because of the 5 year warranty but low power usage is a goal of mine so I might end up taking the green side of things if I don't buy the 1.5TB Seagates i've been eyeing.

When you say 'realistically run 20 drives from' what do you mean? Specifically would cause the PCI bus or other component to be a bottleneck? there are others on this board, (namely Ockie If I understand his setup properly) Using the regular PCI 33mhz bus with dual PCI-X Cards to support this many drives. it is true that data thruoughput is limited by this slow bus however if the end result is a transfer out over Gigabit ethernet (which is slightly slower than the max the PCI-33mhz bus can handle) Then what is the difference to the end user?

Well it depends on what electricity costs are in your area along with the heat and noise considerations. The 1TB Green drives idle around 2.8W, the Blacks around 6.5W. Depends how much the extra 2 yr warranty means to you. If you're only serving a few users than you can get away with PCI-X on the PCI bus, but if I were building now I would go with two of the PCIe cards just in case. Worst case it probably speeds up scrubs and resilvers. The miniSAS connectors on those cards also keep things neater in the case if you're into that.
 
So it could work the same. One could just go with the ZFS file system, all drives pooled together no matter the size and enable the copies at creation. So you'll always have two copies of everything on different disks protecting against drive failure.
It doesn't work that way, unfortunately. If a drive actually fails, the whole pool will be failed, and will not be usable. It'd be nice if it worked that way, but it doesn't.
And still get the benefits of ZFS too. I wonder why if one could just use this copies would anyone use RAID 0? It seems you would get the same benefits with less to manage?
Yes, it would be nice, but without some method of tracking what disks are part of a pool it's not possible.

Your best bet is to have similar-sized disks. "For drives I have 2x250gb, 2x500gb, 4x750gb and 2x1tb." Maybe you can sell the smaller drives and end up with 4x750 and 4x1tb. Then you can set up two raidz vdevs, for a total of 2.25+3=5.25 TB. Later you can add another vdev composed of larger disks, or replace the 750s with larger disks and get a bigger pool.
2.2TB + 1.3TB =~ 3.5TB of useable space is bigger than a pure mirror setup by the numbers i listed there so It must be doing something different.
It's probably not accounting for the fact that data written to the mirrored part of the WHS share will take up twice as much space as the actual data.
To your question on removal here is the quote from the Zpool Man page:
Code:
zpool remove pool device ...

"Removes the specified device from the pool. This command currently only supports removing hot spares and cache devices. Devices that are part of a mirrored configuration can be removed using the “zpool detach” command. Non-redundant and raidz devices cannot be removed from a pool."

you would have to manaully move the data off of the working 250GB mirror then remove the vdev from your zpool (somehow that just sounds dirty when read out loud)
Read again---you can remove one device from a mirrored vdev, but then the remaining vdev is no longer redundant, and cannot be removed. Removing entire vdevs is in the works, but it's not a high priority and may be a few years out.
 
This is very interesting reading. From what I understand ZFS main focus is on data integrity without regard to the actual hardware used. WHS on the other hand is designed for ease of use, and there's nothing quite like Drive Extender available for the home user.

Does ZFS offer any kind of deduplication, such as Single Instance Storage? (I wish WHS would offer it for more than just pc backups)
 
This is very interesting reading. From what I understand ZFS main focus is on data integrity without regard to the actual hardware used. WHS on the other hand is designed for ease of use, and there's nothing quite like Drive Extender available for the home user.
Yeah, ZFS isn't nearly as simple as WHS to set up, but in my opinion it provides sufficient advantage over WHS for advanced users that it's worth the trouble.
Does ZFS offer any kind of deduplication, such as Single Instance Storage? (I wish WHS would offer it for more than just pc backups)
No. Dedup has been implemented by some ZFS-based products (green bytes, for example), but it's not in vanilla ZFS. It may eventually be implemented based on customer demand; bug 6677093 tracks this issue.

However, if you use snapshots, you can create the same kind of shared-common-blocks structure: create some files, take a snapshot, modify things, take another snapshot, modify some more things, and you can read the data of any of the snapshotted versions while using only as much storage space as is needed to store the total unique data.
 
Do any of you have any experience or knowledge about this 'hybrid' storage system that ZFS supports? Supposedly it caches some of its writes to the SSD first then migrates them into the storage pool. Very very very interesting stuff. more here
http://blogs.sun.com/ahl/entry/hybrid_storage_pools_in_cacm

What are the minimum Hardware requirements for something like this? Could I buy a single SSD for the OS and to use some of its space for this feature?

<btw. I'm getting closer to pulling the trigger on some hardware. That ASUS board with 2 PCI-X slots has a combo deal with 4GB of ECC ram now total for 200 bones :-D >
 
Has anyone tried zfs on FreeBSD? How does it compare to opensolaris?

My setup is P5BV-SAS (8 sas + iirc sata onboard)/Freebsd but have been curious about opensolaris.

My zfs pools are:

8 * 1.5TB raidz2
2 * 73G raptor mirror, * 5
2 * 750G mirror (boot)

E8400, 4G ram. So far no issues....
 
I would drop the Supermicro card as well and spend the savings on a slightly better motherboard. Here is my ZFS setup:
http://www.newegg.com/Product/Product.aspx?Item=N82E16813131134 - Asus M2N-LR MB w/PCI-X
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231122 - Gskill 2 x 2GB Memory
http://www.newegg.com/Product/Product.aspx?Item=N82E16819103298 - Athlon X2 5050e
http://www.newegg.com/Product/Product.aspx?Item=N82E16822152102 - 5 x Samsung F1 1TB
http://www.newegg.com/Product/Product.aspx?Item=N82E16817104014 - FSP 600Watt P/S
Put it all in an Antec 900 case.

Planning on going up to a Norco case at some point but I've still got over 1TB left after RAIDZ so I'll probably wait until 2TB drives come out and build a second array at that point.

Performance is excellent, I went 64bit as well, its strongly recommended in every whitepaper, doc, and forum post about Solaris and ZFS. Solaris and ZFS are both optimized for 64bit and if you look around you actually can find people complaining about throughput performance on 32bit platforms. I figured for a mere $62 processor I could do 64bit it was a no-brainer. I also did opt for 4GB of memory as I also run Virtualbox on this machine to run a Server 2003 VM for my windows requirements.

I run CIFs and have automated snapshots turned on (possibly the coolest part of ZFS). I can max out the gigabit link to the server when hitting it with transfers from two our more workstations at once (The drive speed on one machine isn't fast enough to do more than roughly 60MB/s avg.

You really need to go for a larger number of drives at once if you are doing RAIDZ. Two drive at once really won't do it as you are essentially adding another raid set under the pool which means you will lose more space to parity. I'd go for more smaller drives up front to meet your requirements and you can always carve up a second pool later on when 2TB+ drives are out.

ZFS does online disk scrubbing, you just have to schedule it. Its strongly recommended for online data integrity.

All in all I'm very pleased with ZFS performance and reliability. The ease of setup is ridiculous (it was more of a pain to setup CIFs initially as many of the guides on the net show how to setup CIFs on an older version of Solaris which was completely different).

I'm curious, the specs on that board say it uses Opteron CPU's but the CPU you have chosen isn't an opteron. That's not a typo?
I may just grab that and get some ECC ram.

1tb WD green drvies are on sale too.
 
Has anyone tried zfs on FreeBSD? How does it compare to opensolaris?

My setup is P5BV-SAS (8 sas + iirc sata onboard)/Freebsd but have been curious about opensolaris.

My zfs pools are:

8 * 1.5TB raidz2
2 * 73G raptor mirror, * 5
2 * 750G mirror (boot)

E8400, 4G ram. So far no issues....

I tested out ZFS on FreeBSD before deciding on OpenSolaris, I never had any issues despite being considered an "experimental" feature. It worked fine for me in testing, but last I checked they were still lacking many of the features of a full implementation. Of course I tested it in the early stages of 7.0, it might be up to speed now. I decided to go with OpenSolaris because they had the full fledged implementation, and there were some updates at the time Sun had made to ZFS that I could never figure out if FreeBSD had implemented or not.
 
Do any of you have any experience or knowledge about this 'hybrid' storage system that ZFS supports? Supposedly it caches some of its writes to the SSD first then migrates them into the storage pool. Very very very interesting stuff. more here
http://blogs.sun.com/ahl/entry/hybrid_storage_pools_in_cacm

What are the minimum Hardware requirements for something like this? Could I buy a single SSD for the OS and to use some of its space for this feature?
You can, sure, but I'd recommend trying it out for a little while before committing to it. You can't remove log devices yet, and the pool won't survive with the log device missing. So test and decide if the performance benefits are worth the risk. Later, when log devices are removable and SSDs are cheaper, you can always add a log device.
<btw. I'm getting closer to pulling the trigger on some hardware. That ASUS board with 2 PCI-X slots has a combo deal with 4GB of ECC ram now total for 200 bones :-D >
Wow, very nice. That's quite a bit cheaper than my solution.
 
Back
Top