How much ram for ZFS dedup:

you need a lot.

there aren't many use cases where dedup is beneficial vs just enabling lzjb system wide. even in the cases where dedup is useful (vdi type stuff) you may be better off just using snap shot clones which is sort of a manual dedup but without the potentially massive memory over head.
 
you need a lot.

there aren't many use cases where dedup is beneficial vs just enabling lzjb system wide. even in the cases where dedup is useful (vdi type stuff) you may be better off just using snap shot clones which is sort of a manual dedup but without the potentially massive memory over head.
I agree with this. Dedup on ZFS is immature. It should be avoided for some time. It is easier and more trouble free to buy more disks, than enable ZFS dedup.

However, compression is good and works fine on ZFS.
 
the big problem with any dedup is unless you have a decent level of control over the data your dedup ratios aren't that great.

example:

enterprise email, you can get some good dedup ratios here. emails with lots of folks CCd with one or more attachments, etc. real easy to see the benefits of dedup here.

hosted email, with lots of small to medium sized companies. its all email right? yes, but you won't get the same dedup ratios because every company is receiving different emails.

however in that example lzjb compression system wide is a win in both scenarios because text compresses extremely well.

I was talking with one of EMC's vcdx guys the other day and he told me even EMC doesn't do dedup for things like cloud and VDI. they too are getting better results with compression and snapshot clones.

dedup is a nice bullet point feature but the usefulness really just isn't there in most use case scenarios.
 
A quick faq on the benefits (or lack thereof) of ZFS dedup can be found at http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup. While a nice theoretical exercise, most of the users here who are running home media servers will see no to little benefit for the additional overhead (based on their storage patterns). It is useful for production servers in some cases but is not something you should just enable thinking it is going to give you massive storage savings.
 
It's relatively easy to profile the data being considered for deduplication - which makes it a bit easier to decide whether to bother.

For a home media server though, I agree with the previous poster - it'll almost certainly not be worthwhile!
 
My company is looking at using Nexenta as the foundation of the storage stack of a public cloud solution. Therefore, we will definitely be implementing dedupe as will be thin-provisioning all VMs and over-subscribing the storage we have.
FYI, we are going to have dual head nodes for each storage pool and they will have a minimum of 384GB of RAM.
 
My company is looking at using Nexenta as the foundation of the storage stack of a public cloud solution. Therefore, we will definitely be implementing dedupe as will be thin-provisioning all VMs and over-subscribing the storage we have.
FYI, we are going to have dual head nodes for each storage pool and they will have a minimum of 384GB of RAM.

don't. all i can say is don't. talk with your nexenta rep if you don't believe me but you'll destroy your performance. 384GB may sound like a lot but it really isn't. i'm building more or less the same thing you are with 512GB per head and won't even consider dedup.

you'll get much better results by creating a plugin that manages manual snapshot clones.
 
Back
Top