ESXi ZFS box: worth it?

balnazzar · Apr 14, 2013

I'm trying to understand if a ZFS system makes sense when installed into ESXi.

Right, the zfs-compatible OS uses indeed zfs, but the underlying filesystem is however esxi own filesystem (an ext4 derived fs, much less reliable than zfs).

Is this correct??

danswartz · Apr 14, 2013

I can't speak to vmfs, but zfs would indeed not make sense on top of a virtual disk. Use a physical disk passed in via vt-d or an RDM.

balnazzar · Apr 14, 2013

Thanks for the reply. I already thought about vt-d, but unfortunately my present esxi box doesn't support vt-d. Perhaps RDM could be viable...

danswartz · Apr 14, 2013

Works pretty well (I've had to use it on a low-end server without vt-d).

balnazzar · Apr 14, 2013

Ok, Thanks a lot!

ChristianVirtual · Apr 15, 2013

I used ZFS with virtual disks to play with it. Was fun to "destroy" an virtual disk while VM was running and assign a new one into the Raid to recover.
But for a real deal only native or passthrough makes sense, ZFS prefer native access to the disk to ensure consistency. Even not install on a hardware RAID.

lopoetve · Apr 15, 2013

ZFS likes the disks, but we do see it being used in guests for various things (makes a great LVM).

balnazzar · Apr 16, 2013

Uhm, it appears I have to buy an aditional controller (M1015?) to do passthru: indeed, one had to provide one controller for the hypervisor, and pass the other to guest...

My little-cheap virtualization lab is getting expensive...

danswartz · Apr 16, 2013

I'm confused. I thought you couldn't do vt-d? If you're using RDM, you certainly don't need those disks on a separate controller.

balnazzar · Apr 17, 2013

While I am doing preliminary experiment on a regular desktop system, I'm planning to switch to a cheap T110 II, and it has vt-d. But as I said, adding controller, disks, and ECC memory, T110 will turn into a not-so-cheap thing.

hutchingsp · Apr 18, 2013

I haven't really played with ZFS but my understanding was that even if you didn't use any of the funky RAIDZ stuff that there are still plenty of benefits that would apply if you were running virtualised?

I'm thinking dedupe, snapshots etc.

danswartz · Apr 18, 2013

Do NOT use dedupe! Very few use cases it makes sense, and it is extremely memory hungry (so much so that people have reported taking days to delete files... Yes, though, checksumming and a number of other nice features. What I like about an ESXi datastore hosted on a ZFS filesystem using NFS: you can snapshot the filesystem, do some stuff to the VM and if it's borked, revert to the snapshot (or if you do periodic snapshots, you can clone a snapshot to another mount point, pull a specific VM's vmdk off it and recover that way. You do not suffer any performance penalties...

hutchingsp · Apr 18, 2013

danswartz said:
Do NOT use dedupe! Very few use cases it makes sense, and it is extremely memory hungry (so much so that people have reported taking days to delete files... Yes, though, checksumming and a number of other nice features. What I like about an ESXi datastore hosted on a ZFS filesystem using NFS: you can snapshot the filesystem, do some stuff to the VM and if it's borked, revert to the snapshot (or if you do periodic snapshots, you can clone a snapshot to another mount point, pull a specific VM's vmdk off it and recover that way. You do not suffer any performance penalties...

Interesting thanks for sharing that - those are the things you don't always hear about (I knew it was memory hungry but not the deleting point).

danswartz · Apr 18, 2013

The problem comes up when your dedupe table spills onto disk. EVERY single reference to the dedupe table then has to read and possibly write disk to do anything.

DigitalDaz · Apr 18, 2013

I've found NFS performs nowhere near as good as iscsi on ZFS. I've tried many different configurations and distros to achieve this.

The best so far out of the box performance wise I have found is Nexenta. I was passing through an IBM 1015 HBA and allocated the VM 8GB of RAM

If you do go NFS you will definitely need a ZIL

danswartz · Apr 18, 2013

I'm of the school of thought that if you are doing an all-in-one, you can just set sync=disabled at little, if any, risk. If you do, NFS should perform at least as well as iSCSI for writes (that was my experience anyway...)

m1abram · Apr 18, 2013

Plus if doing an ALL-in-One the advantage of using NFS instead of iSCSI is that ESX will try to remount the NFS datastore since it will not be available until the SAN VM comes up. iSCSI in my experience ESXi would not retry the mount point and would require user intervention to get your other VM guests online.

danswartz · Apr 18, 2013

Yeah I had the same experience. I think it's not that vsphere doesn't try, but the timeout is much shorter...

DigitalDaz · Apr 18, 2013

Yep, it doesn't come up by itself, you have to boot the zfs vm then rescan the iscsi software hba

I would really like to know where you guys are getting this NFS performance from though as I've tried everything, using a ZIL, sync=disabled etc. Nothing like the speeds I get out of iscsi

danswartz · Apr 18, 2013

What ZFS distro are you using for the guest? What vnic?

DigitalDaz · Apr 20, 2013

danswartz said:
What ZFS distro are you using for the guest? What vnic?

I'm currently using Nexenta, but I've also tried NAS4Free, FreeNAS, raw FreeBSD, OI & Napp-IT.

I'm using the vmxnet3 nic though I have also tried e1000.

I just cannot even seem to get close to the performance with NFS that I can get with iscsi.

MaDSpartus · May 1, 2013

Be warned RDM doesnt work with drives over 2GB, and using virtual drives is a bad idea.

If you dont have hardware passthough and a SATA card to pass through, the best option IMO is smartos, optionally with FiFo installed for management. Using this currently and I'm very happy with it. Dedicated hypervisor with native ZFS and no limits.

Zarathustra[H] · May 2, 2013

It's worth noting that the people over in the FreeNAS forums recommend STRONGLY against using FreeNAS virtualized, especially without Direct I/O.

That being said, I've been running mine using IOMMU to pass through my M1015 controller for a good amount of time now with no issues (other than some initial setup tweaks).

I would strongly advise against running ZFS on any VDMK virtual drives. RDM sounds better, but still not like something I'd rely on if you value your data. Just seems too risky to me.

ChristianVirtual · May 3, 2013

I start using ESXI 5.1 with FreeBSD; passthrough LSI-HBA and a bunch of WD Red 2TB; soon to add a SSD as cache. Still not fully migrated and in kind of test phase. Once I'm confident with setup I will move over the stuff from my QNAP and make the QNAP the backup storage. A task for rsync.

Rody · May 4, 2013

MaDSpartus said:
Be warned RDM doesnt work with drives over 2GB, and using virtual drives is a bad idea.

If you dont have hardware passthough and a SATA card to pass through, the best option IMO is smartos, optionally with FiFo installed for management. Using this currently and I'm very happy with it. Dedicated hypervisor with native ZFS and no limits.

I'm not sure this is true. I am using an all in one setup with 3 3tb drives and i have all my space available.

Stanza33 · May 4, 2013

DigitalDaz said:
I've found NFS performs nowhere near as good as iscsi on ZFS. I've tried many different configurations and distros to achieve this.

The best so far out of the box performance wise I have found is Nexenta. I was passing through an IBM 1015 HBA and allocated the VM 8GB of RAM

If you do go NFS you will definitely need a ZIL

Did you try TUNING NFS a little?

Have a read here, as some of the default settings are way way way too low.

http://utcc.utoronto.ca/~cks/space/blog/solaris/SolarisNFSServerTuning

.

DigitalDaz · May 4, 2013

I've binned any idea of NFS now. Its actually gone in the datacenter today, back to iscsi with 4GB FC.

Pulling about 400MB sequential and 15000+ random IOPS, that will do fine

Zarathustra[H] · May 4, 2013

DigitalDaz said:
I've found NFS performs nowhere near as good as iscsi on ZFS. I've tried many different configurations and distros to achieve this.

The best so far out of the box performance wise I have found is Nexenta. I was passing through an IBM 1015 HBA and allocated the VM 8GB of RAM

If you do go NFS you will definitely need a ZIL

Odd...

I found the absolute opposite.

iSCSI for me had tons of inefficiencies and was kind of slow.

NFS was nice and lightweight and HAULED!

MaDSpartus · May 13, 2013

Rody said:
I'm not sure this is true. I am using an all in one setup with 3 3tb drives and i have all my space available.

You need VT-D and a sata controller to pass through in my experience. Neither Virtual drive images, nor RDM, would let me make drives over 2GB last I tried. Even if all my drives were <= 2TB I still wouldnt use that solution because I would be in trouble next upgrade. If I had VT-D then ESXi would be viable, but as is I find SmartOS is the only sane choice, and a competitive one even if I did have VT-D. I would much rather have ZFS in the hypervisor than in the client OS.

dandragonrage · May 13, 2013

MaDSpartus said:
You need VT-D and a sata controller to pass through in my experience. Neither Virtual drive images, nor RDM, would let me make drives over 2GB last I tried. Even if all my drives were <= 2TB I still wouldnt use that solution because I would be in trouble next upgrade. If I had VT-D then ESXi would be viable, but as is I find SmartOS is the only sane choice, and a competitive one even if I did have VT-D. I would much rather have ZFS in the hypervisor than in the client OS.

I agree that ZFS in ESXi should use VT-d to pass through a controller, and ESXi itself should be on a hardware mirror in that case. Not ideal since ESXi itself can't be on ZFS, but still beats not having ZFS at all. This is actually what I run for a setup myself. M1015 passed through with IOMMU (VT-d). SmartOS seemed pretty good but no passthru and I'm passing more than just my RAID controller thru (also a NIC for a router VM's external interface (for security, to keep it separated from VMWare's network) and likely soon a GPU for a HTPC VM).

I wish people would either give up on virtualization platforms that aren't VMWare/Xen, or else start actually delivering features like IOMMU so they rival the real platforms. I don't want no KVM crap. I see no purpose for it. (And I hate bloated kernels anyway. Dear Linux: Your entire OS should not be contained in the kernel. What next, in-kernel web browsers?)

danswartz · May 13, 2013

Kind of hyperbolic, IMO. "only sane choice"? Seriously? Doesn't do much for your credibility. Given how ESXi works, I can't think of any legitimate concern vis-a-vis not having zfs at the hypervisor level.

/dev/null · May 13, 2013

dandragonrage said:
I agree that ZFS in ESXi should use VT-d to pass through a controller, and ESXi itself should be on a hardware mirror in that case. Not ideal since ESXi itself can't be on ZFS, but still beats not having ZFS at all. This is actually what I run for a setup myself. M1015 passed through with IOMMU (VT-d). SmartOS seemed pretty good but no passthru and I'm passing more than just my RAID controller thru (also a NIC for a router VM's external interface (for security, to keep it separated from VMWare's network) and likely soon a GPU for a HTPC VM).

I wish people would either give up on virtualization platforms that aren't VMWare/Xen, or else start actually delivering features like IOMMU so they rival the real platforms. I don't want no KVM crap. I see no purpose for it. (And I hate bloated kernels anyway. Dear Linux: Your entire OS should not be contained in the kernel. What next, in-kernel web browsers?)

I don't have any issue with my kvm setups....what issues are you seeing? Btw it's a kernel module not the whole kernel!

dandragonrage · May 13, 2013

Robstar said:
I don't have any issue with my kvm setups....what issues are you seeing? Btw it's a kernel module not the whole kernel!

It's not that I use it and see issues. It's that it lacks support for many things (can't beat Xen for IOMMU, only other decent HV with IOMMU is VMWare) and I honestly don't care if it's a "kernel module." Stop bloating the damn kernel. The whole OS is not supposed to be in the kernel. Linux kernel is way, way, WAY too bloated, and because EVERYTHING is in the kernel, kernel version matters so much to software and drivers. Often need kernel updates for software/driver updates. What a horrible system in general. Imagine everytime new drivers come out for Windows, needing a kernel update... LOL. And also, why do they think that "release candidates" should add new features and have huge changelogs in general? Linux RCs should be alphas, releases betas, and they don't have any quality 'release' channel at all.

Anyway, as I was saying, I'll continue to recommend real virtualization options.

lopoetve · May 15, 2013

Robstar said:
I don't have any issue with my kvm setups....what issues are you seeing? Btw it's a kernel module not the whole kernel!

It's not nearly as capable as its competitors.

MaDSpartus · May 24, 2013

danswartz said:
Kind of hyperbolic, IMO. "only sane choice"? Seriously? Doesn't do much for your credibility. Given how ESXi works, I can't think of any legitimate concern vis-a-vis not having zfs at the hypervisor level.

Way to quote me out of context

" If I had VT-D then ESXi would be viable, but as is I find SmartOS is the only sane choice"

I do not have VT-D and I do have drives over 3GB. Teach me how to set that up on ESXi without splitting drives into partitions.

I use smartOS because it works for my platform without any limitations.

ESXi limits my ram & limits my drives size to 2TB, unless i use VT-D, which then limits my platform. As is it is not the appropraite choice for MY situation.

In general my whole post was a clairification for the first reply in the thread:

"Use a physical disk passed in via vt-d or an RDM. "
However the OP was not cautioned on the 2TB limit on RDM. I felt it was prudent because the OP replied...
"but unfortunately my present esxi box doesn't support vt-d. Perhaps RDM could be viable... "

All the sudden telling them that 2TB is the limit of RDM becomes very important, don't you think.

My whole experience with ESXi can basically be summarized as 'limitations'. You may find that an overarching statement, but again, it was MY experience.

Zarathustra[H] · May 24, 2013

DigitalDaz said:
I'm currently using Nexenta, but I've also tried NAS4Free, FreeNAS, raw FreeBSD, OI & Napp-IT.

I'm using the vmxnet3 nic though I have also tried e1000.

I just cannot even seem to get close to the performance with NFS that I can get with iscsi.

Ahh,

I can't speak to the virtual networking. I don't use that much anymore. I have a dual port Intel PRO/1000 PT NIC direct I/O Forwarded to my FreeNAS guest which I have set up LACP on using BSD's LAGG system to my 802.3ad enabled switch.

I prefer relying on virtual hardware as little as possible.

ESXi ZFS box: worth it?

Weaksauce

2[H]4U

Weaksauce

2[H]4U

Weaksauce

[H]ard DCOTM x3

Extremely [H]

Weaksauce

2[H]4U

Weaksauce

Limp Gawd

2[H]4U

Limp Gawd

2[H]4U

Weaksauce

2[H]4U

2[H]4U

2[H]4U

Weaksauce

2[H]4U

Weaksauce

Limp Gawd

Extremely [H]

[H]ard DCOTM x3

Limp Gawd

Gawd

Weaksauce

Extremely [H]

Limp Gawd

[H]F Junkie

2[H]4U

[H]F Junkie

[H]F Junkie

Extremely [H]

Limp Gawd

Extremely [H]