ZFS as ESXi datastore

JamesP

n00b
Joined
Mar 9, 2010
Messages
28
In my system, ESXi boots first from an SSD and then launches the NAS4FREE VM which uses a datastore on the same SSD. The remaining VM's are then launched one by one and these other VM's use a datastore directory hosted on NAS4FREE and exported via NFS. Just to be clear, the 'Other VM's' do not know anything about NAS4FREE and they think they are simply accessing a locally attached drive on /dev/sda1.

NAS4FREE is performing really well on my network and Windows hosts are able to both read and write at 60-100MBytes/s over CIFS.

My virtual machines running on ESXi are not so lucky. They can read from their 'local' disk at 100MB/s but writes are around 6MB/s. Is there anything I can do to speed this up? Or is there just too much overhead going from the VM to ESXi and then through NFS to NAS4FREE?

Thanks,
JP
 
Turn off sync writes. Due look up the risks involved with doing this though.
 
Turn off sync writes. Due look up the risks involved with doing this though.

Thanks for the tip and pointer! I guess my CIFS writes were asynchronous so that's why I didn't see this problem before. Seems that the best option (without turning off sync writes) is to add a ZIL drive. Now to find a cheap reliable ZIL drive...

JP
 
How much memory have you allocated for your NAS4FREE VM? Also, what ZFS version do you have running?
 
Thanks for the tip and pointer! I guess my CIFS writes were asynchronous so that's why I didn't see this problem before. Seems that the best option (without turning off sync writes) is to add a ZIL drive. Now to find a cheap reliable ZIL drive...

JP

Unless your ZIL drive has super caps or BBU (expensive SSD) then you really are not much safer than with sync off.
 
Unless your ZIL drive has super caps or BBU (expensive SSD) then you really are not much safer than with sync off.

Ya, I'm trying to find one with caps but still haven't found a reasonably priced one... You'd think that someone would sell an adapter with capacitors that you could place in series with the power lines of a regular SSD that would guarantee power for X seconds after a power failure...

JP
 
How much memory have you allocated for your NAS4FREE VM?

It's got 12G allocated to it and the 'other' VM (Ubuntu) has 2G allocated to it.

Also, what ZFS version do you have running?

It's v28... I guess v28 is better than previous versions with regards to ZIL, but I'm not sure how this translates into v28 ZIL disk requirements. It seems that basic (non super-cap) SSD's are still not recommended.

JP
 
Ya, I'm trying to find one with caps but still haven't found a reasonably priced one... You'd think that someone would sell an adapter with capacitors that you could place in series with the power lines of a regular SSD that would guarantee power for X seconds after a power failure...

JP

The SSD controller needs to know it lost power so it can flush its cache.
 
I was reading an article about using ZFS with NFS for VMware platforms, there is supposidly a sweet spot for the Record Size (Defaults 128k) somewhere around 8-16k, i have been futzing around with this setting as well. From what i read setting it lower gains IOPs over throughput. Maybe LOP has some suggestions.

Also ensure your NFS network is isolated from your primary network so you arent saturating your primary ip subnet with NFS data.... (created separate nic)

Using OmniOS w/NFS for my VDP machine
 
Ya, I'm trying to find one with caps but still haven't found a reasonably priced one... You'd think that someone would sell an adapter with capacitors that you could place in series with the power lines of a regular SSD that would guarantee power for X seconds after a power failure...

JP

I think you meant 'in parallel with'?
 
I think you meant 'in parallel with'?

Umm, I guess if such a product were to exist, it would be connected in series as in power supply -> supercap dongle -> SSD. Inside the dongle, the supercap would be wired in parallel with the power supply lines, but it would also need some sort of series switch between the supercap and the power supply to prevent the charge stored in the supercap from flowing back down the supply lines in case the power failed.

The SSD controller needs to know it lost power so it can flush its cache.

Oh, this is something new I didn't know about. I wonder if typical SSD's have a timeout that automatically flushes the cache after X seconds? Then the supercap would only need to supply power for X+1 seconds to ensure that all data had successfully been written.

Kickstarter project anyone? :)

JP
 
Umm, I guess if such a product were to exist, it would be connected in series as in power supply -> supercap dongle -> SSD. Inside the dongle, the supercap would be wired in parallel with the power supply lines, but it would also need some sort of series switch between the supercap and the power supply to prevent the charge stored in the supercap from flowing back down the supply lines in case the power failed.



Oh, this is something new I didn't know about. I wonder if typical SSD's have a timeout that automatically flushes the cache after X seconds? Then the supercap would only need to supply power for X+1 seconds to ensure that all data had successfully been written.

Kickstarter project anyone? :)

JP

They don't ;) That's what else is special about enterprise grade SSDs.
 
I was reading an article about using ZFS with NFS for VMware platforms, there is supposidly a sweet spot for the Record Size (Defaults 128k) somewhere around 8-16k, i have been futzing around with this setting as well. From what i read setting it lower gains IOPs over throughput. Maybe LOP has some suggestions.

Also ensure your NFS network is isolated from your primary network so you arent saturating your primary ip subnet with NFS data.... (created separate nic)

Using OmniOS w/NFS for my VDP machine

That's part of the fun of performance tuning - finding those sweet spots. Lots of tuning on both sides to do.
 
That's what else is special about enterprise grade SSDs.

It seems that there are several Intel drives now that have 'enhanced power loss data protection' (supercap equivalent). There's the 320, the 710, and the s3700 with the s3700 being the newest and going for $250 for 100GB. Still trying to figure out if any other manufacturers have anything equivalent :(

I actually moved my datastore to iSCSI from NFS and although it is much better, it still isn't stable. Best I can guess is that small transactions are still quite slow, but burst speed (dd 1GB files) is good (enough) at 100MByte/sec.

James
 
dd is a lousy benchmark - it only really benchmarks single outstanding IO linear commands ;)

Use IOmeter or ioanalyzer :)
 
In case anyone is interested, I went with the S3700 and I get up to 100MB/sec now with large file writes (dd, etc.).

Before adding ZIL drive (these are write speeds measured by iometer):
Code:
Request-Size Operations/sec Throughput(MB/sec)
  4k 67 0.3
  8k 74 0.6
 16k 72 1.1
 32k 70 2.0
 64k 40 2.3
128k 20 2.4

After adding ZIL:
Code:
Request-Size Operations/sec Throughput(MB/sec)
  4k 2.9k 11
  8k 1.9k 14
 16k 2.2k 30
 32k 1.1k 34
 64k 1.2k 68
128k 0.7k 91

A few more details of how I did the tests:
http://hardforum.com/showpost.php?p=1039777725&postcount=4

JP
 
Back
Top