Physical vs. Virtual FreeNAS + iSCSI vs. NFS: performance testing results.

Johnyblaze

Limp Gawd
Joined
Nov 9, 2003
Messages
300
Hi everyone,

I've wanted to see how well FreeNAS would perform in back to back tests on physical vs. virtual hosts on the same hardware for a while now and finally have all the parts together to do some initial testing. My vision is to have a single box that can do it all for my homelab and for testing with ESXi being the hypervisor and have FreeNAS be virtualized and host out a nested datastore for the rest of my ESXi VM's.

I wanted to share my results in case anyone else was curious and I haven't seen anything like this benchmarked before. This is not a production system and the results are far from scientific but should give you a little insight into the performance of this type of configuration.

Pertinent hardware specs:
  • FreeNAS 9.10.1
  • SuperStorage Server 6048R-E1CR36L chassis
  • X10DRH-iT motherboard
  • Dual Xeon 2620 v4 CPUs
  • 128 GB RAM
  • LSI 3008 HBA
  • 14x Seagate IronWolf 10TB 7,200 RPM drives in a single zvol that constists of 7x mirrored vdevs that are stripped (RAID 10).
  • 2x 80 GB Intel SSD DC s3510 overprovisioned to 10 GB for SLOG
  • 1x 256 GB Samsung 850 Pro SSD for cache
  • 2x onboard Intel x540 NICs (MTU 9000)
  • Netgear XS708E 10GbE switch with VLANs set up for the storage network to isolate its traffic.
Physical host setup:

FreeNAS was installed on a USB device in the SuperStorage Server 6048R-E1CR36L chassis with X10DRH-iT motherboard. The onboard Intel X540 10GbE NICs were plugged into a Netgear XS708E 10GbE switch. On the ESXi host side of things, I used a Supermicro X10SDV-TLN4F based 1U server which has an Intel Xeon D-1540 SoC and integrated Intel x552 10GbE NICs as well which was also plugged into the Netgear XS708E 10GbE switch. The screenshots you're seeing below are from a Windows Server 2012 R2 VM on iSCSI and NFS datastores.

Physical FreeNAS, iSCSI VM, sync=disabled

iscsi_physical.png

Physical FreeNAS, NFS VM, sync=disabled

nfs_physical.png

I forgot to screenshot the results for the physical host with sync=always but the writes were the same as you'll see below with the FreeNAS VM at around 100 MB/s.

VM host setup:


The general process for the VM setup was as follows:

A FreeNAS VM was created on an SATA DOM datastore that is physically in the host. Then in the FreeNAS VM the LSI 3008 HBA was set to passthrough mode so the VM could have full access to the disks. VMXNET 3 NICs were used for the VM and MTU was set to 9000 from within ESXi networking and FreeNAS. Either an iSCSI zvol target or NFS datastore was set up in the FreeNAS VM and passed back through to the ESXi host where I created a nested datastore. I then created a Windows Server 2012 R2 VM on that nested datastore. These are the disk performance results from the various instances of the Windows Server 2012 R2 VMs that's were on the aforementioned datastores. The Intel E1000E NIC was used for the Windows Server 2012 R2 VMs and the MTU was set to 9014 bytes.

VM FreeNAS, iSCSI VM, sync=disabled

iscsi_vm.png
VM FreeNAS, iSCSI VM, sync=enabled

iscsi_vm_sync_writes.png

VM FreeNAS, NFS VM, sync=disabled

nfs_vm.png
VM FreeNAS, NFS VM, sync=enabled

nfs_vm_sync_writes.png
Questions:
  • Based on the info I provided, is there any glaring reason why my performance metrics take such a hit with sync=enabled?
  • Which do you chose for your ESXi datastore; iSCSI or NFS and why? Right now I'm leaning towards NFS based on these results and the hassle I see in general with iSCSI tuning on FreeNAS.
  • What other disadvantages do you see with this confirmation as it relates to virtualizing FreeNAS, considering you can passthrough an HBA and this is a non production environment?
Conclusion:

I'm still not totally sold one way or the other on virtualizing FreeNAS, but the temptation of virtualizing it for the power savings of having less servers running is tempting. It's also tempting to use sync=disabled for the ESXi datastores but I know that's not smart so I need to get that figured out. Additionally, as it relates to virtualizing FreeNAS, it's pretty annoying when you reboot or have a power outage as you have to either SSH into the ESXi host to rescan the storage adapters (which I realize you can automate but it doesn't work if you're using FreeNAS volume encryption) or you have to do it manually from the GUI before you can power on your VMs that are in your nested datastore that's hosted by the FreeNAS VM. It's also of course a huge negative as this type of setup takes down all of your nested VMs when something goes wrong and the system goes down for whatever reason.

Furthermore, my results show that sync=enabled is a huge performance killer, even with an Intel DC s3510 SSD's for a SLOG. I am not sure if I'm doing something wrong but the performance loss is just too great. I don't really care if sync=disabled winds up destroying my VM's in the ESXi datastore as Veeam is easy enough to use but I would be upset if I lost my data on the rest of my FreeNAS volume, so that's concerning. I just can't imagine that the Intel DC s3510 is too slow to use as a proper SLOG and will only put out 100 MB/s. I wish I had an NVMe drive to test SLOG performance with...

At any rate, I will keep testing and see if I can get the performance to an acceptable level whilst still using sync=enabled. My results make me feel like virtualizing FreeNAS will a viable option for my non-production environment and I don't really see any downsides as long as you can passthrough a proper HBA to the VM, but again I'm not sure which way I'll go just yet as I want to do more testing. I just wanted to share a few of my initial tests and start a post to get some discussion going. I'll report back once I come to a conclusion on how I'm going to set things up or if I have any other interesting data to share. I would enjoy any questions, opinions, or feedback you have on this configuration and hearing about your similar setups. I'd also really like to figure out why things are so slow with sync=enabled. Thanks for any feedback.
 
Sweet write-up!!! I did a few rounds of testing similar to this albeit less comprehensive and much smaller scale (in terms of hardware) a while back. It was very barebones, quick and dirty (while yours appears to be well laid out and methodical so kudos). I also neglected to do a write-up and take screenies. And mine was almost strictly for the purpose of testing viability of moving to 10GbE (direct-attach copper between my bare metal FreeNAS and ESXi boxes) but it went extremely well so I picked up some x520-DA2's and cables. Thanks for sharing your procedure, results and the accompanying screenies, much appreciated.

I also wanted to mention that, for me and my use case, adding an SLOG and L2ARC didn't boost read/write performance numbers so much as it did IOPS for the pool. It also seemed to have a positive impact on the loading of CIFS/SMB shares for clients browsing said shares (the directory contents loaded much quicker in Explorer and Finder).
 
Last edited:
Sweet write-up!!! I did a few rounds of testing similar to this albeit less comprehensive and much smaller scale (in terms of hardware) a while back. It was very barebones, quick and dirty (while yours appears to be well laid out and methodical so kudos). I also neglected to do a write-up and take screenies. And mine was almost strictly for the purpose of testing viability of moving to 10GbE (direct-attach copper between my bare metal FreeNAS and ESXi boxes) but it went extremely well so I picked up some x520-DA2's and cables. Thanks for sharing your procedure, results and the accompanying screenies, much appreciated.

I also wanted to mention that, for me and my use case, adding an SLOG and L2ARC didn't boost read/write performance numbers so much as it did IOPS for the pool. It also seemed to have a positive impact on the loading of CIFS/SMB shares for clients browsing said shares (the directory contents loaded much quicker in Explorer and Finder).

Thanks for your feedback. I appreciate your comments. I was taking the screenshots for my own amusement and then I figured hey why not share it with the forum. Yes I also did notice that CIFS share do browse a lot more quickly, so that's always a positive.
 
Thanks for your feedback. I appreciate your comments. I was taking the screenshots for my own amusement and then I figured hey why not share it with the forum. Yes I also did notice that CIFS share do browse a lot more quickly, so that's always a positive.

Haha yeah unfortunately in my case it was all an afterthought (in regards to the screenies, documentation and info sharing). Thanks again and keep up the good work. If you do more like this please feel free to keep us updated!

Follow up questions;
- What's your overall power consumption looking like?
- How's that Xeon-D working out for you? (I'm more tempted than ever but for a number of various reasons I'm still waiting/hesitant, performance wise they look great but it's mostly a cost & feature set thing for me)
- Any particular reason you went with 10GBASE-T over 10GBASE-CX4 (or some other 10GBASE standard)?
 
Last edited:
Thanks for sharing your results.

With regards to the performance with sync=always:

Your performance is exactly what is to be expected.

Look in the product spec for your SLOG devices: [Page 9/33 table 4]
Intel® SSD DC S3510 Series Product Specification

The 80Gb is specced as 110 MB/sec sequential write. SLOG devices have a sequential write usage pattern. So, numbers are exactly what they should be. Your performance is bad because your SLOG devices hold back your whole system. I think you would be faster in sync=always without the SLOG devices. You could try to rebuild the pool without SLOG devices and test again.

I am also suspicious of the 850 pro as cache. There is no way that cache disk will keep up with the rest of the pool. When you are reading uncached data from the platters, that data has to be written to the cache disk to populate it. You can read at 1700+ MB/sec. This is considerably higher then the write speed of the cache drive. I am not certain if sequential reads populate the cache, but my guess is that most reads will go to the platters and not to the 850pro because the 850 is overloaded. If you are able to retest, ditch the cache drive and test with just the 14 spinners.

I would not consider any non NVMe highend SSD for cache or SLOG duty in that system.

Building storage servers is hard :)
 
Haha yeah unfortunately in my case it was all an afterthought (in regards to the screenies, documentation and info sharing). Thanks again and keep up the good work. If you do more like this please feel free to keep us updated!

Follow up questions;
- What's your overall power consumption looking like?
- How's that Xeon-D working out for you? (I'm more tempted than ever but for a number of various reasons I'm still waiting/hesitant, performance wise they look great but it's mostly a cost & feature set thing for me)
- Any particular reason you went with 10GBASE-T over 10GBASE-CX4 (or some other 10GBASE standard)?

- Power consumption for the FreeNAS box:

2016-10-05 18_27_14-Mozilla Firefox.png

- The Xeon-D is really amazing. That thing is like 50 watts at idle, can take 128 GB RAM, is really surprisingly fast and works so well for a homelab ESXi VM host. Highly recommended.

- No reason for 10GBASE-T in particular. It just felt easier since I had the Xeon-D which already has 10GbE adapters on it, and Netgear 10GbE switch already, not to mention many motherboards are starting to come with 10GBASE-T adapters.
 
Thanks for sharing your results.

With regards to the performance with sync=always:

Your performance is exactly what is to be expected.

Look in the product spec for your SLOG devices: [Page 9/33 table 4]
Intel® SSD DC S3510 Series Product Specification

The 80Gb is specced as 110 MB/sec sequential write. SLOG devices have a sequential write usage pattern. So, numbers are exactly what they should be. Your performance is bad because your SLOG devices hold back your whole system. I think you would be faster in sync=always without the SLOG devices. You could try to rebuild the pool without SLOG devices and test again.

I am also suspicious of the 850 pro as cache. There is no way that cache disk will keep up with the rest of the pool. When you are reading uncached data from the platters, that data has to be written to the cache disk to populate it. You can read at 1700+ MB/sec. This is considerably higher then the write speed of the cache drive. I am not certain if sequential reads populate the cache, but my guess is that most reads will go to the platters and not to the 850pro because the 850 is overloaded. If you are able to retest, ditch the cache drive and test with just the 14 spinners.

I would not consider any non NVMe highend SSD for cache or SLOG duty in that system.

Building storage servers is hard :)

Thanks for the feedback. Yes I did a little more testing and like you said, the DC S3510 is simply a poor choice for a SLOG device. The following screenshot is with the DC S3510 removed (no SLOG on the zvol) and sync=always so it's hitting the spinning disks directly on writes.

iscsi_physical_no_slog_sync_writes.png

Thanks for the info on the 850 pro. When you break it down like that it's clear it won't be fast enough. I really appreciate that. I probably don't even need it for my workload as the server has 128 GB RAM anyway. I should probably just take it out as it's only going to slow things down.

Oh and building storage servers is really hard! I'm getting close, though... I'm doing all this in part so I can make some informed recommendations to my clients with confidence. I'll get there.

Thanks again!
 
Last edited:
Thanks a lot for taking this time.

If you want to do some more test, I think the file size your picked is too small. (1GB), that will mostly go to RAM. That's why there is such a hugh difference between sync and sync disable.

I would go with fewer pass (maybe 2 pass, but with 16 or 32 GB file); worth a try if you have time
 
Thanks a lot for taking this time.

If you want to do some more test, I think the file size your picked is too small. (1GB), that will mostly go to RAM. That's why there is such a hugh difference between sync and sync disable.

I would go with fewer pass (maybe 2 pass, but with 16 or 32 GB file); worth a try if you have time

You're absolutely right. I totally missed that. I'm ordering the 400GB DC P3700 and will re-run the tests at 32GB to see how that changes things.
 
- Power consumption for the FreeNAS box:

View attachment 8541

- The Xeon-D is really amazing. That thing is like 50 watts at idle, can take 128 GB RAM, is really surprisingly fast and works so well for a homelab ESXi VM host. Highly recommended.

- No reason for 10GBASE-T in particular. It just felt easier since I had the Xeon-D which already has 10GbE adapters on it, and Netgear 10GbE switch already, not to mention many motherboards are starting to come with 10GBASE-T adapters.

Wow I'm impressed and surprised by the power there, nice. Maybe I'm weird but upon reading the specs I have immediately ruled out any Xeon-D motherboard that doesn't have SFP+. But I am really looking forward to joining the Xeon D club one day hopefully soon.
 
Wow I'm impressed and surprised by the power there, nice. Maybe I'm weird but upon reading the specs I have immediately ruled out any Xeon-D motherboard that doesn't have SFP+. But I am really looking forward to joining the Xeon D club one day hopefully soon.

Yes it is really great considering it's got 14x HDDs, 3x SSDs, 2x 8 core 2620v4 CPUs, 4x DIMMs, LSI 3008 HBA, 2x 1400 watt PSUs. I was surprised, too. SFP+ would be nice for sure. I wish everything had SFP+.
 
Yes it is really great considering it's got 14x HDDs, 3x SSDs, 2x 8 core 2620v4 CPUs, 4x DIMMs, LSI 3008 HBA, 2x 1400 watt PSUs. I was surprised, too. SFP+ would be nice for sure. I wish everything had SFP+.

SFP+ is great right?!?! After doing research, I figured it would be the most versatile and come out on top. Do you happen to have pics of the setup(s)?
 
UPDATE:

400 GB DC P3700 came in. Results as follows:
  • Ultimately decided run FreeNAS on the physical hardware and ditch virtualizing it because as you'll see below it runs so awesome that I've decided I'm going to introduce this this into a production environment at a client's site once I finish up more testing.
    • This client currently uses individually managed ESXi hosts that all have DAS. All hosts currently use the free ESXi hypervisor. This new box will be used as SAN storage for the VM hosts datastores. The VM hosts will be clustered and managed with vCenter Server and used with vMotion.
  • Since VAAI is so important I ultimately decided to use iSCSI for the datastores.
  • I removed the 2x 80 GB Intel SSD DC s3510 overprovisioned to 10 GB for SLOG from the server.
  • I installed the 400 GB DC P3700.
  • I overprovisioned the 400 GB DC P3700 using the following commands:
    • Code:
      gpart create -s gpt nvd0
      gpart add -t freebsd-zfs -b 2048 -a 4k -l log0 -s 8G nvd0
      zpool add vol0 log gpt/log0
      gpart show nvd0
  • The DC P3700 now looks like this from the shell and GUI:
dc_p3700_gpart_info.png
dc_p3700_fn_gui_info.png
  • I set up a task to run the following command post init:
init_script.png
  • I spun up a VM with a vmxnet3 NIC, set the MTU to 9000 and re-ran CrysalDiskMark at 16GiB.
    • Physical FreeNAS, iSCSI VM, sync=enabled with the DC P3700 being used as the SLOG.
iscsi_physical_sync_writes.png
I still want to test a few things here and there but overall I am happy with the results and learned a ton by setting this up. I do have another DC P3700 that I can use as a cache drive but at this point I am not even sure if I need it. I have to do more testing to see the hit rates of the ARC to see if it's even necessary and also test a few other odds and ends. I'll keep everyone updated as things progress. Thank to everyone for all the help and feedback thus far!
 
Last edited:
UPDATE:

400 GB DC P3700 came in. Results as follows:
  • Ultimately decided run FreeNAS on the physical hardware and ditch virtualizing it because as you'll see below it runs so awesome that I've decided I'm going to introduce this this into a production environment at a client's site once I finish up more testing.
    • This client currently uses individually managed ESXi hosts that all have DAS. All hosts currently use the free ESXi hypervisor. This new box will be used as SAN storage for the VM hosts datastores. The VM hosts will be clustered and managed with vCenter Server and used with vMotion.
  • Since VAAI is so important I ultimately decided to use iSCSI for the datastores.
  • I removed the 2x 80 GB Intel SSD DC s3510 overprovisioned to 10 GB for SLOG from the server.
  • I installed the 400 GB DC P3700.
  • I overprovisioned the 400 GB DC P3700 using the following commands:
    • Code:
      gpart create -s gpt nvd0
      gpart add -t freebsd-zfs -b 2048 -a 4k -l log0 -s 8G nvd0
      zpool add vol0 log gpt/log0
      gpart show nvd0
  • The DC P3700 now looks like this from the shell and GUI:
  • I set up a task to run the following command post init:
  • I spun up a VM with a vmxnet3 NIC, set the MTU to 9000 and re-ran CrysalDiskMark at 16GiB.
    • Physical FreeNAS, iSCSI VM, sync=enabled with the DC P3700 being used as the SLOG.
I still want to test a few things here and there but overall I am happy with the results and learned a ton by setting this up. I do have another DC P3700 that I can use as a cache drive but at this point I am not even sure if I need it. I have to do more testing to see the hit rates of the ARC to see if it's even necessary and also test a few other odds and ends. I'll keep everyone updated as things progress. Thank to everyone for all the help and feedback thus far!

Intriguing results, thanks for the update. I have a similar setup at home; FreeNAS box for storage of various kinds running on bare metal and two ESXi boxes running a variety of VMs and OSs. All three have 10GbE cards using SFP+ in direct-attach. But for the datastores I ended up using SSDs and exporting via NFS.
 
Intriguing results, thanks for the update. I have a similar setup at home; FreeNAS box for storage or various kinds running on bare metal and two ESXi boxes running a variety of VMs and OSs. All three have 10GbE cards using SFP+ in direct-attach. But for the datastores I ended up using SSDs and exporting via NFS.

Thanks for the info. The issue is that when you use NFS and don't use iSCSI you can't use the datastores with VAAI, which is kind of a big deal IMO.
 
Thanks for the info. The issue is that when you use NFS and don't use iSCSI you can't use the datastores with VAAI, which is kind of a big deal IMO.

Oh yeah VAAI is pretty sweet, I've only read about it. Nothing I've done has required it and therefore it wasn't a thing I investigated in practice only in research and theory. I ended up with SSDs purely for speed and separation from the main pool. I also wasn't jived by the notification in the FreeNAS docs about iSCSI fragmentation (which could be mitigated, speed wise, by going with SSDs) or the utilization of space constraint at 50 or 75%. So I went with NFS.
 
Oh yeah VAAI is pretty sweet, I've only read about it. Nothing I've done has required it and therefore it wasn't a thing I investigated in practice only in research and theory. I ended up with SSDs purely for speed and separation from the main pool. I also wasn't jived by the notification in the FreeNAS docs about iSCSI fragmentation (which could be mitigated, speed wise, by going with SSDs) or the utilization of space constraint at 50 or 75%. So I went with NFS.

Yup all good points. I just set the "Pool Available Space Threshold (%):" to 80 and the Extent "Available Space Threshold (%):" to 50 so I think that will cover it. I also only used 50% of the Extent in ESXi for the datastore.
 
Yup all good points. I just set the "Pool Available Space Threshold (%):" to 80 and the Extent "Available Space Threshold (%):" to 50 so I think that will cover it. I also only used 50% of the Extent in ESXi for the datastore.

Ooooh yeah that stuff too. I got confused when I first started my experiment with the extents, file vs disk (or something), etc. So I did more research and upon seeing an update to a previous page I'd read early on I found they were now recommending NFS, so I did that lol To each their own I always say and use the right tool for the job.
 
Back
Top