ESXI 6.5u2 Network speed between VMs

mda

2[H]4U
Joined
Mar 23, 2011
Messages
2,207
Hello All,

I have a question re: what SCP speeds I should be seeing when copying from VM to VM.

The box I'm using is quite beefy for what I'm supposed to be doing:

HW
Lenovo Server
2x Xeon Silver 4114 (10c/20t each CPU)
96GB RAM
2 Ethernet Ports, 1 Gigabit
4x Intel 4600 SSDs in RAID10 via a dedicated RAID card
Each one can do about 400MB/s Read and Write so with RAID10, I'm expecting a little more than what I'm getting

ESXI Config
I'm a idiot/noob at this and didn't configure ESXi a lot. Just set my management IP to static and added the 2nd gigabit ethernet port to work as a NIC team

Guest VMs
RHEL 5.8 64bit, using the VMXNet3 Ethernet Driver
RHEL 5.8 32bit, using the VMXNet3 Ethernet Driver
Each has quite a bit of resources allocated. At least 10 cores per VM, and over 16GB RAM. No overallocation of resources. These, as well as a tiny Ubuntu VM just used for remote desktop and other minor things are the only VMs on this computer.

Both VMs see a 10GB ethernet <please see pic attached>


**This may or may not be significant: I set up both VMs initially with the E1000 hardware. I installed VMWare Tools BEFORE switching the hardware to the VMXNet3 Driver in ESXI. No other changes made to Guest OSs.
Edit: Not significant, I made 2 VMs with the VMXNet3 driver set as initial hardware. No network until I installed the VMWare Tools. Exactly the same result.

Issue
I'm copying about 50 big 4GB files from one VM to VM via SCP, and I'm seeing transfer rates to be between 110-125MB. I'm assuming that this should be quite a bit faster since I probably am not bottlenecked by the ethernet or SSDs.

I've read that since file transfers go through the vswitch and not through the physical ethernet ports, I should be seeing a much faster copy between the 2 VMs. Why is this not the case?

The exact command I'm using to transfer:
scp -r -c arcfour <files> <user>@<ip>:/directory

Thanks!
 

Attachments

  • ETH.JPG
    ETH.JPG
    30.5 KB · Views: 0
Last edited:
You have the right idea with the VMXNet3 drivers, but VM to VM doesn't quite work that way it still routes to the vSwitch which is the same speed as the physical ports (1Gb).
Thats your bottleneck, 1 Gigabit is about equal to 125 MegaByte max, so 110-125 seems about right.

To get that internal 10Gb rate you would need to create an isolated network between the two VMs without physical adapters attached to the vSwitches: https://kb.vmware.com/s/article/2043160

That transfer path should yield a max of about 950MB instead of 125MB (depending on the transfer workload)
I think your SSD would be your bottleneck at that point.


Secondary option would be to do link aggregation with the two 1Gb ports if you have a switch that supports it.
Third option would be to buy some cheap 10Gb cards, you'd need a 10Gb switch then though :p http://a.co/7X7b21c
 
Last edited:
As an Amazon Associate, HardForum may earn from qualifying purchases.
I'm no expert but I just tried a quick iperf test between two VMs on my 6.5 install out of curiosity as Im pretty certain I read in the vmware host config book that VMs connected to a vSwitch weren't limited to 1gbps. These VMs are both Debian stretch, 1CPU, 1GB Ram with low utilisation on a Xeon D-1541 host connected to a vSwitch with 10gbps uplink to my router.

Code:
root@mrtg:/home/mrtg# iperf -c tftp -i 1
------------------------------------------------------------
Client connecting to tftp, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.10.26 port 41600 connected with 192.168.10.23 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  2.65 GBytes  22.8 Gbits/sec
[  3]  1.0- 2.0 sec  2.18 GBytes  18.7 Gbits/sec
[  3]  2.0- 3.0 sec  1.72 GBytes  14.8 Gbits/sec
[  3]  3.0- 4.0 sec  1.71 GBytes  14.7 Gbits/sec
[  3]  4.0- 5.0 sec  1.81 GBytes  15.5 Gbits/sec
[  3]  5.0- 6.0 sec  1.82 GBytes  15.6 Gbits/sec
[  3]  6.0- 7.0 sec  1.85 GBytes  15.9 Gbits/sec
[  3]  7.0- 8.0 sec  1.86 GBytes  16.0 Gbits/sec
[  3]  8.0- 9.0 sec  1.70 GBytes  14.6 Gbits/sec
[  3]  9.0-10.0 sec  1.87 GBytes  16.1 Gbits/sec
[  3]  0.0-10.0 sec  19.2 GBytes  16.5 Gbits/sec
 
Thanks for the info.

What you guys said so far just makes sense.

I'd basically either need a 10GBPS uplink to a physical switch or a new dedicated virtual switch...

the latter is cheaper but I need to do a little more research.

I have a Cisco SG300 with LAG onhand but I haven't gotten around to set it up yet for this purpose.

Thanks again!
 
Two VMs connected through the same vmware vSwitch shouldnt use the uplink at all so buying a new switch shoulnt be necessary to solve this problem. Once you are transferring out of the vSwitch, i.e say a router-on-a-stick type config where theres a router firewalling subnets, a better switch may be needed to increase bandwidth. For a single threaded xfer, LAGG wont help either.

In terms of optimising vmware, this is a good read https://www.amazon.com/VMware-vSphe...swatch_0?_encoding=UTF8&qid=1530773845&sr=8-1
 
As an Amazon Associate, HardForum may earn from qualifying purchases.
Check vswtich for traffic shaping
Check guest for latest vmware tools
Check guest nic config and make sure the speed doesn't show negotiated at 1gbps
 
I'm no expert but I just tried a quick iperf test between two VMs on my 6.5 install out of curiosity as Im pretty certain I read in the vmware host config book that VMs connected to a vSwitch weren't limited to 1gbps. These VMs are both Debian stretch, 1CPU, 1GB Ram with low utilisation on a Xeon D-1541 host connected to a vSwitch with 10gbps uplink to my router.

Interesting I get the same result too ~9.96Gbit from server 2012R2 -> unbuntu 16.04 - 64 bit, I had the opposite thought for some reason.

mda assuming you just did a default esxi install I cant think of why it wouldnt be a 10Gb transfer, can you install iPerf and check the results to confirm its not network based?
The only other thought would be a bad manual configuration or for some reason your storage bottle-necking you.
 
I'll check out iperf sometime next week when I set up new VMs to test.

Thanks again everyone!
 
Don't bother checking iperf. scp is typically limited by single threaded cpu capacity. On nearly every box I have seen this tops out <150MB/s. You need to use different protocol if you want faster transfers.
 
Sorry, haven't gotten around to try iperf yet..

So is Samba the fastest way to do file transfers to utilize the 10GB link and SSDs?
 
Sorry, haven't gotten around to try iperf yet..

So is Samba the fastest way to do file transfers to utilize the 10GB link and SSDs?

Your VMs are running a super old version of linux which may do a compatible samba version. You will struggle to find a suitable rpm with a recent enough smb client. The only other easy fast option is NFS.

If you are going to the trouble of compiling and installing recent samba and tools you might want to investigate hpn-ssh as that may be simpler and get you what you need.
 
VM to VM is purely limited by CPU speed, as it does not touch the NIC hardware. The bandwidth limitation is likely due to benchmark setup.

For reference, with multiple threads and multiple VM’s, we’ve hit around 60 Gbps in our lab environment (not very real world) with Xeon E5 v2’s.
 
Would also like to know the CPU resource usage during large contiguous transfers.

Also have you benchmarked the array? What cache mode is it running in?
 
Make sure latest version of vmtools are running and you are using vmxnet3 adapter.
 
VM to VM is purely limited by CPU speed, as it does not touch the NIC hardware. The bandwidth limitation is likely due to benchmark setup.

For reference, with multiple threads and multiple VM’s, we’ve hit around 60 Gbps in our lab environment (not very real world) with Xeon E5 v2’s.

dexvx would you mind expanding on your methodology used to test this? OS's, protocol, etc . .
 
Back
Top