ESXI 6.5u2 Network speed between VMs

Discussion in 'Virtualized Computing' started by mda, Jul 4, 2018.

  1. mda

    mda [H]ard|Gawd

    Messages:
    1,288
    Joined:
    Mar 23, 2011
    Hello All,

    I have a question re: what SCP speeds I should be seeing when copying from VM to VM.

    The box I'm using is quite beefy for what I'm supposed to be doing:

    HW
    Lenovo Server
    2x Xeon Silver 4114 (10c/20t each CPU)
    96GB RAM
    2 Ethernet Ports, 1 Gigabit
    4x Intel 4600 SSDs in RAID10 via a dedicated RAID card
    Each one can do about 400MB/s Read and Write so with RAID10, I'm expecting a little more than what I'm getting

    ESXI Config
    I'm a idiot/noob at this and didn't configure ESXi a lot. Just set my management IP to static and added the 2nd gigabit ethernet port to work as a NIC team

    Guest VMs
    RHEL 5.8 64bit, using the VMXNet3 Ethernet Driver
    RHEL 5.8 32bit, using the VMXNet3 Ethernet Driver
    Each has quite a bit of resources allocated. At least 10 cores per VM, and over 16GB RAM. No overallocation of resources. These, as well as a tiny Ubuntu VM just used for remote desktop and other minor things are the only VMs on this computer.

    Both VMs see a 10GB ethernet <please see pic attached>


    **This may or may not be significant: I set up both VMs initially with the E1000 hardware. I installed VMWare Tools BEFORE switching the hardware to the VMXNet3 Driver in ESXI. No other changes made to Guest OSs.
    Edit: Not significant, I made 2 VMs with the VMXNet3 driver set as initial hardware. No network until I installed the VMWare Tools. Exactly the same result.

    Issue
    I'm copying about 50 big 4GB files from one VM to VM via SCP, and I'm seeing transfer rates to be between 110-125MB. I'm assuming that this should be quite a bit faster since I probably am not bottlenecked by the ethernet or SSDs.

    I've read that since file transfers go through the vswitch and not through the physical ethernet ports, I should be seeing a much faster copy between the 2 VMs. Why is this not the case?

    The exact command I'm using to transfer:
    scp -r -c arcfour <files> <user>@<ip>:/directory

    Thanks!
     

    Attached Files:

    • ETH.JPG
      ETH.JPG
      File size:
      30.5 KB
      Views:
      0
    Last edited: Jul 5, 2018
  2. Spartacus09

    Spartacus09 Limp Gawd

    Messages:
    434
    Joined:
    Apr 21, 2018
    You have the right idea with the VMXNet3 drivers, but VM to VM doesn't quite work that way it still routes to the vSwitch which is the same speed as the physical ports (1Gb).
    Thats your bottleneck, 1 Gigabit is about equal to 125 MegaByte max, so 110-125 seems about right.

    To get that internal 10Gb rate you would need to create an isolated network between the two VMs without physical adapters attached to the vSwitches: https://kb.vmware.com/s/article/2043160

    That transfer path should yield a max of about 950MB instead of 125MB (depending on the transfer workload)
    I think your SSD would be your bottleneck at that point.


    Secondary option would be to do link aggregation with the two 1Gb ports if you have a switch that supports it.
    Third option would be to buy some cheap 10Gb cards, you'd need a 10Gb switch then though :p http://a.co/7X7b21c
     
    Last edited: Jul 5, 2018
  3. vikingboy

    vikingboy n00bie

    Messages:
    30
    Joined:
    Dec 27, 2007
    I'm no expert but I just tried a quick iperf test between two VMs on my 6.5 install out of curiosity as Im pretty certain I read in the vmware host config book that VMs connected to a vSwitch weren't limited to 1gbps. These VMs are both Debian stretch, 1CPU, 1GB Ram with low utilisation on a Xeon D-1541 host connected to a vSwitch with 10gbps uplink to my router.

    Code:
    root@mrtg:/home/mrtg# iperf -c tftp -i 1
    ------------------------------------------------------------
    Client connecting to tftp, TCP port 5001
    TCP window size: 85.0 KByte (default)
    ------------------------------------------------------------
    [  3] local 192.168.10.26 port 41600 connected with 192.168.10.23 port 5001
    [ ID] Interval       Transfer     Bandwidth
    [  3]  0.0- 1.0 sec  2.65 GBytes  22.8 Gbits/sec
    [  3]  1.0- 2.0 sec  2.18 GBytes  18.7 Gbits/sec
    [  3]  2.0- 3.0 sec  1.72 GBytes  14.8 Gbits/sec
    [  3]  3.0- 4.0 sec  1.71 GBytes  14.7 Gbits/sec
    [  3]  4.0- 5.0 sec  1.81 GBytes  15.5 Gbits/sec
    [  3]  5.0- 6.0 sec  1.82 GBytes  15.6 Gbits/sec
    [  3]  6.0- 7.0 sec  1.85 GBytes  15.9 Gbits/sec
    [  3]  7.0- 8.0 sec  1.86 GBytes  16.0 Gbits/sec
    [  3]  8.0- 9.0 sec  1.70 GBytes  14.6 Gbits/sec
    [  3]  9.0-10.0 sec  1.87 GBytes  16.1 Gbits/sec
    [  3]  0.0-10.0 sec  19.2 GBytes  16.5 Gbits/sec
     
  4. mda

    mda [H]ard|Gawd

    Messages:
    1,288
    Joined:
    Mar 23, 2011
    Thanks for the info.

    What you guys said so far just makes sense.

    I'd basically either need a 10GBPS uplink to a physical switch or a new dedicated virtual switch...

    the latter is cheaper but I need to do a little more research.

    I have a Cisco SG300 with LAG onhand but I haven't gotten around to set it up yet for this purpose.

    Thanks again!
     
  5. vikingboy

    vikingboy n00bie

    Messages:
    30
    Joined:
    Dec 27, 2007
    Two VMs connected through the same vmware vSwitch shouldnt use the uplink at all so buying a new switch shoulnt be necessary to solve this problem. Once you are transferring out of the vSwitch, i.e say a router-on-a-stick type config where theres a router firewalling subnets, a better switch may be needed to increase bandwidth. For a single threaded xfer, LAGG wont help either.

    In terms of optimising vmware, this is a good read https://www.amazon.com/VMware-vSphe...swatch_0?_encoding=UTF8&qid=1530773845&sr=8-1
     
  6. Eickst

    Eickst [H]ard|Gawd

    Messages:
    1,764
    Joined:
    Aug 24, 2005
    Check vswtich for traffic shaping
    Check guest for latest vmware tools
    Check guest nic config and make sure the speed doesn't show negotiated at 1gbps
     
  7. Spartacus09

    Spartacus09 Limp Gawd

    Messages:
    434
    Joined:
    Apr 21, 2018
    Interesting I get the same result too ~9.96Gbit from server 2012R2 -> unbuntu 16.04 - 64 bit, I had the opposite thought for some reason.

    mda assuming you just did a default esxi install I cant think of why it wouldnt be a 10Gb transfer, can you install iPerf and check the results to confirm its not network based?
    The only other thought would be a bad manual configuration or for some reason your storage bottle-necking you.
     
  8. mda

    mda [H]ard|Gawd

    Messages:
    1,288
    Joined:
    Mar 23, 2011
    I'll check out iperf sometime next week when I set up new VMs to test.

    Thanks again everyone!
     
  9. ljw1

    ljw1 n00bie

    Messages:
    14
    Joined:
    Dec 19, 2011
    Don't bother checking iperf. scp is typically limited by single threaded cpu capacity. On nearly every box I have seen this tops out <150MB/s. You need to use different protocol if you want faster transfers.
     
    SGalbincea likes this.
  10. mda

    mda [H]ard|Gawd

    Messages:
    1,288
    Joined:
    Mar 23, 2011
    Sorry, haven't gotten around to try iperf yet..

    So is Samba the fastest way to do file transfers to utilize the 10GB link and SSDs?
     
  11. ljw1

    ljw1 n00bie

    Messages:
    14
    Joined:
    Dec 19, 2011
    Your VMs are running a super old version of linux which may do a compatible samba version. You will struggle to find a suitable rpm with a recent enough smb client. The only other easy fast option is NFS.

    If you are going to the trouble of compiling and installing recent samba and tools you might want to investigate hpn-ssh as that may be simpler and get you what you need.
     
  12. dexvx

    dexvx Gawd

    Messages:
    937
    Joined:
    Aug 14, 2002
    VM to VM is purely limited by CPU speed, as it does not touch the NIC hardware. The bandwidth limitation is likely due to benchmark setup.

    For reference, with multiple threads and multiple VM’s, we’ve hit around 60 Gbps in our lab environment (not very real world) with Xeon E5 v2’s.
     
  13. Easius

    Easius Limp Gawd

    Messages:
    323
    Joined:
    Jan 1, 2009
    Would also like to know the CPU resource usage during large contiguous transfers.

    Also have you benchmarked the array? What cache mode is it running in?
     
  14. ironforge

    ironforge [H]ard|Gawd

    Messages:
    1,212
    Joined:
    Feb 7, 2006
    Make sure latest version of vmtools are running and you are using vmxnet3 adapter.
     
  15. KaaCeeGeek

    KaaCeeGeek n00bie

    Messages:
    3
    Joined:
    Sep 8, 2017
    dexvx would you mind expanding on your methodology used to test this? OS's, protocol, etc . .