ESXi 5.5 with LACP Enabled?

KapsZ28 · Dec 4, 2013

Is there any real benefit to using this? I have a 5.5 environment setup and enabled LACP on my Production distributed port group with the mode set to Active. I also enabled LAG on the switch ports going to each ESXi host. There are two 1 GbE NICs per host. I ran an iperf test between two VMs on different hosts with 10 parallel sessions, and the speed is still only 900+Mbps. The distributed port group is set to IP Hashing, and I tried several different hashing algorithms on the switch.

NetJunkie · Dec 4, 2013

Hashing is static..meaning you might hash two connections over two NICs or they might just get hashed across one of the two. You don't get to control that..the hashing algorithm does. If you had more VMs and more clients connecting to them you'd see better distribution of connections over the two links. It's just statistics and a mathematical formula. It's not intelligent. Basically a simple XOR and it spits out a 1 or 2 for the NIC to use. Depending on the algorithm you choose and the luck of the IPs/ports/protocols being hashed against you can still end up on the same port.

I normally do Hash Based on Physical NIC Load (LBT). You don't need LACP enabled for this (in fact you can't) and it monitors your NICs. If a NIC is >70% utilization for 30 seconds it'll move connections to a lesser loaded NIC.

KapsZ28 · Dec 4, 2013

When you enable LACP in vSphere, it specifically says to make sure the teamed NICs are setup for IP Hashing. I believe this is a new feature of 5.5. I guess I should read up on the VMware documentation to see if and when it is recommended.

KapsZ28 · Dec 4, 2013

Here is what I am talking about.

http://kb.vmware.com/selfservice/mi...nguage=en_US&cmd=displayKC&externalId=2051826

I enabled LACP on the distributed port group and enabled LAG on the physical switch. Right now the LAG Hash on the physical switch is set to "Source/Destination IP and source/destination TCP/UDP Port." There are 6 other hashing algorithms to choose from.

Also, unrelated question. Right now I have a distributed switch setup with two 1GbE NICs for just management. Then another distributed switch setup with two 1GbE NICs for Production traffic. Would it be better to put all four NICs into the Production distributed switch and enable management on two of them? This way we have four 1GbE NICs per host for production. All the 1GbE NICs are going to the same switches anyway.

NetJunkie · Dec 5, 2013

vSphere 5.5 added some enhancements to LACP. You can now do multiple LAGs in a VDS uplink group and there are 23 hashing algorithms. You need to make sure you match both sides to the same one.

No reason to have management on its own VDS. No reason at all. I'd put all 4 NICs in an uplink group in one VDS. What sort of storage are you using? FC? NFS? iSCSI?

I'd probably put Management and iSCSI on two of them.... Management primary on NIC1, standby on NIC2, vice versa for vMotion. Then I'd take the other two NICs and use them for VM traffic. This is assuming you aren't doing IP storage.

KapsZ28 · Dec 5, 2013

OK, so that was my next question about the multiple LAGs. Since we have 4 NICs going to 2 separate switches, I would need to create 2 LAGs per ESXi host. 1 LAG with 2 NICs, and a 2nd LAG with 2 NICs. And you are saying it is fine to have both of those LAGs in the same VDS and port group?

Although VMware supports 23 hashing algorithms, the Dell switch we have only has 7 options. Below is a screen shot. Which algorithm would you recommend?

As for the iSCSI, we are using two 10GbE Dell switches with a Dell PowerVault SAN. Each ESXi host has two 10GbE NICs and the SAN has four 10GbE connections. So there is no iSCSI traffic on the 1GbE switches.

NetJunkie · Dec 5, 2013

Unless your two switches are stacked or can do like Cisco VSS or vPC and have the ability to do port-channels across switches, then yes you'd need to create two LAG groups..one to each switch. I honestly don't think it's worth it unless for a few specific cases. I'd go LBT and be done with it...

Really the hashing type you choose depends on your traffic/protocol type. If you choose Source/Dest IP it's going to hash just on the source and dest IP. Meaning a connection from a client to a VM will only ever go over a single NIC since the source/dest IP never changes during a connection, or even multiple connections over different protocols/apps. If you do like Src/Dst IP and Src/Dst Port then communications from a single client to a single VM could go over multiple NICs *IF* they come from, or go to, different ports. It just increases the chance of better load distribution. Note that I never say load balancing. It's not balanced.

ESXi 5.5 with LACP Enabled?

KapsZ28

2[H]4U

NetJunkie

[H]F Junkie

KapsZ28

2[H]4U

KapsZ28

2[H]4U

NetJunkie

[H]F Junkie

KapsZ28

2[H]4U

NetJunkie

[H]F Junkie