UCS VM-FEX, should I be using it?

RESTfulADI · Apr 23, 2013

So we just got a new UCS chassis, 6 blades, and 2 FIs to start a new vSphere cluster.

I was initially going to set up each blade with 2 vNICs with all the necessary vlans tagged (one in each fabric) and join them to the existing vDS were all the port groups are already configured, it would match the existing IBM hosts which have 2x10gb uplinks each in the vDS and minimize set up time.

Then I found out about VM-FEX and some questions came up:

1. does it work with the vmware vDS or do I have to buy 1000v?
2. is anyone actually using high-perf mode?
3. what is the point of standard mode over trunks to the existing vDS? AFAIK in this mode there is a vNIC on the blade for each VM vNIC but it still passes through the hypervisor so I don't see a benefit, other than being able to manage all NICs from a central place.

RESTfulADI · Apr 23, 2013

Nevermind, found UCS Guru's series on VM-FEx, I should be able to figure it out.

NetJunkie · Apr 23, 2013

I wouldn't normally. It's cool, but not something I see a lot of use for.

The way we do UCS deployments is to create vNICs in UCS for each traffic type and set QoS and rate limiting as needed for things like IP Storage, vMotion, FT, etc. You can do this somewhat with NIOC, but we prefer to do it in UCS. Not as big a fan as doing one vNIC per fabric and just using those in the uplink groups.

What you don't want is Active/Active vNICs on both fabrics for something like vMotion. You'll end up throwing vMotion traffic out the FIs across your up layer switches. That's bad. So we'll do two vNICs for vMotion, one active, one standby. Same with something like FT and flip the fabrics so normally vMotion goes over one and FT over the other but should a fabric fail they'll be fine. That way you have redundancy but no vMotion or FT going outside the FIs.

RESTfulADI · Apr 23, 2013

The up layer switches are Nexus 5Ks, and all the older IBM hosts are directly connected there with their 10Gb NICs and HBAs (separate). All the storage devices are connected there (XIV, VNX(FC and NAS), Nimble iSCSI, Nexsan FC (Commvault library), commvault physical servers, Spectra table library, and some others. The 5Ks uplink to the customer's core switches.

I had given some thought to doing a vNIC per traffic type, but the active-active scenario spilling traffic to the 5Ks did not occur to me since all the older ESX hosts would have to go through the 5Ks anyway. Other than QOS control per traffic type inside the FIs, can you elaborate more on the benefit of this setup vs a trunk per Fabric?

Thanks for the very helpful advice as always.

NetJunkie · Apr 23, 2013

It's just about where you want to control your traffic flows. You can do it in NIOC...but if you don't do it there, and you don't do vNICs per traffic type how do you keep vMotion from using 8Gb/s and possibly causing I/O starvation (if using IP Storage) or overrunning VM traffic?

And I never want vMotion to leave those FIs unless there are servers outside the UCS cluster that are part of the vSphere cluster..but even then, I don't want intra-UCS vMotion leaving a FI.

RESTfulADI · Apr 23, 2013

Right now I am capping vMotion in its port group and NIOC is disabled on the vDS but as soon as I get storage DRS configured NIOC is next on my list, but I haven't thought of a use for it yet, other than limiting Test/Dev VMs but I can just as easily shove them into a port group and throttle there.

I really have no choice regarding vMotion leaving the FIs until the IBM hosts go away next year and get replaced with more UCS blades. It wouldn't be that much since the UCS blades will be in their own cluster so it I set it up active-passive as you suggested, vMotions initiated by DRS for that cluster should not leave the FIs.

Commvault is another story. We have 2 blades in chassis, but the older media agents and CIs are on the 5Ks. We are limited by the customer's upgrade cycle and until everything is in UCS I am trying to make it as easy to manage as possible without creating bottlenecks.

RESTfulADI · Apr 24, 2013

Thinking about it some more, knowing that more blades are coming and that this will be our standard server tech from now own, I might as well make a new vDS and set up the UCS vNICs per traffic type. QoS in hardware makes perfect sense. As I get more blades they will already have a good service template and a vDS that is set up correctly.

Sticking with the legacy setup to on account of the older hardware will just make changing everything later a much bigger job.

When I set it up tomorrow hopefully I'll have a better idea and some more specific questions. First thing is to figure out how to streamline commvault traffic, we should have just converted all the MAs to UCS.

Child of Wonder · Apr 24, 2013

I agree with NetJunkie. We usually use UCS QoS over NIOC. Having the hardware handle QoS just makes more sense to me, especially since the CoS IDs are passed upstream so the QoS can apply further into the LAN.

Don't have any client using VM-FEX, however. Seems to me another one of those fringe scenarios where someone needs the absolute bleeding edge performance out of their VM's network like stock traders and so on.

It also adds complexity when upgrading UCS and VMware since you have to upgrade the VEM as well.

Unless you have a critical need for VM-FEX, I wouldn''t bother with it.

RESTfulADI · Apr 25, 2013

Curveball time:

I just set this up like you guys suggested and updated our two SAs so we are all on the same page. One came back and said the FI in end-host mode is not a switch and there is no point in forcing active passive since vmotion will go through the up stream switch anyway.

Child of Wonder · Apr 25, 2013

shiznit said:
Curveball time:

I just set this up like you guys suggested and updated our two SAs so we are all on the same page. One came back and said the FI in end-host mode is not a switch and there is no point in forcing active passive since vmotion will go through the up stream switch anyway.

They're not switches in the traditional sense, however they are aware of the MAC addresses being handed down to the blades and if MAC1 shows up on FI-A looking for MAC2 that is also assigned to FI-A, then FI-A will simply forward the traffic to the appropriate chassis without sending it upstream.

Putting the FI's in switch mode instead of end host mode would simply allow you to plug other non-UCS devices into the FIs and have it act as a switch between the servers and the other devices.

RESTfulADI · Apr 25, 2013

Makes perfect sense, thanks.

NetJunkie · Apr 25, 2013

Child of Wonder said:
They're not switches in the traditional sense, however they are aware of the MAC addresses being handed down to the blades and if MAC1 shows up on FI-A looking for MAC2 that is also assigned to FI-A, then FI-A will simply forward the traffic to the appropriate chassis without sending it upstream.

Putting the FI's in switch mode instead of end host mode would simply allow you to plug other non-UCS devices into the FIs and have it act as a switch between the servers and the other devices.

And cause the FIs to participate in spanning-tree and therefore start blocking ports. There are VERY few good reasons to put FIs in switch mode.

Child of Wonder · Apr 25, 2013

NetJunkie said:
And cause the FIs to participate in spanning-tree and therefore start blocking ports. There are VERY few good reasons to put FIs in switch mode.

Yep. Almost never a good idea.

RESTfulADI · Apr 25, 2013

I should have just looked up end host mode, https://supportforums.cisco.com/docs/DOC-5948, everything is there. Too much stuff going on today.

Now I know what to do.

UCS VM-FEX, should I be using it?

RESTfulADI

2[H]4U

RESTfulADI

2[H]4U

NetJunkie

[H]F Junkie

RESTfulADI

2[H]4U

NetJunkie

[H]F Junkie

RESTfulADI

2[H]4U

RESTfulADI

2[H]4U

Child of Wonder

2[H]4U

RESTfulADI

2[H]4U

Child of Wonder

2[H]4U

RESTfulADI

2[H]4U

NetJunkie

[H]F Junkie

Child of Wonder

2[H]4U

RESTfulADI

2[H]4U