vSphere Design Considerations - Network ?'s

Vader

Supreme [H]ardness
Joined
Dec 22, 2002
Messages
5,102
While I continue to study and go through some vSphere Design Scenarios a couple of things came up that I have questions about. Pertaining some of the Vendors and 10Gbe, and how at the Vendor layer, HP Flex10 and Cisco Unified Fabric, the network segementation of bandwidth is done at their layer and presented to vSphere.

In this case, looking at some scenarios, for instance if vMotion, DRS, and evacuating VM's from a host for maintenance needs to be robust, how do you handle that scenario? My question pertains to vSphere actually being presented the proper bandwidth to either do 4 or 8 vMotions. Is there an inbetween area say I present 4Gb of bandwidth from the infrastrucutre side to vSphere, does that allow me to do more than 4 vMotions or am I limited meaning, do I have to present 10Gb and vSphere needs to see 10Gb in order for me to do 8 vMotions or is this a 1Gb for 4 vMotions, and 10Gbe for 8, no inbetween?

In a scenario with blades say you have two onboard 1Gbe controllers, and a dual port 10Gbe CNA. How do you segment say VMkernal for MGMT, vMotion, FT, and VM traffic? I know this is a loaded question and obviously is needing more information per scenario, I understand the NIOC and vlans could be used if there isn't segmentation at the vendor layer in this scenario, but it seems that we can't address the SPoF here. I guess, I should learn that we can't address SPoF in most cases it seems?

Will this be an issue for VCDX..etc..or is it understood going through that process?
 
Last edited:
max 4 concurrent vmotion on 1Gb nic, 8 on 10Gb nic, this is per host even you have more than one vmkernel port for vmotion.

For blades, it really depends on your design and it's very flexible and complex. Take Flex10 as an example, you can carve each 10G CNA to 4 vnics so you would have 8 vnics per blade. Cisco UCS is similar unless you have Palo adapter which gives you 128 vnics. This is on hardware side, how you are going to use it really depends on network design to upstream switch.
 
For blades, it really depends on your design and it's very flexible and complex. Take Flex10 as an example, you can carve each 10G CNA to 4 vnics so you would have 8 vnics per blade. Cisco UCS is similar unless you have Palo adapter which gives you 128 vnics. This is on hardware side, how you are going to use it really depends on network design to upstream switch.

I understand all that but look at what my scenario is with only have 2 1Gbe interfaces and two 10Gbe CNA's. How would you really provide design to avoid SPoF in that case?

To reiterate pertaining vMotion traffic, it's either 1Gbe 4 or 10Gbe 8, nothing inbetween even if you provide additional bandwidth beyond 1Gbe whether it's through Vendor specified bandwidth allocation or NIOC. I guess if evacution, HA/DRS is a primary concern to a customer, you just wouldn't limit the port capacity and go with a differnet host/network architecture all together.

What would be the teaming setup for multiple vMotion nics utilizing multiple VMkernal ports? Is the nic in STANDBY while the other is Active and vice-versa on the second vMotion VMkernal port? I believe in iSCSI the second adapter is in UNUSED state...didn't know if it was similiar.
 
Last edited:
I don't think you need to care about multiple vmotion nics since the reason for it is to go beyond 1Gb interface to speed up vmotion. You can give vmotion nic more than 1Gb bandwidth in Flex10. Multi-Nic vmotion is active/active and you can use MPIO to utilize both nics in the team for iSCSI too.

You have 8 nics to create 4 teams, how you want to use the teams for is up to you. The deciding factor would be security requirement.

Whether all teams are active/active or active/standby depends on configuration between Flex10 and uplink switch. Again it depends on traffic pattern whether it's south/north or east/west.
 
Hmmm..I guess i'm a bit confused here. If I give vMotion more bandwidth, if it's not the full 10Gbe, then i'm stuck with 4 vMotions at a time but adding bandwidth will speed up the actual vMotion process correct?

I think I explained this wrong. I'm looking at this specific scenario this way as it's used as an example for a study case for the VCAP-DCA/DCD and VCDX.

The scenario is this:

Customer has current virtualized infrastructure on 3.5 and has Blades. Each blade contains 2 1Gbe onboard NIC's and 2 Dual Port 10Gbe CNA's. FC is used for Storage.

The customers requirement an upgrade to vSphere 5.x and a complete load balancing accross the hosts for the customer applications, and quick evacuation of VM's to perform maintenance..etc in a small window of opportunity as it's 24/7 shop.

In this case you have MGMT, vMotion, VM networks. Provide the best solution to resolve SPoF and peformance based on customer's needs.

I know VLAN's, NIOC or the Vendor may be Cisco or HP using Flex10..so that makes it a bit easier. Assume that the vendor doesn't have that capability.
 
I don't know if anything between 1Gb and 10Gb. If you have 1Gb or above but below 10Gb for vmotion, it is 4 concurrent vmotion, it can speed up vmotion process but not increase concurrent limit.

In your case, you may use onboard 1Gb for MGMT, carve a vnic from Flex10 for vMotion, another vnic for FT (if required), you have 2 vnics left for VMs. Bandwidth can be dynamically adjusted on the fly for each vnic.

I don't understand the concern for smal windows for maintenance, the beauty of VMware if you never need to bring down VM to do VMware maintenance. I've performed a lot of maintenance during business hours while keeping VMs up and never had any problem.
 
Pending the cluster setup n+1 etc you would be running at risk during maintenace. Quicker evacuations for high capacity vm's on host may be required. I've dealt with all types of clients some extremely anal about risk.
 
I wanted to bring this post back up as i'm digging into UCS more and more trying to absorb and learn as much as I can.

I hate to say "best practice" or use words like "typical" because I realize each design is different so here are my questions.

1. With UCS, assume that i'm using a VIC and have plenty of logical capacity, when you create the logical interfaces, is it similar to HP FlexConnect, where you can assign actual bandwidth limits on the logical interface or am I stuck with each logical ethernet nic is 1Gbe?

2. What is the total amount of downstream bandwidth PER BLADE to the Fabric interconnects or is that by Chassis (i/o modules) and you can assign out of total % to each blade?

3. How is NIOC used in conjunction with leveraging the FCoE piece? Say i am using two 4Gb FC connections per blade and that leaves me x%, can I just setup two Logical uplinks with the remaining bandwidth to my vSphere hosts and then just use NIOC w/Vlans?

Bare with me here, i'm trying to get some facts down.
 
1. Pretty sure each vNIC is going to register as a 10Gb NIC to vSphere...need to confirm. But you can set in QoS how much actual bandwidth it gets...what it can burst to...etc....

2. With a M81KR you get two 10Gb ports. 10Gb from each FI. With the new 1240 and 1280 you could do 20Gb or 40Gb from each FI to a blade, assuming you have enough links on the I/O Modules.

3. NIOC isn't aware of unified I/O. That can cause an issue if you really hit points of contention as FC can take 4Gb of each port...if it needs it.
 
As for NIOC... Instead of doing that w/ UCS we normally do all QoS in the FIs. We do like 6 or 8 vNICs on each blade (NFS/iSCSI, Management, vMotion, FT, etc). Each one split to a vNIC and then apply QoS policies to handle traffic. That way vMotion CAN use up to 10Gb..if available but it won't overrun FCoE or say NFS/iSCSI.
 
Ok..so it's much smarter than I thought. So basically..the bandwidth is there and leverage per QoS policy but can burst up to whatever available?

I'm also assuming that you are VLAN'ing all those vNICs?

One a side note, can you setup QoS with UCS manager and/or do the FI's have CLI as well? I'm guessing yes, since it is Cisco..lol. Can this be a template within UCS Manager? I'll have to fire up the Emulator when I get home.
 
You can use the UCSM CLI but it's very painful. It's not meant for normal use. It is NOT IOS or NX-OS. Some good PowerShell integration. We have a script that can stand up and configure an entire UCS system in 30 seconds.

We do VLAN all the vNICs..but they can be the same. And yes, you can do a lot with UCSM's QoS.
 
1. Pretty sure each vNIC is going to register as a 10Gb NIC to vSphere...need to confirm. But you can set in QoS how much actual bandwidth it gets...what it can burst to...etc....

2. With a M81KR you get two 10Gb ports. 10Gb from each FI. With the new 1240 and 1280 you could do 20Gb or 40Gb from each FI to a blade, assuming you have enough links on the I/O Modules.

3. NIOC isn't aware of unified I/O. That can cause an issue if you really hit points of contention as FC can take 4Gb of each port...if it needs it.

1. Yeah, they appear as 10Gb in VMware. Looking at one right now.

Vader, the big thing to remember with UCS QoS is that you're really prioritizing and giving guaranteed minimums to each type of traffic. You typically don't set maximums like with Flex-10 so that any type of traffic can utilize all 10Gb so long as nothing else needs it at the same time.

For example, I'm setting up an environment for a client right now where they have 2 vNICs for each type of traffic: management, VMotion, and virtual machines.

In UCS, I assign the two vNICs for management in the "Bronze" class of service and give it a maximum of 1Gbps and a minimum of 1Gbps (10% weight), VMotion is "Silver" class, has no maximum, and a 1Gbps minimum (10% weight), FCoE (vHBAs) is in the "FC" class, has no maximum, and a 4Gbps minimum (40% weight), and VM traffic is in the "Best Effort" class, has no maximum, and 4Gbps minimum (40% weight).

In this scenario, even if all the traffic failed over to one Fabric, all types of traffic have guaranteed minimums no matter what happens, but each (except for management) is free to consume the entire 10Gbps so long as it's not stopping any other traffic from reaching their minimum.
 
Last edited:
Thank you for providing that example. It reinforces what Netjunkie stated and lays it out nicely.

I'm still interviewing as much as possible and the next one on my list is extremely promising. I've scratched off EMC as they are unresponsive so i'm shooting for VAR's a this point and this particular VAR i'm interviewing with now is Cisco/VMware/Dell shop and i'll be coming into the Consulting practice at a boom in business and of course UCS is part of that so I need to learn it.

Of course they would provide training and all the hands on but it would be good to come in knowing the fundamental architecture of UCS.
 
Thank you for providing that example. It reinforces what Netjunkie stated and lays it out nicely.

I'm still interviewing as much as possible and the next one on my list is extremely promising. I've scratched off EMC as they are unresponsive so i'm shooting for VAR's a this point and this particular VAR i'm interviewing with now is Cisco/VMware/Dell shop and i'll be coming into the Consulting practice at a boom in business and of course UCS is part of that so I need to learn it.

Of course they would provide training and all the hands on but it would be good to come in knowing the fundamental architecture of UCS.

Good luck in your job hunt. I remember seeing your thread about interviewing with EMC. It sucks that they will not return your phone call.

I have thought long and hard about looking for a job with a VAR. I have been a Network Admin/Manager with two different companies during my 11 year career. My background has been traditional networking and Windows servers during most of that however for the past couple of years I have been working with VMware (VSphere and SRM), EMC products (NS-480, Recoverpoint, Centera, Cloud Tiering Appliance), Cisco UCS and Nexus products as we have been upgrading our datacenter and adding a DR site. I have thoroughly enjoyed working in the datacenter side of things and I would like to continue my career down that path. My goal being CCIE Datacenter certification and possibly VCDX. For me to obtain that goal I know it would benefit me more to work with a VAR as I can get more hands on experience with the different products.
 
Good luck in your job hunt. I remember seeing your thread about interviewing with EMC. It sucks that they will not return your phone call.

I have thought long and hard about looking for a job with a VAR. I have been a Network Admin/Manager with two different companies during my 11 year career. My background has been traditional networking and Windows servers during most of that however for the past couple of years I have been working with VMware (VSphere and SRM), EMC products (NS-480, Recoverpoint, Centera, Cloud Tiering Appliance), Cisco UCS and Nexus products as we have been upgrading our datacenter and adding a DR site. I have thoroughly enjoyed working in the datacenter side of things and I would like to continue my career down that path. My goal being CCIE Datacenter certification and possibly VCDX. For me to obtain that goal I know it would benefit me more to work with a VAR as I can get more hands on experience with the different products.

I'm coming up on 2.5 years with a VAR and I've learned more in that time than the previous 6 years of my professional IT career.

However, working as a consultant isn't for everyone and the culture of the company you work for will make a big difference, too. But, if you find a great company to work for and like the change, I've found working for a VAR to be much more satisfying and rewarding than any other IT job I've held.
 
Thank you for providing that example. It reinforces what Netjunkie stated and lays it out nicely.

I'm still interviewing as much as possible and the next one on my list is extremely promising. I've scratched off EMC as they are unresponsive so i'm shooting for VAR's a this point and this particular VAR i'm interviewing with now is Cisco/VMware/Dell shop and i'll be coming into the Consulting practice at a boom in business and of course UCS is part of that so I need to learn it.

Of course they would provide training and all the hands on but it would be good to come in knowing the fundamental architecture of UCS.

I wouldn't get your hopes down about unresponsive calls, unless you have a rock solid "in" then getting fast tracked is often difficult or at least it has been for all of my friends that work for major vendors. I know my timeline from applying on the website to getting an offer letter from VMware was just short of 6 months. In that timeframe I questioned my self-worth until it was over and I found out that they're just really busy :)

You obviously have a strong grasp if you're asking the right questions like this in terms of converged networking/contention/etc... Learning the little idiosyncrasies of vendor-specific capabilities will come easy since you already have the conceptual building blocks in place.

Best of luck in your future endeavors!

I think I explained this wrong. I'm looking at this specific scenario this way as it's used as an example for a study case for the VCAP-DCA/DCD and VCDX.

I can't speak on the VCDX defense, but I am under the impression that you will have to go fully upstream with your design in terms of the vendor-specific capabilities. As far as the DCD, I've never seen anything outside of just being aware of converged networking and NIOC.
 
Last edited:
Anything in your design is open for questions on a VCDX defense...from high level things to why'd you check that box and what else does that cause....

Speaking of UCS...passed the new UCS Design exam this morning.
 
Thank you all for your support! EMC was a long road, started last September, went through Months of interviews and up to the final presentation. Built, what I thought, was a strong rapport with the Manager, only to be ignored the last couple of months. I know what you're going to say, they weren't interested, well, I have to say that I felt like I went on a date with a hot girl and getting all the way to 3rd base, but no home run. I was very disappointed, but i've taken a positive attitude, learned from it, and moved on. I did learn some things and it only benefited me with more interview experience.

Tomorrow, I have my technical interview, very excited. This company has been straight forward on the process and they have delivered so far. Also, there some nice fringe benefits, the office space if very cool, almost like the Facebook offices how they showed them in the Social Network movie, Gym and Showers in the facility..etc.

What i'm most excited about is that they want to build up a Data Center expert since they are fairly new to that space, mostly Cisco Networking, UC, and Security, and their DC business around UCS and VMware is growing rapidly like most companies.
 
I'm coming up on 2.5 years with a VAR and I've learned more in that time than the previous 6 years of my professional IT career.

However, working as a consultant isn't for everyone and the culture of the company you work for will make a big difference, too. But, if you find a great company to work for and like the change, I've found working for a VAR to be much more satisfying and rewarding than any other IT job I've held.

That is what I need. I can only learn and practice so much on production equipment.:p I love a challenge and I just don't get that challenge where I work at now.
 
Thank you all for your support! EMC was a long road, started last September, went through Months of interviews and up to the final presentation. Built, what I thought, was a strong rapport with the Manager, only to be ignored the last couple of months. I know what you're going to say, they weren't interested, well, I have to say that I felt like I went on a date with a hot girl and getting all the way to 3rd base, but no home run. I was very disappointed, but i've taken a positive attitude, learned from it, and moved on. I did learn some things and it only benefited me with more interview experience.

Tomorrow, I have my technical interview, very excited. This company has been straight forward on the process and they have delivered so far. Also, there some nice fringe benefits, the office space if very cool, almost like the Facebook offices how they showed them in the Social Network movie, Gym and Showers in the facility..etc.

What i'm most excited about is that they want to build up a Data Center expert since they are fairly new to that space, mostly Cisco Networking, UC, and Security, and their DC business around UCS and VMware is growing rapidly like most companies.

You have the right attitude. Just learn from it all and you will be better for it. Good luck with the techinical interview.
 
Tomorrow, I have my technical interview, very excited. This company has been straight forward on the process and they have delivered so far. Also, there some nice fringe benefits, the office space if very cool, almost like the Facebook offices how they showed them in the Social Network movie, Gym and Showers in the facility..etc.

What i'm most excited about is that they want to build up a Data Center expert since they are fairly new to that space, mostly Cisco Networking, UC, and Security, and their DC business around UCS and VMware is growing rapidly like most companies.

Excellent! Sounds like you found a winner! Best of luck on the tech screening
 
Technical interview went great, i'm heading up Monday to meet with the VP. The uncomfortable conversation pertaining compensation was actually very comfortable and it's looking like i'm going to get most of what I want..may lose a week of vaca, but oh well, I knew I had to make some sacrifices, however it's looking like a possible $35k pay increase with quarterly bonus!

My focus will be on Data Center Virtualization, Networking/Fabric-Cisco MDS/Nexus/Catalyst, Compute-UCS/Dell m1000e, Storage-Equalogic/Compellent, and of course a bevy of VMware products. Can't wait!! I'll update you guys Monday evening!
 
Congrats, hope it works out well for you. Contemplating the job hunt myself and I'm not looking forward to it. Planning to finish up my VCP in the next month or so and then am looking at the CCIE datacenter.
 
Technical interview went great, i'm heading up Monday to meet with the VP. The uncomfortable conversation pertaining compensation was actually very comfortable and it's looking like i'm going to get most of what I want..may lose a week of vaca, but oh well, I knew I had to make some sacrifices, however it's looking like a possible $35k pay increase with quarterly bonus!

My focus will be on Data Center Virtualization, Networking/Fabric-Cisco MDS/Nexus/Catalyst, Compute-UCS/Dell m1000e, Storage-Equalogic/Compellent, and of course a bevy of VMware products. Can't wait!! I'll update you guys Monday evening!

Congrats! Definitely exciting stuff!
 
I'm getting more into the UCS Unified Fabric and i'm a bit confused on the the products..especially the Nexus 2k line.

Where would you apply that? Wouldn't it just be easier to go 5k's for top of rack, or is the cost difference that great? Also, since a lot of these handle FCoE, what would be the need for the MDS line except on larger deployments? Wouldn't you just run all your FC via the 2k's or 5k's or even 7k's?

BTW, interview today went great. Hope to be accepting an offer very soon here!
 
Think of Nexus 2Ks as modules off of a 5K or 7K. They extend their reach and port count. A 2K is far cheaper than a 5K. So you can do a pair of 5Ks and extend 2Ks off of them for more 10Gb or 1Gb ports.

Eventually we'll probably move to all FCoE but that's not workable so much today. UCS doesn't support multi-hop FCoE right now, for example. You also have a TON of MDS gear, knowledge, and trust in the world. FCoE hasn't proven itself.
 
That's what we're putting in our new environment. Nexus 5596's with 2248's top of rack. Going to be nice to get some Nexus experience.
 
To add to what NetJunkie said, if you need 10GB to your servers one of the Fabric Extender models, such as the 2232PP support 10GB for server connectivity via twinax so cabling costs for 10GB is much cheaper as well since you do not have to use fiber. The 2232TM supports 1/10GBASE-T. Also the 2232PP and the 2232TM are the only 2 that support FCoE. The other models only support 10GB for fabric uplinks to your 5k or even 7k. You also have fewer nodes to manage since the fabric extenders are managed through the 5k/7k. Not all of the fabric extender models are supported on the 7k however all are supported on the 5k I believe.


Edit:

Think about how the UCS blade chassis is designed. You have your 6100 fabric interconnects which are like your 5k switches. Within the UCS blade chassis itself you have your 2100 series fabric extenders (Think of these as your ToR switch). Internally the fabric extenders are connected to each blade in the chassis. You then have your uplinks from the fabric extenders to the fabric interconnects. Your fabric interconnects can support multiple chassis which is a huge cost savings when you need to grow.
 
Last edited:
That makes sense, to have the ability to extend a "module" from your 5k's/7k's is pretty awesome. I can see where that would come in handy.

Lets go back to the MDS question though. Since the FI's have FC Module capability, in a FC environment, you would just run your FC connectivity off of that and into an MDS SAN Fabric? Is that how it's deployed usually, and the normal 10Gbe LAN traffic is moved up the stack to a 2k then to a 5k/7k or just a 5k/7k?
 
That makes sense, to have the ability to extend a "module" from your 5k's/7k's is pretty awesome. I can see where that would come in handy.

Lets go back to the MDS question though. Since the FI's have FC Module capability, in a FC environment, you would just run your FC connectivity off of that and into an MDS SAN Fabric? Is that how it's deployed usually, and the normal 10Gbe LAN traffic is moved up the stack to a 2k then to a 5k/7k or just a 5k/7k?

I will tell you how it is done in our environment. This is a UCS blade environment. We do not have 5ks for our other physical servers since those physical servers are using local storage only with the exeception of 1 linux server. With our 6120XP fabric interconnects we have the 8 port FC modules installed on both. We actually directly connect our storage array (NS-480) to this. We do not go through an MDS for connectivity. We also have uplinks from the FI to our 7k and this carries the Ethernet traffic. We do have 2 MDS switches that are also connected to our storage array. This is so that our Recoverpoint Appliances can access the arrays for replication purposes. We also have 1 physical linux server connected to the array via the MDS switches.

For the 5ks it is similar. Think of the 5ks as where the fabric splits to FC and Ethernet. You have FC ports on it where you connect those to your traditional MDS switches for SAN connectivity or you can directly connect those to your SAN. It really depends on your environment and your needs. You also have uplinks from your 5k to your distribution/core switches such as the 7k or the 6500 series that carries your Ethernet traffic.

Now if your SAN supports FCoE then you will not need traditional FC ports for connectivity. They can directly connect to your 7k (as long as you have the correct 10GB module) or your 5k switches. There is just so much flexibility in the Nexus line that allows you to add it to your existing FC environment in order to have a seamless migration to FCoE.

I hope this makes sense.

Edit: I am by no means an expert on this. If I am wrong I hope others will step in to correct me. I am just going off of my own experience in my particular environment.
 
Last edited:
Couple more questions. From the FEX's to the Blades, there are lanes of bandwidth, i'm guessing much like a PCI bus or some sort of high speed interconnect. In the case of the older FEX's that do 4 ports each that equates to 10Gbe per blade, or in the case of a full width blade that blade would get 20Gbe bandwidth? Also, in the case of the newer FEX's where you can go 80Gb you get 20Gb per blade? You can portchannel the new FEX's and that bandwidth is available to all blades and balances out? Is this how the FEX to Blade relationship is?
 
Ok..ok..I can take a hint...lol..thanks for the link!:cool:
 
Last edited:
With the ten 1 I/O modules (210x) it pins blade I/O ports to I/O Module uplink connections. I can post the chart but Blade 1 pins to the first port, Blade 2 to the second, etc. If you only connect 2 ports then it just pins back and forth between 2, if you do 4 it goes through all 4 twice (8 blades across 4 uplinks).

With the second gen modules (220x) you can port-channel the links from the module to the FI. Then you can set how it load balances those (IP hash, MAC hash, etc) to get better throughput. With the new VICs you can also do 80Gb to a blade and also do some port channeling there. So it can get complex...or just do simple pinning.
 
Back
Top