ESXi 5 VLAN tagging?

digilink

Gawd
Joined
Jul 27, 2002
Messages
566
So I'm just getting started with ESXi and struggling just a bit to get my networking just right. I have two VLAN's:

vlan1 - 10.0.0.0/24
vlan2 - 10.0.1.0/24

I want to use vlan1 as the management network, vlan2 for the VM's. I'm using the on-board NIC on the mobo in the ESXi server and it's attached to a Netgear gigabit smart switch that is configured as an 802.1q trunk for these VLAN's.

If I leave VLAN tagging off or set it to 4095 (for all VLAN's) on the management network, it grabs an address via DHCP and I am able to connect to it just fine. If I set it to 1 (which is what it should be), networking dies. In the case with the VM network, I set the VLAN tag for this network to 2, and my VM's will not pull a DHCP address, and appears that there is no network connectivity at all.

I must be missing something, but struggling to find out what. Any docs I have ran across suggest setting the VLAN tag on the virtual switch, but I do not appear to have that option.

Anyone have any ideas?
 
When you setup the port group within the vSwitch, that's where you can add the VLAN tag. In the properties of the vSwitch, do an Add -> Connection Type - Virtual Machine -> Port Group Properties has Network Label and VLAN ID.

For the management network, you'll choose VMkernel instead of Virtual Machine when setting up the port group.
 
So I'm just getting started with ESXi and struggling just a bit to get my networking just right. I have two VLAN's:

vlan1 - 10.0.0.0/24
vlan2 - 10.0.1.0/24

I want to use vlan1 as the management network, vlan2 for the VM's. I'm using the on-board NIC on the mobo in the ESXi server and it's attached to a Netgear gigabit smart switch that is configured as an 802.1q trunk for these VLAN's.

If I leave VLAN tagging off or set it to 4095 (for all VLAN's) on the management network, it grabs an address via DHCP and I am able to connect to it just fine. If I set it to 1 (which is what it should be), networking dies. In the case with the VM network, I set the VLAN tag for this network to 2, and my VM's will not pull a DHCP address, and appears that there is no network connectivity at all.

I must be missing something, but struggling to find out what. Any docs I have ran across suggest setting the VLAN tag on the virtual switch, but I do not appear to have that option.

Anyone have any ideas?

Just so I can beat a dead horse... you really shouldn't use VLAN 1. It is not a best practice. The reasons are sort of lame, but if you have the opportunity to use something else -- I would.

As suggested you'll need to set the port groups to have matching VLAN tags as appropriate. You should already have a default vmkernel port group, and probably a VM Network port group. You modify the VLAN tag by going to the properties of the vSwitch, then selecting the port group and clicking edit. You should be able to type the VLAN id in at this point. If you hose this up for the management network while using the vSphere Client... you're going to need to sort it out via the DCUI.
 
I'm just gonna throw this out - is your DHCP server on either of those VLANs?
 
I'm just gonna throw this out - is your DHCP server on either of those VLANs?

Yes

Just so I can beat a dead horse... you really shouldn't use VLAN 1. It is not a best practice. The reasons are sort of lame, but if you have the opportunity to use something else -- I would.

I 100% agree with you, but never got around to actually changing it so just trying to get up and running at the moment, I will most likely change in the future.

Oddly..... the only way I can get this to work is to set the VLAN ID to 4095 on both interfaces. 4095 from the documentation is for all VLAN's and it's pulling an IP from the VLAN 1 subnet for the management network as well as the VM network. I previously had a Proxmox install working on this same setup (actually it's the same machine/switch port, taking ESXi for a ride to see how I like it) and it worked fine. I had to edit /etc/network/interfaces manually, but after I got it setup it dealt with both VLAN's without an issue.

So still kinda lost.... I'm open to changing my VLAN tags as suggested, but in theory this should be working without resorting to that unless I am still missing something....
 
Yes



I 100% agree with you, but never got around to actually changing it so just trying to get up and running at the moment, I will most likely change in the future.

Oddly..... the only way I can get this to work is to set the VLAN ID to 4095 on both interfaces. 4095 from the documentation is for all VLAN's and it's pulling an IP from the VLAN 1 subnet for the management network as well as the VM network. I previously had a Proxmox install working on this same setup (actually it's the same machine/switch port, taking ESXi for a ride to see how I like it) and it worked fine. I had to edit /etc/network/interfaces manually, but after I got it setup it dealt with both VLAN's without an issue.

So still kinda lost.... I'm open to changing my VLAN tags as suggested, but in theory this should be working without resorting to that unless I am still missing something....
If I am not mistaken, 4095 is the open broadcast for vlan. I could be mistaken in my understanding of that one.
http://communities.vmware.com/thread/85501 there's a link for vlan from google search
 
Paste the config of your physical switch. It's not tagging VLANs. The reason VLAN 1 works when you set to 0 or 4095 is that it's assuming VLAN 1 is native (meaning, untagged) and it works. By default that's how most switches are...VLAN 1 is native and untagged.

Paste your config and I'll help you fix it. It's easy.
 
NJ is right default vlan on the switch is not tagged. Make the native vlan on the switch port connected to vmware something other than 1 or 2, that will force the port to tag vlan1 and vlan2 traffic.
 
Paste the config of your physical switch. It's not tagging VLANs. The reason VLAN 1 works when you set to 0 or 4095 is that it's assuming VLAN 1 is native (meaning, untagged) and it works. By default that's how most switches are...VLAN 1 is native and untagged.

Paste your config and I'll help you fix it. It's easy.

Thanks a bunch :) No CLI, GUI only, so here's some screenshots. It's a Netgear GS724T switch, router is pfSense. ESXi server is connected to port 9, port 24 is towards router:

VLAN 1 membership:



VLAN 2 membership:



PVIDS:



This was working in the past with my Proxmox box, so I think my switch is config'd correctly but never can tell :)

Thanks for any insight, I appreciate it :)
 
port 9 is member of vlan1 as tagged but also has pvid (native vlan) of 1 so that might override each other. Try setting pvid on port 9 to something other than 1 or 2
 
Try setting pvid on port 9 to something other than 1 or 2

I set it to 25 to test.... crashed and burned :( Can not connect back to it unless I set it back to where it was (PVID of 1)
 
can u post ss of the networking configuration for the esx host
 
Can you even trunk vlan one on a netgear switch? Some of those web managed only jobs can't tag vlan 1 and it is what it is. Don't use vlan 1.
 
you lost connection to esx because ur "managment network" has no vlan tag which means it takes native lan of switch port (25 in ur test) change it to 1 and it should work
in vmware 4095 means that the port group will accept and tag traffic on any vlan but it's upto the hosts on that port group to specify which vlan they want to use if the vm machine on 4095 group is sending untagged packets they will become whatever the native vlan (pvid) of the switch port is.
 
Can you even trunk vlan one on a netgear switch? Some of those web managed only jobs can't tag vlan 1 and it is what it is. Don't use vlan 1.

that might also pose a problem, I know you can tag vlan 1 on procurve and cisco but I've never tried it on netgear or dlink or any other soho gear.
 
From my first post:

If I leave VLAN tagging off or set it to 4095 (for all VLAN's) on the management network, it grabs an address via DHCP and I am able to connect to it just fine. If I set it to 1 (which is what it should be), networking dies. In the case with the VM network, I set the VLAN tag for this network to 2, and my VM's will not pull a DHCP address, and appears that there is no network connectivity at all.
So it would appear that I had it setup correctly to begin with based on your suggestion above? If that's the case, I am truly at a loss as this should work as intended..... it worked with Proxmox, which is nothing more than vanilla Debian Lenny with KVM/OpenVZ installed and a fancy web gui. I had two virtual ethernet interfaces with tagging on both and they worked fine, so my assumption is that the problem could be with ESXi, but I'm still at a loss as to what it could be as everything appears to be config'd properly.
 
u need to make both changes. vlan 1 for management network and change pvid for port on the switch. After u change pvid on the port make sure that the port is still member of vlan 1 and is tagging.

Also what handles dhcp on your network? is it another vm server or physical one?
 
I have one of those switches and just setup vlan trunking on it last week. I was only using vlan 1 before as we weren't doing anything special so just left all switches set to 1. I didn't have any issue using vlan 1 as part of the trunk other than vlan traffic appearing to want to skip over to the other vlan. Seemed to go away when I changed my network to a different vlan number and left nothing on 1. But as far as traffic it all passed just fine. Like everyone else I am thinking that its just not trunking on the server correctly.

That said why is ports 14 - 23 set for both vlans as untagged?
 
/mnt/STEPHENNET-DATA/ISOS

That's for some VoIP phones and ATA's for my VoIP network, not all of them are in use, but reserved them for VLAN 2....

I'm really beating my head against the wall with this. Starting to maybe think there's an issue with the on-board NIC I'm using. It's a Realtek 8168, seems to do everything but VLAN'ing and must support it because I've had it working under linux before, just may not work under esxi. Haven't been able to confirm that though.
 
I doubt it's the onboard nic and more the switch or config of the vswitches.
 
A vSwitch config isn't complex at all. If you set it to tag a VLAN and it doesn't work then you're physical switch is configured wrong. My guess is it's that NetGear switch. A lot of those have weird issues...
 
Can you even trunk vlan one on a netgear switch? Some of those web managed only jobs can't tag vlan 1 and it is what it is. Don't use vlan 1.

I have the same switch at home. With the newest firmware the GUI/options look completely different.

And yes the thing is screwed to hell, but it works. I am tagging VLAN 1 and that does seem to work... pfSense on a tagged port machines on untagged VLAN 1 can get online, same for wifi AP tagged port EXCEPT I can't access the switch config from a tagged port, and there's no configuration option for management vlan -- only IP address. Oh yes and SNMP can only be configured on a per host (IP address) basis and only a max of 4 IP addresses can be configured.

gs724t.png
 
I have the same switch at home. With the newest firmware the GUI/options look completely different.

And yes the thing is screwed to hell, but it works. I am tagging VLAN 1 and that does seem to work... pfSense on a tagged port machines on untagged VLAN 1 can get online, same for wifi AP tagged port EXCEPT I can't access the switch config from a tagged port, and there's no configuration option for management vlan -- only IP address. Oh yes and SNMP can only be configured on a per host (IP address) basis and only a max of 4 IP addresses can be configured.

gs724t.png

Actually that is the older firmware. I have one of those. Actually have a GS724v1 (one you have there) v2 and a v3. newer firmware looks like what he posted.

what issues do you have with them? that is what we are using throughout the entire office without much issue. Only real issue I've seen is that I have some device that only links at 10Mbps half duplex (by design) that causes them to go crazy and stop responding to anything, they still pass traffic just fine, but the management goes down. reboot and it would be fine for a few days then do it again, haven't narrowed down which device it was as it was in a room full of old devices that sync at 10Mbps half duplex so just moved all none gigabit stuff off of it, who knows could have just been the number of 10Mbps connections. But I have a GS748T, 2 of the GS724Tv3, GS724Tv2 and a GS724Tv1 they all work fine. you might want to check your firmware to see if you are at the current version.

As for the management vlan, if you are talking about the netgear, that can be changed on the newer ones. default is 0 for all vlans, but you can change that, was thinking that the v1s had that option also. As for the SNMP that is only for traps, if you enable SNMP you can access it from any machine for query
 
That's for some VoIP phones and ATA's for my VoIP network, not all of them are in use, but reserved them for VLAN 2....

I'm really beating my head against the wall with this. Starting to maybe think there's an issue with the on-board NIC I'm using. It's a Realtek 8168, seems to do everything but VLAN'ing and must support it because I've had it working under linux before, just may not work under esxi. Haven't been able to confirm that though.

I have the same issue, the only solution is to use static IP addresses, or replace your Realtek NIC with Broadcom/Intel.

I have seen that when a virtual machine request an IP from VLAN 10, the DHCP server responds on the native VLAN 1.

This is not the case after I swaped the network interface to something else, like Broadcom/Intel.
 
An update on this.... I got a new network card (Intel flavor, gigabit), VLAN problem solved :D

I suspected it might be the onboard NIC causing the issue, and my suspicions were correct. Realtek and linux have always given me fits, I should have just bought an Intel card to begin with, live and learn!!!!

Thanks to all who responded :)
 
You know, that actually explains an odd issue I had with a set of realtek nics too... except mine transmitted everything on "every" vlan, somehow... I just never investigated and moved them to a different network segment to avoid the problem (lab boxen)
 
I didn't read any of this thread so I will just comment on the OP's first post.

VLAN 0 in VMware is akin to VLAN 1 on a switch. That is why you're seeing it work on VLAN 0 or 4095 and not 1. Blame VMware for this snafu.
 
I didn't read any of this thread so I will just comment on the OP's first post.

VLAN 0 in VMware is akin to VLAN 1 on a switch. That is why you're seeing it work on VLAN 0 or 4095 and not 1. Blame VMware for this snafu.

VLAN 1 and 2 are now working just fine, it was a NIC issue. I've got VLAN 1 on the mgmt network, and VLAN 2 for the VM network.
 
I have the same issue, the only solution is to use static IP addresses, or replace your Realtek NIC with Broadcom/Intel.

I have seen that when a virtual machine request an IP from VLAN 10, the DHCP server responds on the native VLAN 1.

This is not the case after I swaped the network interface to something else, like Broadcom/Intel.

I know this is an old post but I see issues like this dhcp issue you have all the time with what I work on (VoIP). If the dhcp server is on a trunk port and not an access port or a port with a voice vlan it will always respond to the dhcp request on the native vlan for some reason.
 
Back
Top