Virtual pfSense and switch VLANs

Outlaw85

[H]ard|Gawd
Joined
Feb 7, 2012
Messages
1,601
Wasn't sure if this should be posted here or in virtualization.

I put pfSense in my 2 node ESX cluster but am having issues with it communicating over a L2 switch with VLANs. I'm not expecting anybody to have the golden ticket since I'm working with old hardware but hoping to get some direction.

What I've tried:
Only set Portgroup VLAN in ESX and VLAN in pfSense for the WAN port. (no communication/WAN IP)
Only set VLAN on 3 ports of switch. This appeared to allow two VLANS (1,10). This may be where I have to dig in. All switch ports are default to 1.
I also tried setting them 3 ports as trunk but didn't see a change.

note- I tried with and without portgroup VLAN and pfSense VLAN assigned.

Hardware:
HP DL380g6 x2
Cisco Catalyst 2948G-GE-TX
AMD shuttle running pfSense dedicated (WAN in / LAN out). Also running DHCP

Working configuration
1605506249978.png


Non-working configuration
1605506284426.png
 
Quick update for anybody interesting. I was able to do more digging and found this is a common issue for services with a modem and 1 assignable MAC address. Because the managed switch has a MAC and the modem doesn't route, I can't get the traffic to the proper port for the pfsense VM. I'm not finding anything for the switch that allows "port forwarding" either. At least for now, It looks like I will have to keep the small unmanaged switch in place to keep redundancy.
 
I'd have to dig into what you're doing, but the reasoning doesn't seem to make a lot of sense to me. I'm going to assume you assigned an IP to the managment switch on that vlan? If you're truly doing L2 I don't think you would even see a single MAC from the switch itself since it's not residing on that network. Otherwise you should have as many as 5 macs + your VM. Each port on the switch could have it's own MAC, and definitely each port on the nodes themselves have MAC addresses assigned to them. If failover is working and your VM is getting the proper lease, then that means the virtual MAC on your VM is being used by the modem. So it would seem odd to me you can accomplish that through L2 VLANs on the hosts, but not L2 VLANs on the switch.
 
Thanks for the reply. It's very possible I am stating the reason it's not working incorrectly. It seemed to make sense when I read it though. I'll try to find the links again and share them.

This is where my lack of networking knowledge will shine. I'm not sure what you mean by assigning an IP to the management switch on the VLAN. I did not set any IPs on the switch.

I do not have any VLANs set on the hosts/vswitches when it's working with the unmanaged switch. Its a generic DLink GB switch
1605726194623.png



I was able to confirm the vlan tagging on the Cisco switch was working using a different computer. When I have the cables from the DLink plugged into the Cisco using VLANs, pfSense doesn't get an IP.


*update*
I was able to find one of the posts describing the same issue. I'm still looking for the other.
https://www.reddit.com/r/homelab/comments/8ttwwd/spectrum_internet_nonsense/

Also the Spectrum modem - ARRIS TM1602A
 
Last edited:
So right in that post,

Basic... but I don’t know how many times this has hung me up for a while...
In my area, even with your own modem, Spectrum locks the modem to communicating with a single MAC address. It resets... but only after the modem is physically powered off for 30 seconds (I think, I usually go a minute to be safe).
When your switching between this that works and that that doesn’t, I’ve had it just be that I didn’t wait long enough too many times to count...

Odd. I have spectrum (with my own modem, though — a Motorola surfboard, forget the model).
Modem connects to a unifi switch. The modem switch port is tagged with my WAN VLAN.
The only other thing in that VLAN is a PFsense VM. The eSX host ports are all open for passing any VLAN traffic, and VMware takes care of tagging the virtual interface for me.
I’ve never had any issue. Can successfully vmotion the PFsense VM between hosts.

Same poster.

So basically what they are stating is that if there is ANYTHING else that the modem sees on that network, then it will bomb out. If it dies then you need to unplug the modem, wait 30 seconds, and then try again (And wait a minute to see if it works).

From my experience PFSense can also be really finicky with changes like that. You probably want to do this. Power down said VM. unplug modem, wait 30 seconds, plug modem back in. Wait a minute, power on PFSense VM. Wait 2 or 3 minutes, then check and see what happens. PFSense can get annoyed if there isn't a device there ready to hand it a lease, and then likes to make it even more interesting. I have seen it pick up the modems 10.x.x.x assignment, and doing a release / renew via PFSense allowed it to get the proper IP.


So basically just make sure that your VM is on your VLAN 10 inside of ESX and that only the ESXi server is set to tag VLANs and you're not passing untagged traffic to any of your VMs. On the switch the port the modem is plugged into isn't trunked and is configured for vlan 10 as well. Make sure if you display VLAN 10 the switch isn't trying to run MGMT on that vlan, which is what will cause it to assign a mac address to that vlan. From PFSense itself open up diagnostics, ARP Table. Type WAN in the search box, and change the drop down to "interface", then click search. If there is more than 2 entries there, you need to figure out what that MAC address is and see what you can do to keep it from being on your network. (Something as simple as you set ESXi to allow MGMT traffic on VLAN10 and that would probably make it so that both ESXi hosts have MAC addresses on VLAN 10)
 
I read that too and if that is required every time power is lost or need to restart for some reason.. That is less than ideal. For me it's annoying but not a big deal, If I'm traveling for work and something happens, thats not going to fly with the family. i do recall seeing it pulling a 192.168.. and a 10.0.. at some point. Maybe I was partially there but didn't wait long enough.

So far, it's been solid with the "working config". I can restart pfsense if needed and the net comes back without issue, same with vmotion and restarting the modem.


-Waited 1min+ with power off on modem /shutdown pfSense VM
-Plugged into switch port configured with vlan 10 (no trunk)
-Powered on modem
-Waited a few minutes
-Started pfSense VM, waited at least 5 minutes. (Originally used NDP table as ARP table must not have been ready? It didn't give info for at least 10-15min?)

The second WAN entry gets a WAN IP. Guessing this is a gateway?

On managed switch (not working)
1605748204227.png


On unmanaged switch (working)
1605748228341.png


On managed switch (not working)
1605748281519.png


1605748355679.png


The devices on port 42 and 43 are just the crappy workstation I'm typing this from and from my other switch (daisy-chained). Port 48 tag was going to be if I get another host.
 
Last edited:
So are you tagging vswitch1 in your environment? From the configuration you're posting it looks like you are simply using the managed switch as a glorified hub with a separate network. So if you had configured any type of tagging in ESXi or pfsense it's not going to work. The way you should be setting this up is:

2/45 Trunk ESXi 1
2/46 Trunk ESXi 2
2/47 VLAN 10 cable modem

vSwitch1, new vmkernel adapter with a vlan ID of 10. Put the VM onto that network.

Can you post a picture of the vswitch1 diagram from ESXi? I can't tell if you set it up to tag traffic or not.
 
Sorry.

As of now. no tagging in ESXi/vSwitch1 or pfSense. I did try that previously, which of course wasn't working (with and without switch ports tagged)
I did NOT create a vmkernal adapter.

1605810634318.png


I will have to wait till tonight to try reconfiguring the trunk ports.
Question- Do I want to have my dedicated pfsense ports on trunk? If I trunk this, I have to remove port 1 (default)?

Thank you for your patience and taking the time to explain this!!
 
So you can mix the traffic on your ESXi hosts because you have multiple interfaces. It's perfectly fine that vSwitch0 is not trunked, and vSwitch1 is. That's pretty standard affair. I actually purposefully keep at least one NIC in pfsense configured to not be setup on the trunk because when you have VLAN issues you lose access to the web interface. It's about 10,000x easier to fix issues from web than from terminal that way. But if you do get trunking working then it's easy enough to just throw a bunch of virtual NICs into PFSense and put them on different VLANs, so you can segment out your traffic.

The only thing to make sure you don't do is setup one end to tag traffic, but not setup the other end to tag. An example would be set a VLAN ID in ESXi but not set the port as trunk. If you do that you're sending tagged traffic directly onto whatever vlan you configured. So if switch port 2/41 is set to VLAN12, plugged into ESXi1 port 3 and not set to trunk, the ESXi traffic won't be untagged by the switch and put onto the proper VLAN. Instead the switch will just put traffic onto it's own VLAN12 but leave it tagged as VLAN10. That would mean the host that is plugged into say 2/42 on vlan 12 would have to be VLAN aware and be setup to handle tagged traffic. So in the case of the modem it's not going to be able to handle said traffic and not be able to communicate.


So basically I would change vSwitch1 so that the WAN port of PFSense is using a vmkernel adapter pointed to VLAN10. Then do the trunk configuration on the switch. Remember that because you did the trunk configuration on the ESXi hosts and the switch that all of the VLAN configuration is completely transparent to both the modem and the PFSense VM. They don't know and don't care they are on a VLAN, so it should work the same way the unmanaged switch does. That said I would actually kind of expected your configuration to work, unless for some reason the Physical NIC that you're bridging PFSense on is interfering for whatever reason. If it worked unmanaged I would have expected it to work managed. But as someone else posted they confirmed they have it working with VLANs, so it wouldn't be a bad idea to try it that way since you're likely wanting to get to that point anyway. You probably want to setup ESXi trunking, and use another VM and a workstation to just test the trunk to make sure it works. It will be a lot easier if you have two hosts which you can trust to make sure it's working before testing DHCP from the modem to PFSense WAN.
 
Woot! THANK YOU!
1605835466515.png


Tested from my "crash cart" lol. Soooo bad. 5G WIFI test from phone did 166dwn/23up. I'd call that full speeds.
Also confirmed vmotion working with speedtest and download running at the same time.


It's working so it must be right :D but here are the screen caps to compare to above. At least for others if they try the same... or just me when I have to redo this for some reason and forget :D:D
1605835910056.png


1605836535442.png
 
Last edited:
So you actually have a bit of a mess there Outlaw! I can't actually tell from the picture which one of the two methods is making it work...

You set "Native VLAN" to VLAN10. So what that does is it will automatically take any non tagged traffic and go ahead and tag it to VLAN10. Basically that's a long way of saying if you plug in something that is not vlan aware, instead of that traffic staying on the default vlan (VLAN 1) it puts that traffic on VLAN10 instead.

So then on the ESXI host, it looks like you have an adapter that's probably in use that's untagged, and one that is tagged to VLAN10. The untagged traffic from the VM is going to end up on VLAN10 regardless, because the native VLAN is set to VLAN 10.

To explain that in another way.

2/45 and 2/46 traffic which is untagged will be routed to the "native VLAN". In this case it's just tossing it onto VLAN 10 instead of VLAN 1. That seems to be how "PFSense WAN" is setup to work. You're just using the VLAN10 as a L2 network and nothing is actually VLAN aware.

For "PFSense-WAN", you are explicitly saying tag this traffic, and then on the switch it's set to trunk, so the switch knows to take and tagged traffic and put it onto the correct VLAN. In this case it sees VLAN10 traffic, and puts that traffic onto VLAN10.

In case A the server is sending untagged traffic, and the switch is picking it up and knows to tag it and put it onto VLAN10. In Case B the server is properly tagging the traffic, then the switch is picking up the tagged traffic and putting it onto the correct vlan. In both cases it should work, however Case B is the "more correct" way to do it. It looks like you're doing Case A. The reason why it's more correct to do B is that as soon as you wanted to put another VM onto that server, and let's say it's supposed to land on VLAN11, Case A won't put it on VLAN11. Case A will put it on VLAN10 because that's the native VLAN.

But if for some reason Case B just doesn't want to work, just make sure that other VMs you want to use are not set that way. Otherwise they will land onto that VLAN10 and cause you problems.


So all I think you need to try to do to make this work correctly is this:

On the "pfSense WAN" connection, click the 3 dots in the top right corner. On the screen where it gives you the name, at the bottom right it should say VLAN ID. Type in "10" there and hit save. See if that the connection stays working and if it does, great! You shouldn't need that second adapter called "pfSense-WAN" at all. The confusing part is that I think the "VMKernel" option is a combined option to both create new VLANs, but also management adapters. You probably don't want or need a MGMT adapter for your ESXi on VLAN 10. If you remove that and everything is still all working, then you should be golden. If you need to create new adapters you should be able to just follow whatever steps you did to make the first "pfSense WAN" but during those steps there should be a place for VLAN. It has a dropdown but you can just type in the VLAN ID there if it's not already listed. (Which it never is because you are making a new VLAN) You need to do that operation each time you want a new VLAN, and obviously you need to do that for both hosts. But once you get started making new VLANs is as simple as creating a new NIC for PFSense in ESXi, putting it onto that VLAN, and then setting up that NIC in PFSense. Then on the switch just set the port to the correct vlan and you'll have another isolated network that can only talk to other hosts that are on that VLAN.
 
Yep. I see what you mean.

I was able to clean it all up.
--Removed the vmkernel port
--Added the VLAN to the pfSense WAN port group
--Set the native VLAN back to 1 on 2/45 and 2/46

Everything is still working and hopefully correct now :)
 
If it's still working after those changes, Congrats!!!!

You now have VLANs on your ESXi box and are making use of a managed switch. It's always a hurdle the first time you try to get through that part of configuring, but after it's done it seems much more simple looking back at it. Now that you have it set up, creating new VLANs is a piece of cake. You can carve out your entire network so your IoT is walled off from things you care about and really start taking advantage of the switch you have.
 
Thanks! :D

IoT lol. I'm fighting it but will lose in the end I'm sure.

I do have one Unifi AP for learning on and it supports using VLANs. Got some learning to do there :) I'll start a new thread when I get stuck again.

Thanks again!
 
So the Unifi is exactly the case where you want to use Native VLAN. Their controller software is somewhat dumb in that it has to be able to broadcast to hit the new APs as you set them up. So what you would do is this. Set up your favorite Linux Distro, install the Unifi Controller software on it. Put that VM onto say VLAN 11, then setup your AP on a trunk port, but make the native VLAN 11 as well. That way when you plug in the AP you can provision it through the controller, but once it's setup it will support the other VLANs for trunking. It also makes it so that MGMT interface of the AP is on VLAN11, thus allowing you to prevent any access to manage the AP from other networks.

It's actually fairly simple to do, and with the settings you have now shouldn't be too far to get setup. The only thing I'm not sure is how PFSense reacts to adding additional NICs to it's VM to route the traffic. When you add more NICs to PFSense on a physical box, you can basically expect it to jumble all of the ports, so you get you set it up again from scratch. (While trying to figure out one by one which port is which) If that's the case then I'd just go ahead and toss like 8 additional NICs at your PFSense VM, then you'll have plenty of NICs available for future expansion.
 
Going to be a bit.. lol
I was trying to get mulitple subnet DHCP following a doc and... well.. I bricked pfSense setting bridged connections. And nope, I didn't snapshot it. And it won't boot far enough to do a setting revert. :( Lesson learned. I'll be doing a separate DHCP server setting.

In regards to your wifi settings- I did mess with this a little bit. I couldn't get the wifi to work when i set native vlan on the AP switch port and port group in esx.
 
So one thing you should make sure you start doing is going into PFSense, click on diagnostics, backup and restore. Then click the big download as XML button. Takes like 5 seconds to do and it will save the entire config (Including password hashes) into a file. So if you brick it you can just restore that XML to a new install using the restore wizard. Snaps are probably easier to restore, but config backups are still a good thing to have.

For the AP you already had a config that should work, but maybe to just keep it even more simple don't start off with the AP on a trunked port at all. Pick a VLAN, put the AP on it, and see if you can get that VLAN in PFSense to the AP is able to pick up a lease. Then put another VM onto that same VLAN to configure the controller software. If you get that far then you can reconfigure the AP, then change the port over to a trunked port but keep the native VLAN as whatever you originally configured everything on.
 
Almost forgot about this :/

I did run a backup xml today. Thanks for that.

I think I got the AP stuff working right. I just need to secure it at the pfsense level as it can traverse any subnet. I create the firewall rule for any/any to test.
The AP port is configured with Trunk on default native vlan 1
The controller VM is on a tagged port group and using the correct subnet.
In the controller software I created the network with the VLAN and then assigned the wireless network to it.
--This is currently working. I did notice the AP moved back to my x.x.1 subnet but the controller is on x.x.2. My phone grabbed a x.x.2 IP so it does appear to be tagging correctly.
 
Back
Top