intermittent packet loss with pfsense?

colinstu

2[H]4U
Joined
Oct 11, 2007
Messages
3,563
Check out these RRD graphs

7zq6GgK.png


QeYyzGi.png


PeErtCL.png


Anyone see anything like this before? Any ideas what the cause could be / where could I look to narrow down the problem?

running 2.1.4-RELEASE i368 on a Dell Optiplex 980. Onboard gig nic used for WAN, ancient half-height Linksys 10/100 nic for LAN.

edit: TWC 30/5 internet connected to Motorola sb6141 connected to the pfsense router, which connects to an HP 24 port switch. traffic shaping turned off (not configured). no CARP.
 
Last edited:
I`m no guru but I`d start with swapping the NIC allocations around, give your LAN the onboard Gig NIC and your Wan the half height Linksys. If the packet loss follows the Card, there`s your problem!

Or just straight out swap them both for a couple of Intels.

that`s the route I`d take anyways...
 
Onboard gig nic used for WAN, ancient half-height Linksys 10/100 nic for LAN.

Dude... Why skimp on the NICs?

Get something like this, or this. That's exactly what I use in my Pfsense router, and I can verify they are good for 100Mbps+ as well thousands of simultaneous connections. They are 64-bit cards, but install in 32-bit slots just fine. To say the cards are inexpensive is a bit of an understatement... There is no reason to still be using some ghetto old network card.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
How does your internet work during those time periods? Since it is on your WAN port that most likely means your connection is having issues during those times.

What PFsense is doing is pinging the gateway device from time warner. It's giving you an idea of the quality of the connection back to your ISP. You probably want to look at your modem and see the signal levels. http://192.168.100.1

Downstream should be + or - 7dB and the upstream needs to be less than 51dB for power levels on each channel. Downstream signal to noise ratio should be above 32dB.

The onboard NIC in that is just fine and you shouldn't have any issues using that for your WAN. This particular graph has nothing to do with your LAN nic as the graph is solely from the pfsense box to your ISP. If you look at the graph for LAN and see packet loss then you definitely have an issue with particular nic, but it most likely works fine as well.
 
A NIC does exactly one physical connection.

Fixed that for you. Unless of course you have a NIC with multiple ports on it. But what he meant was that it is fine with thousands of open states meaning that computer A on port x mapped to say amazon on port 443, computer A on port X mapped to google on port 443, computer B mapped to port x to gmail on 993, etc. There are thousands of connections or states between each computer and each destination. Smaller SOHO routers (like Linksys, trendnet, etc) don't always handle having a large number of connections because they have very limited memory and cpu power. PFSense can scale to whatever needs you have since you can upgrade the amount of memory or CPU power in your PFsense box.
 
A "connection" or "state" is a loose agreement between 2 hosts. A NIC sees Ethernet frames and at most, it does IP/TCP/UDP checksum calculation.

One "connection" at 1000pps is more work for the NIC than 100 "connections" at 1pps each.

Using the concept of TCP connections in the context of NICs is just completely pointless.
 
A "connection" or "state" is a loose agreement between 2 hosts. A NIC sees Ethernet frames and at most, it does IP/TCP/UDP checksum calculation.

One "connection" at 1000pps is more work for the NIC than 100 "connections" at 1pps each.

Using the concept of TCP connections in the context of NICs is just completely pointless.

I was referencing a dual-port NIC in the context of how the router performs once installed, the same way one might reference tires in the context of how the car performs.

Did you actually have anything to contribute to this thread, or do you just get off on arguing out-of-context technicalities?
 
Seeing how this forum is frequented by not-so-experts, it helps to use a common and correct language.

"NIC xyz is good for thousands of connections" is just plain nonsense. There's no way to talk around it. Try going to a car club and saying "I installed new tires, now my motor is stronger".

The reason why so many clueless people are running around is that the experts - or so-called experts - are using a dumbed down language to accommodate the newbies, instead of using the correct language of their field so the newbies can actually learn from it.

Dumbing down pisses me off. Now I'm finished.
 
"NIC xyz is good for thousands of connections" is just plain nonsense. There's no way to talk around it.

Given what actually I posted, and the context of the conversation during which I posted it, I think it was pretty obvious exactly what I meant. Large numbers of simultaneous connections, such as using bittorrent with a large number of seeders and/or peers, is a frequent cause of router problems. In the case of PFsense and other home-built routers, switching to more robust network cards often serves to diminish or completely cure the problem.
 
It's most likely fine, ICMP is a really bad way of determine packet loss as any sane router will ignore such requests during load.
//Danne
 
Dude... Why skimp on the NICs? Get something like this, or this.

I was using what I had laying around, haven't put a dime into this router yet (well, obviously besides the power to run it).

How about newer Intel PCIe half-height cards? Do they all work out of the box with pfsense? And as far as installing new cards after pfsense is setup, do you just power the router down, install the cards, boot it back up+login, configure the new interfaces? Or is some extra work needed for the device to be found / drivers installed / etc?

How does your internet work during those time periods? Since it is on your WAN port that most likely means your connection is having issues during those times.

What PFsense is doing is pinging the gateway device from time warner. It's giving you an idea of the quality of the connection back to your ISP. You probably want to look at your modem and see the signal levels. http://192.168.100.1

Downstream should be + or - 7dB and the upstream needs to be less than 51dB for power levels on each channel. Downstream signal to noise ratio should be above 32dB.

The onboard NIC in that is just fine and you shouldn't have any issues using that for your WAN. This particular graph has nothing to do with your LAN nic as the graph is solely from the pfsense box to your ISP. If you look at the graph for LAN and see packet loss then you definitely have an issue with particular nic, but it most likely works fine as well.

When this packet loss is happening, the internet will feel extremely slow and sometimes pages will refuse to load.

8 downstream channels, all of them are -3 dBmV. SNR is 37dB.
4 upstream channels, all between 46-48 dBmV. everything looks to be in order there.

The "Quality" RRD graphs are only available for WAN.

It's most likely fine, ICMP is a really bad way of determine packet loss as any sane router will ignore such requests during load.
//Danne

No load is being applied (besides completely normal / average web browsing) ... no crazy torrent downloads / external file transfers / etc.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
uUedO4P.png


Packet loss has gotten worse. Went from 5% to 20% and then 30%. I finally swapped the LAN and WAN interfaces, rebooted the router, and then finally rebooted the modem. When the packet loss stopped is when I did all that. Surely enough though... awhile later the packet loss started right back up, first at 30% and now at 40%.

The WAN interface was the onboard Intel, but now it's on the linksys card. When packet loss is this high, the internet will feel slow and sometimes pages refuse to load the first or second try. Oddly enough, if I do "ping google.com -t" and let it run for minutes... not a single time has it had an issue pinging. Have tried other domains too with same effect.

I just bought parts for an entirely new router (plus an intel dual gig PCIe nic) so we'll see if that helps at all. Looking at the modem's webpage and reading more about safe power levels, 50dbmv looks to be the highest safe/in-range power level, and current 1 of my 4 channels is right there at 50, the others being 49/48/47 (and it seems like some ISPs have different ranges, like 52 to 57 can be the most).

Another worrysome stat on the modem are codewords
HTzPPyd.png


I never looked @ these numbers before any of these issues, so I don't know if these uncorrectable codewords are just normal amounts or abnormal. I'm also going to try putting a fan on the modem to see if that helps. It's not in a really hot place or anything... maybe it's just failing?
 
Do you actually have any connectivity issues apart from looking at the graph?
//Danne
 
Last edited:
uUedO4P.png


Packet loss has gotten worse. Went from 5% to 20% and then 30%. I finally swapped the LAN and WAN interfaces, rebooted the router, and then finally rebooted the modem. When the packet loss stopped is when I did all that. Surely enough though... awhile later the packet loss started right back up, first at 30% and now at 40%.

The WAN interface was the onboard Intel, but now it's on the linksys card. When packet loss is this high, the internet will feel slow and sometimes pages refuse to load the first or second try. Oddly enough, if I do "ping google.com -t" and let it run for minutes... not a single time has it had an issue pinging. Have tried other domains too with same effect.

I just bought parts for an entirely new router (plus an intel dual gig PCIe nic) so we'll see if that helps at all. Looking at the modem's webpage and reading more about safe power levels, 50dbmv looks to be the highest safe/in-range power level, and current 1 of my 4 channels is right there at 50, the others being 49/48/47 (and it seems like some ISPs have different ranges, like 52 to 57 can be the most).

Another worrysome stat on the modem are codewords
HTzPPyd.png


I never looked @ these numbers before any of these issues, so I don't know if these uncorrectable codewords are just normal amounts or abnormal. I'm also going to try putting a fan on the modem to see if that helps. It's not in a really hot place or anything... maybe it's just failing?

The uncorrected codewords aren't all that high, but it largely depends upon on how long it's been on. I can hit 5k or more if it's been on for a few days but I know that my connection is also having major issues right now. If you have 4 upstreams then the power level tops out at around 51dB or maybe a tad less. The more upstreams the lower the power level. So 1 upstream can be near the 57dB mark, but 2 is around 54dB, 3 is somewhere around 51dB, and 4 might be a tad under that.

I don't know how bad your ISP customer service is but it might be worth seeing if someone can just look at it. IMO all of the DOCSIS 3.0 setups are very touchy with bonded upstreams and have no wiggle room for poor signals. If it's intermittent it might be harder to track but I'd still guess it's a bad splitter / drop / poor connection, or something on the line. The ISP will likely blame the modem first even though once again I bet with a modem swap it isn't going to magically fix the issue.

I don't think it has anything to do with your firewall IMO, you could always try hooking something else to the line and see if it experiences the same issues over time. Most likely it will.
 
rebooted the modem earlier today and so far no issues as far as packet loss etc. I also put a 120mm fan, maybe that can help.

I bought and own the SB6141 modem to avoid paying $/mo to TWC to rent one of their garbage ones (typically some Ubee POS). Bought it over a year ago and really haven't noticed any problems.
 
Back
Top