• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Isolating network performance issues

ThatITGuy

Gawd
2FA
Joined
May 5, 2017
Messages
682
Hi all,

I am wanting to learn more about what tools to use for isolating network issues across a network. My current problem revolves around performance taking a hit when my kids are on their tablets streaming, but I have also wanted to identify where the slow downs are in my network and identify potential issues (cable going bad, etc.) to be a bit more proactive. I know I can ping from a device to another device and see some slowdown, but that doesn't always show the issue. I have also used tracert in the past, but that isn't always showing me anything either. What other tools/commands do you use to help isolate issues? If there is a device on the network that is causing all of the load, how do you find which one it is (other than going to each one, unplugging from network, and then see if issue is resolved..... since this may not work as much with transient bursts causing issues).
I currently have Gig Fiber feeding into the AT&T modem/router, which feeds a switch. I have an AP connected to the switch, along with my computers, and other devices.
 
Performance on your network, or your Internet performance and bandwidth?
 
Internal network would be the priority. I can ping/tracert and find most internet connectivity problems
 
Hi all,

I am wanting to learn more about what tools to use for isolating network issues across a network. My current problem revolves around performance taking a hit when my kids are on their tablets streaming, but I have also wanted to identify where the slow downs are in my network and identify potential issues (cable going bad, etc.) to be a bit more proactive. I know I can ping from a device to another device and see some slowdown, but that doesn't always show the issue. I have also used tracert in the past, but that isn't always showing me anything either. What other tools/commands do you use to help isolate issues? If there is a device on the network that is causing all of the load, how do you find which one it is (other than going to each one, unplugging from network, and then see if issue is resolved..... since this may not work as much with transient bursts causing issues).
I currently have Gig Fiber feeding into the AT&T modem/router, which feeds a switch. I have an AP connected to the switch, along with my computers, and other devices.

Ping is a valuable tool especially so when you start to play with df and size options. A link that has mismatched speed/duplex might look good with a default ping but when you up increase the payload to 1000 bytes you'll see loss. Traceroute is less useful on most home networks where as netstat -rn or route -print or arp -a and very useful. As far as what equipment is consuming bandwidth this is where infrastructure equipment starts to differentiate itself. Enterprise or business class switches, firewalls or access point will flat out expose this information usually from the mgmt ui. I'll use Fortinet as an example since that is what I have here. I can log in and see all sessions source and destinations including bandwidth used in real time. I can see historic and real time bandwidth information on the individual switch ports. All of this from the webui. If I need more granular information I can run reports on the log data or setup packet captures based on triggers. Whereas with much consumer gear you'll never know any of that and will be stuck with the cable game. That is not to say ALL consumer gear is that way but most of it is. This why you must do your research up front with a defined set of criteria.

I would also suggest that a local iperf server is great to have. Takes all of 2 min to load up on linux vm/box.
 
Last edited:
I'd check the logs in your router and also your AP(if it has a gui/login) both of those may have which device is consuming bandwidth, an the post by Nicklebon about different ping options is also very valuable
 
The long and short of it, if it's working at all, then it's probably not an internal problem if it's a wired device. Even a 4k HDR stream isn't going to saturate a 100mbps link, so outside of you having gig E and someone downloading from steam, under normal circumstances it will basically never be an internal choke point. Ping doesn't have anything to do with throughput (other than it will get worse when the network is saturated) so it's not particularly relevant for testing bandwidth.

The easiest bandwidth test? A linux ISO between two pcs over a network share. SMB2/3 is plenty fast enough to saturate 1gig links, and with SSDs both ends can handle it just fine. It's either working or it's not, and if it's not then it's an issue with the device itself, and not the network*. (*about 99% of the time. Sure 1% of the time the cable could be only running 100mbit)

The IT professional answer? Spend a bunch of time trying to setup and configure iperf so you can do full bandwidth testing in memory. It's kind of a pain to get working, and probably not worth the hassle for a home environment. It's still only good for theoretical network testing, which like I said is probably the last reason as to why something isn't working correctly. If you're in an actual enterprise environment, then you're just going to look at your bandwidth graphs you should be generating using snmp. (Or looking at all of the email alerts you're receiving because a link crossed some threshold)

What you're really asking for is a managed switch and / or router. You are basically asking the question why people use managed switches and things like PFSense. In PFSense you login, click Status > Traffic Graph. Switch to your LAN interface and see what IPs pop up on the table on the right side of the screen. Go look up your DHCP leases, and then move that pc to it's own VLAN and throttle the heck out of it! J/K

The problem you're having is that the ISP's device is very basic and limited in functionality. I'd guess more of the exotic routers these days can probably give you some type of traffic graph, or tools to try to help track down a client. Probably one of the other easiest methods is that the "lights don't lie". You want to find something glancing at your switch lights and look for the one that is blinking faster than the others. Once again low end SOHO devices might not do a great job representing this, but higher end devices do. I've never seen a Cisco switch that doesn't make it painfully obvious which port has all of the activity on it.

If you're talking about wireless issues, then that's an entirely different animal. Apparently the new version of inSSIDer can even show spectrum utilization for a certain channel so it would be a great resource to help troubleshoot if a particular channel is being saturated. Wireless is far more complex because the bandwidth is dynamic. Unlike a switch where the port is either 100mbps or 1 gig, the same wireless client could be getting anywhere from like 20 - 600mbps depending upon their signal strength, neighboring devices, interference, etc. So while you can easily predict is a port on a switch has enough bandwidth, you need to use tools like inSSIDer to determine if wifi is the issue. You could use iperf to conduct bandwidth tests in this case, and it would be helpful to see if it's staying consistent or if it's all over the board.


Ping is a valuable tool especially so when you start to play with df and size options. A link that has mismatched speed/duplex might look good with a default ping but when you up increase the payload to 1000 bytes you'll see loss.

This is definitely a thing in an enterprise environment, especially if you're using Jumbo frames, vlans, VPNs, etc. You can also manually force a speed / duplex. In a soho environment it's unlikely you'll find anything other than a 1500 MTU and autoconfigured at 100m full duplex or gig full duplex, so it's not too likely you'll be running into those kind of issues. Basically if you have unmanged switches then no one can actually go in there and fiddle with it, so it's either going to work or it's not. I don't think I've ever seen a cabling issue cause half duplex on an unmanaged device. (Not saying there isn't a 1% chance of it, but if you're seeing half duplex it's probably because someone messed with a config) If you're using managed devices then definitely this would apply.
 
The easiest bandwidth test? A linux ISO between two pcs over a network share. SMB2/3 is plenty fast enough to saturate 1gig links, and with SSDs both ends can handle it just fine. It's either working or it's not, and if it's not then it's an issue with the device itself, and not the network*. (*about 99% of the time. Sure 1% of the time the cable could be only running 100mbit)

The IT professional answer? Spend a bunch of time trying to setup and configure iperf so you can do full bandwidth testing in memory. It's kind of a pain to get working, and probably not worth the hassle for a home environment.

Basically if you have unmanged switches then no one can actually go in there and fiddle with it, so it's either going to work or it's not. I don't think I've ever seen a cabling issue cause half duplex on an unmanaged device. (Not saying there isn't a 1% chance of it, but if you're seeing half duplex it's probably because someone messed with a config) If you're using managed devices then definitely this would apply.

The issue with coping ISO images is that the network may not be the limiting factor. For example I have 2 laptops that because of their hard drives and software configuration will never come close to saturating a gig link but, with iperf no issues doing so. Certainly copying an image is an overall system test and when used in conjunction with something like iperf you can narrow the problem down. Both pieces are needed to do this. As far as iperf being a pain to configure ... you're doing something wrong or overthinking it. It takes less than 2 min to run "sudo apt install iperf3", to pull down and install it on a linux vm or a pi. Running it consist of typing iperf3 -s. Perhaps your definition of "kind of a pain to get working" differs from most? The speed/duplex issue isn't a cabling issue. It is generally the result of a user issue that thinks they should be hard coding it on their end for performance reasons. The infrastructure isn't relevant in these cases as both ends of the cable must match meaning hard code both ends or auto both ends. Doing anything else screws up clocking which results in a system that will "work" but have utterly abysmal performance.
 
I think I may just set up an Iperf. I think I have a spare Raspberry Pi i could stick it on, and it will be fun for learning.
As for my equipment, I should have mentioned that my switch is a Cisco SG350, so i do have managed (Layer 3) switching, along with the MGMT UI that does not have full catalyst level features, but some advanced features. I have actually not gone into it for traffic graphs, in part because i have put my network tinkering on hold until i can get my own router working, which can handle working with VLANs. I understand that wireless connectivity will also play a part in this for wireless devices. It was more a puzzle i wanted to understand how to solve, especially in cases where nicer equipment may not be in use, and i may not even have access to the equipment. I have always enjoyed playing around with networking, but my actual job does not require it other than having enough knowledge to challenge infrastructure teams when they try to pass blame between server, LAN, and "Web" teams as to why something isn't working.
I am currently using the ATT modem/router as DHCP server because I cannot get it to play nice with their cable box when i use my own router. Previously, I did have multiple networks (and DHCP Servers) created on the router, using VLANs to determine which network the device was assigned, with one each for wired, my wireless devices, guest wireless devices, and IOT (with IOT and Guest having cross network communication packets dropped). Once i get my own router back up and running, i would like to play around with tracking per port utilization as well as per client.
I think all of this just spawned from my noticing some streaming performance problems and wondering how i could prove to myself it was outside my network, or how to pinpoint what the culprit was if it was from within my network(especially since it was more noticeable when i had kids on their devices) , since at any given time there could be 1 NAS server, 2 wired PCs,1 wired Xbox One, a wireless laptop, 3 tablets, 2 phones, along with a Chromecast and an Amazon Tap accessing my network. My first instinct was to check wireless signal, but it was good (and should be since the AP is on top of a tall bookcase directly beneath where the Chromecast is, only a rather thin floor between). It was then that I realized that beyond some basic ping/tracert, I do not have enough understanding on what to do next, and not just in this specific instance, but in being able to "quickly" pull up some proof of where the issue lay. I.e. how do smarter people do this?
 
The issue with coping ISO images is that the network may not be the limiting factor. For example I have 2 laptops that because of their hard drives and software configuration will never come close to saturating a gig link but, with iperf no issues doing so. Certainly copying an image is an overall system test and when used in conjunction with something like iperf you can narrow the problem down. Both pieces are needed to do this. As far as iperf being a pain to configure ... you're doing something wrong or overthinking it. It takes less than 2 min to run "sudo apt install iperf3", to pull down and install it on a linux vm or a pi. Running it consist of typing iperf3 -s. Perhaps your definition of "kind of a pain to get working" differs from most? The speed/duplex issue isn't a cabling issue. It is generally the result of a user issue that thinks they should be hard coding it on their end for performance reasons. The infrastructure isn't relevant in these cases as both ends of the cable must match meaning hard code both ends or auto both ends. Doing anything else screws up clocking which results in a system that will "work" but have utterly abysmal performance.


If you're not seeing full throughput on your devices then you could troubleshoot further, but outside of 5400rpm spinners you're probably not running into a device that can't handle a gig connection. The problem with iperf is basically the assumptions you made. Just run "sudo apt install" on a non existent vm on a non existent VM host. Or go out and buy a raspberri pi, and spend an hour figuring out how to to load a linux image onto an sd card. Thankfully youtube is great and will help the average person who isn't already running a vm stack.



Watching this 5 minute tutorial will show you how to get it working on Windows, which is what the vast majority will have handy. It shows you that you need to setup the listening daemon, but then you obviously need to put it on the client side, and type in another specific command "iperf -c ip address" to get it to connect. If you were to just type "iperf /?" you'd be greeted with a huge list of commands. You're just supposed to know that you only need to type "iperf -s" on the daemon side, and that on the client side it's now pre-programmed to know what default port to look for so you don't have to specify a port. It's also crucial that if the "allow access" box pops up you need to click allow, or the firewall won't actually let the server side listen on the port it wants. All of that should be trivial to the average person who never touched iperf before right?
 
I.e. how do smarter people do this?

It's not really that people are smarter about any of it. It's just picking somewhere to start and keep in mind the "KISS" principal. If you're seeing issues, start small. Turn off devices one at a time until the problem goes away. That alone should give you an idea of what caused it. Then what you need to do is start mapping things out, so maybe potential issues will show themselves. You can run iperf to your heart's content, but you first need a game plan of what you're trying to accomplish. Putting iperf on a raspberri pi, then running bandwidth tests between the pi and another host that's hardwired on the same vlan on the same switch probably isn't going to tell you much if your actual use case is streaming netflix to an ipad. You need to understand where the traffic is flowing to and from first, then try to replicate that if you're going to succeed. Likewise testing to a pi that is on another vlan, yet the two devices having issues which each other are on the same vlan isn't really a fair comparison. You might see performance issues with routing, but that's not the cause of your troubles.

If wireless is involved then just assume it's a wireless problem. The first thing with wireless is test it wired if possible, if not you probably want to start there. Having great signal doesn't really mean much when it comes to wifi because it's a shared medium. Every device connected on the network is sharing that airtime. On your switch if it has 28 gig ports, it can usually handle every port at full bandwidth.

I think the big thing to differentiate as well is the hardware itself. On the forums here I generally assume someone is using soho equipment with unmanaged devices unless they list otherwise. Most people on here simply don't have the knowledge to go around and mess with things and cause the more obscure issues, even if they did have a managed device. If you're troubleshooting an actual SMB or enterprise installation, then all bets are off. Assuming the person who configured it knew what they were doing never works out, so you need to start simple, but then you might need to start digging a lot deeper into the issue. Because there are more things to tweak, there are far more things to go wrong. Once again asking questions that kind of give you an idea of someone's knowledge will usually help narrow down where to look. If you ask them a very technical question and they don't really know the answer, then they probably didn't know how to actually go in and configure what you were talking about.

As for helping improve your knowledge, you are basically on the right track. Simply trying to reach a goal and run into roadblocks on your home equipment will help you to learn things you don't know. Probably the most important part is you need to first know what's "normal". You may or may not be able to just pick out a solution in 30 seconds. Some cases it could take hours of troubleshooting to figure something out. But the more you're paying attention to it, the more you know what it should look like. So it's easier to spot when something isn't right because it's different than it was before.

So with that said, document what's actually on when you're seeing the issues. Is it 1 xbox and and ipad? Are they using netlfix? Is your xbox downloading updates? Do you have any windows pcs on? What time of day is it? Does the problem only occur at night? All of those are potential clues to where the problem might be.
 
Back
Top