Please forgive the wall of text, but I think I should start at the beginning:
I'm volunteering for a non-profit that has a T1 WAN backhaul.
Normally, I can ping "www.google.com" and get ping times Minimum=15ms, Maximum=150ms average=18-50ms, depending on how many other users are using the web.
However if downstream bandwidth (From remote location to us) becomes saturated, ping times spike to 400-800ms.
I'll be the first to admit, I have loads of "Book-learning" so to speak on networking and computer technology, but first-hand, practical experience, particularly with WAN technologies, not so much. I have been subscribed to the "Security Now" podcast from Leo Laporte's Twit.tv podcast network, and remember Steve Gibson devoting a whole episode to the "Buffer Bloat" phenomenon.
If you are NOT familiar with this issue I'll refer you to the wikipedia page at http://en.wikipedia.org/wiki/Bufferbloat
TLR summary of buffer bloat:
More (Larger capacity) buffering by a router of packets that must go ouy over a low bandwidth network link is a bad thing. It causes packets to wait a long time (relatively speaking) to go through, and this can cause issues with timing sensitive communications, (Online gaming, like WoW, VoIP, Web Browsing,) and other real-time communications.
IIRC, Steve explained that excessively buffering packets causes TCP flow-control to break down, which can further exacerbate the problem.
With all of that being said, now, what I am asking of you is this:
1. Am I completely off base here? The discussion in the Security Now podcast was referring to Home routers where saturating the more limited upstream bandwidth caused reduction of available downstream bandwidth.
2. Am I correct in that saturating downstream bandwidth causing these issues means that the issue is on the provider's side of the T1 line, because that is where the bottleneck on the downstream side is.
3. does anyone have a T1 or bonded T1 that they can test like I have tested ours, (saturating upstream or downstream bandwidth and comparing ping times to a reliable major website.)
4a. Does a CSU/DSU count as a "Hop" across a router, and would it normally have an IP address (or two since it's a router)?
4b. If I traceroute www.google.com over our T1,
Hop 1's IP is the LAN interface of our NAT router, a 100Mbit Ethernet link from my computer to the NAT
Hop 2's IP is the 10/100baseTX interface of the CSU/DSU mounted on the wall near the NAT router.
Hop 3's IP is the T1 side of the CSU/DSU on the providers end of the T1 line, and this is where the choke-pont is, where the remote CSU/DSU has a 10/100 or greater link in, but only a T1 going out.
Am I interpreting the above correctly, and is the above typical, and/or standard operating procedure for a T1 internet link?
5. Am I correct that if properly managed the ping time to google.com should be more like 50-100 MS with occasional timeouts (maybe 10-25% loss, from replies being dropped by the remote CSU/DSU
6. From the traceroute, I can ping hop 3 from my phone, (via Verizon 4G and get consistent, reasonable ping times, 50-60ms regardless of T1 utilization. Pinging hop 2 from my phone will be consistent with what I'm getting at that moment, pinging google.com from the T1 line. Am I correct in that this is further evidence that the issue is on the provider's end of the T1 line?
Thanks fort your help,
Tim D.
I'm volunteering for a non-profit that has a T1 WAN backhaul.
Normally, I can ping "www.google.com" and get ping times Minimum=15ms, Maximum=150ms average=18-50ms, depending on how many other users are using the web.
However if downstream bandwidth (From remote location to us) becomes saturated, ping times spike to 400-800ms.
I'll be the first to admit, I have loads of "Book-learning" so to speak on networking and computer technology, but first-hand, practical experience, particularly with WAN technologies, not so much. I have been subscribed to the "Security Now" podcast from Leo Laporte's Twit.tv podcast network, and remember Steve Gibson devoting a whole episode to the "Buffer Bloat" phenomenon.
If you are NOT familiar with this issue I'll refer you to the wikipedia page at http://en.wikipedia.org/wiki/Bufferbloat
TLR summary of buffer bloat:
More (Larger capacity) buffering by a router of packets that must go ouy over a low bandwidth network link is a bad thing. It causes packets to wait a long time (relatively speaking) to go through, and this can cause issues with timing sensitive communications, (Online gaming, like WoW, VoIP, Web Browsing,) and other real-time communications.
IIRC, Steve explained that excessively buffering packets causes TCP flow-control to break down, which can further exacerbate the problem.
With all of that being said, now, what I am asking of you is this:
1. Am I completely off base here? The discussion in the Security Now podcast was referring to Home routers where saturating the more limited upstream bandwidth caused reduction of available downstream bandwidth.
2. Am I correct in that saturating downstream bandwidth causing these issues means that the issue is on the provider's side of the T1 line, because that is where the bottleneck on the downstream side is.
3. does anyone have a T1 or bonded T1 that they can test like I have tested ours, (saturating upstream or downstream bandwidth and comparing ping times to a reliable major website.)
4a. Does a CSU/DSU count as a "Hop" across a router, and would it normally have an IP address (or two since it's a router)?
4b. If I traceroute www.google.com over our T1,
- Hop 1=192.168.1.1=Our Nat Router
- Hop 2=an IP adjacent to our public IP conforming to a /30 network. Ping is 1ms regardless of T1 utilization
- Hop 3=the source of the trouble. 9ms ping when T1 is idle, 400+ms when downstream is saturated
Hop 1's IP is the LAN interface of our NAT router, a 100Mbit Ethernet link from my computer to the NAT
Hop 2's IP is the 10/100baseTX interface of the CSU/DSU mounted on the wall near the NAT router.
Hop 3's IP is the T1 side of the CSU/DSU on the providers end of the T1 line, and this is where the choke-pont is, where the remote CSU/DSU has a 10/100 or greater link in, but only a T1 going out.
Am I interpreting the above correctly, and is the above typical, and/or standard operating procedure for a T1 internet link?
5. Am I correct that if properly managed the ping time to google.com should be more like 50-100 MS with occasional timeouts (maybe 10-25% loss, from replies being dropped by the remote CSU/DSU
6. From the traceroute, I can ping hop 3 from my phone, (via Verizon 4G and get consistent, reasonable ping times, 50-60ms regardless of T1 utilization. Pinging hop 2 from my phone will be consistent with what I'm getting at that moment, pinging google.com from the T1 line. Am I correct in that this is further evidence that the issue is on the provider's end of the T1 line?
Thanks fort your help,
Tim D.