Ipsec, a little help perhaps.

xphil3

[H]ard|Gawd
Joined
Nov 11, 2005
Messages
1,212
So im stumped, and I thought that maybe I could get some other insight into my little problem. Here is the scenario:

Two broadband(15mbit+ down each, 1.5mbit+ up each) are connected via a lan to lan IPsec tunnel. Since I have around 5 different networks behind each of these routers Im running a routing protocol between the routers(ospf, tuned down quite a bit). Since im running a routing protocol, I must also run GRE tunnels between the routers, so im running IPsec over GRE.

The GRE tunnels are configured as follows:

interface Tunnel101
ip address 192.168.254.1 255.255.255.252
ip mtu 1400
ip ospf authentication message-digest
ip ospf authentication-key
ip ospf hello-interval 120
ip ospf 50 area 501
keepalive 10 3
tunnel source FastEthernet0/0
tunnel destination x.x.x.x
tunnel path-mtu-discovery
tunnel bandwidth transmit 15000
end

as you can see, MTU has been changed to 1400 so Im not fragmenting anything when data is put into the tunnel(was having problems with a few applications). Both sides are configured the same, just inverse with source and destination ips/interfaces. Hopefully that description is good, the problem is this.

When I start an SSH connection from location A to a location B linux box its fine, I can log in and run any command I would like, but when I try and run something like top, or ps(something with large or continuous output) the entire ssh session locks up and I have to close out and reconnect to the host. This connection is going over the GRE tunnel and being encrypted with IPsec. Now, when I connect to the same server, but this time not going over the tunnel(directly with location A's public ip address and ports forwarded on its router) I can run top fine, run through all my processses, run yum, etc, so I know the problem isn't the server.

The weird part is, I have run music, movies, remote desktop over the tunnel without any problems, and still can run them fine.. the only problem im having is with this SSH connection to this host through the tunnel! The resources on the routers around 5% tops CPU utilization, I have SSH'd into the far end routers console as well through the tunnel and ran a show tech without my ssh connection dying. I have no idea what the problem is.

Any thoughts/recommendations? Was thinking about upgrading IOS, though im on a very recent release as it is. Any help is greatly appreciated.
 
PMTU discovery needs to have ICMP functionality to work. If you statically configure MTU on both sides to 1400, why do you need this line? Not sure if this will fix anything, but I was curious.

From the conditions you described, it appears as if UDP traffic is fine, but TCP is having issues. Have you tried adjusting the TCP MSS on the interface? Also, you know this better than I would.. the latest IOS is rarely the best IOS :). Maybe you should search the bug tracker to see if anything pops up on your version.

Interesting issue... let me know when you resolve it.
 
PMTU discovery needs to have ICMP functionality to work. If you statically configure MTU on both sides to 1400, why do you need this line? Not sure if this will fix anything, but I was curious.
heh, nice catch. That was supposed to be removed. Part of my testing to see if the MTU wanted to be lower than 1400. ICMP is also functioning between the two interfaces but it still made no difference.

From the conditions you described, it appears as if UDP traffic is fine, but TCP is having issues. Have you tried adjusting the TCP MSS on the interface? Also, you know this better than I would.. the latest IOS is rarely the best IOS :). Maybe you should search the bug tracker to see if anything pops up on your version.
Yes, I have adjusted the TCP MMS on the interface to the calculated overhead, and a bit below(like the 1400). Same problem. Not sure if TCP is the culprit here, I have successfully run webcam's over the tunnel as well as webcam streams over http all using TCP.

Interesting issue... let me know when you resolve it.
[/QUOTE]
I hope I do resolve it, what I want to do is throw openssh on a different box and see if I have similar issues. Kinda weird how it works on the router which is terminating the tunnel and not on the server which is behind the router. Still, both TCP connections are being established over the tunnel.
 
If I were you, I'd run a telnet daemon on that box temporarily and see if that's more stable. Run the same commands, etc.. I realize that the payload will be entirely different, and probably smaller, but try to take smaller steps and build up from there. Also, you could also try generating an ssh session from the source router to the destination server.

Although, from what QHalo said, and the fact that you can ssh to the terminating router, I wonder if the server is sending a frame > 1400 and is getting rejected at the router. If you to a tcpdump on the server, see if you get any ICMP type 3, code 4 relating to the don't fragment (DF) bit. If you do, this is the problem.

Oh, and one stupid question... you don't have any OSPF route flaps when this issue occurs, right?
 
update:

Not to double post, but just brought down the TCP-MMS to 1320(guess I have a bit more overhead than I thought) and its working perfect now. Didn't think that it would have required that much offset. Im glad that I read your post just2cool, made me go back to it :D

Thanks!
 
Back
Top