Network crazyness

ashman

Gawd
Joined
Mar 28, 2011
Messages
811
Ok, I am not sure what the hell is going on, but something is wrong either with one of my servers, or something else, I'm not sure and would like some advice/tips.

I recently bought a Cisco SG200 26 port gigabit switch to use at home to replace an old 16 port gigabit Dell switch I had been using for years. I have about 12 drops throughout my house, most of which are in the basement where my office is as well as my equipment. Everything was fine when I replaced the switch, upstairs I had an 8 port Dlink gigabit switch into which an xbox, boxee box and few other things are plugged into, it sits behind my tv. In my office I have my machine and three others, one of which is a physical server running Server 2012, it has 16 drives internally and acts as my backup server for two of my 5 NAS's, it runs a windows program called Second Copy and copies media to two RAID 5 partitions. In the past I've had no issues with this server, its run, fine, but recently I replaced two Dell Perc 5i raid controllers with one LSI 16 port SATA RAID controller and just got it up and running again. For some reason now, this server is having network issues, I am not sure why, its got an Asus P5Q premium board, Core 2 Duo 8400, 8GB of RAM, works well and has for years. Now though, when I run a second copy job, it runs sometimes for 5 minutes, other times for longer, but inevitably it ends with the message that the network drive its copying from is no longer available - this would be one of my two NAS's, I've tried mapping the drive via hostname and IP, makes no difference. I have my own windows domain, this server is member server, there is nothing in the event log that shows the network connection is dropping. I made some physical changes to my network yesterday while trouble shooting this problem- I removed the 26 port cisco switch, plugged everything into a backup Dell 16 port gigabit switch (unmanaged) and three of my NAS's are connected to an 8 port Cisco SG200 switch (managed) where I am using LAG to bind the two network ports from each NAS. Also where the backup server is, it and another machine are now connected to an 8 port Dlink gigabit switch (unmanaged) which is then plugged into a network drop. Now when I did a ping from the backup server to the NAS, the pings were all over the place, there was latency and dropped packets everywhere, I'd get maybe 3 or 4 pings, then two or three timeouts, and repeat. Now, since the physical changes I can ping the NAS from the backup server without any timeouts, but the second copy jobs are still failing with the error "The specified network name is no longer available" Also when I RDP into the backup server from my Mac (Hackintosh) sometimes it freezes or is slow to respond. I did have a static IP on the backup server, but am now using DHCP, doesn't seem to make a difference. The backup server's motherboard, Asus P5Q premium has four onboard network interfaces, I am using a different one now then before, still doesn't seem to make too much of a difference. I also tried using an Intel Pro desktop network card and it was also dropping packets wildly, but that was before I made the physical changes. I am not using jumbo frames, no VLANS, its pretty straightforward. I have replaced the network cables at both ends as well.

Sorry if I am rambling, I hope its coherent.
 
attachment.php
 
everything on the same subnet? what is routing everything? am i connect in understanding that you have 3-4 switches all plugged in together?

sounds like you need to power down everything and bring them up one at a time... starting with the master switch that all the other switches are plugged into.
 
+1 to what notarat said. Learn to at least split your thoughts into segments (paragraphs). :)

jjandrob has a good thought. If you don't want to power down everything first, at least disconnect the links to the other switches and re-test.

I've seen similar problems when someone creates a broadcast storm (via Ethernet loop) on a 100Mbit (unmanaged) switch hung off a gigabit network. Since the 100 Mbit uplink isn't enough to saturate the gigabit capable uplink, the gigbit connections on the other switches continue to mostly work, but start behaving oddly. Not saying that's your problem, but just using it to illustrate that you might need to think "outside the box" to troubleshoot.

Have you tried a packet capture on a random port, to look for oddities? How about a packet capture on the server during the file transfer? Any oddities at, or just before the point of failure?
 
Sorry, that was poorly written and expressed, I can do better then that.

My network looks like this:

Main switch at the moment is a 16 port Dell unmanaged gigabit switch. Connected to it are an 8 port unmanaged Dlink gigabit switch, an unmanaged 5 port Negear switch and a managed 8 port Cisco switch. I do have a managed 26 port Cisco SG200 switch but that is not in play at the moment.

Three of my NAS's are connected to the 8 port managed Cisco switch and I am using LAG combining the two ethernet ports from each NAS.

I have not done any packet sniffing or powered down switches.

The problem seems to be with my one backup server and the fact that its dropping packets for some reason. When I run a second copy job on this server to copy data from one of my NAS's to this server, it invariably ends abruptly with 'the network name is no longer available' error in Second Copy. When I RDP into the server, the drive mapping is there and connected. Very odd.
 
my advise still stands.

power everything off and bring the switches up first. then your computers. See if that resolves your issue.
 
two things I have learned about networking gear:

1. Hubs/switches tend to fail one port at a time.

2. A failing network interface will tend to lose/scramble more and more packets over time, until it becomes unusable

I might try disabling the LAG, use the NAS first on one port, then the other.

I might also try limiting the NAS in question to 100mbit
 
Back
Top