intermittent problem - can't access server

kezs

Weaksauce
Joined
Nov 12, 2006
Messages
66
hello everyone.

I've been having an issue with my network and it's driving me mad.

I recently moved my office and kept the same network layout as before: a computer running windows server 2008 connects to a gigabit switch, which connects to a few computers and 3 other gigabit switches that connect to the rest of the computers.

well, for some reason one of the computers (dual xeon running xp 64 sp2) sometimes fails to connect to the server. it still sees the server, and the server can still connect to it and transfer files quite fast (70-85 mb/s). it's intermittent: the whole weekend their were working fine. now this afternoon the xeon stopped being able to access the server. restarting the xeon won't do anything, but restarting the server solves it.

I've got 13 computers in total, so I don't think it's got anything to do with number of connections or network usage (which sometimes peaks to 70% on the server but is generally between 0 and 5%). disabling firewall and AV on both ends won't do anything.

up until recently we were occasionally connecting a 3G modem to one of the computers, and I thought that was messing with the network. but it's been about a week now that we've stopped doing that (they finally installed dsl :) ).

so, any ideas? I have absolutely no clue anymore.

thanks in advance.
 
fails to connect how?
NetBIOS name?
IP?
FQDN?

can you ping it, just not browse it via UNC path?

or is it an application that connects to the server that's failing to connect?

what kind of network setup? AD? Workgroup?
 
can't access it through windows explorer (nor can any app). UNC/mapped network drives try to connect for a while and then I get the message "The specified network name is no longer available." I can still see it as a workgroup computer.

the server pings fine from the xeon. it resolves the name and everything. IP is assigned by DHCP. it's a workgroup network, no domain.
 
can't access it through windows explorer (nor can any app). UNC/mapped network drives try to connect for a while and then I get the message "The specified network name is no longer available." I can still see it as a workgroup computer.

the server pings fine from the xeon. it resolves the name and everything. IP is assigned by DHCP. it's a workgroup network, no domain.
j-sta's post also asked how it was specifying the server to connect to... in your case, it would either be "\\ComputerName" or "\\ComputerIPAddress".

Also... Is the server's IP address static, and outside of the DHCP license range?
 
Last edited:
can't access it through windows explorer (nor can any app). UNC/mapped network drives try to connect for a while and then I get the message "The specified network name is no longer available." I can still see it as a workgroup computer.

the server pings fine from the xeon. it resolves the name and everything. IP is assigned by DHCP. it's a workgroup network, no domain.

can it be accessed via \\ipaddress rather than \\computername ?

any errors listed in event viewer when it fails?

and it's only this 1 computer with the issue, correct?
 
thanks for the replies, j-sta and PTNL.

sorry: nope, \\ipaddress doesn't work either. event viewer doesn't list anything when it fails.

the server IP is also assigned by DHCP, and has been 192.168.0.103 since I can remember.

to me, everything points to a physical error, since we moved and kept the whole structure and configuration. but the server still transfers to and from the affected computer...

edit: that's what bugs me: not only the whole network configuration was the same before, but it also works after restarting the server. something happens that makes the server reject the computer's request. net stats server shows 23 permission violations, but that number doesn't change after I try connecting to the server again.
 
Last edited:
ok.. so workstation fails to connect to server, but server can still connect to workstation.
and only 1 workstation is affected. correct?

what AV application?
 
ok.. so workstation fails to connect to server, but server can still connect to workstation.
and only 1 workstation is affected. correct?

what AV application?

yep. I'm using avast on the workstation, rising on the server. but, like I said, I disabled both for a while (days), restarting both, and didn't help.
 
hmmm..

well, the only time I ever came across an issue similar to yours, except it affected all workstations, and all PCs were running Symantec. The symptoms were exactly alike; desktops could not connect to "server" but could ping it, and server could connect to desktop.

The fix to my particular situation was to modify the registry to incrase the IRPStackSize, as it was a (somewhat) known issue.

It's something you could certainly try on the server if you'd like.

HKLM\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters

create new DWORD called "IRPStackSize" with a hexadecimal value of 14.
 
thanks! I'll try that, let's see how it goes.

I just found out that a specific app (3d studio max's backburner monitor) is able to connect to the manager, which is running on the server. it retrieves the list of projects and all. it even retrieves the project files, but then it can't "find" the output path (which is mapped to the server).

well, I'll give your IRPStackSize a couple of days and I'll report back. thanks again.
 
does the 3d studio max have a server-side application running on the server, which the monitor on the workstation connects to? If so, it's very possible that the IRPStackSize fix should resolve the issue.

and actually, the hexadecimal value you may want to bump up to around 20+. I was thinking the default was around 10.

The IRPStackSize parameter specifies the number of stack locations in I/O request packets (IRPs) that are used by Windows 2000 Server, by Windows Server 2003, and by Windows XP. You may have to increase this number for certain transports, for media access control (MAC) drivers, or for file system drivers.

The default value of the IRPStackSize parameter is 15. The range is from 11 (0xb hexadecimal) through 50 (0x32 hexadecimal).

http://support.microsoft.com/kb/285089
 
does the 3d studio max have a server-side application running on the server, which the monitor on the workstation connects to? If so, it's very possible that the IRPStackSize fix should resolve the issue.

and actually, the hexadecimal value you may want to bump up to around 20+. I was thinking the default was around 10.

yep, that's the case. now that you mentioned it, I remember something similar going on back when our server was running XP (you gotta start somewhere, right?). though XP did have that 10 connections limitation so IRPStackSize didn't help much then. I changed it all the way to 30 (48), so let's see how that goes. once again, thanks!
 
bad news: the problem persists (already). :confused:

rebooted the server after modifying the registry, right? :p

and I confused myself with the value for IRPStackSize. Decimal value is 11 to 50 with default of 15. Not hexadecimal value.
 
rebooted the server after modifying the registry, right? :p

and I confused myself with the value for IRPStackSize. Decimal value is 11 to 50 with default of 15. Not hexadecimal value.

riiiiight :p

yeah, I figured that, since a decimal value of 14 would be making it smaller :)
anyway, it's 48 (decimal) right now and same problem. :/
 
and no errors on the server stating the server service stopped, or anything in the server's event logs?
 
yep, no errors at all. the workstation just doesn't find the server, or where it's supposed to save its rendered work (mapped unit from the server). like someone just disconnected it.

the event viewer does report several hundreds of errors and warnings from PrintSpooler (can't locate CutePDF driver, and the printer installed on the server is doubled as "redirected" or something). I doubt it's got anything to do with this, though.
 
good news, or terrible news: I'm having some trouble accessing the server from a couple of other workstations as well - though, so far, it just takes a really long time and doesn't error out on those. net stats server on the server shows 83 system errors.
 
The lack of a consistent internal DNS still has me skeptical that it is the real culprit... Set the server with an IP address that is outside of the DHCP license range (along with gateway, DNS, etc.), and add a HOSTS file entry on the Xeon that resolves to the server. Restart both machines, and see if it works any better.
 
thanks PTNL. I'm keeping your suggestion as a measure of last resort... because it implies structural changes that might affect the rest of the network which is functioning well.

but there's an interesting development to the story. yesterday I accessed the server via remote desktop connection using the problematic workstation with my server administrator password (no security problems there). even though the shared folders on the server are set to Everyone -> Co-owner, here's what happens: the network mapped drives still disconnect and apps can't access the server, but then I just log on to the workstation and double click the mapped network drives and they connect and work again. so everything points to a permission/security issue. I just don't know what that means.
 
You can temporarily try a HOSTS file entry on just the Xeon box to point to the DHCP-assigned IP of the server.

Might be worth it to try that on the one test machine, and see if the flakiness you noted on the permissions still persists.
 
You can temporarily try a HOSTS file entry on just the Xeon box to point to the DHCP-assigned IP of the server.

Might be worth it to try that on the one test machine, and see if the flakiness you noted on the permissions still persists.

good idea, thanks. that would be a more conservative way to test that solution.

what I just did was a net config server /autodisconnect:-1 on the server. since I only have 12 workstations, I don't think that's gonna be a problem, and I guess it should work as a workaround...
 
after 24 hours, I think I can safely say that the net config server /autodisconnect:-1 command solved my problems. it certainly doesn't explain why it started happening in the first place, but our good ole xeon is back from its forced vacations.

thanks for the input, folks!
 
after almost a year, I'm sorry to say I still have the same problem. I've tried everything. the server still disappears for random computers for a few minutes through the day. the server still sees every other computer. but sometimes even the server can't access its own mapped network locations.

it's driving my nuts. I'm about to switch to linux, but that's a lot of work. any measures of last resort?
 
Ok so.

Looking at what you wrote , you are running you entire network of the DHCP of ....either a router (DSL ?) or a scope on your server , but I am pretty sure the MS software wont allow that.
You are using netbios to resolve the names of systems / servers and connect to them.
Applications using the server and services work without many problems until they fall back on the network shares and in essence netbios again.

In my experience netbios issues are related to DNS , even though netbios can be very reliable when one has IP addresses that regularly change (what ever the reason) it becomes more and more unstable.

Is it an option to enable DHCP, WINS and DNS service on the server or an other similar setup, a linux distro can do it to of course. Then change routers , servers and possible y printers and such to static addresses, only leaving clients dynamic. This would enable DNS registration on the client and updating it so al clients use the same DNS info and the server is always at the same spot. Much of this can be done with most routers as well but I have found not half as dependable as a server. And honestly in one year how many man hours have you not lost due to this ? :)

Last thing is DNS from your internet, as I assume you moved ISP connection , they might be loosing a last resort resolver which redirect all no resolvable names to a default address. I use OpenDNS for example and if I would not be using a local DNS server local addresses would resolve to their default web redirect, and as such making the PC or server unreachable.
 
Back
Top