Windows Server 2003 - Shares Appear to Drop Randomly

Desperate need for a solution here as well....exactly the same problem occurs here and there doesn't seem to be anything to fix this so far....

We literally tried everything here from permissions to DNS to reinstalling shares to change NICs....nothing seems to help.

Windows 2003 latest patches and MSSQL Server 2005 installed on it.
 
Yup still had the problem happen yesterday, and both servers at this site are running with 100% clean event logs....there are no errors being generated.
 
Windows doesn't always log Kernel meory shortages. Are you sure your not running out of Kernel memory ?
 
windows 2003 here with XP clients. We have mutiple win2k3 machines with shares that don't show this problem, only the one. Also none of our machines have DHCP but all of them are statics.
 
Yeah check the LMCompatibilityLevel reg setting, make sure it matches the rest of your domain. If say the broken machine is set to '2' and the rest are '5' you will have issues.
 
Sounds like a netbios/browse list problem. Have you tried mapping the share via NET USE or explorer during the downtime? I would try narrowing this problem down to UNC specific and then start looking at MUP cache/WINS, etc.
 
All DNS records are fine, no duplicates on the network either. WINS is not even in use at all. I even went as far as checking the reverse records to no avail. error.
 
so the OP and a second person have a sql server dropping shares? Typically we don't combine the two. Have either of you tried stopping the SQL server to see if the shares ever drop? That would be the next thing I'd do.
 
so the OP and a second person have a sql server dropping shares? Typically we don't combine the two. Have either of you tried stopping the SQL server to see if the shares ever drop? That would be the next thing I'd do.

Sounds like a reasonable idea to try. However, this is a company critical server that we can't just "stop" lol

Especially since the problem does not occur in a pattern but randomly. We can't take the SQL server offline for days in hopes the problem occurs or not.
 
so the OP and a second person have a sql server dropping shares? Typically we don't combine the two. Have either of you tried stopping the SQL server to see if the shares ever drop? That would be the next thing I'd do.

Unfortunately, the actual application is dependent on these shares. It does not connect to the DB directly. Don't get me started on that...
 
I've been told by Microsoft Support the root cause of this issue maybe Symantec Endpoint 11...anyone else running this?...I am reading through a thread on their forums talking about this exact same issue. Microsoft told me to contact them regarding an upgrade to 12 free of charge...apparently 11 had a lot of problems.
 
I've been told by Microsoft Support the root cause of this issue maybe Symantec Endpoint 11...anyone else running this?...I am reading through a thread on their forums talking about this exact same issue. Microsoft told me to contact them regarding an upgrade to 12 free of charge...apparently 11 had a lot of problems.

Not running SE here.
 
Well other than pointing to possible problems with Symantec....my one server which was the most problematic seems to have been fixed with doing these steps from Microsoft. However I think this only applies if your running DNS on the server.

* Removed Hot fixes KB951746 and KB951748 and rebooted.

* Created these two registry keys:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DNS\Parameters\SocketPoolSize

Dword value should be 500
--------------------------------------------------------------------------------------------------------

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\MaxUserPort

Dword value should be 65535

* Restarted Server and then reapplied hotfixes KB951746 and KB951748 and rebooted.
 
It's been over a week since I sent the logs to MS Support... Still no update. This is extremely frustrating.

mlowe01, I will check out your suggestions.
 
Thought I'd share with thread a recent experience I had...

I had a server at one client that started to exhibit issues somewhat similar to the OPs issue here. This server was a Small Business 2K3 R2 server that was introduced to existing active directory following a Microsoft article on installing SBS into existing AD, which I've done quite a few times without problems.

This particular server would become rather unresponsive at times...and these times would be random. During some weeks..it would happen once or twice a week....other times...it would go running smoothly for weeks before this problem surfaced. It would stop shares, login for client workstations would become painfully slow (typical sign on a DNS problem). Logon to the servers local console, or RDP, would be painfully slow until the desktop came up. Bringing up task manager would show the lsass.exe process running away with CPU utilization. A reboot of the server, which would take about 1/2 an hour once it exhibited these signs..would cure the issue once it came back up. Once the problem surfaced..shares would drop, clients logins became slow, sometimes DNS would hang up so much that clients couldn't even surf the 'net.

I had the case open with Microsoft for over a month...going through several techs. Finally the 3rd tech to EasyAssist into the server seemed to fix the issue...as I'm going on 2 months now without the issue returning.

Below is what he had in the e-mail to me from the case....

"PROBLEM: server goes unresponsive
RESOLUTION: We Check the lanmanserver parameters for a sizreqbuf and maxworkitems values, try setting the values as such:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters]

"SizReqBuf"=dword:00001104

"MaxWorkItems"=dword:00002EE0
"
 
Thought I'd share with thread a recent experience I had...

I had a server at one client that started to exhibit issues somewhat similar to the OPs issue here. This server was a Small Business 2K3 R2 server that was introduced to existing active directory following a Microsoft article on installing SBS into existing AD, which I've done quite a few times without problems.

This particular server would become rather unresponsive at times...and these times would be random. During some weeks..it would happen once or twice a week....other times...it would go running smoothly for weeks before this problem surfaced. It would stop shares, login for client workstations would become painfully slow (typical sign on a DNS problem). Logon to the servers local console, or RDP, would be painfully slow until the desktop came up. Bringing up task manager would show the lsass.exe process running away with CPU utilization. A reboot of the server, which would take about 1/2 an hour once it exhibited these signs..would cure the issue once it came back up. Once the problem surfaced..shares would drop, clients logins became slow, sometimes DNS would hang up so much that clients couldn't even surf the 'net.

I had the case open with Microsoft for over a month...going through several techs. Finally the 3rd tech to EasyAssist into the server seemed to fix the issue...as I'm going on 2 months now without the issue returning.

Below is what he had in the e-mail to me from the case....

"PROBLEM: server goes unresponsive
RESOLUTION: We Check the lanmanserver parameters for a sizreqbuf and maxworkitems values, try setting the values as such:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters]

"SizReqBuf"=dword:00001104

"MaxWorkItems"=dword:00002EE0
"

I caution anyone from just changing thse vaules unless you know what you are doing. These two settings will have adirect imapct on your Kernel (nonpaged pool) memory useage. If set in correctly it will cause your system to hang. Modifying these value can help in some cases but I would advise against changing these values unless advised by Microsoft support.

http://support.microsoft.com/kb/232476
http://support.microsoft.com/kb/317249

btw x64 solves a lot of these issue because its not limited to 256Mb of nonpaged pool memory and 470MB of paged pool memory.

http://blogs.technet.com/askperf/ar...-management-understanding-pool-resources.aspx

The #1 cause for this issue is storing pst files (that your use connect outlook to) on your file servers.

http://blogs.technet.com/askperf/archive/2007/01/21/network-stored-pst-files-don-t-do-it.aspx
 
Still no resolution here... I think I've received better support response from my home ISPs...
 
Yep, that's the support I was referring to.

You have a case number with them? (SRX)
Usually they call me to death til the problem is resolved..and they can close the case. Usually on top of things...they call you each day to see how things are progressing.
 
You have a case number with them? (SRX)
Usually they call me to death til the problem is resolved..and they can close the case. Usually on top of things...they call you each day to see how things are progressing.

SRZ081105000021

It's been hit or miss. Admittedly, part of the problem is lack of regular physical access to the problem and lack of client response as well.

I receive e-mails from them almost daily. The extent of the e-mail is simply 'I want to ensure I am doing everything, please confirm you are receiving my communications". However, when I provide additional information, logs, or something else they virtually never respond unless I ask multiple times.

They provided an app to generate the kernel memory dumps that I could run over TS, but all the dumps were corrupt (at least 3 of them). So they claimed it was their application causing the issue. I finally uploaded a 'good' dump from a keyboard generation, and now they lost the workspace where I uploaded it.

I've had to copy the technical lead and manager on the case on about 10 different occasions just to get a response from an e-mail.
 
Ok, I've been watching this thread for a long time now as one of my clients had a VERY similar symptoms and this was the only article that was almost spot on. My client started having random 'lockups' of their client server application. This application uses both UNC pathing to shares on the server and COM+ on a Dell PE1800 with Server 2003 R2 SP2. They started complaining that they couldn't access anything on the server via UNC and shortly after that the app (and server) would lock up altogher. After a ton of troubleshooting with DNS it was ruled out. We tried everything that we could think of and find on the internet including what was in this article without luck. Again, it was random at best. Sometimes it would go a couple weeks without a lockup and other times it would only be a couple hours. Finally, they couldn't take it anymore and bought a new server (they wanted to upgrade anyways) and asked me to migrate the apps off it to the new server. So off I went to move it all over...(here comes the best part...)

So I installed all the applications on it and started moving all the clients shared folders...bang...it hung up again after getting an access denied on one particular folder! The only real option was to hard boot it to unlock it. So I tried it taking ownership of the folder, as soon as I did it locked up again. Well after looking a little closer I noticed that the folder was encrypted (which oddly enough was the only one on the server). So I tried to un-encrypt it and move it one more time...well it locked the server AGAIN!! (seeing the pattern here??). Anyway, long story short there was one particular user that was encrypting her files on the server (obviously didn't work too well) and everytime she tried to access them the server locked. After asking the cleint some more questions the whole story fit together with this one user and the 'randomness' of when this occurred. This explained why it was numerous times one day and then wouldn't happen again for a week or two...

Once those files were 'cleaned up' things went back to a life without lockups. I hope this helps at least one other person who is going through the hell that we did (and the kicker was, they were blaming our app when it was the user the whole time!).

I hope this helps!!!

Good luck...
 
Ok, I've been watching this thread for a long time now as one of my clients had a VERY similar symptoms and this was the only article that was almost spot on. My client started having random 'lockups' of their client server application. This application uses both UNC pathing to shares on the server and COM+ on a Dell PE1800 with Server 2003 R2 SP2. They started complaining that they couldn't access anything on the server via UNC and shortly after that the app (and server) would lock up altogher. After a ton of troubleshooting with DNS it was ruled out. We tried everything that we could think of and find on the internet including what was in this article without luck. Again, it was random at best. Sometimes it would go a couple weeks without a lockup and other times it would only be a couple hours. Finally, they couldn't take it anymore and bought a new server (they wanted to upgrade anyways) and asked me to migrate the apps off it to the new server. So off I went to move it all over...(here comes the best part...)

So I installed all the applications on it and started moving all the clients shared folders...bang...it hung up again after getting an access denied on one particular folder! The only real option was to hard boot it to unlock it. So I tried it taking ownership of the folder, as soon as I did it locked up again. Well after looking a little closer I noticed that the folder was encrypted (which oddly enough was the only one on the server). So I tried to un-encrypt it and move it one more time...well it locked the server AGAIN!! (seeing the pattern here??). Anyway, long story short there was one particular user that was encrypting her files on the server (obviously didn't work too well) and everytime she tried to access them the server locked. After asking the cleint some more questions the whole story fit together with this one user and the 'randomness' of when this occurred. This explained why it was numerous times one day and then wouldn't happen again for a week or two...

Once those files were 'cleaned up' things went back to a life without lockups. I hope this helps at least one other person who is going through the hell that we did (and the kicker was, they were blaming our app when it was the user the whole time!).

I hope this helps!!!

Good luck...

I assume you mean using the built-in Windows encryption? I checked today and it doesn't look like any folders or files are encrypted. Bummer.

Did ALL of your UNC paths drop, or just that one?

My case is still open with MS and has been escalated to the US teams (performance and networking). They were having a rough time getting a memory dump that wasn't corrupt, and they're also a little confused by the problem. It's has been a much more pleasant experience to work with these guys. The first line or two was extremely painful
 
Have you checked when the systems are updating their time? I've seen a similar problem where a 2003 server that got a time which was two hours out from an internet time service. Even checking the system logs (source:kernel-general) revealed that the machine i'm using right now (2008) has been out by nearly 5mins in the last month.

It should read:
The system time has changed to 20/02/2009 3:00:25 pm from 20/02/2009 3:00:25 pm.
Not:
The system time has changed to 10/02/2009 4:43:30 pm from 10/02/2009 4:39:09 pm.

All system in a domain environment must have a time within 30mins of each other, and its worth looking into...
 
I'm assuming it was the Window's encryption but I never did find out for sure. We were working with some onsite IT people who took it over once the issue was identified. They cleaned it up from there. And yes, all UNC failed, access to shares failed, RDP to to the server failed (hung after login) and remote management failed. Basically, anything you tried to do to gain access to the server was locked out. DCDIAG would come back with a bunch of entries with "error 64, The specified network name is no longer available". After a reboot, the dcdiag was clean. Occasionally it would unlock itself and normal operation would resume but that wasn't too often.
 
We are experiencing the same problem with shared folders dropping. It there a quick way to search for encrypted files?

Thanks.
 
I used explorer and located the shares and viewed the contents. The encrypted files will show in green font (at least they did for me)
 
Unfortunately we do not have any encrypted files. I ran the command efsinfo /s:d: | find ": Encrypted" to do a search.

Any other ideas why our shares drop randomly? We also notice that from time to time the server will stop allowing remote desktop sessions and the console will not allow us to log in, forcing us to physically shut it down.
 
Unfortunately we do not have any encrypted files. I ran the command efsinfo /s:d: | find ": Encrypted" to do a search.

Any other ideas why our shares drop randomly? We also notice that from time to time the server will stop allowing remote desktop sessions and the console will not allow us to log in, forcing us to physically shut it down.

Setup a perfmon capture. Look at your Kernel memory useage.
 
I am experiencing this problem since last 6 months and until now no solution. I would like to share my experence as well which may help in solving this problem. We have started facing this problem in the HP Prolient DL 380 G5 with Windows 2003 Server Standard Edition, then we planned to move the data to another Server, and purchased HP Prolient DL 185 G5 Server with Microsoft Windows Storage Server 2003 R2 installed in it (all the latest patches has been updated). It was working fine with new data and new files. After a period of time we have moved the Data from our old Server to the New Server and kept the same access and share level permission. But unfortunately after that from the day 1 probelm has been started in the New Server also and it is still continueing. Another thing is that only .xls, .pdf and .doc files are residing in the Server (it is purely using as a File Server). Some times it will work without causing any problem for 2 days and on the other hand some times on the same day it gives trouble more that 5 times and every time we need to restart the Server to retain the share access from the Client computers.

I have tested all the trouble shooting steps which has been mentioned in these threads, but of no use.

Now we are thinking to upgrade to Windows 2008 Server. Can anyone has any suggettion on the same...
 
OK, I own an IT company and a freind of mine who also owns his own IT business in another city called me today about this same issue. Now, he noticed that not only are you unable to access the shares until you reboot that when he goes to the properties of the shares they are no longer shared... As well it's not all of the shares just some of them, and never the same ones.

My first questons to him were this, How much RAM is in the system?
What type of server hardware, and so on.

Most of the questions I asked him are here except the amount of RAM.

His system is a DELL poweredge 1900 and 1GB RAM running SBS 2003!
1GB RAM are you NUTS! Is what I said. (He didn't setup the system initially and the client is a bit odd)

Well he wasn't in front of the system when he called and, I wish he was, I asked him when he gets in front of the system to check the virtual memory. If the system is running out of resources this could cause the problem as I have seen this happen on desktop machines in a peer to peer setup before, never on a server.

For your virtual memory settings make sure you have it, as I preffer, 2.5 times your ram.
If you have 4GB make it 10GB, 2GB make it 5GB. And don't be afraid to make it start and end at your respective #, i.e. manual settings. System managed is OK but manual is better as you will know for sure that you have enough virtual memory. Also when I setup a system I always create a virtual memory partition to host the page file.

I haven't heard back from him just yet so and may not this week. But if this helps you guys with your issue let me know and I will pass it along to him.

-NS
 
Back
Top