Server 2008 intermittent Lockup, HELP

acoda

n00b
Joined
Mar 24, 2009
Messages
13
Hey All,

I am not sure if this is the right place to post or not but I need some help from any sysadmins/IT B@USSES out there, cause I am at my wits end.

TLDR: HP server locking up, when it comes back online looks like nothing happened, Only clues is failed to process Group Policy, has been a chronic/progressive issue.

TAKE COVER! Long story incoming,
I work for a small IT shop and I have a client that has an HP Proliant server running server 2008 R2. This is a small business worth a lot of money, 6 total users in the system. They do not run exchange, they have 2 main software packages on their server (one is a scan management software, and another runs a SQL DB), these run locally on client machines and they do not use terminal services. Mostly their server is for domain security and file service. Now, about 7 months ago their server started to drop network connectivity at random intervals no rhyme or reason but they would go down for a few minutes and then come back up. At this time there were no major changes to the server. This got progressively worse for a few weeks until they were down for an hour or more at a time. The weird thing is that they would come back up and there are no warning signs of issue in Windows error logs, NONE, ZIP NADDA. WE had HP warranty replace everything sans case and PSU. After the replacement of the HDD's(mobo RAID-1), mobo, processor, and ram things got a lot better for a bit. However, now they seem to be getting worse again. Nothing they are running is that taxing, their system usage stays below 60% at nearly all times (my rig stronger than their server). I have only been on the sever before it goes down twice, both time it was plenty snappy, sys usage low, and then sluggish for about 20s and then nothing, lockup for indeterminate time, then POP back up and everything is fine.
When the server goes down I can still ping it, but nobody is home. When I am local and look at the monitor it just locks up, and there is no responsiveness for a period of time (1-30min) and then it just pops back to normal, I can get everywhere and all network drives come back like nothing happened. The ONLY clue I have that anything happened is in system error logs, it says GP failed to process. These failed GP errors are also local on the client machines. When the system is up I can get to the policy location just fine. I have removed AV from the system, firewall is off, nearly everything has been stripped to bare bones. Group Policy has been stripped to a minimum so security is not where I want it right now. I want to uninstall the server packages but getting them at a good down time and then getting the vendors to reinstall would be a damn nightmare (all my homies in the trenches know all about that). My boss is MCSE and has been doing this for two decades but he does not know what to do and does not have the time to devote to it. I have only been in server game for 18mo and I cannot think of what the heck to do.

So this is my plea to my [H] betters out there, any suggestions on what to try next would be supremely helpful. I have tried to look into OS monitoring but I need something that does not cost $$$$$ and generate GB/day of log files as they only have 50Gb of free disk space.
 
Last edited:
Quick and easy/cheap test, try running a VMware converter on the pc to convert it to a VM, then see if that VM also locks up/drops at random
If it gets sluggish, that sounds like memory or disk access, Av was uninstalled..ok, what about turning off the VSS (snapshot service)
Im assuming the DNS on the network is working fine, the DNS server is that server, is that also doing DHCP?
 
This client does not really like the idea of me taking a copy of their info off site (lawyers need to stop talking) are you thinking I just throw it on a spare desktop that they have on site? I will try to turn off VSS over a weekend when no changes are being made and see what happens. Yes the server is running DHCP and DNS roles. neither of these are effected when it locks up users can still resolve sites and nobody has disconnected while it is down. I have never converted to a VM I see the program is available, what version of VM should I use if I virtualize?
 
depends on what hardware you have available...can use ESX if you want, can use vmplayer too
 
Back
Top