Terrible Windows Server 08R2 Performance

USMCGrunt

2[H]4U
Joined
Mar 19, 2010
Messages
3,103
I have a custom built machine running a fresh install of server 2008R2. I5-2500, 16GB of ram, os installed on a raid 5 array. Initially, this thing ran like a bat out of hell but it seems like ever since it joined the domain it's slowed down to the point that from bios handoff to server manager has full started takes about 15-20 minutes. CPU utilization is nearly none, RAM utilization is no where near using the page file, and the array is fine. Originally, I thought it was a RAM issue so I replaced the RAM. A few weeks later, active directory says it was taking too long to get things done and suggested faulty hardware, the HDD was constantly running so I replaced it with this hdd array and reinstalled everything fresh. Still getting the issue, to the point that I can install Adobe reader....it.just hangs at 97% for like 30 minutes before I kill the process. The server has the following roles running:

ADDS
DHCP
DNS
FILE SERVICES
PRINT MANAGEMENT
IIS
WSUS

I'm at a loss for what's going on and don't know how totrwck down what's holding things up. Any ideas??
 
ran memtest?

could be a lot of things, next thing I would do is boot to a knoppix or ubuntu live cd and see if everything is snappy and copy some files across the network.

Then in windows update all drivers.
In particular the network and storage drivers.
Once you install AD everything is done over the network interface and if its bad you get no end of problems.
Run performance monitor to see what is getting hit.
 
Tried different ram, different ram slots, different brands, different quantities with no change. All drivers and OS is up to date. I've got perfmon up but nothing stands out at me as being an issue other than when acting upon a program, say closing out server manager, the process stops responding for a few moments before closing and turns red in perfmon.

Could there be a BIOS setting that does this? I stopped WIDS and wsus thinking maybe it's a weird sql issue but it didn't change anything.
 
It's a fresh install, nothing has been done to it to expose it to malicious code.
 
Ya

1, Raid 5 and what drives / raid card Please do not say anything over 146G or so...cause then you should be on raid 10, and only raid 5 if you got some decent sas drives.

Also, did you check your domain policies?

After what role did you install did it get slow?
 
Ya

1, Raid 5 and what drives / raid card Please do not say anything over 146G or so...cause then you should be on raid 10, and only raid 5 if you got some decent sas drives.

Also, did you check your domain policies?

After what role did you install did it get slow?

Hardware RAID 5 using the onboard RAID controller from the motherboard which is a Gigabyte GA-Z68X-UD3H-B3 with a secondary RAID 1 array using the Marvel RAID chip built-into the motherboard for non-OS duties (WSUS updates, network shares, etc...). Using WD Red 1TB drives that replaced a Seagate 500GB HD.

What policies am I looking for that could be the culprit?

I am not 100% on which role it started after but I think it was after ADDS. I had to go through dcpromo a couple times before it was able to successfully replicate all of the AD partitions...the net connection seemed to be having issues at the time.

The only hardware left that would be the issue is the motherboard/CPU. I've replaced the RAM, tried other sticks of RAM, replaced the one HDD with multiple, higher quality HDDs and replaced the PSU. This problem was happening before with the single hard drive, obviously not using the RAID controller, so I couldn't see how that would be the cause of it, unless its just the motherboard crapping out. I have other motherboards I could throw in...I just don't want to go through the process of rebuilding the server again and it not be the cause...it took about 14 hours to get it back up.
 
AD has the POTENTIAL to slow down a machine, esp if you install it to a separate partition in order to get protection if the drive were to crash with the OS on it, so you could gain access to it. It also depends on how large your AD environment is.

Just a guess. Try to uninstall AD services and see what happens. AD on 2008 R2 usually only takes a few mins to install and uninstall.
 
I can't quite recall but I'm pretty sure I installed all the ADDS stuff onto the OS drive...can't remember what they're called...sysvol, log file, and something else I believe....but I left the three at their default install paths. I will see if I can get it uninstalled and reinstalled tomorrow after work hours.
 
Those onboard raid "controllers" are software raid. A raid 5 on one will hit the cpu pretty hard and my experience is bad performance. Raid 1's generally are not that bad.
 
I have a custom built machine running a fresh install of server 2008R2. I5-2500, 16GB of ram, os installed on a raid 5 array. Initially, this thing ran like a bat out of hell but it seems like ever since it joined the domain it's slowed down to the point that from bios handoff to server manager has full started takes about 15-20 minutes. CPU utilization is nearly none, RAM utilization is no where near using the page file, and the array is fine. Originally, I thought it was a RAM issue so I replaced the RAM. A few weeks later, active directory says it was taking too long to get things done and suggested faulty hardware, the HDD was constantly running so I replaced it with this hdd array and reinstalled everything fresh. Still getting the issue, to the point that I can install Adobe reader....it.just hangs at 97% for like 30 minutes before I kill the process. The server has the following roles running:

ADDS
DHCP
DNS
FILE SERVICES
PRINT MANAGEMENT
IIS
WSUS

I'm at a loss for what's going on and don't know how totrwck down what's holding things up. Any ideas??


Where exactly does the slowdown happen during boot? What screen takes the longest? Are there other DCs that this server can reach or is it the only one on the domain?

And why does it matter how long it takes to boot? Are there other performance issues or just slow booting? This box should be running 24x7 and should only have to boot for updates.
 
Where exactly does the slowdown happen during boot? What screen takes the longest? Are there other DCs that this server can reach or is it the only one on the domain?

And why does it matter how long it takes to boot? Are there other performance issues or just slow booting? This box should be running 24x7 and should only have to boot for updates.
The Windows logo screen seems to take a usual amount of time, it seems the slowdown starts once it gets past that. I will get a mouse cursor with nothing else for about 30 seconds and then loading of group policies takes another 5-7 minutes and then when you have to hit ctrl-alt-del, in doing so the message goes away but just sits with a blank screen for about 10 seconds and then the login screen is slow to pop in and there's a slow response between hitting a key and it actually showing the character on screen. I mention the startup time as its something most people are familiar with, even if its relevancy to a server isn't. The whole thing is performing like crap, even though its not server grade hardware, its still capable of performing. I have MMCs (Including Server Manager) hang for 30 seconds to a minute and occasionally crash out altogether. Managed to get Symantec Endpoint Manager installed last night but, after it taking 30-45 minutes to crawl through the install, I watched it for about two hours initializing its database before I went to bed, not sure when but when I woke up in the morning it was complete. Opening File Explorer or any application at all is an extreme exercise in patience. I think the only way you could get a real sense as the kind of performance its exhibiting is to install this same load onto a Celeron single core with 512MB of RAM.

I've monitored performance monitor and the disks are going buckwild with reads/writes, CPU activity is around 20% with most of that coming from the NT System & Kernel process, RAM usage is around 3.6GB in use, 10GB cached, and then the rest free (16GB total) so I know its not jumping in and out of the page file. I've monitored the average disk queue and there are short periods every 15-30 minutes where it will jump to 50 for a few seconds but outside of that the queue stays <5. I've managed to get process explorer installed and dug into it looking for excessive threads but there isn't any.

So after screwing with this trying to figure out the bottleneck, about 90% of it suddenly disappeared without me doing absolutely anything. The system was still slightly, very very slightly, slow to respond, to the point that you wouldn't notice it if you weren't looking for it like I am. I did notice that the RAID management software that came with the onboard controller is initializing but its going stupidly slow. It's been initializing since.....saturday night or sunday morning and its still only at 45% as of a couple hours ago. I would imagine though that any kind of disk activity would show up on these performance monitors but there's nothing to indicate the system is starved for resources other than actual the experience. It was at the point that when a user machine needed admin permissions to run something, I'd put in the admin credentials and the server couldn't respond fast enough and the user machine came back with bad user/pass.


DNS working ?

I've verified all services are working and have had no error messages thrown at me by DNS. NTDS has thrown an Event ID 508 which is, to paraphrase, "Im trying to write a file to disk but it took too damn long (60-180s). There might be faulty hardware." It was this same message that led me to replace the RAM with higher capacity/quality RAM and later replace the single HDD that was running it with the RAID arrays.
 
The Windows logo screen seems to take a usual amount of time, it seems the slowdown starts once it gets past that. I will get a mouse cursor with nothing else for about 30 seconds and then loading of group policies takes another 5-7 minutes and then when you have to hit ctrl-alt-del, in doing so the message goes away but just sits with a blank screen for about 10 seconds and then the login screen is slow to pop in and there's a slow response between hitting a key and it actually showing the character on screen. I mention the startup time as its something most people are familiar with, even if its relevancy to a server isn't. The whole thing is performing like crap, even though its not server grade hardware, its still capable of performing. I have MMCs (Including Server Manager) hang for 30 seconds to a minute and occasionally crash out altogether. Managed to get Symantec Endpoint Manager installed last night but, after it taking 30-45 minutes to crawl through the install, I watched it for about two hours initializing its database before I went to bed, not sure when but when I woke up in the morning it was complete. Opening File Explorer or any application at all is an extreme exercise in patience. I think the only way you could get a real sense as the kind of performance its exhibiting is to install this same load onto a Celeron single core with 512MB of RAM.

I've monitored performance monitor and the disks are going buckwild with reads/writes, CPU activity is around 20% with most of that coming from the NT System & Kernel process, RAM usage is around 3.6GB in use, 10GB cached, and then the rest free (16GB total) so I know its not jumping in and out of the page file. I've monitored the average disk queue and there are short periods every 15-30 minutes where it will jump to 50 for a few seconds but outside of that the queue stays <5. I've managed to get process explorer installed and dug into it looking for excessive threads but there isn't any.

So after screwing with this trying to figure out the bottleneck, about 90% of it suddenly disappeared without me doing absolutely anything. The system was still slightly, very very slightly, slow to respond, to the point that you wouldn't notice it if you weren't looking for it like I am. I did notice that the RAID management software that came with the onboard controller is initializing but its going stupidly slow. It's been initializing since.....saturday night or sunday morning and its still only at 45% as of a couple hours ago. I would imagine though that any kind of disk activity would show up on these performance monitors but there's nothing to indicate the system is starved for resources other than actual the experience. It was at the point that when a user machine needed admin permissions to run something, I'd put in the admin credentials and the server couldn't respond fast enough and the user machine came back with bad user/pass.




I've verified all services are working and have had no error messages thrown at me by DNS. NTDS has thrown an Event ID 508 which is, to paraphrase, "Im trying to write a file to disk but it took too damn long (60-180s). There might be faulty hardware." It was this same message that led me to replace the RAM with higher capacity/quality RAM and later replace the single HDD that was running it with the RAID arrays.


In the resource monitor, how much data is it reading/writing to the raid array, active time % and disk queue? Have you run any disk benchmarks against the array? It really sounds like it's an issues with the drives or controller. Have you tried putting everything on a single disk and seeing if performance improves?

Also, what drivers are you using? Most desktop hardware vendors don't have drivers for a Server OS.
 
It's a fresh install, nothing has been done to it to expose it to malicious code.

Is it on a network? Then you should check. And I'll echo the comments about fake RAID. Have you checked the drives? One could be going bad.

If CPU usage is around 20% and it's a quad core machine then maybe one core is getting maxxed.
 
In the resource monitor, how much data is it reading/writing to the raid array, active time % and disk queue? Have you run any disk benchmarks against the array? It really sounds like it's an issues with the drives or controller. Have you tried putting everything on a single disk and seeing if performance improves?

Also, what drivers are you using? Most desktop hardware vendors don't have drivers for a Server OS.

Active time bounces around a lot but its mostly sub-10% with disk queue being <5. Not sure how to check total data being read/written to the drive...based on the processes that are writing and roughly how much they're writing, total, I'd guess at it being in the area of 30KB/sec. The graph is very spiky though so its like its writing 30-40KB and then going idle for a second and spiking up again, over and over.

I haven't run any benchmarks because I haven't had the time to sit and babysit an install. Unfortunately, this IS a production server and the only one on the site so I can't pull it down and experiment on what could be causing the problem. Was hoping someone had an idea of how to diagnose what it is that's causing this behavior.
 
it sounded to me like DNS was not working correctly (slow logons, GPO slow, etc)
The server has itself for its DNS entry on the network connection? (that's static too right)
The DNS console has some other DNS servers as forwarders?
 
it sounded to me like DNS was not working correctly (slow logons, GPO slow, etc)
The server has itself for its DNS entry on the network connection? (that's static too right)
The DNS console has some other DNS servers as forwarders?

Well it's got another DC as it's primary DNS with itself as the secondary as per Microsoft's best practice recommendation. Also, DNS isn't throwing any warnings/errors and performance is very slow with functions not pertaining to DNS itself.
 
This is a dumb question but you do have it set up as a static ip address right?
 
What's showing up in the Event Viewer under System?
 
It honestly sounds like a DNS issue. Try ditching the loopback altogether, making the first IP your primary and the second IP this server.
 
What's showing up in the Event Viewer under System?

Nothing, all logs are clean

It honestly sounds like a DNS issue. Try ditching the loopback altogether, making the first IP your primary and the second IP this server.

I'm not sure why you think its DNS if you've read all the issues that the system is experiencing. What part of DNS issues would cause File Explorer to open slowly, or application installation to take a long time, or basic OS navigation to be nearly impossible?

BTW, the array has finished initializing yet the system has returned to snails pace with a vengeance. I can't even get a context menu to appear or the screen to update without trying to invoke another action. For example, if I wanted a window to go full screen it wont until I try doing something else, such as right clicking on the screen. Hovering over an icon in the taskbar will not bring up the little text bit about the icon. The workaround to allow the display to refresh itself is to bring up task manager and the 1 sec delay update of the graphs is enough to forced the screen to update giving me those little context menus and such to pop up.

I attempted to do a complete backup earlier so that I could avoid having to rebuild the entire OS in the event that I replace the motherboard or something like that but the attempted backup failed twice due to timeout... I'm going to replace the sata cables as a last ditch before going in and replacing the motherboard and, most likely, having to rebuild the array and OS. If I have to do that, I am just going to move to a raid 1 array as it appears raid 5 could also be the culprit.
 
Just attempted another complete backup and its working this time around but, its got a full gigabit connection from endpoint to endpoint and the transfer rate is pegged at 8%. That's an artificial ceiling that I can only attribute to resource....."starvation" or something. Still, 8% of a gigabit connection is around 10MB/s and the full backup is 110GB....hoping it sustains and doesn't fail for the next 3-4 hours this is gonna take.
 
Upgrade TO THE LATEST SQL SERVER VERSION

New versions of SQL server also contain fixes for bugs, take advantage of new CPU instruction sets, and are filled with the latest and greatest in software development techniques.

Hmm....I know nothing about SQL and it's scary to me, lol. How far behind is 08R2's SQL server version from the newest one?
 
check the cpu temp.. is it throttling from an improperly mounted heatsink?
 
check the cpu temp.. is it throttling from an improperly mounted heatsink?

That's something I assumed wasn't an issue because it won't even allow a load to be introduced on it but I just checked it using RealTemp and idle temps were between 40-44c and while transferring a 3GB file, they've gone up to 60c. Not great temps but nowhere high enough to start throttling. I watched the clock stay at a solid 3.2GHz for the 7 minutes or so it took to transfer the file.


I also went to a PC connected to the same infrastructure as the server and was able to transfer the same file at 60MB/s, so that confirms the infrastructure isn't to blame for that particular scenario.
 
did you boot a linux live cd and run some tests?
That would tell you if it was hardware or software.
 
Back
Top