Weird server overhaul I'm workin' on

YeOldeStonecat

[H]F Junkie
Joined
Jul 19, 2004
Messages
11,330
So this metal fabrication shop I took on as a client....some guy working there originally setup their network. On the cheap. Server was an HP baby server..ML110 or ML150...basically a glorified desktop trying to run as a server. The guy purchased on on the cheap..had just 1 gig of RAM, pair of 160 SATA drives doing on onboard wanna-be RAID 1.
C partition 20 gigs, D partition 140 gigs. C drive with 600 megs free...lol. I came in, immediately stuffed RAM in there to bring her to 4 gigs. 1 of the hard drives in the mirror was tanked..so she was running on broken RAID.

So come up with plan to upgrade those drives. Well, I'd want a real server in there..but no budget. So I pickup 4x drives, WD Black Edition drives..1TB each.
Swap out failed drive...allow mirror to rebuild. Swap out 2nd drive, allow to rebuild.
Cheesy software onboard fake-RAID won't allow expansion...so yeah 840 gigs of the first array go wasted..oh well.
Add 3rd and 4th drives...build RAID 1. Plan is to move the D partition from first array, onto the second array..and then stretch old C to take up the remaining newly freed up 140 gigs. So end result is RAID 1 160 gigs for C, and new 1 TB RAID 1 for D.

Here's the fun part. I fire in Acronis Disk Director...and in prepping the partition adjustments....I find that the D drive is primary and active! Yeah...boot.ini, ntldr, ntdetect on are D! Yet \Windows is on C. :eek:

WTF.

Just moved 'em over to C and edited boot.ini, changed primary 'n active over to C via Acronis, reboot time!

Sittin' here doing this stuff remotely from the comfort of my couch while watching WWII in Color on History Channel. Some nail biting reboots about to go on here!
 
It surely amazes me at some of the stuff people do...We had an sc440 running exchange, sql, a terminal server, AD/dns/dhcp, and server for the ERP software we use when I first got into the IT end of things...yeah....We have beef now...

I've been trying to get some IT talent in where I work so I can get away from it 100%. One guy they found worked at a company of 10 people and did all of their IT needs....this person was recommended for the job:eek:. If possible, I want someone more skilled than I so I sleep wonderfully at night!

The issue is ALWAYS budget. On the upside, everything from servers to network to cabling has been 100% redone and as the old stuff breaks new stuff shows up. I have 0 old desktops being used, everything is now thin client, laptop, or modern desktop. Next up, network printers and scanners and dump the remaining desktops:)
 
So this metal fabrication shop I took on as a client....some guy working there originally setup their network. On the cheap. Server was an HP baby server..ML110 or ML150...basically a glorified desktop trying to run as a server. The guy purchased on on the cheap..had just 1 gig of RAM, pair of 160 SATA drives doing on onboard wanna-be RAID 1.
C partition 20 gigs, D partition 140 gigs. C drive with 600 megs free...lol. I came in, immediately stuffed RAM in there to bring her to 4 gigs. 1 of the hard drives in the mirror was tanked..so she was running on broken RAID.

So come up with plan to upgrade those drives. Well, I'd want a real server in there..but no budget. So I pickup 4x drives, WD Black Edition drives..1TB each.
Swap out failed drive...allow mirror to rebuild. Swap out 2nd drive, allow to rebuild.
Cheesy software onboard fake-RAID won't allow expansion...so yeah 840 gigs of the first array go wasted..oh well.
Add 3rd and 4th drives...build RAID 1. Plan is to move the D partition from first array, onto the second array..and then stretch old C to take up the remaining newly freed up 140 gigs. So end result is RAID 1 160 gigs for C, and new 1 TB RAID 1 for D.

Here's the fun part. I fire in Acronis Disk Director...and in prepping the partition adjustments....I find that the D drive is primary and active! Yeah...boot.ini, ntldr, ntdetect on are D! Yet \Windows is on C. :eek:

WTF.

Just moved 'em over to C and edited boot.ini, changed primary 'n active over to C via Acronis, reboot time!

Sittin' here doing this stuff remotely from the comfort of my couch while watching WWII in Color on History Channel. Some nail biting reboots about to go on here!

Dang. Good thing you're on the couch....makes any waits more enjoyable.;)
 
Doh! Over a half hour now....guess she didn't like the adjustments of that last reboot..she ain't back up. ***sigh

Owner in San Diego now...just texted him that the server had a glass jaw and went down for the count! Small shop of 8 or so peeps....he's really the only one that would slightly miss no e-mail to his smartphone over the weekend.

Until then, I'll be rewinding the steps I did in my head wondering what step went awry.
Copied ntldr, ntdetect, and boot.ini to C. Edited boot.ini to change from partition 2 to partition 1. Renamed old boot.ini on D. Used Acronis Disk Directory to change C to primary partition and make it active. Reviewed twice more..committed change, rebooted.

//scratches head

Ah well, Monday morning..a little recovery console 'n FIXBOOT to kick off the day.
 
Was it in a domain or nah?

SBS03.
Yeah..SBS03 was running on 1 gig of RAM on SATA disks..lol. Oh yeah..and it's just a Pentium D processor, not even a freaking Xeon.
So SBS, (Exchange), Quickbooks database server, SolidWorks server...deals with huge graphic files..runs on SQL Express. And some server software to control their WaterJet machine.
 
Looks like rebuild time.

I normally take an image first with Acronis/Paragon, then I would just build new Raid Array and do a restore.

Let me know how it goes.
 
Bah, no need for a rebuild...just need to get to recovery console and fire up fixboot. Or any disk/partition editing software off a util CD for that matter. My hunch is Acronis didn't do a complete job of the bootsector on partition 1.
 
Wow. That's awesome. I'm just doing an SBS2011 migration this weekend. At least the source server is an actual server...
 
Bah, no need for a rebuild...just need to get to recovery console and fire up fixboot. Or any disk/partition editing software off a util CD for that matter. My hunch is Acronis didn't do a complete job of the bootsector on partition 1.

Do you have an IP KVM there or does the server support IPMI?
 
If that HP ML150 has the lights out system, then he can attend to most of it remotely without worry. Depending on which version of Acronis he used he may have created his own issue. Solidworks is a beast of a program. We use that at work. The CAD machine gags running some of those files. Win 7 x64 8GB 3.0Ghz Core quad with a 2GB NVidia cad card we paid near $2k for. Then again our solidworks files sometimes take upwards of 30 minutes to compile. When I first got involved they were using windows XP 32bit with 4GB and the compile times were near 16 hours. The changed them to win 7 x64 the compile times dropped to 3 minutes. Adding another 4GB dropped it another minute and let them actually have 2 projects open at the same time.... who'd a thought?
 
A little bit too risky do it remotely don't you think?

Depends on the client. I do a lot of remote work...to keep the costs down for the client.
Certainly say.. a healthcare agency, or a nursing home, someone that needs 24x7x365 stuff...yeah, would do onsite. This is a small shop that's 20x minutes away and not open on weekends. Having me do upgrades at my leisure allows for lower cost to owner. Versus having me be there after production hours....which means higher rates. Yeah..didn't work this time, but >90% of the time it works.
 
I wonder if acorns booted into it's cd based thing, I know when I resize a drive with those tools and raster it take a while to come up
 
Heh. I have a town government client with a similar setup put together by a previous, now out of business computer repair shop. Dell PE840 I think, Pent D 2.4 I think, 1GB RAM, two 160GB hdds not in any RAID format, running Server 2003, domain naming follows the convention of townname.com. Not .local. When I first started working for them, they had no centralized antivirus program. Had AVG Free on several computers. ClamAV on the server, which I don't think was updating properly. They were instructed to do Tape Backups using a single tape that had never been removed in probably 2 or 3 years. Server and tape drive was full of dust. NTBackup wasn't even working either.

Right now, this server is dog slow. Barely does anything. File storage for 7 computers. Trendmicro WFBS that I've put on a couple years ago, upgraded to a newer version of the software and now that probably explains why it is slow. I think I may look for a new AV product (but that's a separate thread).

Anyway, I've ordered a new Dell T410 to replace the server, using the specs from a suggestion thread I posted here a couple months ago. Just waiting for Dell to build the server to specs, and ship.

I too have had nail biting experiences while working on clients' servers after hours or on weekends. If those damn Spiderlinx KVMoIP units were less than $100, I probably would buy a few of them and install them at key clients, ones with monthly maintenance contracts. That'd save my butt on reboots that fail for whatever reason.
 
Thats a nice server you went with, I do alot of them. Have one next ot my desk 2.53Ghz, 4 250GB SATA, 8GB for a 4 person office with SBS 2011. Exchange, QB, and files.

I have yet to use the IP Kvm on that.
 
Stonecat, you should try out Paragon, it will make migration to new hardware and RAID configuration change suspense a thing of the past.
 
Here's the fun part. I fire in Acronis Disk Director...and in prepping the partition adjustments....I find that the D drive is primary and active! Yeah...boot.ini, ntldr, ntdetect on are D! Yet \Windows is on C. :eek:
I remember in the old days when we used to do that because the old school viruses couldn't read D: drives so if you installed your boot stuff on a non-C:\ drive you couldn't get affected by those viruses. :p Good ole days.
 
Weirdest thing I've seen in a long time.
She simply will not boot if I make the C partition active. "A Disk Read Error Occurred, Press ctrl+alt+del to restart" comes up any time I make her active. Confirming boot.ini, ntldr, and ntdetect are on the C drive, and boot.ini is edited correctly to the first partition.

Made her active in Disk Mangler, or try it with Acronis, or try it from a linux CD with GParted....same issue. System won't boot. So I gotta get back in there booting from the linux rescue CD and use Gparted to make the 2nd partition active again.
 
Last edited:
You can try to boot up anytime after any of these operations have completed.

Step ZERO .... make sure you physically know which drive is the C: drive. Verify this by unplugging a drive and using a boot disk.

Step One move boot files to the correct drive and make the C: drive active.

Step Two confirm boot.ini

Step Three chkdsk C: /f

Step Four fixmbr

Step Five fixboot

Step Six verify the system is booting to the correct drive in via BIOS. (unplug the other drive....triple check the boot order.

Step Seven bootcfg /scan

Step Eight bootcfg /rebuild


Should now work.
 
Last edited:
You can try to boot up anytime after any of these operations have completed.

Step ZERO .... make sure you physically know which drive is the C: drive. Verify this by unplugging a driver and using a boot disk..

How do I unplug a driver?

Physically, knowing which drive is the C drive is a little more complicated..it's a RAID setup, see my first post. 160 gig RAID 1 array, 2x partitions..20 gig C and 140 gig D. Can't just unplug one drive to test.

I went through all those othe steps a few times over this morning. There's something wonky in there. I'm wondering if, since the C drive is so jammed full, ntldr got copied too far deep in the partition (it used to insist on sitting within the first 2 gigs of a partition..used to be wonky in the NT 4 days, I thought they cured that issue with sp5 or sp6 or at least 2K Server...but maybe not).
 
What about making an image. Copying the annoying partition elsewhere and performing a repair install. Then repartition and copy the other partitions data back.
 
I'm telling you Stonecat, give Paragon a try. Look forward to reading an update from you.
 
Update...in order to get the server and running for the client (was starting to get near noon..I didn't want them to run much more than 2-3 hours without their server)...had to come up with a quick fix. Even had my colleague come in..he's wicked experienced with server disasters and he was stumped at this issue.

I tried Acronis, I tried Paragon, and GParted. All had the same result. This server would simply NOT properly allow me to switch primary and active partitions, without giving that error on the next reboot.

So in order to get the server back at the client and in production....combined with allowing me to do the partition moving later on, remotely, at night....we did a sort of quick and dirty trick.

"Made an NT boot disk" on a CD. I copied the ntldr, ntdetect, and boot.ini to a CD. Server has CD as first bootable device. boot.ini points to the partition that Windows is on. Bootable device on the server (set to CD) over-rides whatever the Active partition is. This way I can now juggle the partitions around..and not worry about which drive/partition Windows "thinks" is active or not....the bootable CD will always over-ride.

Once the partitions are separated from the current 1x array..and moved around..I'll go onsite and try to run without the CD...see if the active partition will "stick" properly after all that juggling.
 
Hmmm stupid question maybe but ever thought the problem lays with in the raid setup and by you moving stuff it does a sync and breaks what ever you have done. Sometimes it is as stupid as the raid software running in windows that keeps all that in check but also breaks it when you do something "abnormal".
 
Why don't you simply get a couple usb 2 tb drives, backup the data off raid, and just kill it and start over? save yourself and your client hassle for the future. (As much as you can when it comes to windows and servers)
 
Interesting experience. You'd probably get sick if you had to walk in our "server room" (I almost did) with it's spaghetti mess and 10 year old desktops running OpenBSD [one of our recently replaced critical routing servers had a Windows 98 license on the tower], but we're running four-nines (99.99%) uptime over the past 5 years.... so it can't be that bad. That said all the *real* servers are now HP DL380's.

Good choice on the Enterprise Black drives, they've been rock solid for me.

HP ML150's are perfectly fine hardware, especially for non-critical servers. They're extremely cost effective for our massive file archive dumps because first party SCSI/SAS drives are absolute rape compared to generic enterprise-level SATA drives. Cost per TB starts mattering a lot more when you're dealing with 50 TB, a lot of which will never be accessed again unless we get audited.

1GB RAM is a bit of a serious deficiency though....
 
Back
Top