Halloween IT Horror Stories

blarg

Limp Gawd
Joined
Jul 19, 2009
Messages
140
IT CAME FROM THE SERVER ROOM!

For a bit of fun lets post our horror stories from life in IT.

I'm dealing with one right now. My company is consolidating 3 buildings into a 1 new building. The architect didn't bother to include network drops or cabling in the spec and nobody caught it. Then managment decided they wanted the server room and IT department moved to the new building for no reason. For the grand finale my co-worker who was working with AT&T to move our DS3 and new fiber to the new building quit and didn't leave any info on the project.
 
I came into work this morning to everyone waiting for me at the door. The local telco moved from ATM to GigE equipment over night. Needless to say something screwed up and I had no internet and the mpls line between us and one of our remote locations was screwed up too. 4 Phone Calls & 2 hours later internet is back. Another 45 minutes and 2 more phone calls the mpls line is back.

I use to work for the local telco. I know how shit works. Needless to say Heads got ripped off this morning.
 
So, today I have the pleasure of migrating a broken Server 2003 Std + Exchange 2003 Std domain to SBS2008 Premium. Now, I was not involved in the purchase of anything and no one asked for my input. The owners of this other company just decided to go out and buy their own server after I mentioned their old one was really screwed up and they needed to consider migrating to a new one. That is the short version of the whole story. So, anyway, I'm in the middle of this whole migration and it's going reasonably smoothly and then the frigging power goes out. Their new ML350G6 doesn't like their UPS and it shuts off immediately. Old server is so screwed up that it won't shut down cleanly in time before their UPS went dead. Went to go grab some lunch and wait for the lights to come back on. Finally do about 45 minutes ago. I start up both servers and run the SBS2008 migration wizard again. Now I'm getting errors that the source server doesn't meet requirements . . . . I just want to gouge both of my eyes out with a grapefruit spoon right about now.....
 
I have an accounting firm client I don't hear from often, they sort of take care of their own stuff.

4x servers onsite. I get a call early Monday morning (just a few days ago) from one of the early arrival ladies there, she usually gets to the office about 0630. "I can't get into CSA Payroll!"

I walk her through checking to see if the server is and running...nada. Other servers are up fine though.

I have her go over to it,it's ice cold...flip through the KVM..it's sitting at "Initializing" the RAID controller at POST.

Ruh roh! :eek: "I'll be right there!"

I get there...power cycle..same thing. Ah well, this server did this one before...I reseated the RAID controller and it came back fine. So I slide the side off...yank the SCSI cable to the drives, pull the card, reseat it, plug cables back in...power up...this time a different error from the RAID controller. Ugh. I call buddy at my office..just a few minutes away, I have him run over with a spare RAID controller we have from a recent trade in Proliant server we retired from a restaurant chain in NYC. And a new SCSI ribbon cable. I put those in, now she initializes..but I get the 1x drive failure in the RAID 5 array. 3x drives total (small lightweight old server)..I figure..OK, 1 drive failed, we'll be alright. Looooong bootup. And the remaining 2 drives...1 of them sounds absolutely horrible, screetching and scratching like fingernails down a chalkboard. Progress bar hangs for a while, and then the lovely blue screen of stop x7B. Oh yeah, that's a great one to see on a server.

Several more times of reseating everything..no avail. So knowing this server, it was planned to be retired soon, only has a few folders shared on it, so I figure I can slap the RAID controller in another Proliant server of theres...and like slaving another hard drive, just browse the shares, copy 'n paste away, remap some stuff and they're back in business. So power down another bigger Proliant server, put this card and drives in..boot up, RAID controller loads...wait a while...open up My computer..it shows just 1 drive letter (should have showed 2..for the old C and D partitions)..and it has no size, no data, nada. Ruh roh...MBR probably dorked.

Take old server back to office, boot from Server CD and attempt repair..she manages to boot and begin Windows Setup..and see the partitions..but soon as attempt repair she hangs. Nice!

Boot from UBCD...run Avira NTFS4DOS...manage to see the partitions, manage to kick off a checkdisk, which completes after having some difficulty.

Power cycle server, Windows starts loading...takes a long time...1 of the remaining 2 drives in the broken RAID 5 array making a bit of horrendous noise, but she gets to login and I get to desktop, and I can browse the D drive. EXCELLENT!

Start to copy data from D partition to an external USB drive, a little 500 gig Western Digital Passport. Yeah, this server is old...only USB 1 ports. Ugh..approx 50 gigs to copy. Power down server..take home with me. Boot up at home, begin copy process. Goes for about a half hour..that failing drive of the remaining working pair starts making noises really loud.the copy process hangs..explorer hangs. Power down server, let cool, power up...copy some more..about 30 minutes later starts hanging again. Power down, this time I break out a floor fan and lay it down sideways on the hard drives..blowing full force, begin copy process again. The server makes some ugly noises..and the copy process really slows down, but she never hangs again except for 2x files that were corrupted. I nurse her along time about 1am, fall asleep, get up at 6am..nurse her along some more.

Brought USB drive back onsite next day, copied data to one of the other servers, redid login script, pretty much all set. Outta there by 1 in the afternoon.

Yeah time to sell them a newer backup strategy, as this server just had a local backup..which obviously failed on them in this situation.
 
My day went fine.

Earlier this week, we migrated one of our terminal servers from vs2005 to vmware.

Yeah. VMware server just makes it run slow as hell, I have 1/2 of the production staff freaking out because they can't punch in.

No problem, power down the one on VMware, fire up the one on VS2005.

Ugh. VMtools must not have installed correctly. The VMware one wont shut down.

So I run to the server, and just yank the network cable from it, and just let the whole thing do its thing....power up the other TS, badda boom done.

Yeah I wish they would just let me get some good hardware, namely, fast disks.

Not too sold on running vmserver on top of win2k3...
 
Yikes, no external backup?

This is one thing I have been anal about at work. Infact, come hell, death, or high-water, I am going to push for some 1.5tb drives so I can backup our vhd's to a disk.
 
Today, Friday, at 4pm, I find out that a construction crew repairing a sinkhole on our property sliced through a CATV trunk serving four buildings. FML.
 
How did you guess?? It's only about 5 years old, too. :p

Like one of these SC1000 units? I had a client that got one of those, their ML350 (G4 I think) also freaked out about it, randomly rebooting, or flat out not powering up properly at all. Those models don't have enough nut for a decent sized server.

302E257A-5056-9170-D354E96B48C5E9A7_f_v_200x100.jpg
 
Yikes, no external backup?

It originally had a tape drive, DAT, but it was a G2 server....PIII 1.0 or 1.2. yeah...shows ya the age. Regardless, I guess their backup to tape with BE 9 or whatever got unreliable, so they started doing backup to disk. Normally NT backup can peel open backup exec files...but when doing backup to disk it chops it into 1 gig files, and NT backup cannot deal with the catalog of many files making up a backup archive. So restoring from backup wansn't going smoothly. I got the point where I was going to toss that ancient server anyways, so restoring it locally wasn't a plan.

Never had to deal with a RAID 5 box where 1 failed and a 2nd drive is about to jump off the cliff any moment...that wasn't fun.
 
Like one of these SC1000 units? I had a client that got one of those, their ML350 (G4 I think) also freaked out about it, randomly rebooting, or flat out not powering up properly at all. Those models don't have enough nut for a decent sized server.

302E257A-5056-9170-D354E96B48C5E9A7_f_v_200x100.jpg

Pretty much like that except the older even more sucky model, Back-UPS XS 1500. It's yellowed white in color if that tells you anything. :rolleyes:
 
This is one thing I have been anal about at work. Infact, come hell, death, or high-water, I am going to push for some 1.5tb drives so I can backup our vhd's to a disk.
Ya, I hear ya. I worked a place where I had to fight tooth and nail for any kind of back up solution, everytime we brought a new server online. They all thought I just liked making an ass out of myself and spending money ( 2 for 2 actually ), but then came "The Day". SQL Server crashed, the cheap ass raid card ( prior to my employment ) had finally died ( as I had been saying it was going to for almost a year ), taking with it the array. Nothing short of an act of god would pull data off those drives at that point.

Except we had the external backups I had been advocating, and I was able to bring them back up and running in a couple hours ( large data set ). Of course, I was contract at the time so it took me a couple hours to get out there for them. They learned the value of external backups that day, let me tell you. As to the next lesson; decent UPSs...well, we're still working on it.
 
Like one of these SC1000 units? I had a client that got one of those, their ML350 (G4 I think) also freaked out about it, randomly rebooting, or flat out not powering up properly at all. Those models don't have enough nut for a decent sized server.

302E257A-5056-9170-D354E96B48C5E9A7_f_v_200x100.jpg

Yep, we had that exact same problem with some of our client's HP servers too. I believe it was something to do with the PSUs in the servers themselves. I can't remember exactly what it was though. We contacted APC and they told us which UPSs to use on those servers. We got those (they were pretty expensive) and they worked like a charm. I think it was something to do with sine-correcting in the PSUs or something....
 
Anutter hour and I'm going to a new jazz club with da wife. I'll be sure to have a glass of Shiraz for ya! :D or twenty!

I hate you in the face right now.

I'm getting read to pack up and head home for a few hours. Need to re-group before I just toss the server out the window. regsvr32 this regsvr32 that . . . it never ends.....
 
High school computer lab for me. Yeah, showing my age.
a2gs2e.jpg

You know what's sad, that was my junior high school lab. I'm only 19 :D hahaha. Yeah, they were QUITE behind on technology. My first computer was from when they replaced all of those so I got to take home an Apple IIGS and some P1's and P2's. That was grade 8. I started into computers in the 2000's on 1980's hardware haha. I still liked my Commodore 64 and 128 better than my Apple IIGS though... mmm "LOAD 8"
 
Ya, I hear ya. I worked a place where I had to fight tooth and nail for any kind of back up solution, everytime we brought a new server online. They all thought I just liked making an ass out of myself and spending money ( 2 for 2 actually ), but then came "The Day". SQL Server crashed, the cheap ass raid card ( prior to my employment ) had finally died ( as I had been saying it was going to for almost a year ), taking with it the array. Nothing short of an act of god would pull data off those drives at that point.

Except we had the external backups I had been advocating, and I was able to bring them back up and running in a couple hours ( large data set ). Of course, I was contract at the time so it took me a couple hours to get out there for them. They learned the value of external backups that day, let me tell you. As to the next lesson; decent UPSs...well, we're still working on it.


We got the battery backups, would like more, but, its enough to get the servers to shut down safely.

We had a transformer blow out on the serivce pole.

One of the sales guys goes "we should have a generator":facepalm.

Then, the maint guy strings 300' of 120v extension cable from 1 building to another.

I told him hell no, he insisted. I plugged in the battery backup. I have never heard a plug make that kind of snapping sounds:eek: Yeah....that cord got hot fast.
 
So, after having waited all night for MS to call be back, which they were supposed to do at three different times and never did. I gave up and called and told them to have a technician call me in the morning at 9am today. Went on-site and waited for 45 minutes for MS to call, they didn't. So I called again and they didn't see anywhere in their notes that I was supposed to receive a call back. So they escalated the case, AGAIN, and then told me to wait another two hours for a call back. Needless to say I am not having a very fun Halloween weekend......
 
I just pushed out an upgrade of critical financial software; went off without a hitch. In fact, given the way I did it, all the workstations finished with the upgrade before I even got here, I just went around and spot checked.

Now I'm migrating machines from eDirectory to Active Directory. A boring halloween so far ( just the way I like it! )
 
Back
Top