Official Red's server room thread

Took a break/got busy during Christmas time. Back at it now, though I got delayed since the screws for the battery rack fans arn't long enough so I need to buy longer ones tomorrow, also ran into an issue with the converter for the PDUs... bought the wrong ends. So looks like I have to put everything on hold and wait till tomorrow as I'll have to go to the hardware store. I wish home depot was 24/7... I don't know how many times I've been delayed because I had to wait till the next day to get something.

That said,

The DC wiring is in:





Not exactly cable porn, but it works. That stuff ain't easy to work with that's for sure.

The front of that:



That meter should also tell me % capacity left and other details when running on UPS. I still have to calibrate it, but it's probably best to wait till it's online.

PDUs are in as well:



Currently, one PDU will be UPS protected, and the other will just be surge. Once I add more UPS capacity then I'll just move the surge one over to it. Most of the load will be on the first PDU that is UPS protected. I can put up to 750 watts on it.
 
you should put an EPO switch somewhere
 
needs an EPO and a wiring panel, also you should probably not have your fuses mounted to a 2x6, they get hot if they blow.
 
The over/under on a fire destroying Red's house is +/- 6 months.

Bets start at 0.10BTC.
 
The over/under on a fire destroying Red's house is +/- 6 months.

Bets start at 0.10BTC.

I already have this setup, this is just an improvement on it. The current setup has no fuse, no cut off switch, and the batteries are sitting on the floor right next to the inverter. There has not been any fire in the 1+ year I've been running it.

Those fuses are in plastic enclosures, if they could get hot enough to make wood spontaneously combust I'm pretty sure they would not be made of plastic. Ever try to hold a lighter to a piece of wood? It's not that easy to catch on fire without some paper and kindling. We are talking 100 amps (max) at 12vdc here, not 100,000.

I will be adding monitored smoke detectors on both sides once I get around to experimenting with some so I can interface with my monitoring system. I currently have one already installed though it's not monitored.

I was thinking about an emergency power off switch actually, though it would require some heavy duty contactors. (AC and DC side). Though it's something I can always add in the future. Right now all I'd have to do is kill the breaker and both battery switches.
 
I already have this setup, this is just an improvement on it. The current setup has no fuse, no cut off switch, and the batteries are sitting on the floor right next to the inverter. There has not been any fire in the 1+ year I've been running it.

Those fuses are in plastic enclosures, if they could get hot enough to make wood spontaneously combust I'm pretty sure they would not be made of plastic. Ever try to hold a lighter to a piece of wood? It's not that easy to catch on fire without some paper and kindling. We are talking 100 amps (max) at 12vdc here, not 100,000.

I will be adding monitored smoke detectors on both sides once I get around to experimenting with some so I can interface with my monitoring system. I currently have one already installed though it's not monitored.

I was thinking about an emergency power off switch actually, though it would require some heavy duty contactors. (AC and DC side). Though it's something I can always add in the future. Right now all I'd have to do is kill the breaker and both battery switches.

its not difficult

get something like this

http://www.automationsource.com/p-6...-yellow.aspx?gclid=CLzTm8mB67sCFSvl7AodVwgAhg

and have it attached to some large contactors for your DC and AC sides, press the button and the contactors de energize turning off your AC and DC simultaneously

also those automotive fuses do melt, I have a friend that used to do high end car audio stuff and I have seen the fuses and holders melt without the fuse itself failing...

a large battery plant is no joke... I have worked on some 120v battery plants (yes, 120V DC) that are freaking scary...

dropped a screwdriver across some terminals? don't worry about it the screw driver turned into green smoke and a bright flash!

battery plants are great, nice to see someone doing one but they are extremely dangerous if not implemented properly and most importantly respected!
 
its not difficult

get something like this

http://www.automationsource.com/p-6...-yellow.aspx?gclid=CLzTm8mB67sCFSvl7AodVwgAhg

and have it attached to some large contactors for your DC and AC sides, press the button and the contactors de energize turning off your AC and DC simultaneously

also those automotive fuses do melt, I have a friend that used to do high end car audio stuff and I have seen the fuses and holders melt without the fuse itself failing...

a large battery plant is no joke... I have worked on some 120v battery plants (yes, 120V DC) that are freaking scary...

dropped a screwdriver across some terminals? don't worry about it the screw driver turned into green smoke and a bright flash!

battery plants are great, nice to see someone doing one but they are extremely dangerous if not implemented properly and most importantly respected!

That link is not loading, but yeah that's what I'd do, is get some contactors controlled by the switch. Probably going to do that in the future. Being in Canada stuff like this is harder to get, typically have to order from ebay and wait many weeks or even months for shipping from China/US, so it usually requires advanced planing. Right now an emergency shut down would simply require to turn both of those switches off. The inverter-charger wont supply AC power without batteries even if AC is on. Though even if it did, I could just shut the breaker off too.

Though if I do the emergency cut off I could incorporate it with my home environmental/control system so it could be done remotely too. Heck, even automatically based on certain environmental conditions. Ex: extremely high or low temp. Could be a two stage system, first stage sends shut down signal to all servers, second cuts off all power. A serious life emergency would skip the first stage.

And yeah I've heard plenty of horror stories about battery plant accidents, have to treat with respect. I can't find much details on it but I heard about an explosion in a Toronto CO many years back. A wrench was dropped on the battery terminals. Practically the whole building was gone and everyone was without phone till they got an emergency mobile CO wired up.
 
Fans are in!



Tomorrow is my last day off till I have to go back to work so I'll probably take a break, but if I get up early I might do the shut down and move the inverter-charger over to the new rack and the other 2 batteries and hook everything up. It should be the last time in years that I have to shut down that box so hopefully it comes back up gracefully then I don't have to worry about it anymore.

I'll have to rewire the alarm points as well for voltage and AC fail condition. I'll add the h2 sensor as well near the exhaust fan on the back.
 
This is probably the final update for a while. I have some cleaning up to do but everything is more or less done. The inverter-charger has been moved to it's new and permanent location. Today was a big day as that involved shutting down everything. It was eerie quiet in the basement for a while. I did not have to fight too hard with the old dinosaur server but I did have to deal with an NTP and DNS issue, but the raid array was ok. It usually fails when that server is shut down. I will more than likely get a drive failure soon, but it's just nice that I don't have to deal with this right off the bat.


Both battery banks online:





Inverter in the new rack:



Power cable management done and everything plugged into proper PDU:





Full view:




I also kinda regret not building my own PDU with multi inputs/outputs as these are kind of flimsy and the meter is only precise to the nearest amp. But on the other hand it would have been lot of work and I could have never made something this slim as I'd want to use proper junction boxes and treat it like a wall outlet if I had done it custom.


I'm just happy that the server shut down went more or less smooth and now everything is online. My voltage divider seems to be inaccurate despite me calibrating it so I need to check out why. My phone was telling me 12ish volts when the actual voltage was 11.9. Though I will let the batteries charge up and equalize properly and then run a load test again. The amp meter display near the battery switches jumps so much the amp reading is practically useless too, which is unfortunate as that would have been a good way to find out what my load is to do a rough estimate of my run time.

But all in all, I'm fairly happy with the result and that this is done. I was nervous about doing the power switch over because of having to shut down the main server. I was even contemplating making a suicide cord so I can plug it into the power bar and plug it into the same AC leg as the inverter, so that when I unplug it from the inverter there is still power on the power bar, and I can feed the cables through that way and "walk" the power over. But then I would have been stuck with a power bar, I wanted to completely get rid of power bars.

The non UPS PDU has a power bar for the purpose of surge protection, but at some point I may buy a proper rackmount surge protector/filter.
 
Last edited:
Absolutely epic build!! Great job and I personally love the telco DC cable!! That stuff is seriously tough to work with.
 
Just did a battery load test today. Got about 4 hours till the voltage got dangerously low. I think the inverter shuts off at 11 volts and I caught it at 11.04 so I turned the breakers back on.

TBH, not that impressed considering I was getting 4 hours with just the two batteries. Though I'm thinking they may need to equalize properly and maybe the charger did not top them up enough so now that I discharged them and going to charge them up again they may equalize better. I also need to check water levels once they are fully charged, probably due to add some in the older two.

Still 4 hours is not too bad though, and I have to remember I did add more equipment since the last time I did a load test. Probably 100w worth or so. (server and switch) I know server was only using like 70w when I tested and I never tested the switch.
 
Well here is the problem with using dd, the actual sled the drive is in has electronics on it that communicate with the enclosure backplane. What is done is that there is a section of the disk (the last 40MB or so) that contain data called "DACStor" that contains information such as the enclosure serial number, some signature data and particularly the sled serial number (the sleds have unique serial numbers).

The reason you can't just dd the DACStor region from one disk to another is because the cloned DACStor wouldn't contain the correct sled serial number. Basically the disks are "married" to the sleds at the factory.

The procedure I outline in the link I sent (if you had a full controller), outlines a method to force the controller to accept disks that weren't factory signed, as well as a procedure to force the controller to write a new DACStor to the disk inserted. This process will basically marry the disk to the sled for you. However, again, because there is no way for the controller to sign the DACStor region (again this is done at the factory), you have to do some tricks to force disable disk signing in the controller.

This procedure isn't possible with the JBOD expansion controllers you have however, because those controllers don't contain a serial connection and are really just "dumb" expansion devices with small amount of logic to run things.

And as far as the sector sizes, it is possible for the controller to do odd things like 520 bytes per sector instead of 512 bytes per sector. But I've personally never seen it with the LSI based controllers (which is what the IBM Fast-T and Sun/LSI StorageTek enclosures use).

Hello gatekeep,
This is my first post to this forum so I hope my message formats correctly. I'll take my best shot at not violating any etiquette as well.

I also have one of these disk arrays. Mine is an SGI TP-9300s. It uses the same or very similar sled/interposer/drive marriage as the IBM DS4000 series.

It took me a while to realize the proprietary nature of drive replacement on these systems. At first I just had the expansion chassis with the Emulex SR-1216 storage router & ESM units. When I had a drive fail and I couldn't get a (genuine Engenio!) replacement drive to be recognized, I thought I might need the raid chassis and controller to somehow prep the drive. Imagine my distress after setting that up to discover that the replacement drives are still not recognized, but simply disappear from Santricity shortly after being inserted into the shelf.

It appears that the signature check is made in the expansion chassis by the SR-1216 in my case. I was thrilled beyond belief when I scoured the 'net and found your "work-around". Sadly, even though my #2882 controllers have the VxWorks serial console and support entering all that you wrote, that flag doesn't seem to be implemented in my firmware. Perhaps LSI locked that down in later releases or it was an OEM request. The bummer is that I am actually using the correct, CERTIFIED Engenio drives and sleds! However as "reconditioned" or re-certified drives, the DACstore area has been wiped clean and thus doesn't match the signature in the interposer card.

The real trick here seems to be how to read or determine the unique signature of the interposer module. If that was known, it could be eventually found on the disk and perhaps manually edited.

However, I am not aware of any way to read the signature from that sled. I have tried doing various hex and ascii searches for the outwardly visible numbers on the sled but no soap. I figured that the signature must be in the DACstore area since that is the only reserved area on the disk. The HPA and DCO are disabled and/or not used.

If you or anyone else knows a way to turn off signature checking on these LSI / Engenio / SGI controllers, please speak up! :D

The bare drives are made by Seagate and are model number #ST3400832NS, and must have minimum firmware version 5.23. They are Engenio part number #17105-01. As I mentioned earlier, I am using the correct replacements but they are not "new" and therefore (apparently) don't have the right-- or possibly any-- matching signature. Alternatively, if someone knows a company who can read the interposer chip and write the proper signature on the drive as a service, I'd sure be interested in knowing about that as well. Right now, I have 4 Engenio replacement drives and sleds that appear perfectly good but don't have the signatures, effectively bricking them for this application.

Gatekeep, if you or anyone else has a potential fix for this issue, it would be fantastic! My eyes and ears are now open... ;) Thanks for listening!
 
The real trick here seems to be how to read or determine the unique signature of the interposer module. If that was known, it could be eventually found on the disk and perhaps manually edited.

The interposers contain a controller with embedded logic firmware. Its not easy to read the data out of them. Without the original data from the disk in that sled its nearly impossible to get the correct DACStor information. The other problem with "dd"'ing is usually it contains information that tells the backplane how large the attached disk is. Which is why my original procedure contains instructions to write a DACStor region to the new disks.

If you or anyone else knows a way to turn off signature checking on these LSI / Engenio / SGI controllers, please speak up! :D

If the driveSignMode=0 variable doesn't work, it is very likely SGI had LSI remove the ability to switch off signing. It is also possible that LSI removed the variable in newer versions, the array I had was an Sun StorageTek 2655 with older firmware (but nearly the same firmware on the IBM DS4000).

PS: We should try not to hijack Red's thread! Sorry Red!
 
The interposers contain a controller with embedded logic firmware. Its not easy to read the data out of them. Without the original data from the disk in that sled its nearly impossible to get the correct DACStor information. The other problem with "dd"'ing is usually it contains information that tells the backplane how large the attached disk is. Which is why my original procedure contains instructions to write a DACStor region to the new disks.



If the driveSignMode=0 variable doesn't work, it is very likely SGI had LSI remove the ability to switch off signing. It is also possible that LSI removed the variable in newer versions, the array I had was an Sun StorageTek 2655 with older firmware (but nearly the same firmware on the IBM DS4000).

PS: We should try not to hijack Red's thread! Sorry Red!
Thanks for the info, gatekeep. Please accept my apologies for hijacking your thread, Red! I should have started another and hoped that gatekeep would find it. Will do so next time!

BTW Red-- awesome attention to detail in the AC and DC power infrastructure!
--RPM
 
No problem, if you do start a thread maybe link to it from here, I'd be curious myself if there happens to be an easy work around. Though I don't really use mine in production so I would not really do anything drastic.

I also read somewhere these have a total max capacity anyway so even if you did manage to get 3TB drives in there I don't think you'd see all 3TB.
 
I also read somewhere these have a total max capacity anyway so even if you did manage to get 3TB drives in there I don't think you'd see all 3TB.

FWIW; Most of these arrays are based on the older LSI tech, IIRC the firmware doesn't have the capability to read SATA drives beyond 2TB. They will get flaky with anything above 1TB (it'll work but the array may act strangely).
 
Just came across this, saw your old post about the original car battery inverter/ups setup.

Good to see the updates!

I am Interested in more info on your monitoring and hvac/control plans with the arduinos
 
Currently the monitoring is a fully custom app, I will probably release it at some point, there's still some bugs to work out, and I may expand it's features. The arduino is basically just coded to display a list of all the input values. A script runs to update a file with the values by connecting to the serial port. Then the monitoring software just looks at the file using sed and other commands to get the right line/value.

Here's some screenshots:








Did not start on the hvac yet, I will probably do that in the summer. Basically it will be a series of dampers or similar to move air around based on various environmental conditions, while trying to save the heat in winter. Going to need lot of dampers for this though, and those are not cheap, so I may have to figure out a different way to do it. There will also be a HRV involved. Maybe I will build a HRV where the core can move around on a rail, and the core will have different sections to direct air through the various pipes. So ex: if it's hot in the server room but very cold outside rather than put the heat outside it will move it elsewhere in the house and grab air from the crawlspace for the server room. Then in summer it will exhaust outside. I made a few plans but not set on one yet.
 
Impressive, but my main question is what you caught in the mouse trap!
 
That one always falsely triggers for some reason, I only bother going up in the attic if all 3 trip. So far so good. I'll have to fix that one trap some time. I think I just have to sand down the contacts, and I may need to look at doing some kind of current limiting on the loop. I think it corrodes.
 
Battery maintenance day!



Probably been over a year since I've done it. All cells checked out ok, had to add a bit of water in some of the cells but that's about it. Still need to check the other bank though.
 
New server! Will replace my current server which is an "everything" server including VM server. The old server may potentially be kept around and once I migrate stuff over I will update it to the same VM solution I use for the new server, and it can act as HA or what not. Though I'm starting to get near my UPS's capacity so I'll probably end up leaving it off or on a smaller UPS.

Hardware:

Intel Xeon E3 1270 V2 Quad Core Processor 3.5GHZ 8MB LGA1155 69W Retail Box *IR-$27*
Kingston KVR16E11K4/32 32GB 4X8GB Kits DDR3-1600 CL11 DIMM ECC Quad Channel Memory
Kingston SSDNow V300 120GB 2.5in SATA3 LSI SandForce Solid State Disk Flash Drive SSD
Supermicro 822T-400LPB 2U 6 3.5in Hot Swap 7LP EATX 400W Chassis
Supermicro X9SCL+-F mATX LGA1155 DDR3 ECC 2GBE IPMI 3PCIE 6SATA2 9USB2.0 Motherboard

Racked it today. It's now on UPS power and fully connected and labeled.









Starting to run out of ports on my switch! I have a 10/100 switch I scored cheap off ebay a while back, I might actually put it into service. A lot of stuff can be moved to 10/100 such as my HTPC, IPMI, Wifi APs etc...

Now that it's out of the way I will just leave memtest running till I decide to start working on it. The nice thing is from this point I can do everything remotely including installing an OS. I love the fact that it has a dedicated IPMI port, so when I decide to mess around with link aggregation I don't have to worry about losing network.
 
Probably more to come in fall.

Need to build the hvac system, pipe it up, and once it's active I'll start sealing the room up. Still plan to go to a 48v dual conversion system as well, but think that will be a while, probably wont be cheap. :p
 
I love the fact that it has a dedicated IPMI port, so when I decide to mess around with link aggregation I don't have to worry about losing network.

Supermicro makes great gear and I, too, love the IPMI port. It's nice having a dedicated switch for the IPMI ports that is segmented from normal LAN traffic.

All of my whitebox server stuff is based around Supermicro motherboards and cases.
 
Simply amazing, makes my network look like a high school project :p. Keep the updates coming Red!
 
Thanks! Not much has changed since I last updated.

Did have a battery go into thermal run away though. That was interesting. I also realized that I will face a challenge when I go to a 48v system if ever I do get a bad battery. These days, companies manufacture a batch of something, then change it. So can't get the same exact battery years down the line. Not a big deal when they're all in parallel, but in a 48v system, that would be a big problem, you want all batteries in a string to be the same. So when I switch to a 48v system I need to have a contingency plan in case I get a bad battery. Maybe I can have a spare that is on a separate 12v trickle charger or something. Idealy I'd be going with higher end batteries and not get any failures. I want to look into solar, so I'll probably look at more batteries at that point. Trojan golf cart batteries or something. They'd be put in a separate room too, that battery cabinet as nice of an idea as it was at the time, is cubersome as I need to disconnect and move the batteries for maintenance. Those suckers are heavy, especially when you're being careful not to get acid on your clothes. :p

As a side note, I do want to redo the cabling at some point, even the DC cabling I'm not that happy with. So that is a potential project for later that I'll have to remember to update here. I also want to finish the server room ceiling, and the walls. Basically the ceiling will involve adding insulation and vapour barrier between each joist. To seal it up nicely. The joists will still be exposed so I can mount stuff like wiring etc.

I still need to retire my old mail server too, just never get around to it. The new server (VM) is mostly operational except for the mail part. I want to go with virtual users, but there is no procmail equivalent that works for that. I need to be able to run arbitrary apps on a per mailbox bassis. I have a few special mailboxes that execute custom apps for example. So I actually will be writing my own mail transport app that will handle that.

Edit: Figured I'd add the pics of the dud battery, kinda hard to tell what's going on in there, but basically all the cells are empty except one. It was really hot lol.





Yum.
 
Last edited:
Haha I bought central A/C this summer so still paying that off. No changes to server room for a while, but I do have it on my todo list to enclose it properly and add hvac. Still did not do that yet. :p Crazy how time flies, what feels like months is years. Been procrastinating a lot of projects. :p

But yeah next step is probably hvac and closing it off. I will need to build the hvac handler in my garage and it's too cold in there now so that will go to next spring at the soonest. I still need to find a decent cad program for Linux so I can design it. All the ones I tried suck.
 
Have not updated this in a while but not much has changed.

I kinda moved my efforts towards other parts of the house for now and doing things as money permits. If it counts, the hot aisle outside wall is insulated now. :p Been insulating the basement to prep for eventual drywalling.

 
Back
Top