Dashcat2 Build

September 2, the company I worked for downsized. I was one of the ~15% cut. I learned later that I was being paid about 40% less than my position demanded. Instead of going directly to another job (I was offered a great spot at a cool company, with interesting work, and a sweet pay rate), I'm refreshing myself and going back to school. I'm headed to Bridgerland ATC for the first half and USU for the second half. I was planning to get into Drafting, but the Robotics program is too appealing to ignore. I can learn SolidWorks on my own time, anyway, and a lot of what I'll cover in Robotics is integral to my USU studies.

My later work will require Dashcat's computing power. I have no choice but to continue. What luck... obsolescence that isn't.

I might have to scavenge Infiniband gear. It's the cables that are expensive, oddly.
 
Cheers to the positive turn on a big change. And good luck with the new career path.

I might have to scavenge Infiniband gear. It's the cables that are expensive, oddly.

Having stared at this a bit, it seems to me you can pretty much count on about $1k per device for 40Gb/s connectivity if you are going with Mellanox, and a bit less for Qlogic. This seems to hold true whether you are connecting as little as 8 ports, or as much as 36 ports.

Either way, that's a monster get for home use. Good luck!
 
Cheers to the positive turn on a big change. And good luck with the new career path.



Having stared at this a bit, it seems to me you can pretty much count on about $1k per device for 40Gb/s connectivity if you are going with Mellanox, and a bit less for Qlogic. This seems to hold true whether you are connecting as little as 8 ports, or as much as 36 ports.

Either way, that's a monster get for home use. Good luck!

Thanks for the ups. I appreciate it. I didn't think anyone still followed this pretty-much defunct thread from almost six years ago.

I can't go 40Gbps at this point. The nodes and server are all PCI-X 64/133 architecture. Basically, it's down to a 10Gbps (Infiniband SDR 4x) approach. I see dual-port cards for USD$10, cables for USD$10, and 24-port switches for USD$200.

Dashcat is 16-node with two cold-spare worker nodes and a cold-spare server. 18 ports for the nodes, two ports each for the servers (if needed to quell saturation), and two ports to my PCI-e workstation is a good fit. If I elect to cold-plug the backup server (or I just don't need two ports each for the servers, I have two spare ports to throw at a long-distance shotgunned fiber link out to 10km away.

That's my whole city and then some, even with our local fibers. Shooting twin beams to BATC from my house is only three miles (4.67km). Granted, at that distance, speed-of-light in the fiber (~2/3 c because zigzag) becomes a relevant issue with a ~26 microsecond propagation delay.

I'm glad to have these problems to consider.
 
I guess I'm at least persistent. I'm back in college, refreshing mah skillz.

I expected a UCAT certificate, and that guarantees me $18 an hour now. I also expected to press onward to an Associate Degree that would boost that by some bucks. That's not the endpoint. 2017 starts the Bachelor program in my field. This is perfect timing.
 
VR lit this project back up; Unreal Engine 4 Swarm.

Looks like I can make something out of this, after all.

I didn't see this coming.
 


This is new. Does it work?

Preview says so. I'm going to have fun with this.
 
It was yesterday that I stabbed my flag in the dirt, taking control by admitting defeat, in a way.

As most of those who have followed this thread already know, I have had a lot going on and it was 7/28/2016 13:43MDT I declared a "Fuck That Shit" to the world and took my pills.
Bupropion (Generic for Wellbutrin) -- Originally just an antidepressant, which I needed anyway. Since 2004, used for ADD/ADHD for the same reason Strattera (Atomoxetine) is.
Buspirone (Generic for Buspar) -- Anti-Anxiety. Ever since my Mom's final Ambulance ride and my subsequent chase in my Camaro SS, sirens going by my house put me on edge.
feat. Pitbull, er, Zolpidem (Generic for Ambien) -- Wellbutrin/Bupropion causes insomnia for the first month. Did the same when I was a teen. This fixes it.

My entire bill for this month was like USD$48. Ambien was $9 of that. This should never have surprised me. Overclocking costs money, whether it's Silicon or brain matter.

I need to give a shoutout to whoever is behind the prescription discount card my town is giving out. I'd have paid more than double without it.
 
Last edited:
This has been really difficult for reasons I've already mentioned. I started this project in 2009 around this time. The technology was already 4 years old then. Today, it's not CPU but GPU that dictates speed. The problem is these motherboards I use are PCI-X and not PCIe of any kind. PCI-X 133 to PCIe 1.0 4x bridge cards are available, but they're expensive and still limited in a lot of ways. I'm working on getting the cluster computing concepts mastered. I will never give up.

I really hope this is a Launch And Catch-Fire moment. Seven years is a long time. Steel is one thing, but dog years are longer than silicon years.
 

motherofgodcat.jpg


Four of those nodes have more cores at a faster speed than my own.
The upgrade price from 32GB to 64GB per node is dirt cheap and would give four nodes as much RAM as my whole cluster.

I wish I could get a bunch of the mobo/CPU/RAM sets and put those into my cases. I'm sure it doesn't work that way, though.

PCIe slots mean GPUs become an option.
 
Four of those nodes have more cores at a faster speed than my own.
The upgrade price from 32GB to 64GB per node is dirt cheap and would give four nodes as much RAM as my whole cluster.

I wish I could get a bunch of the mobo/CPU/RAM sets and put those into my cases. I'm sure it doesn't work that way, though.

PCIe slots mean GPUs become an option.

Funny you should say that...

http://www.natex.us/ReWork-Intel-S2600CP2J-Motherboard-2x-E5-2670-SR-p/s2600cp-h8-32gb.htm
 
This is probably the most solid progress I've made on this machine in a long time. I have the cable glands that will allow me to hook my 8AWG welder cable to a sub-panel for the cluster. I'm figuring out breaker sizes, at the moment. My house wiring can take 50A intermittent and 40A continuous within spec on the 30A circuit. (Note: USA power uses split-phase 120V for a net 240V between poles and 120Vrms versus neutral/ground.) I prefer to steady-state 30A like a standard clothes dryer, but larger and cooler wires are a bonus on the voltage drop front.

My IceBox intelligent PDUs will take two standard IEC computer power plugs, which are normally designed for 10A to each of five nodes. In practice, heavier gauge cable yields better ampacity, as does better contact material. I can maintain 20A per branch (120VAC split, of course) with 14AWG cable without exceeding 60C or even close. That's 480W per node. I don't need that much.
 
When I first conceived the Dashcat Supercomputer, I thought about the idea of a massive bank of system with dedicated GPUs rendering video in somewhat-realtime fashion with, say, 120 nodes kicking out a completed frame every two seconds and those frames being stitched together to form a video stream. I used to wonder how to switch GPU outputs quickly enough to accomplish this, but I realized after that Netex link provided by Jorona that's it's no longer necessary to coordinate video outputs when each frame can be sent over a network connection.

When I built Dashcat, the goal was Gigabit Ethernet and that was economical at the time. It's still feasible now while kicking out a 4K frame every few minutes. The nodes in the listing Jorona posted have 10GbE onboard. This is a game changer. I got to thinking: What if I transplanted the guts from one of these nodes into the Dashcat Server and changed the switch from the Dell 5324 to something with Gigabit ports, but with 10GbE uplinks, with one connected to the server to aggregate the worker nodes and another sent to my workstation? I wouldn't need Infiniband.

848x480x32bpp video at 60Hz consumes 781mbps.

1920x1080x32bpp video at 60Hz consumes ~4gbps.
2160x1200x32bpp video at 90Hz consumes -7.5gbps. (today's VR HMDs)
3840x2160x32bpp video at 24Hz consumes ~6.4gbps.
3840x2160x32bpp video at 30Hz consumes -8gbps.

GbE can handle 3fps at 4K (I would need 20 nodes) (334ms additional latency)
GbE can handle 15fps at 1080p with 12fps more likely (I would need 5 nodes) (67ms additional latency)

I do believe Dashcat is going to become a general playground, not just a render farm. The capability is there now.

What I find interesting is what happens when this is extended out to 100GbE (which will be cheap, eventually)

7680x4320x32bpp is just over 1gb per frame. The same at 90Hz is ~95.5Gbps.

What happens when I use a 4x VR resolution?

8640x4800X32bpp = 1.327gb per frame = 70Hz to fit in a 100Gbps stream

We'll get there eventually. I remember when 1024x768 XGA was the holy grail, then 1080p. Now I'm looking at buying my first 4K TV/monitor.
 
So... Last week I got a QLogic Silverstorm 24-port CX4 Infiniband SDR switch for $10 while I was out Picking. And that was at the most unusual DI store on the Wasatch Front.

There's a lot to be said for "How?" and I'm putting that to work in 2017. This is already disturbingly-effective like nabbing Mewtwo with an errant snotball.

Oh... My Pokemon GO is showing.

This thread may as well be my life history, at this rate.

If anyone benefits from this at any age or level, just pay it forward. But leave the finer points for questions requiring answers later.

"Fryode only ran at Infiniband 10G to learn what he did. That gear is free now if it hasn't gone to the trash."

I'll adapt. Nobody will be left behind. (But I'll need to get helpers of the Third Street Saints-tier.)

Holy shit....
 
Last edited:
The 3D rendering capability of Dashcat2 has been replaced with two GTX 1080 cards.

However, this is not the end. I built the groundwork upon which a scaleable architecture can still exist, if necessary.
Updated nodes will have GPUs installed. In the meantime, I'm studying MPI networking because the slot left over after the GPU will carry an Infiniband card.

The next version will be Dashcat 3.
 
And we await its arrival and subsequent thread for it! :D
 
And we await its arrival and subsequent thread for it! :D
Thanks. I have no idea when it will happen. I don't know if documenting the learning process of HPC (High-Performance Computing) is worthwhile for anyone. I still learn a lot from the current iteration of this cluster.

Among other attributes, I can't keep it fed with two LACP shotgunned Gigabit Ethernet links from my workstation. I have to go 10G uplink, which requires I swap out the Switch. I guess I get to learn something new.

The 10Gb Infiniband switch is the data path, while Gigabit Ethernet is the signal/command path, when needed. I wish my server had PCIe instead of PCI-X, but it works for now.

The day PCIe GPUs can be added to each node will rock.
 
Thanks. I have no idea when it will happen. I don't know if documenting the learning process of HPC (High-Performance Computing) is worthwhile for anyone. I still learn a lot from the current iteration of this cluster.

Among other attributes, I can't keep it fed with two LACP shotgunned Gigabit Ethernet links from my workstation. I have to go 10G uplink, which requires I swap out the Switch. I guess I get to learn something new.

The 10Gb Infiniband switch is the data path, while Gigabit Ethernet is the signal/command path, when needed. I wish my server had PCIe instead of PCI-X, but it works for now.

The day PCIe GPUs can be added to each node will rock.

Thank you.
 
I found that the screws spin out of the standoff when I Dremel them. I switched to holding them with a pair of wire cutters (huh huh... dykes) at the very top where damage to the threads means nothing. It worked, but takes a long time. I got to thinking there had to be a better way and I found it. Lowes had some flathead countersink machine screws that will only require me to use a larger bit to bevel the edge of each hole, which had been the plan all along as I want the new holes de-burred.

The best part is this will make the rails self-aligning.

I tried a test-fit with the modified panhead screws and found that my rack is not proper spec and has to be modified itself. I have to grind away at least 1/16" of steel from both sides at the front to get the rack to take my rails and the associated servers.

I tried to upload photos of my recent progress, but my FTP server is being a prick about bulk uploads so I can't for now.

I have some bar nuts I can use as guides while I Dremel the hell out of the rack. That will have to wait for the weekend, however. It means the machines are going in this weekend, though. Adjustment will probably take another week and that will give me time to wait for my next delivery of cooling fans. The ones on my CPU heatsinks are overpowered, I found. They will be replaced with low-speed units while the high-speed fan in each machine will be replaced with an original mid-speed unit.

I learned about step-drill bits a week back. I thought "Where have these been all my life?" The basic side is a hole of the correct size. Advanced = de-burring the hole you drilled out.
 
Falcon Heavy... If Elon Musk can shoot a car into space out past Mars' orbit, the least I can do is finish the Dashcat machine.

I have video of myself going ballistic during the launch. The Super Bowl was cool and I'm glad the Eagles won, but I was looking forward to watching Falcon.

Who didn't love seeing those side boosters stick the landing?

At one point during the live Starman stream, I went grocery shopping and it dawned on me just how far we've come. I watched a Mars-capable, re-usable rocket shoot a car into space and had a live, high-quality video feed from space that I was watching on my phone with no wires and it was cheap for me to do so.

That launch renewed my sense of hope. Winter is always difficult for me and this one was dark, indeed, until that launch.

Elon Musk was right that the usual dummy payload of concrete or steel was boring. Not to mention expensive because it has to be made a certain way. Instead, people got paid to modify his old car and perch it on top of the most powerful rocket our species has in operation right now. True, the center core was lost due to failed landing relight, but it was never intended to be reused anyway.

-- I like my women like I like my rockets: Flight-proven --
I'm so putting that on a shirt, by the way

Anyway, I've finished a portable version of the main power load center (breaker box) I used for the original install of the Dashcat Supercomputer. It has a clothes dryer plug at one end of the very same 8AWG SOOW 4C cable I used for the first version (with about 20 feet coiled up), the same model of load center, and possibly the same breakers, but I forget. Each of the four breakers that are in use right now have a standard computer-style power cord as the output. The 20A breakers have 14AWG cable (good for 18A), while the 15A breakers have 16AWG cable (good for 13A).

IMG_8157s.jpg

IMG_8160s.jpg

IMG_8161s.jpg


I had been plugging both rails into the same 120V circuit, resulting in unbalanced load. That's over now.

The Lithium coin cells used for CMOS backup drained due to the machine being off for so long (owing to being pretty much useless). True, I had installed CR2025 cells (3V 160mA) because I got a bunch of them for cheap, but CR2032 cells don't have all that much more capacity at 225mA. Hmmm... What to do?

IMG_8163s.jpg

IMG_8164s.jpg


BAM! Lithium Thionyl Chloride AA cell at 3.6V and 2450mA. It works just fine. These cells have a ridiculously long shelf life, too, plus I have a few hundred of them. I'm thinking I'll zip tie a spare inside each case.

But what good is replacing the CMOS battery if it's just going to sit? It's not. And that's why a new OS was needed. I previously installed Windows XP Pro on these machines and the 32-bit address space cripples the machines by only allowing access to 4GB out of the 16GB RAM in each node. I need a 64-bit solution! Enter Windows 7 Pro x64... oh wait... These machines only have USB 1.1 so installing from a thumbdrive would take forever. Damn. Do I even have any optical drives anymore? Let alone IDE? Actually, I did.

IMG_8165s.jpg

IMG_8166s.jpg


It was in my Mom's old desktop from 2002. The tray was finicky about ejecting, but the drive read just fine. Well, until...


IMG_8170s.jpg



Rats. I did my homework and learned that the BIOS version I have doesn't have the correct ACPI support. But I got to thinking about how it was so many years ago that I was hooked up with the v2.09 BIOS that there was a chance a newer version may have come out. After quite a bit of searching, it was at MyDigitalLife in a thread

by user tqhoang https://forums.mydigitallife.net/threads/arima-hdama-and-hdamb-motherboard-bios-fixes.20603/

that I found reference to a v2.18 BIOS (Technically for Rev G) that had been confirmed

by user omegadraconis https://forums.mydigitallife.net/th...herboard-bios-fixes.20603/page-12#post-344178

to be compatible with the E290 revising of the HDAMA motherboard. This was perfect because I have both Rev E and Rev D motherboards and there's really no reason to doubt backward compatibility.

Sidenote: I had a bit of a "WTF?" moment when my work was linked

by user robert_newbie https://forums.mydigitallife.net/th...herboard-bios-fixes.20603/page-13#post-346309

IMG_8171s.jpg


What followed was a nightmare of drives not working, swapping one of the v2.09 flash chips back in, banging my head against the wall and, finally, switching the jumper on the back of the DVD drive from Master to Cable Select. *facepalm*

IMG_8175s.jpg


The new BIOS worked! There was just one more tiny problem. The Gigabit Ethernet NICs didn't have a driver packed with the Windows 7 SP1 install disc. It took some digging at Broadcom's site, but I found what I needed. Turns out they're Windows XP and Windows Server 2003 drivers, but work just fine.

www.fryode.com/DashcatPSC/win_xp_2k3_32-15.2.0.4.zip

www.fryode.com/DashcatPSC/win_xp_2k3_x64-15.2.0.4.zip

IMG_8176s.jpg


With that, I have a solid foundation and can work with Linux at my own pace now, if necessary.

Now I just need to mod the other 15 nodes and the servers.

Since this wouldn't be a proper post without music:



starmanbebop.jpg
 
  • Like
Reactions: Derf
like this
Ten freaking rounds of updates later, no more important or optional Windows updates left. That was brutal and took all day.

IMG_8178s.jpg


Fortunately, I've kept myself busy learning about the specifics of Windows 7 in this application. I still have a lot to learn and a lot to install before I can image the disk and mirror it to the rest of the nodes.

I'm considering whether I even need to have a dedicated server in terms of an actual Windows Server OS because I've never done that before. All I really need is a glorified NAS box, from the look of things.

With all the waiting, I got more hardware stuff done. A battery holder for each node now. I ran out of red wire, though.

IMG_8177s.jpg

For my next trick, I need to figure out how to attach hard drives to the bracket in the blades without a drive sled. Measure and drill looks like the best way. I don't know if I have enough IDE cables, but they're not all that hard to find.

IMG_8179s.jpg
 
Pics dosnt load.. :eek:

I'll try a repost.

I just finished the first of six blade servers. Firmware was updated to v2.18. I had to get creative about mounting the new CMOS battery.

IMG_8184s.jpg


I drilled the hard disk bracket and mounted a drive. I figured out how to get the Power and HDD LEDs working (wires were too short for this model of motherboard).

IMG_8187s.jpg


Just for fun, I found a way to add the same Radioactive Green LEDs as the 2U units have.

IMG_8191s.jpg
 
I have no idea what's going on here. I guess I won't be placeholding with image filenames anymore.

Here are the photos renamed:

IMG_8184rs.jpg

IMG_8187rs.jpg

IMG_8191rs.jpg
 
I finished modding all six of the blades. I'm about to start on the rest of the 2U nodes. Those won't take near as long. It's 1:30AM right now.

This is how I wired the lights.
IMG_8192s.jpg


Just for fun, I opened my Infiniband switch to see what's inside.
IMG_8193s.jpg


That's it. That's all that's in that huge 1U case.
IMG_8194s.jpg


That SO-DIMM is a whole 128MB of PC2100 ECC DDR.
 
MisterDNA said:
Outside my window now is *IDENTIFYING LANDMARK*, badass mountains and *SNIP*

GAH! Reading through this and you tripped up my Google Maps OCD!

You'd already noted whereabouts you lived.

Your picture and your note on the landmark gave me everything I needed.

Not posting it, and I KNOW this was 6 years ago and you've moved since.

Please don't think me creepy and stalker-ish. I'm not (well, not much). And my chances of being in Utah, let alone anywhere near you, is virtually nil.

I'm into home improvement shows like This Old House and the various Mike Holmes series. And they usually provide me with enough ancillary info that I can locate the project homes in under an hour...
 
Last edited by a moderator:
Okay, entire thread binged.

Damn dude. What a trip!

One thing with the Win7 nodes.

1: Image the fuckers if at all possible. As you noted, Win7 updates are a bitch and 5/8ths. And that's IF you don't run into one of the update servers with a corrupted Win7 manifest (then it takes even LONGER).

2: If you don't want to take the time to image, I suggest getting WSUSOffline and pulling down all available updates and storing it. This gives you a plug-and-play auto-reboot-managing updater if you ever need to reload in the future. It'll still take the better part of aa day, but everything is local, no downloads and no chance of an update server with a horked manifest.

As to the Infiniband, you'd think they'd just cut it down to something like a rackmount switch instead. Or half-depth.

Question. How's the school/job situation working for you at the moment?
 
I just finished modding all 16 compute nodes and the server.

The server is interesting. It has a Rev G motherboard with the v2.13 firmware. I'm not going to reflash that to v2.18 until I can get a backup of v2.13 (I have a bootable CD I used for flashing).

I'm noticing a problem that might be due to the BIOS, the CMOS battery, or some kind of boot failsafe. Some of the CMOS data sometimes gets wiped after power-cycling, turning off the "Power on after AC power is restored" option, which isn't ideal since my machines are controlled by relays inside the ICEBox 3000 applying or removing AC power.

Another CMOS option that gets reset is the RAM test. I don't have the RAM do any kind of test on startup because there's no point. When CMOS gets a case of amnesia, the option switches back to the default "Just Zero It", which slows boot time. I also notice the interconnect speeds drop back to default settings.

If it's the battery having too high of voltage, I can fix that with a diode in series for a .6V voltage drop.

If it's the BIOS, I'll have to see if I can customize a version with the defaults set to what I want for this application or see if V2.13 is will Windows 7 compatible.

If it's a matter of looking for a successful boot (most likely, based on evidence so far), that will be solved when I finalize the Windows 7 Pro image on Node 1 and mirror the image to all of the other nodes. I notice Node 1 never fails to start up correctly.


One big problem I ran into was finding enough IDE cables and IDE hard drives for all of the nodes. When I was originally planning 10 nodes, I bought 10 former TiVO 160GB hard disks. This was before I figured out how to install the HDAMA motherboards in the six blades. I collected drives over the years, but the meant I ended up with a few 80GB drives, a 300GB and a 500GB. The 500 is reporting SMART errors. There's a 120GB IDE disk in the server. That has Ubuntu on it, but I'm not going to be using it anymore. Since the server has a 12-port RAID card with four ports not connected to the SATA backplane, I can easily add a SATA disk for the OS.
 
I simply had to get a shot in the dark. The shortstack is being faked with a standard power strip and cords because I don't have enough of the right kind of power cords to plug into the back of the ICEBox yet.

I sort of wish I could put LEDs in the server, but there's literally nowhere to put them.

IMG_8203s.jpg


I figure giving a small life update is a good idea at this point.

For those who may be new to this thread, the namesake of the now-not-so-supercomputer was my cat, Dash. He was adopted from a family that was moving away. He was named after Dash from The Incredibles.

Dash supposedly ran away, but later evidence suggested the sadistic property manager of the trailer park where I was living set his pitbull on him.

At the end of 2016, I decided adopting a cat would be a good idea since being lonely sucks and cats are always entertaining.

A few days later, a couple from a town about 20 miles away found a kitten wandering around in the freezing cold across from their house. They took her in, but couldn't keep her so I adopted her.

I originally named her Ping, but changed it to Spring a few days later.

Here she is, trying to climb the leg of the lady I adopted her from.
spring0.jpg


She was nervous the first couple of days and would spend most of her time hidden, but she was eating and using her litterbox. She had obviously been abandoned and probably expected to be abandoned again. I had to earn her trust.
spring1.jpg


My daughter loves her "kitty sister"
spring2.jpg


Had to get her fixed to avoid kittens. She was howling for a boyfriend at 7 months old... freaking lolita. Interesting fact: The father of Alix, the neighbor girl who was killed, performed the surgery.
The Cone of Shame didn't stay on near as long as it was supposed to so it's a good thing I paid extra for the laser versus the scalpel since it heals faster.
spring3.jpg


Five months after adoption. (featuring my sunburned legs)
spring4.jpg
 
Cute cat! Nice to hear you found one for adoption! only douches dumps their pets what should be considered as family members aswell <3

Your rack looks so much dirtttttty hardware prån and so futuristic! :D
 
*Epic thread binged.

-- The thread may be epic, but it doesn't hold a candle to the man who lived it. Thank you for giving us a window into your journey.
 
It's been brought to my attention that the former archive for LNXI documentation is gone and I found my files in the early AM hours this morning while sorting through over 20 years and 10TB worth of data.

I'm uploading it here for posterity.

fryode.com/lnxi/LNXI-archive-fryode.zip
 
I'm alive and typing on a clicky keyboard with Cherry switches. RightTFon! This is odd. I'm approaching 40. I hope you can learn from my mistakes. Don't waste your time. And watch the skies. You'll see crazy shit. #walkatnight.
 
That's it for me. Maybe I can serve as a Hack-A-Day Fail. A decade. I'm off the proverbial pot. I just wanted to offer closure.
 
Back
Top