hpc cluster and folding

crashbox · May 16, 2011

first of all thanks for the help with getting smp to work on my spare server today (you know who you are). Now for the fun stuff.

I work for a company that makes dental implants and has developed a new ct scan machine, during the r&d process we acquired a few telsa m2050's and some dell precision t7500's with dual x5660's. Currently both the tesla cards and the workstations are not in use and i checked them into my inventory. there are 10 workstations and 10 telsa cards in total, had 11 workstations but i decided to change my system out

. since i have the room and cooling available in my server room, would it be possible to after setting these up in a beowulf cluster running gentoo or possibly bsd (id prefer to use a custom compiled kernel for this) and then have folding @ home run on this hpc cluster? Let me know your thoughts and input if this would even be worth doing vs just making an image of an os install and just running them as a standalone folding box.

Thanks

W.Feather · May 17, 2011

For ease of use, separate OS's on each machine would be best, i dont know much/anything about cluster's, Tobit would be the one to answer the Gentoo question(his prefered linux OS IIRC)

silicide · May 17, 2011

I looked into this at one point for a beowulf cluster of a dozen or so old pentium 4 boxes. If I recall, you need an mpc-enabled program to properly take advantage of clusters like that, and f@h isn't. This was a while ago now, though, so that may have changed. Gentoo image still gets my vote though.

SazanEyes · May 17, 2011

I agree with Lord_of_Doors. I did some checking on the Windows HPC Server side of things, and that apparently doesn't work either. Since F@H isn't cluster-aware, it expects all the resources to be local. Because they're not, bandwidth limitations kill performance. You're better off running a client on each box.

nitrobass24 · May 17, 2011

F@H is basically a big HPC cluster.

Each our computers is a node, and downloads a WU to process and sends back to the head node (stanford).

Currently even though we have the computer power to make quite powerful HPC clusters at home, this will not work for F@H because of the data that must be exchanged during the processing of a protein. Current networks do not have low enough latency for this to be anywhere near efficient.

Also there is no HPC aware F@H application because the current cores do not use MPI.

Basically run a single OS on each box and we will help you set it up

crashbox · May 17, 2011

well thats a shame, i looking forward to the challenge that is setting up a hpc cluster, i was even going to name it hal 9000.... now another question is, how well do these tesla cards fold? and is it even worth the performance vs heat?

Mtnduey · May 17, 2011

I have personally tested out some various ways of doing HPC folding with no success as of yet but this is likely due to my limited knowledge of F@H and not enough time to juggle real work & f@h playtime

I had an eight node infiniband based HPC Grid/Cluster to play with a while back and was hopeful of getting it to farm the workload out to the eight identical nodes but no joy as the app server which would have farmed out the workload was only a 4 core machine as it was just the front-end of the HPC farm so no bigadv wu's. Wish that had of worked, boy oh boy do I.

Linux front end WU server
8x 32 core Power 7 servers acting as the HPC/Grid

I've also tried using PowerVM Lx86 to run F@H on IBM Power Series hardware but Lx86 is still a bit flaky IMO, bombs out at the end of install each time.

Wish they had a client for AIX and Solaris, have some pretty wiz bang servers that would love to stretch their legs on F@H

Activate: AMD · May 17, 2011

crashbox said:
well thats a shame, i looking forward to the challenge that is setting up a hpc cluster, i was even going to name it hal 9000.... now another question is, how well do these tesla cards fold? and is it even worth the performance vs heat?

Those teslas are GTX 470 based, no? In that case they're probably good for 10-15k ppd each. Small change compared to what 10 dual x5660's would be able to do on bigadv, but nothing to sneeze at. power consumption wise, I'd probably skip them and focus my efforts on the dual-hexes

dropper · May 17, 2011

I have a Tesla on one of my blades but I cannot seem to get it to start a unit. It just errors out. Tried different settings of the GPU flag with no difference. I couldn't find anything on Google about it.

In my case, it isn't about the points, but I need a good way to stress the card.

Keith

nitrobass24 · May 17, 2011

@Dropper,

Which Tesla?
Can you post your log?

If its a C1060 you need to use the "-forcegpu nvidia_g80" flag

crashbox · May 19, 2011

well after i had all the workstations imaged out and ready to go the owner pulled the plug on the project, he decided that the resources were better spent elsewhere even tho they have been laying in a closet for 2 months collecting dust

So instead im just going to start building my own folding boxes for use at home, i have an amd 1090t (6 core) not really doing anything so if i just used that for folding anyone have an idea what my avg ppd would be?

Haitch · May 19, 2011

An AMD 1055 Phenom II X6 @ 3500 MHz is good for around 13-15K PPD, the 1090T would be in that ball park.

H.

R-Type · May 20, 2011

Yeah, you're pretty heavily capped by the smp points until you get up to 8 threads and can run bigadv.

Case in point, Intel Burn test pegs my 4.7ghz 2600k at about 80% faster than my 4.0ghz Q9550 in linpack. However, because the 2600k can run bigadv it pulls down 47k ppd in windows while the Q9550 does 12k...

hpc cluster and folding

crashbox

Limp Gawd

W.Feather

[H]ard DCOTM x4 & [H]DCOTY x1

silicide

Gawd

SazanEyes

[H]ard|DCer of the Month - January 2011

nitrobass24

[H]ard|DCer of the Month - December 2009

crashbox

Limp Gawd

Mtnduey

[H]ard|DCer of the Month - Nov. 2013/Nov. 2014

Activate: AMD

[H]ard|Gawd

dropper

[H]ard|DCer of the Month - July 2011

nitrobass24

[H]ard|DCer of the Month - December 2009

crashbox

Limp Gawd

Haitch

Limp Gawd

R-Type

[H]ard|DCer of the Month - October 2011