Linux Core 17 - situation in June 2013

Nicolas_orleans

[H]ard DCOTM May 2016
Joined
Oct 7, 2012
Messages
352
Now that Core 17 works on Linux I am considering some GPU folding. Easiest way would be to add GPUs to my signature rig.

I did some trials with my old GT430 : it folds under Ubuntu 12.04 with latest nvidia-current drivers.Though it's not worth it since the hundred PPDs gained on GT430 are far less than the PPD drop due to the 2 minutes increase in 8101 TPF.

Questions (lots of, as a matter of fact) :

1/ did anyone do GPU folding on an Asus Z9PE-D8 with dual LGA 2011 Xeons previously doing Bigadv SMP32, if yes, how many logical cores did you remove from BigAdv folding per card, and what was the related PPD drop ? (question for AndyE and his quad Titans on the very same motherboard :))

2/ what's the extra power draw per card when folding ? I know TDP and power draw are related, though not the same : for example GTX 780 has TDP of 250 W, but reported 3DMark reported power draw is around 410 W, what's the FAH power draw ?

3/ what's the Linux driver situation for AMD ? Especially Radeon HD 7970 / 7990 ? I found no posting on the way they behave with Core 17 under Ubuntu now that Open CL 1.2 has been included in Radeon drivers...

4/ some time ago (Core 15 and SMP2) there was some core affinity settings to do in order to optimize SMP+GPU on same rig, is it the same with krakenized Bigadv+Core 17 and how to set affinities for GPU folding ?

Regards

Nicolas
 
I haven't done GPU folding on bigadv rigs - all I know is it isn't easy to optimize.

It should be under the TDP (are you sure that 410W wasn't total system load?). My guess would be more around 200W - but if you are sizing the PSU using the TDP isn't ridiculous.

AMD is currently a no-go in Linux. We think something was up with the drivers - though no testing has been attempted for a while.
 
As you mentioned me:
ad 1) I haven't yet done a -bigadv / 4 GPU combo. I used the 4 GPUs with -smp on the remaining 28 CPU cores. I might give it a try now.
Currently I am running all the GPUs in dual GPU configurations on low power/low cost systems (adding 110 Euro per GPU), leaving the 32 cores on my 2 Dual Xeon systems for -bigadv jobs only. As of today, 3 systems with 2 GPUs each are active (4x Titan, 2x GTX780). Later this week, I will get the missing components for the other 3 systems (2x GTX780 and 4x AMD7970). The reason for deviating from the former big system/multiple GPUs is system stability. It is much easier to keep GPUs cool in this setup, the low power CPUs don't add too much energy consumption and only high cost mobos provide a stable 4GPU platform. "Cheaper" 4x PCIe x16 mobo's with LGA 1155 weren't as stable as the big workstation boards.

ad 2) The computational densitiy of FAH GPU workloads is lower than with utilities like FurMark as less transistors are concurrently active. Expect approx. 70% TDP with current core17 WU. 8900 is a bit higher than 7663. My dual GPU rigs with a little Celeron CPU use while crunching 7663 WUs:

2x GTX Titan: 440 wattt
2x GTX 780: 430 watt
2x AMD 7970 GE: 380 watt

The DualXeon system with 4 Titans had - depending on the WUs - around 1100-1300 watts (with 2x Xeon E5-2687W and 128 GB ECC RAM). Without the Titans, power consumption is back to 500 watt.

Without the "missing" 4 CPUs needed for the GPUs and now back to fully 32 cores, the E5-2687W system is currently delivering between 300k ppd (8101) and 400k ppd (8105). This is about the same range one dualGPU rig is producing (with Titans). For WU 7663 (330k ppd) and 8900 (400k ppd) - with less energy than the Dual Xeon system.

From my (short) folding experience, if ppd's is the system of measurement, the days of 2P -bigadv systems seems to be "over" - unless there is another change in the way points are calculated. This is not the case for 4P systems - today.

Some cost calculations:
http://hardforum.com/showthread.php?p=1039914829&highlight=#post1039914829

cheers,
Andy
 
I use that motherboard and do bigadv with 4 GPUs in linux. How many cores you give the GPUs depends on how many you have, and how fast they are. Core_17 has a decent amount of overhead, but the total amount of overhead depends on how fast your cpu is (not an issue in the scenario you're describing), and most importantly what the tpf is. Most of the overhead comes from checkpoints and error checking, which are done deterministically every 2%. So, a WU that has tpfs of 1 minute has roughly 4 times the overhead as a WU with a tpf of 4 minutes (or similarly, a GPU that is 4 times as fast, requires 4 times as much cpu overhead per unit time).

On 7663 I couldn't drop below 3 dedicated cores for the 4 gpus without gpu tpfs suffering (and since smp29 doesn't work anyway, there was no point in not assigning 4). On 8900 I can assign 2 cores (for 4 GPUs), and only lose 1-2secs from tpf. But, that's enough to make it a wash (PPD lost from 1-2s from each GPU=PPD gain of smp30 over smp28 for my CPUs...2670's overclocked to 105 bclk). There is no simple formula. If I had faster CPUs (so gain from smp30 was larger) and/or slower GPUs (so there was less cpu overhead/time), the benefit of smp30/2 cores for GPUs would be larger. You really need to use trial and error to find the sweet-spot for your particular set up.
 
Thanks guys ! So two ways :
- build a dedicated GPU folder, will cost more to set up, but no fine tuning required
- fine tune as Quisarious, more work, but less cost to set up
... I will give these advices some serious thinking !
 
Thanks guys ! So two ways :
- build a dedicated GPU folder, will cost more to set up, but no fine tuning required
- fine tune as Quisarious, more work, but less cost to set up
... I will give these advices some serious thinking !

If the quantity of boxes doesn't bother you, a dedicated GPU folder is the way to go IMO.
 
What re the Linux drivers like for AMD's anyone know? The CPU usage is much lower on them
 
I would agree with Musky, it's what I've done. The benefit on the GPU folder especially for nVidia is that you can use a low spec CPU to power them and more importantly (Correct if I'm wrong guys) Linux doesn't seem to support Boost or overclocking. My 780 folded a great deal slower in Linux than it did on Windows on a P8900 Windows PPD = 165k ~ Linux PPD = 130k ~.

Biffa, as far as I know AMD won't fold under Linux "out of the box" however if you got to folding forums there are a few guides to get it running with Wine etc.
 
Hmm, that is odd. My 660's boost seems to work fine in Linux. At any rate - it is a little faster in Linux than in Windows. And beyond flashing the bios there is no OC.

AMDs have full OC and monitoring in Linux, however the drivers available at the time were poor performing and unreliable. Don't waste your time trying with AMD in Linux at this point. I'd wait for better drivers and official testing/support.
 
If the quantity of boxes doesn't bother you, a dedicated GPU folder is the way to go IMO.

Do you suggest Windows or Linux for the dedicated GPU folder?

What GPUs are rocking the PPD/watt?

Thanks.
 
What GPUs are rocking the PPD/watt?

Based on Anandtech's reports of FAH Bench 1.2.0 results for Explicit / Single Précision with latest OpenMM 5.1, and considering GTX Titan at 160 kPPD as the standard and 80% TDP as power draw, and cheapest card available in Europe with a quick search yesterday, this should give someting like :

(cards sorted by increasing ns/day in Anandtech's FAHbench)

GTX 680 - 750 PPD/W - 275 PPD/Euro
GTX 770 - 650 PPD/W - 300 PPD/Euro
GTX 690 - 550 PPD/W - too expensive for me
GTX 780 - 750 PPD/W - 250 PPD/Euro
GTX Titan - 800 PPD/W - too expensive for me

So I am starting to think dual GTX 770...

Does anybody has more information on GPU Boost under Linux and Linux vs Windows OpenCL performance ?
 
Based on Anandtech's reports of FAH Bench 1.2.0 results for Explicit / Single Précision with latest OpenMM 5.1, and considering GTX Titan at 160 kPPD as the standard and 80% TDP as power draw, and cheapest card available in Europe with a quick search yesterday, this should give someting like :

(cards sorted by increasing ns/day in Anandtech's FAHbench)

GTX 680 - 750 PPD/W - 275 PPD/Euro
GTX 770 - 650 PPD/W - 300 PPD/Euro
GTX 690 - 550 PPD/W - too expensive for me
GTX 780 - 750 PPD/W - 250 PPD/Euro
GTX Titan - 800 PPD/W - too expensive for me

So I am starting to think dual GTX 770...

Does anybody has more information on GPU Boost under Linux and Linux vs Windows OpenCL performance ?

GPU boost works in linux (on every card I've tested). There's no appreciable difference between windows and linux in terms of TPF (clock for clock). So, all things being equal (OS costs not among them...), windows makes more sense for a dedicated GPU box just because of the ease of adjusting/overclocking.
 
GPU boost works in linux (on every card I've tested). There's no appreciable difference between windows and linux in terms of TPF (clock for clock). So, all things being equal (OS costs not among them...), windows makes more sense for a dedicated GPU box just because of the ease of adjusting/overclocking.

One of the GPU developers says Linux is 5-7% faster than Windows. Just passing along this info since I've not tried either solution yet. :)

http://www.reddit.com/r/Folding/comments/1dsylw/i_am_yutong_zhao_iama_gpu_core_developer_at/

Search for TheBlademaster01 and look at the developer's response.
 
One of the GPU developers says Linux is 5-7% faster than Windows. Just passing along this info since I've not tried either solution yet. :)

http://www.reddit.com/r/Folding/comments/1dsylw/i_am_yutong_zhao_iama_gpu_core_developer_at/

Search for TheBlademaster01 and look at the developer's response.

He has said that, and I'm not sure where it came from, because no one I know of during the internal test of the linux client saw any speed advantage (on Nvidia, AMD is a CF in Linux).
 
Do you know if there is a performance advantage for folding using a PCI-E 3.0 motherboard that can run 2 slots at 16x againts a PCI-E 3.0 motherboard that run 2 slots at 8x ?

There is a price premium to have 2x PCI-E 3.0 cards at 16x, is it worth it for GPU folding ?
 
There is no performance advantage, you can even run in PCIe 2.0 8x or lower and it won't make a difference.
 
Here is my current component selection and recommendation list for 2xGPU systems:
1) Reuse any unused things first :)
2) CPU: Celeron G1610 (low cost, low power, suffient to drive 2 GPU, Is not used for folding, it wouldn't make any significant contribution anyway. Even if I would replace the CPU with a i7-3770K for 10x the price)
3) Mobo: The cheapest motherboard with a splitter (one PCI express v3.0 x16 slot or two PCIe 3.0 x8) I found was the MSI Z68MA-G45. v3.0 is not needed with current cores, but it doesn't hurt to use balanced board. This mobo has the 2 GPU-PCI slots 3 PCI slots apart. (Important if you choose GPUs with open coolers so they can suck air from the side). Together with the Celeron, this mobo consumes less than 20 watt with an idle desktop, so its a low energy overhead. I got yesterday a few of those for assembling.
4) PSU: I settled for a PSU type which supports 2x 8-pin PCI and 2x6pin PCI connectors to drive 2 GPU reliably. In my country a 630 watt be quiet! model was the price leader for the performance and quality range I looked for. THere are many PSUs around - choose "wisely" from the available in your area.
5) Any 2x2GB low cost RAM kit will be ok
6) case: I do have 3 systems with low cost 40 Euro cases. If you keep the side panel open, flip them 90 degrees to let the hot air ascend, your are set. These cases are sufficient. I am currently rehosting my systems to Coolermaster HAF XB cases, I found them an interesting concept as I can feed all GPUs an unobstructive air flow. They are more expensive (80 euro), not really needed if you wanna keep cost down.
7) fans: If you select blower type GPUs (i.e. Titan, GTX 780), make sure you feed them fresh air in the input area. If you have open vent GPUs (like 7970) place a 200mm fan on top of the 2 GPUs (in the flipped case with the open side panel) to extract the hot air from the open GPU coolers.
8) use the latest released drivers
9) start with no overclocking and check the folding stability of your system and slowly move up if you want.

Cost optimized, the total cost was 220 Euro, with the Coolermaster HAF XB it was 260 Euro. Plus the GPU costs.

Have fun,
Andy
 
Last edited:
There is no performance advantage, you can even run in PCIe 2.0 8x or lower and it won't make a difference.

3) Mobo: The cheapest motherboard with a splitter (one PCI express v3.0 x16 slot or two PCIe 3.0 x8) I found was the MSI Z68MA-G45. v3.0 is not needed with current cores, but it doesn't hurt to use balanced board.

Thanks guys. Andy, one question, on MSI website, it appears revB3 of the board complies with one PCIe 2.0 x16 or two PCIe 2.0 x8, but you mention PCIe 3.0 ?
 
II have the G3 version (not the B3)
http://www.msi.com/product/mb/Z68MA-G45--G3-.html#/?div=Basic

It has PCI 3.0 support. Please don't forget, that with current CPUs the PCI controller is on the CPU die
and not in the PCH hub anymore. So the revision to be supported is dependent on the CPU type (and mobo).

Andy

Clear and thanks for the reminder. To get PCIe 3.0, I need a 22 nm Ivy Bridge and a mobo with PCIe 3.0 support.
Regards
Nicolas
 
There is no performance advantage, you can even run in PCIe 2.0 8x or lower and it won't make a difference.

Note that this is currently true, however there have been rumblings of a hybrid core that uses both CPU and GPU in a single core. No time frame as to when this might come to fruition, but this might change the current state of affairs.
 
Two GTX 770 ordered ! I will start with them on my Asus Z9PE-D8 WS and see what happens with SMP30 + 2 x Core 17.

Any difference in terms of Bigadv performance whether I pick :
- 2 x PCI lanes on same CPU (I was thinking using PCI 5 and PCI 7 for space reason, both are linked to CPU2)
- one PCI lane on each CPU ?
 
Two GTX 770 ordered ! I will start with them on my Asus Z9PE-D8 WS and see what happens with SMP30 + 2 x Core 17.

Any difference in terms of Bigadv performance whether I pick :
- 2 x PCI lanes on same CPU (I was thinking using PCI 5 and PCI 7 for space reason, both are linked to CPU2)
- one PCI lane on each CPU ?

I don't think it matters.

I've tried getting fancy and splitting/pooling slots based on CPU, assigning GPU core to particular CPUs, and have never gotten any improvement (but in many cases managed significantly worse) in performance.

I've never used a display for my linux box, but I know that people that have used this board for hackintoshes have found that the GPU used to drive the display has to be on CPU1 (so slot 1 or 3, can't remember if it had to be slot1...). But that could just be a vagary of OSX.
 
With the Asus Z9PE-D8WS mobo, slots 1,2,4 are connected to CPU1 and slots 3,5,6 are connected to CPU2.
 
I've tried getting fancy and splitting/pooling slots based on CPU, assigning GPU core to particular CPUs, and have never gotten any improvement (but in many cases managed significantly worse) in performance.

I've never used a display for my linux box, but I know that people that have used this board for hackintoshes have found that the GPU used to drive the display has to be on CPU1 (so slot 1 or 3, can't remember if it had to be slot1...). But that could just be a vagary of OSX.

Thanks Quisarious... so it appears no need for me to play with affinities / locking core and so on. Not a bad news !

I can testify on Linux my GT430 is working flawlessly on PCIe slot 7 eg linked to CPU2... as you say it may be an OSX-only issue
 
With the Asus Z9PE-D8WS mobo, slots 1,2,4 are connected to CPU1 and slots 3,5,6 are connected to CPU2.

Where did you get this info?

In the manual, it states slots 1-4 are provided by CPU1 (1/3 run in x16 if 2/4 both empty, x8 otherwise, 2/4 run x8 always), which 5-7 are provided by CPU2 (5/7 run x16 always, 6 runs x8).
 
Quisarious,
sorry my mistake. mixed up D8 and D16 (have both)
The assignment I described is for the D16 board, not the D8 board
your description is the correct one for the D8 board
Andy
 
Sig rig 2 is up and running with 2 x GTX 770 under 319.32. Both cards monitored in Nvidia X Server settings.

... but FAHClient 7.3.6 does not appear to detect the cards. I posted in the official FF a question regarding the location of the linux equivalent of the GPUs.txt.

If any of you knows where it is, I could check if it contains the magic line
0x10de:0x1184:2:3:GK104 [GeForce GTX 770]

Both of the cards are correctly detected by linux and x server as
0x10de:0x1184: NVIDIA Corporation

So it should be able to fold ? :confused:
 
Solved with manual download of GPUs.txt in /var/lib/client

The dual GTX770 are folding.

I will post some numbers this week end, stay tuned.
 
I have to consolidate my boxes - at least in summer - so this discussion is very interesting and helpful.
 
After almost 4 x 8900, 90-95 kPPD per card eg 180-190 kPPD for the system.
There is a very interesting post from bollix47 on how to adjust in linux the fan speed of Nvidia cards. I have not tried yet, but it gives more options than default driver options (Powermizer and... nothing more)
 
Does anyone know GTX 770 power consumption while folding these units? HD 7970's seem to get around 100k ppd while pulling about 160 W from what I've seen.
 
On my system the power draw is between 320 W and 345 W from the wall with a 55W TDP Celeron CPU with its two threads used 100% by the Nvidia OpenCL driver. So I would say pretty close to 150 W per card.

So based on a few days folding with these 8900s, around 600 PPD/W per card with a system average in the range 525-550 PPD/W when you include CPU, board, RAM, HDD, fans.

Right now I kicked 1150 MHz stable turbo on one card and 1137 Mhz on the other but the coolbit trick in the post I mentionned earlier works for one card (the one used for display), not the other.
 
On my system the power draw is between 320 W and 345 W from the wall with a 55W TDP Celeron CPU with its two threads used 100% by the Nvidia OpenCL driver. So I would say pretty close to 150 W per card.

So based on a few days folding with these 8900s, around 600 PPD/W per card with a system average in the range 525-550 PPD/W when you include CPU, board, RAM, HDD,

Thanks for the data.
Looks like the GTX 770 is slightly more energy efficient than the AMD 7970 GE.

My dual 7970 system with probable the same Celeron CPU (G1610) pulled 380 watt from the wall with pre-P8900 core17 units. With P8900, energy use climed to 400 watt and with P8902 it hovered around 420 watt.
 
My dual 7970 system with probable the same Celeron CPU (G1610) pulled 380 watt from the wall with pre-P8900 core17 units.

Yes, it's the same G1610 Celeron. I acknowledge having followed one of your posts advising this two threaded Celeron for dedicated dual GPU rig :D
 
Back
Top