RTX 3000 series for distributed computing.

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
I just watched the "live" presentation by Jensen. Here is a quick summary of the new RTX3000 series card. Nothing on distributed computing. Not even a mention on folding at home in which they are one of the corporate sponsors/donors. I'm guessing it will translate to improvement in crunching particularly on the power efficiency.

Here is the pre-recorded official presentation by Nvidia and the webpage that summarizes the performance.

Card summary:
RTX 3080, Sept 17, $699
RTX 3070, sometime in Oct, $499. According to Jensen, this card is comparable in performance as RTX 2080Ti.
RTX 3090, Sept 24, $1499

1598979431621.png


I'm guessing that 3070 and 3080 will be great cards for distributed computing. The 3070 with 5,888 cores seems very compelling!
 
Last edited:

Endgame

Limp Gawd
Joined
Jan 10, 2007
Messages
315
Folding at home doesn't care about total VRAM overly much, and I don't think it uses Tensor cores at all. Overall, I think performance it will come down to the number of CUDA cores which means maybe it's possible that the 3080 takes the efficiency crown this time around?

I think I'm in for the 3090 anyway, but whatever I buy will end up with more time cruching for F@H than what I end up gaming on it.
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
Doubling of the cuda cores is expected with the node shrink and pack some more RT and Tensor cores. Games are taking advantage quickly of the use of these new cores. For DC projects, kind of a waste that the software is not optimized to use the RT and/or Tensor cores. So far, I've seen Amicable Numbers and maybe GPUgrid project does very well with Turing than Pascal cards for roughly the same cuda count and clock speed. Not sure how easy it is to implement. Understand that these cards primary target is not DC but just saying.
 

motqalden

[H]ard|DCOTM x3
Joined
Jun 22, 2009
Messages
1,546
Yeah its not totally clear as to what made the 2000 series so much faster than 1000 series for boinc. 2080's are close to twice as fast in a lot of projects and they have no where near twice the cuda cores of 1080ti for example.
I am sure we will see that carry over if not improve and as well see a big boost due to cuda core count
 

Toconator

Gawd
Joined
Jul 8, 2005
Messages
706
Folding at home doesn't care about total VRAM overly much, and I don't think it uses Tensor cores at all. Overall, I think performance it will come down to the number of CUDA cores which means maybe it's possible that the 3080 takes the efficiency crown this time around?

I think I'm in for the 3090 anyway, but whatever I buy will end up with more time cruching for F@H than what I end up gaming on it.
It looks like you'll get 17,408 CUDA cores for $100 less with 2 x 3080's tho ...
 

Toconator

Gawd
Joined
Jul 8, 2005
Messages
706
Yeah its not totally clear as to what made the 2000 series so much faster than 1000 series for boinc. 2080's are close to twice as fast in a lot of projects and they have no where near twice the cuda cores of 1080ti for example.
I am sure we will see that carry over if not improve and as well see a big boost due to cuda core count
I would guess its architectural. For instance concurrent FP and Integer ops, etc. and some projects benefit from it as frequency didn't improve much if at all. My 1070Ti & 1060 will do 2GHz+ easily and the 1050 1.95GHz with little effort .
 

Endgame

Limp Gawd
Joined
Jan 10, 2007
Messages
315
It looks like you'll get 17,408 CUDA cores for $100 less with 2 x 3080's tho ...
True, though my wife is much more concerned with points per watt than total points, as she knows how much trash hardware I could conjure up if given free reign
 

motqalden

[H]ard|DCOTM x3
Joined
Jun 22, 2009
Messages
1,546
I would guess its architectural. For instance concurrent FP and Integer ops, etc. and some projects benefit from it as frequency didn't improve much if at all. My 1070Ti & 1060 will do 2GHz+ easily and the 1050 1.95GHz with little effort .
I agree with the general idea, but the part that I find interesting is that this didn't seem to translate to the same level of improvements in the gaming / FPS front. 2080ti beat out 1080ti by like only 30% and it had an almost 25% cuda core increase. You would thing these magic under the hood improvements to the things you mentioned along with a core count uplift would have brought more to the table. Perhaps a lot of them were inherited more so from the cards intended for non gaming tasks and were really not suited for gaming improvements *shrug
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
Saw this german article posted by Stefan. Translated to English.

If the project uses FP32 calculation, will see almost double the computational speed with all else been equal (i.e. almost double the cuda cores). However no significant gain if the project uses integer operations. Anyone know what DC project uses only integer operations? Nothing about FP64, so some AMD cards (vii, 7970, 280x) are still the king unless you get nvidia non-consumer cards/setup.
1599317506246.png
 

The_Heretic

Certified [H]
Joined
Jun 22, 2001
Messages
13,159
It looks like you'll get 17,408 CUDA cores for $100 less with 2 x 3080's tho ...
A question that comes up with these card's design is do you want multiple GPUs in the same case ? One of the reasons I'm a least considering getting the 3090 for the Threadripper and just run that one GPU in it. Although I may just get a single 3080 and call it a day.
 

Toconator

Gawd
Joined
Jul 8, 2005
Messages
706
True, though my wife is much more concerned with points per watt than total points, as she knows how much trash hardware I could conjure up if given free reign
Just go stealth then. Get a case with a solid side panel instead of a window, replace any tool-less thumbscrews with real screws and lock your toolbox. It's just one more GPU, "harmless" you might say. What she doesn't know … ;)
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
Here is the closest distributed computing (cuda & openCL) benchmarking performance of the RTX 3000 cards that I read this morning: https://videocardz.com/newz/nvidia-...080-performance-in-cuda-and-opencl-benchmarks.

The n-body simulation is widely used in astrophysics. The RTX 3080 can performed up to 78% faster than RTX 2080 Super in this test. On average RTX 3080 will see about 68% and 38%/41% computational increase over RTX 2080 Super and 2080Ti respectively. Nothing on power efficiency. Will know more after Sep 14 but I'm guessing we are going to see a lot of gaming benchmark and probably very few of these.

1599417324862.png


edit: Considering that the 3080 has twice the cuda count than 2080Ti, the overall average compute performance is not twice but only 38-41% higher on average. Could be memory bandwidth limited (760 vs 616 GB/s) or memory bus width (320 vs 352-bit) or maybe un-optimized driver or some combination of all of these. Will know when the 3090 compute performance is out.
 
Last edited:

motqalden

[H]ard|DCOTM x3
Joined
Jun 22, 2009
Messages
1,546
A question that comes up with these card's design is do you want multiple GPUs in the same case ? One of the reasons I'm a least considering getting the 3090 for the Threadripper and just run that one GPU in it. Although I may just get a single 3080 and call it a day.
Yeah I didn't even think about this! its gonna dump straight up into the fan of the card above it. Yes a lot of the heat does this with past cards, but I am sure some of it escapes over the top and doesn't get sucked in.
 

The_Heretic

Certified [H]
Joined
Jun 22, 2001
Messages
13,159
Yeah I didn't even think about this! its gonna dump straight up into the fan of the card above it. Yes a lot of the heat does this with past cards, but I am sure some of it escapes over the top and doesn't get sucked in.
Could be solved by some form of ducting perhaps, and/or some serious air movement in the case. But I don't think the past method(s) of just adding an additional GPU to increase output will be the best solution for 30xx. At least initially.
 

motqalden

[H]ard|DCOTM x3
Joined
Jun 22, 2009
Messages
1,546
Also because of the overall size of the airflow pattern coming off the back i expect that the heat will not be as high as we see coming off the top vent area of more traditional coolers. For example I put my hand behind my tower cooler on my CPU and the air is warm but not super hot like it is coming off the top of my graphics card. Warm air moving at a high rate will not be that much of a burden for the card above it. Additionally if you increase the fan speed on the GPU the air will be cooler.
 

Toconator

Gawd
Joined
Jul 8, 2005
Messages
706
Yeah I didn't even think about this! its gonna dump straight up into the fan of the card above it. Yes a lot of the heat does this with past cards, but I am sure some of it escapes over the top and doesn't get sucked in.
AIB cards will prob have different fan choices more like we're used to seeing and then it's just the status quo. Good case/airflow choices should mitigate the issue.
 

runs2far

Gawd
Joined
Nov 7, 2011
Messages
915
edit: Considering that the 3080 has twice the cuda count than 2080Ti, the overall average compute performance is not twice but only 38-41% higher on average. Could be memory bandwidth limited (760 vs 616 GB/s) or memory bus width (320 vs 352-bit) or maybe un-optimized driver or some combination of all of these. Will know when the 3090 compute performance is out.
From my understanding of the details Nvidia has shared, the 3k series doubles the FP32 units per SM, which is what Nvidia ises to count CUDA cores, while leaving the number of INT32 cores unchanged, which must limit improvements for some workloads.

EDIT:

Looks like Pututu already shared this info.
 

Sparky

2[H]4U
Joined
Mar 9, 2000
Messages
3,232
I will be getting a 3090 to replace my TITAN V. Then next year sell it and get a TITAN if they come out.
Thats just me.
I am also building a new system with an 18 core processor to replace my 8 core 5960x from 2015 which I hope will be ready for the 20 year anniversary of F@H on Oct 1.
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
Awesome build. Please share results here when you got one after Sep 24th (y)
 

sirmonkey1985

[H]ard|DCer of the Month - July 2010
Joined
Sep 13, 2008
Messages
22,149
I will be getting a 3090 to replace my TITAN V. Then next year sell it and get a TITAN if they come out.
Thats just me.
I am also building a new system with an 18 core processor to replace my 8 core 5960x from 2015 which I hope will be ready for the 20 year anniversary of F@H on Oct 1.
the 3090 is the Titan, they inadvertently killed(or it was intentional, who knows) the naming scheme with the RTX Titan so there's nothing they can really call it except something stupid like the RTX Titan X. also because of that this allows AIB's to produce the cards which was never an option. they were always a nvidia only product other than the first Titan which allowed some of the AIB's to sell the reference model under their brand.
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
Videocardz has link to all the reviews on 3080. Didn't have time to go through those reviews with compute benchmark.

I use Geekbench website as reference to estimate relative compute benchmark. Performance will also depend on the DC application software written. If someone has the card on hand soon, please post result.

Cuda benchmark: about 28% faster than 2080 Ti and 66% faster than 2080S according to Geekbench
1600266569959.png


OpenCL: 39% faster than 2080, 65% faster than 2080S according to Geekbench

1600266731963.png
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
From TechPowerUp

First voltage-frequency graph that I've seen for 3080. I think the sweet spot could be around 0.85V and still get decent core clock (~1700MHz), I think this is going to be a very efficient card if tune properly. Not sure what power limit setting this will be but I sometimes use MSI AB to set a fix voltage in the voltage-frequency curve.

Note that power goes up as a square of the voltage (or current), assuming the system electrical resistance varies very little within a narrow range of operating temperature.

1600271640780.png
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,909
From the same folding forum link in my previous post.

With cuda running, 5.6M PPD is possible.

Some quick numbers from Project 11765 in Linux:

TPF 73s - GTX 1080Ti running OpenCL/ 1.554 M PPD
TPF 57s - GTX 1080Ti running CUDA / 2.253 M PPD
TPF 49s - RTX 2080Ti running OpenCL/ 2.826 M PPD
TPF 39s - RTX 2080Ti running CUDA / 3.981 M PPD
TPF 36s - RTX 3080 running OpenCL / 4.489 M PPD
TPF 31s - RTX 3080 running CUDA / 5.618 M PPD

I do expect that the numbers might potentially be better once the drivers have matured a bit, generally in about 6 months. By that time, we might have a new version of FahCore_22 that can unlock more performance too!
 

Endgame

Limp Gawd
Joined
Jan 10, 2007
Messages
315
very nice, especially If you were one of the lucky few to get a 3080. I want too upgrade my launch day 1080ti, but honestly I’m petty happy with how it has held up. I completely struck out on the 3080, don’t Expect any better with the 3090, so I guess I’ll have to live with 2.2mil ppd :D
 
Top