GPU linux apps for NumberFields

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,976
After all these years, we finally have our first GPU app. It's only a beta version for 64bit linux with Nvidia GPUs. Support for other platforms and GPUs will be coming soon.

If you'd like to help test this app, you will need to check the "run test applications" box on the project preferences page. I generated a special batch of work for this app from some older WUs that I have truth for. This will help to find any potential bugs that are still there.

A few potential issues:
1. This was built with the Cuda SDK version 10.1, so it uses a relatively new Nvidia driver version and only supports compute capability 3.0 and up. If this affects too many users out there, I will need to rebuild with on older SDK.
2. I was not able to build a fully static executable, but I did statically link the ones most likely to be a problem (i.e. pari, gmp, std c++)

Please report any problems. I am still relatively new to the whole GPU app process, so I am sure there will be issues of some kind.

Also, feel free to leave comments regarding what platform, GPU, etc I should concentrate on next. I was thinking I would attack linux OpenCL (i.e. ATI/AMD) next as that should be a quick port of what I did with Nvidia. I think the windows port will take much longer, since I normally use mingw to cross-compile but I don't think that's compatible with the nvidia compiler.

https://numberfields.asu.edu/NumberFields/forum_thread.php?id=362
 

pututu

[H]ard DCOTM x2
Joined
Dec 27, 2015
Messages
1,976
Eric (the admin) has GTX 1050 and his pphr is about 28,740. So a 1050 will produce 689,760 PPD. A 1080Ti will be about 4.5 times faster just by considering the CUDA cores, so roughly 3M PPD assuming it can scale linearly.



upload_2019-3-22_20-26-28.png
 

atp1916

[H]ard|DCoTM x1
Joined
Jun 18, 2004
Messages
4,802
Yessss :D

If they can build in Linux, i'm sure it'll be ported to x86.
 

PecosRiverM

Weaksauce
Joined
Mar 23, 2018
Messages
84
It's not that hard
If I can do it anyone can. (after all I've been told I'm not a smart donkey)
 

RFGuy_KCCO

DCOTM x4, [H]ard|DCer of the Year 2019
Joined
Sep 23, 2006
Messages
911
Oh, I can run it easily enough. It isn’t difficult, really. My problem is the lack of hardware monitoring programs for Linux. Give me GPU-Z, CPU-Z, AIDA64, and HWInfo64 ported to Linux and I’d run it. Until then, it’s a non-starter for me.
 

PecosRiverM

Weaksauce
Joined
Mar 23, 2018
Messages
84
I forget who on our team wrote some of those. I don't think a port is anytime soon tho.
I don't use them (don't remember ever using them either)
 

runs2far

Gawd
Joined
Nov 7, 2011
Messages
917
FYI, from the Numberfields front page.

GPU app status update
So there have been some new developments over the last week. It's both good and bad.

First of all, some history. The reason I waited so long to develop a GPU app is because the calculation was heavily dependent on multi-precision libraries (gmp) and number theoretic libraries (pari/gp). Both of these use dynamically allocated memory which is a big no-no in GPUs. I found a multi-precision library online that I could use by hard coding the precision to the maximum required (about 750 bits), thereby removing the dependence on memory allocations. The next piece of the puzzle was to code up a polynomial discriminant function. After doing this, I could finally compile a kernel for the GPU. That is the history for the current GPU app. It is about 20 to 30 times faster than the current cpu version (depends on WU and cpu/gpu speeds).

But then I got thinking... my GPU polynomial discriminant algorithm is different from the one in the PARI library (theirs works for any degree and mine is specialized to degree 10). So to do a true apples-to-apples comparison, I replaced the PARI algorithm with mine in the cpu version of the code. I was shocked by what I found... the cpu version was now about 10x faster than it used to be. I never thought I was capable of writing an algorithm that would be 10x faster than a well established library function. WTF? Now I'm kicking myself in the butt for not having done this sooner!

This brings mixed emotions. On one side, it is great that I now have a cpu version that is 10x faster. But it also means that my GPU code is total crap. With all the horsepower in a present day GPU I would expect it to be at least 10x faster than the equivalent cpu version. Compared with the new cpu version, the gpu is only 2 to 3 times faster. That is unacceptable.

So the new plan is as follows:
1. Deploy new cpu executables. Since it's 10x faster, I will need to drop the credit by a factor of 10. (Credits/hour will remain the same for the cpu but will obviously drop for the GPU)
2. Develop new and improved GPU kernels.

I don't blame the GPU users for jumping ship at this point. Frankly, the inefficiency of the current GPU app just makes it not worth it (for them or the project).

For what it's worth, I did have openCL versions built. Nvidia version works perfectly. The AMD version is buggy for some reason, as is the windows version. Since I will be changing the kernels anyways, there is no point in debugging them yet.

Points have been adjusted again, I do not know what it means for the CPU side, but it looks like PPD for GPU has gone up.
 
Top