GPU linux apps for NumberFields

pututu · Mar 22, 2019

After all these years, we finally have our first GPU app. It's only a beta version for 64bit linux with Nvidia GPUs. Support for other platforms and GPUs will be coming soon.

If you'd like to help test this app, you will need to check the "run test applications" box on the project preferences page. I generated a special batch of work for this app from some older WUs that I have truth for. This will help to find any potential bugs that are still there.

A few potential issues:
1. This was built with the Cuda SDK version 10.1, so it uses a relatively new Nvidia driver version and only supports compute capability 3.0 and up. If this affects too many users out there, I will need to rebuild with on older SDK.
2. I was not able to build a fully static executable, but I did statically link the ones most likely to be a problem (i.e. pari, gmp, std c++)

Please report any problems. I am still relatively new to the whole GPU app process, so I am sure there will be issues of some kind.

Also, feel free to leave comments regarding what platform, GPU, etc I should concentrate on next. I was thinking I would attack linux OpenCL (i.e. ATI/AMD) next as that should be a quick port of what I did with Nvidia. I think the windows port will take much longer, since I normally use mingw to cross-compile but I don't think that's compatible with the nvidia compiler.

https://numberfields.asu.edu/NumberFields/forum_thread.php?id=362

pututu · Mar 22, 2019

Eric (the admin) has GTX 1050 and his pphr is about 28,740. So a 1050 will produce 689,760 PPD. A 1080Ti will be about 4.5 times faster just by considering the CUDA cores, so roughly 3M PPD assuming it can scale linearly.

atp1916 · Mar 23, 2019

Yessss

If they can build in Linux, i'm sure it'll be ported to x86.

EXT64 · Mar 23, 2019

Well certainly never running the CPU app again, lol

PecosRiverM · Mar 23, 2019

Pay no attention to the above posts.
No GPU WU's it's a early "April Fool" joke

I'll work to clear "ALL" these jokes up so y'all can keep crunching the CPU WU's..

RFGuy_KCCO · Mar 24, 2019

Now if I could just get on board with Linux...

PecosRiverM · Mar 24, 2019

It's not that hard
If I can do it anyone can. (after all I've been told I'm not a smart donkey)

RFGuy_KCCO · Mar 24, 2019

Oh, I can run it easily enough. It isn’t difficult, really. My problem is the lack of hardware monitoring programs for Linux. Give me GPU-Z, CPU-Z, AIDA64, and HWInfo64 ported to Linux and I’d run it. Until then, it’s a non-starter for me.

PecosRiverM · Mar 24, 2019

I forget who on our team wrote some of those. I don't think a port is anytime soon tho.
I don't use them (don't remember ever using them either)

runs2far · Mar 31, 2019

The points have been adjusted, it's now 800 points per work unit.

runs2far · Apr 7, 2019

FYI, from the Numberfields front page.

GPU app status update
So there have been some new developments over the last week. It's both good and bad.

First of all, some history. The reason I waited so long to develop a GPU app is because the calculation was heavily dependent on multi-precision libraries (gmp) and number theoretic libraries (pari/gp). Both of these use dynamically allocated memory which is a big no-no in GPUs. I found a multi-precision library online that I could use by hard coding the precision to the maximum required (about 750 bits), thereby removing the dependence on memory allocations. The next piece of the puzzle was to code up a polynomial discriminant function. After doing this, I could finally compile a kernel for the GPU. That is the history for the current GPU app. It is about 20 to 30 times faster than the current cpu version (depends on WU and cpu/gpu speeds).

But then I got thinking... my GPU polynomial discriminant algorithm is different from the one in the PARI library (theirs works for any degree and mine is specialized to degree 10). So to do a true apples-to-apples comparison, I replaced the PARI algorithm with mine in the cpu version of the code. I was shocked by what I found... the cpu version was now about 10x faster than it used to be. I never thought I was capable of writing an algorithm that would be 10x faster than a well established library function. WTF? Now I'm kicking myself in the butt for not having done this sooner!

This brings mixed emotions. On one side, it is great that I now have a cpu version that is 10x faster. But it also means that my GPU code is total crap. With all the horsepower in a present day GPU I would expect it to be at least 10x faster than the equivalent cpu version. Compared with the new cpu version, the gpu is only 2 to 3 times faster. That is unacceptable.

So the new plan is as follows:
1. Deploy new cpu executables. Since it's 10x faster, I will need to drop the credit by a factor of 10. (Credits/hour will remain the same for the cpu but will obviously drop for the GPU)
2. Develop new and improved GPU kernels.

I don't blame the GPU users for jumping ship at this point. Frankly, the inefficiency of the current GPU app just makes it not worth it (for them or the project).

For what it's worth, I did have openCL versions built. Nvidia version works perfectly. The AMD version is buggy for some reason, as is the windows version. Since I will be changing the kernels anyways, there is no point in debugging them yet.

Points have been adjusted again, I do not know what it means for the CPU side, but it looks like PPD for GPU has gone up.

GPU linux apps for NumberFields

pututu

[H]ard DC'er of the Year 2021

pututu

[H]ard DC'er of the Year 2021

atp1916

[H]ard|DCoTM x1

EXT64

[H]ard|DCer of the Year 2020

PecosRiverM

Weaksauce

RFGuy_KCCO

DCOTM x4, [H]ard|DCer of the Year 2019

PecosRiverM

Weaksauce

RFGuy_KCCO

DCOTM x4, [H]ard|DCer of the Year 2019

PecosRiverM

Weaksauce

runs2far

Gawd

runs2far

Gawd