GPU overclocking

mwarps · Sep 29, 2006

I was just running a few basic calculations.

If a modern vid card, running around 500MHz core speed throws out 500GFLOPS, that's about 1000 flop/cycle.

Overclocking your GPU by 2 MHz would be like borging a new PC.

I think I need to change my pants.

kaleb_zero · Sep 29, 2006

Not to burst any bubbles, but I think thats a peak performance, best case scenario, zero overhead figure.. I imagine that it will be closer to 250GFLops, meaning.. you'll have to raise the core speed a whopping 4MHz to achieve the processing power of an additional PC.

drizzt81 · Sep 29, 2006

who cares if that doesn't mean that you get 120ppd more

mwarps · Sep 30, 2006

even if it's 4MHz.. I have been scouring for OC figures with watercooling. Some are getting almost 800 MHz core compared with 650 stock. Sooooo very exciting

mhouston · Sep 30, 2006

The scaling is not quite that simple gents as you are using very much peak rates. We do scale close to linear with core clock, memory clock is less important at the moment. The boards are already running a little warm, and we've had correctness issues ~100C, but a default 3D clocks and voltages, we stay below 90C with the stock reference cooler. We do run significantly cooler than 3DMark06 and Oblivion.

mwarps · Oct 1, 2006

Not a linear increase, pooh.
Oh well. Overclocking will be done under water for me, I can't stand heat.

Macaholic · Oct 1, 2006

mwarps said:
Not a linear increase, pooh.
Oh well. Overclocking will be done under water for me, I can't stand heat.

Just to play it safe. How about doing a few reps on the GPUBench so you won't hurt yourself.

Jon855 · Oct 1, 2006

mwarps said:
even if it's 4MHz.. I have been scouring for OC figures with watercooling. Some are getting almost 800 MHz core compared with 650 stock. Sooooo very exciting

I can hit 700 Core and 690 Mem on air easily... Although that's the highest I can push my Video card with the Catalyst Control Panel... Not a big fan of ATi Tools here. Aftermarket cooling of course.

Viper87227 · Oct 1, 2006

mwarps said:
even if it's 4MHz.. I have been scouring for OC figures with watercooling. Some are getting almost 800 MHz core compared with 650 stock. Sooooo very exciting

This is why I wish I could use my 7900GT. Volt mods FTW! 450mhz to 680mhz is none to shabby.

mwarps · Oct 1, 2006

Macaholic said:
Just to play it safe. How about doing a few reps on the GPUBench so you won't hurt yourself.

95% of the people here have never heard of this.

How about leaving some useful instructions for us instead of a drive-by spamming? Please don't say "read the docs", either. These docs are obviously written by the developer and useless to people who are "trying not to hurt themselves"

Downloaded it, ran everything... Doesn't tell me anything useful since it requires an intimate knowledge of what it's doing

mhouston · Oct 1, 2006

Yes, we don't produce a single number as that is bogus for what we do. GPUBench is designed to let us understand the architectures better, and specifically how the memory system works, how the instruction issue works, and how fast readback and download are so we know the costs of offloading to the board.

But, you can play with the clocks on your board and run the instruction issue test with MAD's to see what your board can crank out in terms of FLOPs.

Or, you can run the GPUBench.pl script and generate the webpages you see in the results section on the GPUBench site.

mwarps · Oct 1, 2006

mhouston said:
Yes, we don't produce a single number as that is bogus for what we do. GPUBench is designed to let us understand the architectures better, and specifically how the memory system works, how the instruction issue works, and how fast readback and download are so we know the costs of offloading to the board.

But, you can play with the clocks on your board and run the instruction issue test with MAD's to see what your board can crank out in terms of FLOPs.

Or, you can run the GPUBench.pl script and generate the webpages you see in the results section on the GPUBench site.

Ah, very good. Time to get perl on my windows box again, looks like

EDIT: No, never mind. making the program work on win32/activeperl5.8.x should not be *this* hard.

Somewhere in the documentation provided, there should be some sort of simple "getting started" sort of thing, listing requirements (jgraph and epstopdf to say the least) such as which distribution of perl will acutally read the "/" as opposed to "\" .. I don't want to flame here, but again, a very good percentage of us are not running development environments on our folding machines, or even our main boxen.

Thankfully, all the little applets seem to work fine. Just the perl is fubar.

drizzt81 · Oct 1, 2006

those gpubench numers are interesting. apparently for subtraction the 7800GTX is beyond excellent??

mhouston · Oct 1, 2006

@mwarps: point taken, but GPUBench was really setup for GPU developers, not end users. I thought the readme said what tools were needed, but that is probably only in the CVS tree. We have't done another release since most of the GPGPU folk just grab from CVS.

@drizzt81: We fixed this just recently. Nvidia was correctly optimizing out the shader based on the constants we were passing in. We just haven't rerun the graphs.

mwarps · Oct 1, 2006

mhouston said:
@mwarps: point take, but GPUBench was really setup for GPU developers, not end users. I thought the readme said what tools were needed, but that is probably only in the CVS tree. We have't done another release since most of the GPGPU folk just grab from CVS.

@drizzt81: We fixed this just recently. Nvidia was correctly optimizing out the shader based on the constants we were passing in. We just haven't rerun the graphs.

Yeah, I figured that end users really don't need this sort of information.

Fiddled with some of the thingies, and my 7900GS Go is about 7.5GFLOPS.. Slow I am assuming for this sort of thing we're doing

I am guessing that the reason Macaholic brought the program into the thread was because of the error calculations.. Just so we could see that the GPU isn't producing wildly insane results.

drizzt81 · Oct 1, 2006

mhouston said:
@drizzt81: We fixed this just recently. Nvidia was correctly optimizing out the shader based on the constants we were passing in. We just haven't rerun the graphs.

thanks for that info. And welcome to the forums

mhouston · Oct 1, 2006

mwarps said:
Yeah, I figured that end users really don't need this sort of information.

Fiddled with some of the thingies, and my 7900GS Go is about 7.5GFLOPS.. Slow I am assuming for this sort of thing we're doing

I am guessing that the reason Macaholic brought the program into the thread was because of the error calculations.. Just so we could see that the GPU isn't producing wildly insane results.

That sounds too low, my guess is that's 7.5 GInst/sec, so multiply by 8 for GFlops.

The FAH client does do sanity checking on it's own, but you may not find out about the error until you've put in a bunch of work. The issue with using something like 3DMark or a game is that you might not visually notice errors unless things are REALLY bad. Since we are a numerical app, we'll notice this sooner. My guess is this will also effect GPU physics acceleration. The issue is that while we could do something like Prime95 for GPUs, basically something to test heat and numerics, it will not be an exact match for every GPGPU app. Hrmm, maybe we can throw something together.

mwarps · Oct 1, 2006

mhouston said:
That sounds too low, my guess is that's 7.5 GInst/sec, so multiply by 8 for GFlops.

The FAH client does do sanity checking on it's own, but you may not find out about the error until you've put in a bunch of work. The issue with using something like 3DMark or a game is that you might not visually notice errors unless things are REALLY bad. Since we are a numerical app, we'll notice this sooner. My guess is this will also effect GPU physics acceleration. The issue is that while we could do something like Prime95 for GPUs, basically something to test heat and numerics, it will not be an exact match for every GPGPU app. Hrmm, maybe we can throw something together.

It might be a good idea to put something like that together, because I know there are a lot of us here who might want to try it, we're a pretty crazy bunch.

marty9876 · Oct 1, 2006

Last edited by mhouston : Today at 04:09 PM. Reason: I need spell check in posts...

Google toolbar has spellcheck, can't live without it. All thou, never really use it here...

+1 for a Prime95 for GPU's.

drizzt81 · Oct 1, 2006

marty9876 said:
+1 for a Prime95 for GPU's.

+2 for a Prime95 for GPU's.

mhouston · Oct 1, 2006

marty9876 said:
All thou, never really use it here...

Apparently so. ;-)

marty9876 · Oct 1, 2006

mhouston said:
Apparently so. ;-)

go borg that thing

http://spire.stanford.edu/ganglia/

mhouston · Oct 1, 2006

No PCIe slots. This is why I want someone to make higher end X1k's in AGP. ;-) We do have a really good interconnect. ;-)

For more info:
http://spire.stanford.edu

marty9876 · Oct 1, 2006

I think they might have a Linux client out for FAH, you might want to check with Stanford...

mhouston · Oct 1, 2006

Yes, but SPIRE is a research system that we have to run performance runs on, so no folding. Since the quarter has started again, that machine will start cranking on research soon. But, we did help the folding folks build a nice GPU cluster for folding.

marty9876 · Oct 1, 2006

If possible, can you provide any info on the GPU cluster you guys keep refering too. We're always interested in folding porn here.

Links/pics/root access- we're flexiable.

Just giving you a hard time on SPIRE.

mhouston · Oct 1, 2006

How about LLNL's Gauss. You can put 512 R580's in that thing. That design was a follow-on to the SPIRE design, built by Graphstream.

http://www.graphstream.com (image on front page)

Too bad we all can't have a machine like that...

marty9876 · Oct 1, 2006

Interesting the install mix of both ATI and Nvida systems.

mhouston · Oct 1, 2006

I believe that setup is still running 256 G70 based Quadro's

drizzt81 · Oct 4, 2006

I have a question about that GPUBench:

the last test in the set checks for precision of the output. Would it make sense to take a close look at this when overclocking? If precision drops -> clockspeed to high?

mhouston · Oct 4, 2006

Not really. That is basicaly a test for how close to IEEE the GPU is. As people overclock, the hope is that the tests built into the FAH client will catch major issues. However, it may take quite awhile for things to go wrong. But, usually when things go bad, they go REALLY bad.

drizzt81 · Oct 4, 2006

mhouston said:
Not really. That is basicaly a test for how close to IEEE the GPU is. As people overclock, the hope is that the tests built into the FAH client will catch major issues. However, it may take quite awhile for things to go wrong. But, usually when things go bad, they go REALLY bad.

Since it obviously is detrimental to the F@H team to submit WRONG results, rather than none, is there any recommendation on how to test for 'long-term' stability? In the end, I "need" to overclock my GPU (why else did I pay for that watercooling loop) yet would love to fold... or should I just wait for "gpu prime95" to be released at some point in the future and have per-game custom settings?

mwarps · Oct 4, 2006

drizzt81 said:
Since it obviously is detrimental to the F@H team to submit WRONG results, rather than none, is there any recommendation on how to test for 'long-term' stability? In the end, I "need" to overclock my GPU (why else did I pay for that watercooling loop) yet would love to fold... or should I just wait for "gpu prime95" to be released at some point in the future and have per-game custom settings?

Pretty much a similar response here.

I have a need to overclock based on a watercooling system as well.

I think something that we can do in the meantime is run a bunch of the "precision" benches from GPUbench and run a diff on the outputs from them. If they change, the OC is bad. Please note this is just an idea and I don't know precisely how the "precision" program works

drizzt81 · Oct 4, 2006

mwarps said:
Pretty much a similar response here.
I have a need to overclock based on a watercooling system as well.

I think something that we can do in the meantime is run a bunch of the "precision" benches from GPUbench and run a diff on the outputs from them. If they change, the OC is bad. Please note this is just an idea and I don't know precisely how the "precision" program works

that is exactly what I asked here:

drizzt81 said:
I have a question about that GPUBench:

the last test in the set checks for precision of the output. Would it make sense to take a close look at this when overclocking? If precision drops -> clockspeed to high?

and got this response to it, which says "it doesn't work like that":

mhouston said:
Not really. That is basicaly a test for how close to IEEE the GPU is. As people overclock, the hope is that the tests built into the FAH client will catch major issues. However, it may take quite awhile for things to go wrong. But, usually when things go bad, they go REALLY bad.

Maybe the precision benchmarks only runs a few iterations, instead of a 'stress test' ?

mhouston · Oct 4, 2006

Correct, the GPUBench stuff won't heat up a board enough to be interesting. I'll see if we can come up with something numerically sensitive that we can run deep enough to check for badness. This may take awhile as getting the FAH client stable is primary right now, that and my thesis. ;-)

mwarps · Oct 4, 2006

mhouston said:
*snip* This may take awhile as getting the FAH client stable is primary right now, that and my thesis. ;-)

*jaw agape*

Dude. You're a monster.

stonedwaldo420 · Oct 6, 2006

When overclocking with FAH on my 1900XT, I have just been careful to adjust the fan speed so that temps stay below 75C. It REALLY helps that it has been getting down into the 30s at night here.

Anyway, I also did some tests while overclocking, and it appears that increasing just the GPU results in about a 93% increase in performance (I.E. 10% increase in core clock speed results in 9.3% folding performance increase). That is really nice scalability. Is that what other folks have been getting? What about you at Stanford? Just thought I would share my results.

mhouston · Oct 6, 2006

The temperature target in testing is <90C, so you can save your ears a little and let the board run hotter if you wish. If you don't mind the noise, running the chip cooler for this and games will make the chip last longer. But, we are probably talking multi-year happiness at high temps already. The main issue with running the fan high 24/7 is we don't have stats on the bearing life in the fans they use. I know from cluster experience that we are probably also talking in terms of a few years, but these boards haven't been out that long to know. ;-)

We are trying to throw together a stress test so people can see when they start throwing bad data when overclocking or running warm.

GPU overclocking

Supreme [H]ardness

Limp Gawd

[H]F Junkie

Supreme [H]ardness

Weaksauce

Supreme [H]ardness

n00b

[H]ard DCOTM January 2008

Fully [H]

Supreme [H]ardness

Weaksauce

Supreme [H]ardness

[H]F Junkie

Weaksauce

Supreme [H]ardness

[H]F Junkie

Weaksauce

Supreme [H]ardness

[H]ard|DCer of the Month - February 2006

[H]F Junkie

Weaksauce

[H]ard|DCer of the Month - February 2006

Weaksauce

[H]ard|DCer of the Month - February 2006

Weaksauce

[H]ard|DCer of the Month - February 2006

Weaksauce

[H]ard|DCer of the Month - February 2006

Weaksauce

[H]F Junkie

Weaksauce

[H]F Junkie

Supreme [H]ardness

[H]F Junkie

Weaksauce

Supreme [H]ardness

Limp Gawd

Weaksauce