GTX 1080 NVIDIA presentation leaked

Overclocking headroom on the 1080 looks to be massive, if that report is to be believed. over 2ghz with a water block.
 
Overclocking headroom on the 1080 looks to be massive, if that report is to be believed. over 2ghz with a water block.
depends on how far past 2 GHz it goes. it needs to hit at least 2100 MHz to overclock as well as a 980 Ti (+25%).
 
If it hits 2 ghz at all it seems would pretty be impressive.

We'll see.
 
I'm shocked they changed the SM configuration... If each SM has double the amount of SPs as gp100... Then the number of registers per sm is exactly the fucking same as maxwell hahaha
 
Yep they were, pretty sure this card will be able to do more than 2ghz on water, if you can deliver enough power to it.
 
Yep they were, pretty sure this card will be able to do more than 2ghz on water, if you can deliver enough power to it.

Exactly 1 8pin isn't enough if you want to hit 2.5ghz which some people claim.

Bring on the MSI Lightning card!
 
Hhhgghbbbb
I think this is how it will work. Pascal SM's still cannot do graphics + compute concurrently because of the expensive context switch.

The solution is to partition the GPC (so at the level of single SMs) to do graphics + compute.

The problem with maxwell is that the SM partitioning (afaik) could only be altered at each drawcall, whereas with Pascal's pixel, Triangle and instruction level preemption the repartitioning can be done with finer granularity.

I expect this to work on maxwell as well, but there will need to be careful profiling to avoid pipeline stalls

gs

Also anyone notice how the witcher 3 is featured on their async compute slide?
 
Yeah but coulda been 96 pixels per clock... at 2000 mhz.

That would require increasing the rop:bus-width ratio which they already did with maxwell

This will be fine man, it's memory bandwidth that is the question mark
 
Hhhgghbbbb
I think this is how it will work. Pascal SM's still cannot do graphics + compute concurrently because of the expensive context switch.

The solution is to partition the GPC (so at the level of single SMs) to do graphics + compute.

The problem with maxwell is that the SM partitioning (afaik) could only be altered at each drawcall, whereas with Pascal's pixel, Triangle and instruction level preemption the repartitioning can be done with finer granularity.

I expect this to work on maxwell as well, but there will need to be careful profiling to avoid pipeline stalls

gs

Also anyone notice how the witcher 3 is featured on their async compute slide?

Pascal's SM's can now do both compute and graphics kernels, at the same time its in the slides, this was not doable with Maxwell, as a whole you can have different queues on different SM's, but trying to force one SM on Maxwell to do both, you end up with major under utilization of the ALU's if the scheduler predicts incorrectly, hence the performance penalties when doing heavy compute tasks, as the graphics queue would have to wait.
 
Pascal's SM's can now do both compute and graphics kernels, at the same time its in the slides, this was not doable with Maxwell, as a whole you can have different queues on different SM's, but trying to force one SM on Maxwell to do both, you end up with major under utilization of the ALU's if the scheduler predicts incorrectly, hence the performance penalties when doing heavy compute tasks, as the graphics queue would have to wait.

Can you link to the specific slide? I didn't see anything to this effect :p

This suggests the context switch penalty has been reduced
 
Well check the dynamic load slides.

There is more.... just hasn't been leaked yet.
Dynamic load slide mentions gpu partitioning, nothing explicitly references sm-level concurrency and/or reduced context switch penalty

Can't wait for more info :D

If there's more that hasn't leaked yet... Then start leaking already! XD
 
I think it's clear from the slides that there's no more static partitioning with Pascal. The slide also explains why Maxwell performance tanks with async.
 
I think it's clear from the slides that there's no more static partitioning with Pascal. The slide also explains why Maxwell performance tanks with async.

Yeah I got that, but how does this relate to the context switch cost on the SMs. Am I having a brainfart?

Afaik maxwell can only repartition at the drawcall
 
Yeah I got that, but how does this relate to the context switch cost on the SMs. Am I having a brainfart?

The pre-emption slide mentions under 100 microseconds switching cost for games. Didnt really address SM level concurrency, which is not necessarily something you want - i.e. compute and graphics fighting for the same L1 cache.

Still don't quite get how pre-emption is a general solution for async. Pre-emption is useful if you want to interrupt a running task to make way for a higher priority kernel (like VR time warp). But async is more about running multiple kernels concurrently to take full advantage of all available execution resources.
 
The pre-emption slide mentions under 100 microseconds switching cost for games. Didnt really address SM level concurrency, which is not necessarily something you want - i.e. compute and graphics fighting for the same L1 cache.

Still don't quite get how pre-emption is a general solution for async. Pre-emption is useful if you want to interrupt a running task to make way for a higher priority kernel (like VR time warp). But async is more about running multiple kernels concurrently to take full advantage of all available execution resources.

My understanding is that you won't have SM level concurrency, but that concurrency arises from having multiple SMs (each working on either graphics or compute) executing g+c asynchronously and concurrently. Actually of they're different SMs it will be in parallel

The preemption should allow for repartitioning the SMs, no?
 
preemption and context switching should only be used for prioritizing or syncing, think of critical path.
Speaking of critical path, notice how nvidia wrote path optimization on the slide with clocks. Ties in nicely to all our rants about clocking being an architectural limitation mainly
 
According to Nvidia's slides the 1080 should be 70% faster than a 980, hot damn.
Show me 2.5 GHz OC and we have a winner.
 
Titan with 3840 SPs @ 2ghz will be 15tflops :D

Yup, plus some nice hbm2 tossed into the mix. That's the true Pascal, this is all just midrange fluff ;) If I were in the market for midrange Pascal, the $380 1070 in SLI seems a way better value than $600 1080.
 
Halfway there to make that FS/FT post quota! Keep those two-word non sequiturs coming, you're almost there!

Edit: Holy smokes, just noticed you joined yesterday. You must really want to buy/sell something.

Almost! But no, i dont know why it says I just joined yesterday though, I've been here for over a month i believe. Definitely didnt post 50+ in day lol

edit: Its fucking May lmao
 
Back
Top