Hello GV100

Shintai

Supreme [H]ardness
Joined
Jul 1, 2016
Messages
5,678
R&D Budget: $3B USD
https://devblogs.nvidia.com/parallelforall/inside-volta/

Volta 50% more power efficient than Pascal.
1455Mhz, down from 1480Mhz with GP100.
Samsung 1.75Ghz HBM2.
320 TMUs (+50% vs GP100)
TSMC 12nm FFN, a 4th generation 16nm made specially for Nvidia with their demands.
Maxed out the reticle size of TSMC’s process.
5376 Cuda cores on the chip, 4 SMs disabled for yield like GP100.
30Tflops FP16, 120Tflops Tensor.
300W card.
NVlink 2.0.
Full height, half length 1 slot 150W lower clocked card as well.
FP32 and INT32 are now separate cores (can operate independently and in parallel).



image7.png

image3.png

Tesla-V100-Full-Height-Half-Length-Hyperscale-Card-150W-5-840x473.jpg
 
Last edited:
So how long did it take from GP100 to GP104 launch?
Looks like I'm skipping the 1080Ti.
 
And here I thought that 600mm die size on P100 gave Nvidia nowhere to grow.

Looks like I was wrong :D
 
Holy 815mm2.. huge die lot of Cuda cores. ~50% faster than pascal..
 
Volta arrives 3Q for GV100, consumer cards won't be around a bit longer. But makes one optimistic to see something for consumers in the next year
Holy 815mm2.. huge die lot of Cuda cores. ~50% faster than pascal..
I wonder how much is due to the massive size of this beast. 3840 CUDA on P100 to 5120 CUDA on V100, 33% increase with 50% increase in general performance, very interesting.
 
Volta arrives 3Q for GV100, consumer cards won't be around a bit longer. But makes one optimistic to see something for consumers in the next year

I wonder how much is due to the massive size of this beast. 3840 CUDA on P100 to 5120 CUDA on V100, 33% increase with 50% increase in general performance, very interesting.

Yeah, I figured there's be an efficiency improvement like Maxwell but smaller (because each additional efficiency step is harder than the last). Sounds like I was right.
 
Yeah, I figured there's be an efficiency improvement like Maxwell but smaller (because each additional efficiency step is harder than the last). Sounds like I was right.

It's going to be hard to judge any efficiency improvements directly though, since the change to HBM will have a big impact on both performance and power.
 
Can the mods merge the two volta threads this is silly that I feel compelled to post this twice lol

upload_2017-5-10_19-39-42.png
 
GV104 if I had to guess will be:
256bit GDDR5X/GDDR6 at 12-14Ghz
3584 cores
240 TMUs
128 ROPs
 
P100 already used HBM2. Same shit, higher clocks.

Consumer Volta will use GDDR6, just like consumer Pascal used GDDR5X.

True. The question will be if they'll eventually roll out a Titan / Ti model with HBM. I'm guessing this will depend largely on the price/availability of the HBM parts, and the performance they get out of GDDR6.
 
True. The question will be if they'll eventually roll out a Titan / Ti model with HBM. I'm guessing this will depend largely on the price/availability of the HBM parts, and the performance they get out of GDDR6.

GV102 will use GDDR6. Forget all the HBM dreams. GV102 may have 768GB/sec via GDDR6.
 
P100 already used HBM2. Same shit, higher clocks.

Lower clocks actually, I heard some people on reddit talking about the 15tflop figure is disappointing for such a large die so I obviously stepped in to clear this up.

815/610 (GV100 vs GP100) = 1.336
(5120x1455) / (3840x1480) = 1.31

So not quite linear scaling with die size but hey, the cache is absolutely massive, tons for RF, new tensor cores certainly take up die space, and its ALMOST scaling linearly with size, and thats not to mention the clock deficit.

I would expect 30% increased performance across the board at every price point
 
GV102 will use GDDR6. Forget all the HBM dreams. GV102 may have 768GB/sec via GDDR6.

I've seen a ton of people claim they won't buy a new card without HBM. Doesn't make sense to me...I don't care what process it's built on, what kind of memory it uses or what brand name it stamped on it. I really care about FPS, and as long as I get more FPS, I'm happy. I do care a bit about power consumption, but as long as it's <300W, I'm happy.
 
Lower clocks actually, I heard some people on reddit talking about the 15tflop figure is disappointing for such a large die so I obviously stepped in to clear this up.

815/610 (GV100 vs GP100) = 1.336
(5120x1455) / (3840x1480) = 1.31

So not quite linear scaling with die size but hey, the cache is absolutely massive, tons for RF, new tensor cores certainly take up die space, and its ALMOST scaling linearly with size, and thats not to mention the clock deficit.

I would expect 30% increased performance across the board at every price point

How can anyone be effing disappointed! (not directed at you but man some do not get how important Pascal and then followed by Volta really is).
They need to realise it is a hybrid mixed precision effing GPU with FP64 (large cores and TDP demand heavy) with a massive 7.5 TFLOPs, considering they also give 15 TFLOPs FP32 and they are disappointed.....
Makes me want to swear, especially as the idiots have not realised this is a large efficiency/IPC jump again and improvements with FP32-FP16 taken again to a new level, let alone the fundamental changes that has already been emphasised.

So same node 16nm (but latest iteration) they managed to keep it with identical TDP of 300W to P100 and not only increase die size by 33.6% but increase core performance to 41.5% and that ignores the massive FP16-FP32 mixed precision acceleration improvements.
In other words they are squeezing more onto the chip than ever before and with the same TDP, also they increased the NVLink connection from 4 to 6 and increased the individual BW as well.
So 50% more connections and BW increased to 50GB per connector from 40GB, I think some really do not get what Nvidia has achieved (again not you).
Cheers
 
Last edited:
I think its safe to expect gv104 will be around the titan xp/ 1080ti performance.
 
For some reason, I cannot stop seeing JHH is dancing on the stage in those snapshots...
 
It will be interesting to see how long they hold out the titan x version after the "normal" high end card this time. I do wish they would let us have the option to buy a consumer one with HBM2, even if it cost more.
 
that probably will create complexity in driver development not such a good option. Also added to the this will all that extra bandwidth even be needed for a consumer GPU too....

Outside of form factor size, I don't see any immediate advantages for consumers for HBM products.
 
that probably will create complexity in driver development not such a good option. Also added to the this will all that extra bandwidth even be needed for a consumer GPU too....

Outside of form factor size, I don't see any immediate advantages for consumers for HBM products.

I am not sure it would make an appreciable difference or be worth it either, since it sounds a lot more costly to make. It would be interesting just to see what if anything it brings to the table in comparison to gddr5x/gddr6 on the same card in terms of real world performance.

In any case, this makes watching how Vega turns out and is priced interesting too.
 
I am not sure it would make an appreciable difference or be worth it either, since it sounds a lot more costly to make. It would be interesting just to see what if anything it brings to the table in comparison to gddr5x/gddr6 on the same card in terms of real world performance.

In any case, this makes watching how Vega turns out and is priced interesting too.


it would be quite a bit, bus size differences, different cache accesses, timings, etc. Think about using different memory timings for ram on a motherboard, and how that affects performance, and scale that to a GPU that needs to be feed consistently for all its units. Drivers govern this quite a bit. And yeah cost is going to be higher to have two different variants too.
 
Back
Top