NVIDIA Shares Blackwell GPU Compute Stats: 30% More FP64 Than Hopper, 30x Faster In Simulation & Science, 18X Faster Than CPUs

erek

[H]F Junkie
Joined
Dec 19, 2005
Messages
11,126
1715656911363.png

"NVIDIA has shared more performance statistics of its next-gen Blackwell GPU architecture which has taken the industry by storm. The company shared several metrics including its science, AI, & simulation results versus the outgoing Hopper chips and competing x86 CPUs when using Grace-powered Superchip modules.

NVIDIA's Monumental Performance Gains With Blackwell GPUs Aren't Just Limited To AI, Science & Simulation See Huge Boost Too​

In a new blog post, NVIDIA has shared how Blackwell GPUs are going to add more performance to the research segment which includes Quantum Computing, Drug Discovery, Fusion Energy, Physics-based simulations, scientific computing, & more. When the architecture was originally announced at GTC 2024, the company showcased some big numbers but we have yet to get a proper look at the architecture itself. While we wait for that, the company has more figures for us to consume."

1715657120405.png


Source: https://wccftech.com/nvidia-blackwe...0x-faster-simulation-science-18x-faster-cpus/
 
Must be a bit strange feeling to be a customer, the same day that :
https://nvidianews.nvidia.com/news/nvidia-grace-hopper-ignites-new-era-of-ai-supercomputing
Driving a fundamental shift in the high-performance computing industry toward AI-powered systems, NVIDIA today announced nine new supercomputers worldwide are using NVIDIA Grace Hopper™ Superchips to speed scientific research and discovery. Combined, the systems deliver 200 exaflops, or 200 quintillion calculations per second, of energy-efficient AI processing power.

they show how much the replacement to be released in months are better....
 
Meanwhile, GTA 6 will launch on consoles at 30fps and will still sell 500 million copies in pre-orders....but every RTX-enabled game will still sell small because Call of Duty and it's ilk are on version 14, and people get enough dopamine hit out of the freebie versions of Warzone and Fortnight and Counterstrikes and so forth which they run, becuase the internet tells them this is what the cool kids do to be competetive, at low detail settings.

Hrrrrrrmmmm....

But on the upside, people who run multi-monitor for sims (the backbone of the PC gaming community since time immemorial) will be that much closer to achieving consistent 60 or 120fps at 4k x 3 screens.......so there's that.
 
But on the upside, people who run multi-monitor for sims (the backbone of the PC gaming community since time immemorial) will be that much closer to achieving consistent 60 or 120fps at 4k x 3 screens.......so there's that.

Still praying to the Korean Chaebol gods for an OLED version of the 57” G9 mini led
 
Meanwhile, GTA 6 will launch on consoles at 30fps and will still sell 500 million copies in pre-orders....but every RTX-enabled game will still sell small because Call of Duty and it's ilk are on version 14, and people get enough dopamine hit out of the freebie versions of Warzone and Fortnight and Counterstrikes and so forth which they run, becuase the internet tells them this is what the cool kids do to be competetive, at low detail settings.

Hrrrrrrmmmm....

But on the upside, people who run multi-monitor for sims (the backbone of the PC gaming community since time immemorial) will be that much closer to achieving consistent 60 or 120fps at 4k x 3 screens.......so there's that.
Gaming Blackwell won't be based on enterprise Blackwell, which this article is about.
Int8 vs floating point...
Surprised they're not pushing int4 in this comparison. We can double the speed by further cutting the size of the data type in half!

https://developer.nvidia.com/blog/int4-for-ai-inference/
 
If they changed the precision of the AI model used for the simulation being run, they wrote it so small below the low res image that I am not even able to read it
 
Gaming Blackwell won't be based on enterprise Blackwell, which this article is about.

Surprised they're not pushing int4 in this comparison. We can double the speed by further cutting the size of the data type in half!

https://developer.nvidia.com/blog/int4-for-ai-inference/
Last time I checked Nvidia's Int4 was upwards of 120x faster than their nearest competitor, that's a pretty boring bar chart.
 
Surprised they're not pushing int4 in this comparison. We can double the speed by further cutting the size of the data type in half!
Such marketing fluff. SIMD within a register is a thing. You can pack two int4s into an int8, two int8s into an int16, so on a so forth. We've been doing it for years. Not sure what data paths Nvidia are taking to achieve those magnitude of order speed claims.

Only scenario I can think of where you'd need that fine-tuned granularity is when thread synchronization costs precious cycles. Then again, I'm not a hardware optimization expert.

Floating Point is different though. SWAR'ing floats is more about making lemonade. FP8 already has like 3 or 4 different widely used formats, at least within the topic of AI inference.
 
disappointing if true...once again the 5090 seems like the card Nvidia wants you to buy while gimping all the other models...seems like they didn't learn their lesson from the 4000 series release

Fresh rumors claim Nvidia's next-gen Blackwell cards won't have a wider memory bus or more VRAM—apart from the RTX 5090

Other than the RTX 5090, none of the forthcoming Blackwell cards will be sporting more VRAM than the current Ada Lovelace models, assuming those specs are correct...if the successor to the RTX 4090 does have a 512-bit memory bus, we could be looking at a graphics card with 32GB of VRAM...

https://www.pcgamer.com/hardware/gr...mory-bus-or-more-vramapart-from-the-rtx-5090/
 
disappointing if true...once again the 5090 seems like the card Nvidia wants you to buy while gimping all the other models...seems like they didn't learn their lesson from the 4000 series release

Fresh rumors claim Nvidia's next-gen Blackwell cards won't have a wider memory bus or more VRAM—apart from the RTX 5090

Other than the RTX 5090, none of the forthcoming Blackwell cards will be sporting more VRAM than the current Ada Lovelace models, assuming those specs are correct...if the successor to the RTX 4090 does have a 512-bit memory bus, we could be looking at a graphics card with 32GB of VRAM...

https://www.pcgamer.com/hardware/gr...mory-bus-or-more-vramapart-from-the-rtx-5090/
Well just going to DDR7 will have huge implications for performance. I’d be interested to see what that does for the 5060 class where memory was obviously an issue this time around but the rest will be fine.
 
Are we reaching the limits of GPU innovation

Not sure about calling memory bus width and amount of VRAM innovation ? Performance gain while keeping those the same, that was more "innovation", those can you not pay for it (size-pcb-power-money wise if you want to have them without that much new R&D involved, at the minimum the bus width and memory quantity of the 5090 is by definition quite understood and feasible)

seems like they didn't learn their lesson from the 4000 series release
Or they perfecly learned their lessons ? The leak about laptop memory bus would match that twitter account and could be true, what lesson a company do when their had the maybe the best run ever since the 4000 series release ?


AIB_PR_Q124_001.png
ed-market-revenue-worldwide-fy2019-2024-by-quarter.png



By december 2024/early 2025, micron could have 32 Gb (3GB) GDDR7 module available:
https://overclock3d.net/news/memory/micron-reveals-the-future-of-gddr7-memory/

Which could open the door to 12/18/24 GB of vram for 128-192-256 bits bus and 16-24-32 by the time of the super refresh with 4GB per chip.
 
Last edited:
Are we reaching the limits of GPU innovation

Not sure about calling memory bus width and amount of VRAM innovation ? Performance gain while keeping those the same, that was more "innovation".


Or they perfecly learned their lessons ?


View attachment 659222View attachment 659223
I mean if the products are selling as fast as they can make them the only thing it teaches them is to charge more.
 
I think the 4090 was an anomoly...I don't think gamers or consumers in general are going to want to pay $1600+ for a new GPU every new generation
 
imo, the 5060 and 5060Ti should define this generation.

The 4060 was just 2.15x, 2.2x a 1060 6GB, while the 4070ti super is 2.2, 2.3x time a 2070 super, 3x time a 1070ti.

GB206 need to have a better jump than last time and relative to the others to make up that growing gap and with rayreconstruction-dlss, the 5060TI at least should be able to play new title with RT on fine.
 
I think the 4090 was an anomoly...I don't think gamers or consumers in general are going to want to pay $1600+ for a new GPU every new generation
3090 was $1500. If the price is too high got "consumers in general" (hint: it is) then those people can buy something more affordable down the stack. Not every high end luxury is meant for everyone.
 
3090 was $1500. If the price is too high got "consumers in general" (hint: it is) then those people can buy something more affordable down the stack. Not every high end luxury is meant for everyone.

normally that's the case but starting with the 4000 series Nvidia made it so that the only card worth getting was the 4090 (prior to the Super releases)...the 4080 and 4070 were not worth it at all (in terms of price/performance)...normally the 70 series offers great value but not this time...looks like it's going to continue into the 5000 series
 
normally that's the case but starting with the 4000 series Nvidia made it so that the only card worth getting was the 4090 (prior to the Super releases)...the 4080 and 4070 were not worth it at all (in terms of price/performance)...normally the 70 series offers great value but not this time...looks like it's going to continue into the 5000 series
Actually, those were all linearly priced according to performance within the series. The super cards only made it even better. Don't get stuck on the names.
 
normally that's the case but starting with the 4000 series Nvidia made it so that the only card worth getting was the 4090 (prior to the Super releases)...the 4080 and 4070 were not worth it at all (in terms of price/performance)...normally the 70 series offers great value but not this time...looks like it's going to continue into the 5000 series
The people who will pay $1600 for a GPU is mostly made up of people who will pay whatever for the 5090 when it comes out.
 
Actually, those were all linearly priced according to performance within the series. The super cards only made it even better. Don't get stuck on the names.

the Super cards were what the original 4080/4070 pricing should have been...the original 4080 was a good card with terrible pricing...the 4080 Super corrected it but by that time it was already too late for most people
 
I mean if the products are selling as fast as they can make them the only thing it teaches them is to charge more.

They are in some markets, not so much in the gaming market, with the exception of the 4090 most of the 4000 series did not sell that well. 4080 and down have been easy to find, even at my lack luster Best Buy.
 
The people who will pay $1600 for a GPU is mostly made up of people who will pay whatever for the 5090 when it comes out.

Personally I am hoping Nvidia charges them like it's a enterprise card. Will see how bad they want the best.
 
Back
Top