NVIDIA RTX 50 “Blackwell” GB202 GPU GDDR7 384-bit memory

Nvidia and AMD has products lined up for multiple generations. They purposely slow roll incremental improvements in each gen release to keep the profits comming in.
Ding, ding, ding...you win for bringing everyone back to reality.

Would it not be much more profitable to release that superior smaller die that give the same result because of a better architecture (if both has it, keeping it secret to the other would not matter much) ?
Look at all the gimped 40 series this gen - they crippled a ton of cards for the mighty $ - did they have to? Probably not. But, it cut costs and they assumed ppl would still buy the product at the overcharged prices.

Do you really think they won't gimp the next gen and now they are making excuses for gimping 4090s cuz of China doesn't like AI? Are ppl really falling for all this?!?
Theoretically, a 5090 would be pretty good for AI? What's going to happen to it now?
 
Look at all the gimped 40 series this gen -
I feel you are trying to say one thing and its exact opposite at the same time, my points was exactly that, Nvidia make the best gpu architecture they can, so they can have the performance at the lower price they can (smaller die, smaller bus, smaller power possible), they do not make what is suggested by your previous sentence a 2-3 generation behind way too big gpu die that it need to be, that cost way too much to make that it need to be, or at least why would they ?
 
Last edited:
So I just watched an interesting video that seems to clarify a few potential performance parameters for the upcoming 5090.

In a nutshell: raster gains 60%, overall compute power 2x increase (probably mostly used for AI), greatly increased ray tracing gains at 2.2x, new DLSS that will increase performance while reducing the current latency issues.

So the 200%+ performance gains rumors, well that part ONLY pertains to Ray tracing, not Raster. So that's a huge distinction to note right there. The question is will that be enough now that we can turn on all the ray tracing goodies on games like Cyberpunk and Alan Wake 2 without suffering huge performance hits to frame rates?

Now looking at this slide, if we're making the assumption that GB202 is the 5090 and GB203 is the 5080... That's gonna be a huge performance drop off to the 80 model with only 56% the SMs of GB202.

1708058867504.png


Usually we've only seen about a 25% performance difference between the top of the line 90 series and the 80 series. But 44% less SMs? Does that mean the 5080 will be roughly 44% SLOWER than the 5090? That level of Nerfing seems extremely bad. And imagine if Nvidia charges $1200 for that? LOL.

Take it with a grain of salt but the guy claims to have sources so we shall see.

video:
View: https://www.youtube.com/watch?v=P-Txgox-vmI
 
Last edited:
Earliest rumor I recall reading that had the name Blackwell was end of 2021.

No, it's called research & development. Do you really believe that they're just sitting on product and drip feeding it to the masses?
The 4090 is LITERALLY a cut-down die, and Nvidia sells the fully-enabled die in an enterprise-only product.

Yes, Nvidia sits on products and sells the minimum-viable-product to make their profit.
 
But 44% less SMs?
4080: 76 / 80 AD103
4090: 128 / 144 AD102

Minus 40% for the card, 44% for the die, the gap between the 4080 and 4090 was the same I think, felt already limit in term of gap so it would be a bit surprising to make it even bigger. If the 5090 is more cut down than the 5080 like for lovelace which would make sense, maybe the gap endup similar or smaller than Lovelace in that regard.
 
So I just watched an interesting video that seems to clarify a few potential performance parameters for the upcoming 5090.

In a nutshell: raster gains 60%, overall compute power 2x increase (probably mostly used for AI), greatly increased ray tracing gains at 2.2x, new DLSS that will increase performance while reducing the current latency issues.

So the 200%+ performance gains rumors, well that part ONLY pertains to Ray tracing, not Raster. So that's a huge distinction to note right there. The question is will that be enough now that we can turn on all the ray tracing goodies on games like Cyberpunk and Alan Wake 2 without suffering huge performance hits to frame rates?

Now looking at this slide, if we're making the assumption that GB202 is the 5090 and GB203 is the 5080... That's gonna be a huge performance drop off to the 80 model with only 56% the SMs of GB202.

View attachment 635394

Usually we've only seen about a 25% performance difference between the top of the line 90 series and the 80 series. But 44% less SMs? Does that mean the 5080 will be roughly 44% SLOWER than the 5090? That level of Nerfing seems extremely bad. And imagine if Nvidia charges $1200 for that? LOL.

Take it with a grain of salt but the guy claims to have sources so we shall see.

video:
View: https://www.youtube.com/watch?v=P-Txgox-vmI

Nobody was saying +200%. Double the performance is a +100% increase.
 
Looking back, do the rumors of XX% perfomance increase in the new upcoming vid cards ever turn out to be true? 🫠
 
  • Like
Reactions: Niner
like this
4080: 76 / 80 AD103
4090: 128 / 144 AD102

Minus 40% for the card, 44% for the die, the gap between the 4080 and 4090 was the same I think, felt already limit in term of gap so it would be a bit surprising to make it even bigger. If the 5090 is more cut down than the 5080 like for lovelace which would make sense, maybe the gap endup similar or smaller than Lovelace in that regard.
I'm thinking the same thing as well.
 
Looking back, do the rumors of XX% perfomance increase in the new upcoming vid cards ever turn out to be true? 🫠
Because of how much room a simple XX% performance increase number has, I can imagine that it happen.

At a certain resolution for a certain set of games or with RT on or once overclocked.

https://videocardz.com/newz/nvidia-...n-rtx-3090-ti-in-3dmark-time-spy-extreme-test
NVIDIA GeForce RTX 4090 is reportedly 66% faster than RTX 3090 Ti in 3DMark Time Spy Extreme test

The rumors had the exact cuda core count on the nose, the 3d mark score perfectly right, which was not a bad approximate of the performance gain in extreme scenario for the 4090 over ampere.

Once you get close enough they can, far away I am not sure if both the simulator are good enough to be right and if they know if it will work, i.e. rumors can both be true (in the sense from people in the know that aim at those result) and not happening at the same time.
 
So I just watched an interesting video that seems to clarify a few potential performance parameters for the upcoming 5090.

In a nutshell: raster gains 60%, overall compute power 2x increase (probably mostly used for AI), greatly increased ray tracing gains at 2.2x, new DLSS that will increase performance while reducing the current latency issues.

So the 200%+ performance gains rumors, well that part ONLY pertains to Ray tracing, not Raster. So that's a huge distinction to note right there. The question is will that be enough now that we can turn on all the ray tracing goodies on games like Cyberpunk and Alan Wake 2 without suffering huge performance hits to frame rates?

Now looking at this slide, if we're making the assumption that GB202 is the 5090 and GB203 is the 5080... That's gonna be a huge performance drop off to the 80 model with only 56% the SMs of GB202.

View attachment 635394

Usually we've only seen about a 25% performance difference between the top of the line 90 series and the 80 series. But 44% less SMs? Does that mean the 5080 will be roughly 44% SLOWER than the 5090? That level of Nerfing seems extremely bad. And imagine if Nvidia charges $1200 for that? LOL.

Take it with a grain of salt but the guy claims to have sources so we shall see.

video:
View: https://www.youtube.com/watch?v=P-Txgox-vmI

LOL, that shocks you? You have seen the massive gap that exists between the 4080 and 4090 right?
 
Nobody was saying +200%. Double the performance is a +100% increase.

I can quote at least one person from this forum who said +200%.

RTX-5xxx series launching this Fall, why buy the soon to be outdated 4090 now? The 5090 is rumored to be 200% faster, seriously.

This person probably heard an inaccurate rumor that didn't specify that only the raytracing performance would be doubled. So it's very important to specify particulars when talking about the new card.

So in reality according to the latest rumor is that the projected performance is +60% raster performance and roughly a 2x ray-tracing performance.

Until we have actual benchmark results, it's still hard to say if this jump in ray tracing performance wiill finally make it so that we just turn ray-tracing on all the time with very little performance hit or if it still needs to take another leap.
 
I can quote at least one person from this forum who said +200%.



This person probably heard an inaccurate rumor that didn't specify that only the raytracing performance would be doubled. So it's very important to specify particulars when talking about the new card.

So in reality according to the latest rumor is that the projected performance is +60% raster performance and roughly a 2x ray-tracing performance.

Until we have actual benchmark results, it's still hard to say if this jump in ray tracing performance wiill finally make it so that we just turn ray-tracing on all the time with very little performance hit or if it still needs to take another leap.
I think this person probably made the mistake of thinking 2x = 200%

I personally doubt that number. Maybe 2x RT performance but not raster. I do think it’ll be far more efficient.

One thing I’d like to point out even though it was posted a while ago.

Intel/AMD/nvidia do not have products lined up for several generations. They have roadmaps. Big difference. These product roadmaps depend on manufacturing processes that are still bring developed so they certainly don’t have 6090s just sitting around waiting for release.
 
I also think 2x or 3x (+200%) in raster make little sense, make it +60-70% like the 4090 was on 350mm (and all it say in power delivery-pcb and other cost saving that come with) instead of 600mm+ and sales it "$1700" with ridiculously high margin, it will be backorder anyway with that amount of performance, unless AMD-Intel change the game completely. And you just gaining the certainty that the 6090 can be +100% if it need to be.

+120% for heavy ray contrusction scenario, say Alan Wake 2 4k DLAA path tracing going from 30fps form the 4090 to 65 on a 5090, sure, maybe, that make sense to do that kind of jump.

That what the 3090 to 4090, so it is possible, more so on the yet to be released title (or the patched version of cyberpunk-alan wake that fully take advantage of the added power).

60% gain is huge, really huge, if you do not move price up, that the Pascal 1080 performance over a 980 at 1440p when it launched, that significantly better than the 3080 over the 2080 super at launch and that card was virtually impossible to buy at MSRP for all its existence.
 
I also think 2x or 3x (+200%) in raster make little sense, make it +60-70% like the 4090 was on 350mm (and all it say in power delivery-pcb and other cost saving that come with) instead of 600mm+ and sales it "$1700" with ridiculously high margin, it will be backorder anyway with that amount of performance, unless AMD-Intel change the game completely. And you just gaining the certainty that the 6090 can be +100% if it need to be.

+120% for heavy ray contrusction scenario, say Alan Wake 2 4k DLAA path tracing going from 30fps form the 4090 to 65 on a 5090, sure, maybe, that make sense to do that kind of jump.

That what the 3090 to 4090, so it is possible, more so on the yet to be released title (or the patched version of cyberpunk-alan wake that fully take advantage of the added power).

60% gain is huge, really huge, if you do not move price up, that the Pascal 1080 performance over a 980 at 1440p when it launched, that significantly better than the 3080 over the 2080 super at launch and that card was virtually impossible to buy at MSRP for all its existence.

I doubt they would be able to pull off a 60% performance uplift while also shrinking the die size down to 350mm2 and greatly lowering power consumption. A 4090 is 609mm2 and a 3090 is 628mm2 and their power draw is similar with the 4090 being slightly more efficient. If the 5090 keeps the same die size and power draw as a 4090 then yes I can believe 60% gain, but if they go all the way down to 350mm2 then I don't think 60% gain is happening.
 
I doubt they would be able to pull off a 60% performance uplift while also shrinking the die size down to 350mm2 and greatly lowering power consumption.
Yes I also doubt, achieving 60% without upping the size-power would already be an giant achievement and win (not the same free performance boost from node change this time but maybe a free one from the memory), I just meant would you came with something with a +120% raster jump with the usual die size, probably better make a smaller-cheaper less powerful card (but that still destroy the 4090)
 
Yes I also doubt, achieving 60% without upping the size-power would already be an giant achievement and win (not the same free performance boost from node change this time but maybe a free one from the memory), I just meant would you came with something with a +120% raster jump with the usual die size, probably better make a smaller-cheaper less powerful card (but that still destroy the 4090)

I have a feeling they can still do that even if the gains aren't that big. Let's say 102 Blackwell die is 60-70% faster than AD102 in raster, and 103 Blackwell die is 30-40% faster than AD102. Why not just sell RTX 5090 with Blackwell 103 die for $1500 instead of using the 102 die, then nvidia can claim that they increased performance while at the same time "lowered prices" to make gamers happy when in reality the price went up from AD103 (RTX 4080 Super at $999) to Blackwell 103 at $1500, then sell all the 102 Blackwell die solely for AI market at 10x the profit margin.
 
That RedGaming guy is just another clown like moores law.
Yeah. I was more commenting on how that poster was surprised at the speculated massive performance difference between the 5080 and 5090 as if we don't already see that with the 4080 vs the 4090. The 4090 literally has 40% more CUDA cores than the 4080. Difference drops a bit with the 4080 Super, but not much, maybe 37-38%.

So not sure what he's surprised about.
 
Yeah. I was more commenting on how that poster was surprised at the speculated massive performance difference between the 5080 and 5090 as if we don't already see that with the 4080 vs the 4090. The 4090 literally has 40% more CUDA cores than the 4080. Difference drops a bit with the 4080 Super, but not much, maybe 37-38%.

So not sure what he's surprised about.
No, the 4090 "literally" has 68% more cores than the 4080 and does not scale worth a shit in most cases. Even at 4k the average performance difference is less than 30% in the latest techpowerup review.
 
Last edited:
No, the 4090 "literally" has 68% more cores than the 4080 and does not scale worth a shit in most cases. Even at 4k the average performance difference is less than 30% in the latest techpowerup review.
Whoops, you are absolutely right. I went the wrong way on the calculator in reading it.

So even more makes the point. Spec wise anyway. As for actual performance difference, yep agreed.
 
No, the 4090 "literally" has 68% more cores than the 4080 and does not scale worth a shit in most cases. Even at 4k the average performance difference is less than 30% in the latest techpowerup review.
If you exclude the outliers that do not scale well with hardware no matter the video card, like Borderlands 3 and Hitman 3, it's an even 30% difference between the 4080 FE and 4090 FE.
 
If you exclude the outliers that do not scale well with hardware no matter the video card, like Borderlands 3 and Hitman 3, it's an even 30% difference between the 4080 FE and 4090 FE.

Pretty sure this has always been the case no? If you double the CUDA cores in the same generation/architecture while keeping everything else equal, you will not get anywhere near double the performance.
 
Pretty sure this has always been the case no? If you double the CUDA cores in the same generation/architecture while keeping everything else equal, you will not get anywhere near double the performance.
There are architectures that have scaled much better than others. The current architecture scales rather poorly and the only other one I've ever seen that scaled as bad was Kepler. I think what's really holding the 4090 back though is that it's only using 72 of the 96 MB of cache.
 
Pretty sure this has always been the case no? If you double the CUDA cores in the same generation/architecture while keeping everything else equal, you will not get anywhere near double the performance.
6900xt doubled the 6700xt core count, was around 50% faster at 4k, 40% at 1440p I think, it helped that the memory bandwith and power budget were both much higher as well.

I imagine that like power, it goes down and down diminishingly result wise has you scale things ?
 
6900xt doubled the 6700xt core count, was around 50% faster at 4k, 40% at 1440p I think, it helped that the memory bandwith and power budget were both much higher as well.

I imagine that like power, it goes down and down diminishingly result wise has you scale things ?
There is always diminishing returns, but I don't recall any other time it's been this "bad" since Kepler, as mentioned above. It's very apparent with the 4080 SUPER vs. 4080 where there is barely a 1% increase in raster performance with 5% more cores. The 600W 4090 also showed that increasing the power doesn't do much, either.
 
There are architectures that have scaled much better than others. The current architecture scales rather poorly and the only other one I've ever seen that scaled as bad was Kepler. I think what's really holding the 4090 back though is that it's only using 72 of the 96 MB of cache.
Agreed. Hope they do another new architecture shakeup for Blackwell as aside from some minor differences and increased cache, Lovelace and Ampere are largely very similar architectures. Last major shakeup was going from Pascal to Turing as aside from the addition of Tensor and RT, each CUDA core could now do both an INT32 and FP32 operation concurrently per clock cycle whereas for Pascal and prior it was either INT or FP per cycle per core, not both. Ampere is also very similar to Turing but iterated on it by further splitting the data path dedicated to INT32 operations in Turing to now also be capable of either INT32 or FP32 in addition to the primary FP32 data path.

Functionally speaking, the CUDA core itself and how the data moves hasn't changed much over the past 3 gens aside from Ampere improving on Turing's initial split of INT/FP data.
 
So same chip capacity at speeds 1/3 faster than gddr6.
https://www.techpowerup.com/320185/...s-blackwell-to-use-28-gbps-gddr7-memory-speed

So basically a 192 bit gddr7 would be the same speed as a 256 bit gddr6x. That's not factoring in any cache speed and size increase.

My best guess, still 384 bit with 24 GB vram for ultra high end, 256 bit with 16 Gb at the next tier with a bunch of cards at 192 bit using both 12 GB and 24 GB options. I would hope most 128 bit cards will likely be 16 GB with maybe a few at 8 GB. 128 bit 16 gb will likely be the vast majority of midrange cards.
 
Then again, 96 bit gddr7 will be the same speed as 128 bit gddr6x or around 160 bit gddr6. That should be fast enough foe the mid range and 12 GB would be a perfect vram size for that performance level.
 
There are architectures that have scaled much better than others. The current architecture scales rather poorly and the only other one I've ever seen that scaled as bad was Kepler. I think what's really holding the 4090 back though is that it's only using 72 of the 96 MB of cache.
Yep, it's why the 4070ti Super seems to be weaker against the 4080 than many would have guessed.
 
Yep, it's why the 4070ti Super seems to be weaker against the 4080 than many would have guessed.
Probably showed the cache was doing more than we thought. I was initially very critical of the 192-bit of the 4070 Ti when it launched but I feel like I have to change my opinion somewhat. Simply bumping the bus and capacity didn't do much for the 4070 Ti Super so it shows maybe Nvidia isn't totally BSing here with their cache architecture and how much lifting it is actually doing for the memory subsystem.
 
Probably showed the cache was doing more than we thought. I was initially very critical of the 192-bit of the 4070 Ti when it launched but I feel like I have to change my opinion somewhat. Simply bumping the bus and capacity didn't do much for the 4070 Ti Super so it shows maybe Nvidia isn't totally BSing here with their cache architecture and how much lifting it is actually doing for the memory subsystem.
There is likely a point of diminishing returns for cache, but it could prove to be a more cost effective way of compartmentalizing products for the manufacture instead of 'fusing off' good silicon.
 
Back
Top