Nvidia's RT performance gains not as expected?

Meeho

Supreme [H]ardness
Joined
Aug 16, 2010
Messages
5,914
AdoredTV did an analysis of Nvidia's theoretical vs practical RT performance gains. It seems architectural bottlenecks severely impact RT performance and that it is in fact raster gains that are bringing the majority of the improvement in RT games for Lovelace.

This could have interesting implications for how far behind in RT AMD actually is. [probably still far but less so]



Screenshot_2022-11-19-03-46-45.pngScreenshot_2022-11-19-03-47-24.pngScreenshot_2022-11-19-03-48-47.png
 
Last edited:
AdoredTV did an analysis of Nvidia's theoretical vs practical RT performance gains. It seems architectural bottlenecks severely impact RT performance and that it is in fact raster gains that are bringing the majority of the improvement in RT games for Lovelace.

This could have interesting implications for how far behind in RT AMD actually is.



View attachment 527927View attachment 527928View attachment 527929

Erm, I don't think this says what you think this says.
 
I see the 4090 being 2.3 times faster overall in ray tracing than the 6950 xt on the techpowerup review including the modest results in Far Cry 6. The 7900 xtx is supposedly only going to be a little over 50% faster overall in ray tracing than the 6950 xt so that means even the 4080 will be well ahead in ray tracing. In fact it looks like the MSRP price gaps for the 4090 and 4080 over the 7900xtx will line up with the ray tracing advantage.
 
Last edited:
It is a bit strange, because the premise seem to be despite Lovelace throwing way more Raytracing Tflops in Lovelace than an Ampere the gen on gen seem similar to a 3090TI to a 2080TI.

But would not have been true of the 3090TI has well, having more Raytracing Tflops in Ampere than in turing ?

Rt core
2080ti: 68
3090ti: 84 (+23%)
4090: 128 (+52%)

Is that issue ?

For one aspect that make this a bit more (or maybe less) is that I am not sure you can watch FPS to FPS ratio to know how faster a system his to a specific test.

Say you are running at 100 fps, 10 ms a frame

If you get twice has fast at a step that was taking 1ms of the frame you will not go to 200 fps but maybe more to 105 fps, for a video card to gain twice the frame just from faster RT calculation it would need to be I would imagine way more than twice at fast at it, a bit like you need to more than double everything a video card do to double your FPS because there would be some non video card element involved in a frame that would not necessarily go down, could even go up has it got less time from the last rendering to do is work.

stat-scenerendering-4.jpg


nothing will be a perfect ratio of RT performance but some proxy will be better than other, RT heavy like Metro Exodus, Control and maybe best of all Quake 2 RTX, will be better proxies of pure gen on gen RT performance increase, according to this:

https://www.dsogaming.com/wp-content/uploads/2022/10/NVIDIA-RTX-4090-ray-traced-benchmarks-4K.png

2080ti/3080/4090
Control: 28 / 38 / 76
Metro Exodus: 25/ 31 / 70
Quake 2: 20 / 38 / 76

At 4k, with RT the 3080 has maybe VRAM issue often here, but that the idea, putting very lite RT title like Far Cry is just polluting the data.
 
Why compare 4090 to 3090ti when comparing latest generation to last? You'd compare 4090 to 3090. Further, that also skews the results at the other end as well, as you should be looking at 2080ti to 3090. (Titan doesn't count since it wasn't a 2090, and was a $2500 professional part)

This just seems like general idiocy.
 
Why compare 4090 to 3090ti when comparing latest generation to last? You'd compare 4090 to 3090. Further, that also skews the results at the other end as well, as you should be looking at 2080ti to 3090. (Titan doesn't count since it wasn't a 2090, and was a $2500 professional part)

This just seems like general idiocy.
It seems you didn't get idea of this analysis at all.
 
It seems you didn't get idea of this analysis at all.
Seems to me like he got the idea just fine, and he is pointing out what should be obvious. The 4090 is not a 3090Ti competitor, but a 3090 competitor. So why compare the RT performance of the 3090Ti vs the 4090, other than to make the 4090 look slightly worse than it actually is.

Case in point, 10 game benchmark RT performance:
1668927183649.png
 
Last edited:
I see the 4090 being 2.3 times faster overall in ray tracing than the 6950 xt on the techpowerup review including the modest results in Far Cry 6. The 7900 xtx is supposedly only going to be a little over 50% faster overall in ray tracing than the 6950 xt so that means even the 4080 will be well ahead in ray tracing. In fact it looks like the MSRP price gaps for the 4090 and 4080 over the 7900xtx will line up with the ray tracing advantage.
The 7900 xtx is priced precisely to account for its rasterization + raytracing performance vs its competition, just like the 6-series was. AMD is following nV lock step on the horrendous pricing that’s been set by Lovelace. This considered, 7900xtx will come very close to 4080 (4070) in rasterization (+/- 10%), and far behind in RT. Early Q2, nV will release its 4080 ti to counter.
 
The 7900 xtx is priced precisely to account for its rasterization + raytracing performance vs its competition, just like the 6-series was. AMD is following nV lock step on the horrendous pricing that’s been set by Lovelace. This considered, 7900xtx will come very close to 4080 (4070) in rasterization (+/- 10%), and far behind in RT. Early Q2, nV will release its 4080 ti to counter.
The 7900xtx should be well ahead of the 4080 in rasterization from everything they have shown.
 
They really haven't shown jack shit. Until 3rd party benchmarks come out, nobody really knows anything and anyone saying otherwise is theorizing at best or making shit up at worst.
Yeah you are right as all they did was clearly lie and the 7900xtx will certainly be at least 20% slower than what they showed...
 
Yeah you are right as all they did was clearly lie and the 7900xtx will certainly be at least 20% slower than what they showed...
I clearly didn't say that. You should perhaps not put words into other people's mouths.
 
I clearly didn't say that. You should perhaps not put words into other people's mouths.
You are making zero sense. They showed some benchmarks but you are saying they did not show "jack shit" and that nobody knows anything until 3rd party benchmarks come out. Whether you can accept it or not based on what they did show it looks like the 7900xtx will easily be ahead of the 4080 in rasterization.
 
You are making zero sense. They showed some benchmarks but you are saying they did not show "jack shit" and that nobody knows anything until 3rd party benchmarks come out. Whether you can accept it or not based on what they did show it looks like the 7900xtx will easily be ahead of the 4080 in rasterization.
They have shown "up to" and best case scenarios. We will see how well those age with actual benchmarks.
 
Why compare 4090 to 3090ti when comparing latest generation to last? You'd compare 4090 to 3090. Further, that also skews the results at the other end as well, as you should be looking at 2080ti to 3090. (Titan doesn't count since it wasn't a 2090, and was a $2500 professional part)

This just seems like general idiocy.
Because it's not about the product number, it's about the hardware RT processing power increases vs gaming performance gains.
 
Because it's not about the product number, it's about the hardware RT processing power increases vs gaming performance gains.
If RT performance was measured in a vacuum, sure; but it's not. So comparing Apple to apples is important.
 
Because it's not about the product number, it's about the hardware RT processing power increases vs gaming performance gains.
Sure it is. The drive of the video was to show generational RT gains, but they aren’t comparing the same category of product from each generation. By using a 3090ti it makes ampere look substantially better than both the generation before and after since they aren’t comparing apples to apples. By your thinking they should just use a 4080 for the latest gen and a 2060 for the gen before just to make ampere look that much better, I guess.
 
Last edited:
If RT performance was measured in a vacuum, sure; but it's not. So comparing Apple to apples is important.
Sure it is. The drive of the video was to show generational RT gains, but they aren’t comparing the same category of product from each generation.
I don't think you've watched the video. Or understood what it was comparing.
 
No, we watched the video. It's logically inconsistent.
It's logically consistent to those that want any reason to slag Nvidia, even if the cards don't match.

You are making zero sense. They showed some benchmarks but you are saying they did not show "jack shit" and that nobody knows anything until 3rd party benchmarks come out. Whether you can accept it or not based on what they did show it looks like the 7900xtx will easily be ahead of the 4080 in rasterization.

Nvidia does this same thing. Releasing slides with almost no context showing how amazing their cards are. They get LAMBASTED here for months, until those slides are actually proven to be correct (surprisingly).

Will this generation be faster than the last, for AMD? I would imagine so. Will it be AS FAST, who knows and we won't know until benchmarks are released.
 
I'm just gonna come out and say it.. this guy is an AMD fanboy dofus. Nvidia's RT performance has gone through the roof this generation. How do I know this?

The Ray Tracing feature test in 3DMark is an excellent measure of RT performance.

My previous RTX 3080 got 48 FPS when overclocked as far as possible. My RTX 4090 gets...

1668966532114.png



That's right friends... a roughly 3x improvement in RT performance going from a cut down GA102 to a cut down AD102. Obviously this is not a perfect apples-to-apples comparison (anyone have an RTX 3090 they can test?), but this should put an appropriate amount of egg on Mr. AdoredTV's face.


Also, if anyone here has a 6800/XT, 6900 XT or 6950 XT, could you do us a solid and run the DirectX Raytracing feature test? I'm having some difficulty finding results for those GPUs.
 
Last edited:
I can see why it could mean, we can find test when the 3090 vs 4090 difference is bigger when RT is not on or some test where the 3090 jump from the 2008TI seem bigger than the 4090 from the 3090

rt_quake_2_rtx_2160p.png


rce-RTX-4090-Generational-Rendering-Improvements-2.jpg

But by margin of errors amount and how pure RayTracing ms proxy are they ?
 
His whole position is just really perplexing. He is saying that relative to the raster increases gen on gen, the raytracing increases are poor, and by 'poor' meaning only slightly outpacing the raster increases. As if somehow the relative increase to raster is somehow more relevant or more important or more consequential than the outright gen on gen increase of RT performance overall.
 
His whole position is just really perplexing. He is saying that relative to the raster increases gen on gen, the raytracing increases are poor, and by 'poor' meaning only slightly outpacing the raster increases. As if somehow the relative increase to raster is somehow more relevant or more important or more consequential than the outright gen on gen increase of RT performance overall.
Which is completely nuts. There's a 2-3x increase in RT performance from the 30 to 40 series. In just 2 years time, Nvidia took RT from a cool tech to actually playable at high framerates at 1440p and 4K native. But, as I alluded to in my previous post, this guy is a blatant AMD fanboy. He doesn't even try to disguise it.
 
I'm just gonna come out and say it.. this guy is an AMD fanboy dofus. Nvidia's RT performance has gone through the roof this generation. How do I know this?

The Ray Tracing feature test in 3DMark is an excellent measure of RT performance.

My previous RTX 3080 got 48 FPS when overclocked as far as possible. My RTX 4090 gets...

View attachment 528196


That's right friends... a roughly 3x improvement in RT performance going from a cut down GA102 to a cut down AD102. Obviously this is not a perfect apples-to-apples comparison (anyone have an RTX 3090 they can test?), but this should put an appropriate amount of egg on Mr. AdoredTV's face.
You do realize you're just making his point, right?

His whole position is just really perplexing. He is saying that relative to the raster increases gen on gen, the raytracing increases are poor, and by 'poor' meaning only slightly outpacing the raster increases. As if somehow the relative increase to raster is somehow more relevant or more important or more consequential than the outright gen on gen increase of RT performance overall.
No, he is saying that raw power RT increases aren't realized in real world and that relatively lesser raw power improvements of prior gens had a more substantial performance increase than much larger raw power increase of the latest gen.
 
No, he is saying that raw power RT increases aren't realized in real world
What do you mean by that, that in the real world RT is a low enough percentage of a total workload that a big increase in raw RT performance does not lead in much raw gain ?

Which could be true outside the known exception but quite the different point of the title, obviously that the slow induced byt the amount of millisecond spent on RT ms in a frame was less of an issue and make it harder to make large gain from it, but on the few game-task where it is still a large portion on a frame is there anything unexpected going on that show ?
 
What do you mean by that, that in the real world RT is a low enough percentage of a total workload that a big increase in raw RT performance does not lead in much raw gain ?

Which could be true outside the known exception but quite the different point of the title, obviously that the slow induced byt the amount of millisecond spent on RT ms in a frame was less of an issue and make it harder to make large gain from it, but on the few game-task where it is still a large portion on a frame is there anything unexpected going on that show ?
It's inconclusive. Either the RT portion of the frame is too small to contribute more to overall performance, or the RT calculations are encountering an architectural block fighting with raster for resources (possibly L1/L2 cache, amongst other).
 
  • Like
Reactions: noko
like this
Maybe but they went from 6144Km of L2 cache to 98304 on Ad102. +80% on the L1 level has well.

Memory bandwith having a large gain for the 3090 (936.2 GB/s) from 2080TI (616.0 GB/s) to staying virtually the same for the 4090 (1,008 GB/s) from 3090 (936.2 GB/s)? The 4090 has the 3090Ti memory bandwith.

Much larger cache seem to achieve to make up for it in many aspect, but one could think that it will show up from time to time in term of not scalling always has much for everything.

if they every make a GDDR7x revision, maybe that will show up.

To put the thesis in perspective RT Tflops
2080TI: 42.9
3090TI: 78.1 +82%
4090: 191 +244%

How much more RT tflops Ada has over the 3090Ti, is not obvious from some real world performance numbers.
 
  • Like
Reactions: Meeho
like this
I look at how relevant RT actually is? Looking at Lumen lighting and complex geometry of Nanite, that looks way better or more realistic than games with RT, runs faster, bla bla bla. Anyways games made for consoles will be using tech that works and not needing a 4090, fake frames etc to make work. Just thinking.
 
I look at how relevant RT actually is? Looking at Lumen lighting and complex geometry of Nanite, that looks way better or more realistic than games with RT, runs faster, bla bla bla. Anyways games made for consoles will be using tech that works and not needing a 4090, fake frames etc to make work. Just thinking.
Lumen use RT massively (just in software mode) no ?
https://docs.unrealengine.com/5.0/en-US/ray-tracing-performance-guide-in-unreal-engine/

It perform well has long nothing his dynamic and everything can be pre-baked and quite specific scene structure, unlike universal RT.

With the latest shader execution reordering affair, apparently hardware more Lumen is finally faster than software mode Lumen:
https://wccftech.com/ser-improves-ue5-lumens-hardware-rt-performance-says-nvidia/
 
Lumen use RT massively (just in software mode) no ?
https://docs.unrealengine.com/5.0/en-US/ray-tracing-performance-guide-in-unreal-engine/

It perform well has long nothing his dynamic and everything can be pre-baked and quite specific scene structure, unlike universal RT.

With the latest shader execution reordering affair, apparently hardware more Lumen is finally faster than software mode Lumen:
https://wccftech.com/ser-improves-ue5-lumens-hardware-rt-performance-says-nvidia/
Lumen does not need ray tracing for dynamic lights, does a great job on light bounce. Reflections use cube maps. It does support hardware ray tracing if turned on which does improve reflections.

Point is you can make outstanding graphically rich games using little to no ray tracing with UE 5.1. Ray tracing can be resource heavy killing performance and limiting the game.
 
That used to be true, but not anymore since the 4090 was released and I imagine that next generation will make it even better.
So everyone will have a 4090? lol. And even a 4090 cannot average 60 fps without DLSS in Cyberpunk maxed right now. Think how much more demanding that game will be once the overdrive update comes out that cranks the ray tracing even higher. And games will be much more demanding before the next gen cards come out too. It is a never ending cycle.
 
Last edited:
So everyone will have a 4090? lol. And even a 4090 cannot average 60 fps without DLSS in Cyberpunk maxed right now. Think how much more demanding that game will be once the overdrive update comes out that cranks the ray tracing even higher. And games will be much more demanding before the next gen cards come out too. It is a never ending cycle.
Good thing is dlss 3 frame gen will be coming with it ;). Doubles the fps! Also with ada's huge rt increase, the frame time will not be hurt by overdrive as much as prior series are.
 
Good thing is dlss 3 frame gen will be coming with it ;). Doubles the fps! Also with ada's huge rt increase, the frame time will not be hurt by overdrive as much as prior series are.
You clearly have not been keeping up with how DLSS 3 works...


https://www.techspot.com/article/2546-dlss-3/

"A game run with DLSS 3 at 60 FPS feels like a 30 FPS game in its responsiveness, which is not great and makes the technology unsuitable for transforming slower GPUs into 60 FPS performers."

"We would recommend a minimum of 120 FPS before DLSS 3, targeting around 200 FPS with frame generation enabled."
 
Last edited:
Would like to see how Flight Simulator performs in VR with DLSS 3, could be good or bad. Artifacts can be magnified more in VR, the smoothness would be astounding if it works as good as it does on a typical monitor. Since AMD may have something similar (Fluid Motion), I will just wait.
 
Good rundown of what is new in Unreal 5.1, nice changes and yes software ray tracing while supporting hardware ray tracing. 5.0 looked amazing but 5.1 took it to the next level. I would like performance numbers with GPUs using Lumen and Nanite. What is faster? Using Nvidia RTX or using Lumen built in global illumination? Which is easier to implement? Will have to download Unreal 5.1 and play around with it.

 
Back
Top