Nvidia's RT performance gains not as expected?

Why should anyone be interested in this? This is the most meaningless comparison ever. Yes, RT gains are not proportional to Rasterizer gains this gen, but who cares, if the card is 60-70% faster anyway?
 
You do realize you're just making his point, right?


No, he is saying that raw power RT increases aren't realized in real world and that relatively lesser raw power improvements of prior gens had a more substantial performance increase than much larger raw power increase of the latest gen.
What does that even mean?

1669050774016.png

98% improvement 3090 -> 4090 compared to 60% from 2080 Ti -> 3090.

1669050835737.png

105% improvement 3090 -> 4090 compared to 63% from 2080 Ti -> 3090.

1669050890358.png

109% improvement 3090 -> 4090 compared to 47% from 2080 Ti -> 3090.

1669050944844.png

99% improvement 3090 -> 4090 compared to 62% from 2080 Ti -> 3090.

1669051089251.png

Edge case. Don't know what is going on with Far Cry 6. F1 22 and Watch Dogs Legion are also AMD titles but don't exhibit this behavior.

1669051016065.png

88% improvement 3090 -> 4090 compared to 62% from 2080 Ti -> 3090.

1669051208296.png

97% improvement 3090 -> 4090 compared to 54% from 2080 Ti -> 3090.

1669051283935.png

76% improvement 3090 -> 4090 compared to 49% from 2080 Ti -> 3090.
Why should anyone be interested in this? This is the most meaningless comparison ever. Yes, RT gains are not proportional to Rasterizer gains this gen, but who cares, if the card is 60-70% faster anyway?
It's AdoredTV. He is reaching for anything to bad mouth NVIDIA.
 
I'm just gonna come out and say it.. this guy is an AMD fanboy dofus. Nvidia's RT performance has gone through the roof this generation. How do I know this?

The Ray Tracing feature test in 3DMark is an excellent measure of RT performance.

My previous RTX 3080 got 48 FPS when overclocked as far as possible. My RTX 4090 gets...

View attachment 528196


That's right friends... a roughly 3x improvement in RT performance going from a cut down GA102 to a cut down AD102. Obviously this is not a perfect apples-to-apples comparison (anyone have an RTX 3090 they can test?), but this should put an appropriate amount of egg on Mr. AdoredTV's face.


Also, if anyone here has a 6800/XT, 6900 XT or 6950 XT, could you do us a solid and run the DirectX Raytracing feature test? I'm having some difficulty finding results for those GPUs.
1669066313356.png



1669066457426.png
 
Ray tracing is quite CPU intensive. Part of the problem here is TPU is testing on a 5800X. I see a huge difference going from a 10900K to a 12900K for instance with RT in Cyberpunk, about a 10-15% improvement in RT fps at 4K in open city scenes.
 
Ray tracing is quite CPU intensive. Part of the problem here is TPU is testing on a 5800X. I see a huge difference going from a 10900K to a 12900K for instance with RT in Cyberpunk, about a 10-15% improvement in RT fps at 4K in open city scenes.
Because with RT, culling is much less meaning more objects, shaders, compute -> more draw calls -> CPU usage is higher.

The finer point of the video was the hit from using pure rasterization to using RT, which will remain massive for some time. Most folks game with less than a 3070 for RT performance. For all the RT capable cards out there, how many gamers are actually using RT? Especially with the performance hit?

I look at how Lumen lighting performs and looks and almost come to the conclusion Nvidia RT advantage is irrelevant. In professional applications Nvidia is kicking some serious ass though.
 
My issue with RT is that even on Lovelace, the performance loss is still significant even though we are on the 3rd gen RTX GPU now. In the heaviest RT titles you are looking at around 40-50% reduction in frame rate if you max out RT, which for now is totally fine since my display maxes out at 120Hz anyways (LG CX) so I can just crank everything up, max out the RT, flip DLSS on and still get full use out of my GPU and display since I'm not exceeding 120fps under those conditions. But once a 4K 240Hz OLED comes out and now the choice is either to play at 240fps with RT off or cut my fps in half to play with RT on I would probably go for RT off at 240fps. When are we going to get to the point where we can flip a bunch of RT effects on to the max and lose no more than 15-20% performance for doing so? Ideally I'd want 0% performance loss for RT but that's an unrealistic expectation.
 
I am not sure if it would "work" like that, RT workload are easily infinite long to run, you can add rays, you can reduce the size of the volumetric region that affect them, made how much they are affected more complex same for the surface they bounce off(and how many rebound you had)

Maxing out RT should always reduce performance (or the game are not letting you push them enough almost by definition), it is more how you prefer the look for the cost, has we go for 240fps frame generation will be a fully usable things (the latency added can be quite low when your game goes at 130+ fps without it), it exists a theoretical world so complex (and with resolution so large) that RT become the faster way but we are probably many miles from it.
 
Last edited:
My issue with RT is that even on Lovelace, the performance loss is still significant even though we are on the 3rd gen RTX GPU now. In the heaviest RT titles you are looking at around 40-50% reduction in frame rate if you max out RT, which for now is totally fine since my display maxes out at 120Hz anyways (LG CX) so I can just crank everything up, max out the RT, flip DLSS on and still get full use out of my GPU and display since I'm not exceeding 120fps under those conditions. But once a 4K 240Hz OLED comes out and now the choice is either to play at 240fps with RT off or cut my fps in half to play with RT on I would probably go for RT off at 240fps. When are we going to get to the point where we can flip a bunch of RT effects on to the max and lose no more than 15-20% performance for doing so? Ideally I'd want 0% performance loss for RT but that's an unrealistic expectation.
The performance loss with ray tracing is never going to significantly improve.
 
The performance loss with ray tracing is never going to significantly improve.

Based on what we've seen with the first 3 generations of RTX I can agree with that. Guess RT will continue to be something I keep off unless the card is just so fast that I can make the most out of my display with RT on like the 4090 is doing for my CX.
 
The performance loss with ray tracing is never going to significantly improve.
this is what was said about FSAA in the first 5-7 years it was around too. And I'd say its been something people dont even think about turning off or on for, hmmm, maybe at least 6 years now? More? I definitely stopped thinking about it with the 1080Ti...
 
this is what was said about FSAA in the first 5-7 years it was around too. And I'd say its been something people dont even think about turning off or on for, hmmm, maybe at least 6 years now? More? I definitely stopped thinking about it with the 1080Ti...
FSAA isn't ray tracing. FSAA performance can be improved by simply upping the pixel fill rate and memory bandwidth. While both metrics are massive compared to times of yore, nobody uses FSAA anymore because it is basically useless with how dependent rasterization is on the pixel pipeline for at least the last decade. FSAA can't anti-alias transparent textures or post-process effects.
 
I am not sure if it would "work" like that, RT workload are easily infinite long to run, you can add rays, you can reduce the size of the volumetric region that affect them, made how much they are affected more complex same for the surface they bounce off(and how many rebound you had)

Maxing out RT should always reduce performance (or the game are not letting you push them enough almost by definition), it is more how you prefer the look for the cost, has we go for 240fps frame generation will be a fully usable things (the latency added can be quite low when your game goes at 130+ fps without it), it exists a theoretical world so complex (and with resolution so large) that RT become the faster way but we are probably many miles from it.
Which begs the question, Is RT the right solution for games? As games get more complex in geometry, you need more rays to bounce around to get a lighting calculation and current RT rays injected is actually extremely low for a high quality RT solution. All models for lighting are gross approximations to begin with, a real candle in a dark room has virtually an infinity number of photons bouncing around in a dark room. Something that any foreseeable GPU will never be able to duplicate to that level of lighting detail. So RT or the hybrid approach used for today games have real limitations -> there are other ways to calculate lighting that can be faster and maybe even being more realistic in the end. More like a combination of techniques in the end:
https://docs.unrealengine.com/5.0/en-US/lumen-technical-details-in-unreal-engine/

With dedicated RT hardware, AI fixed function units that Nvidia committed to also means more and more space dedicated to those fixed function solutions. That is another point made by the video, how much more can Nvidia increase RT when the process node are becoming so prohibitive in cost? AMD instead of tying up die space with fixed function, at larger and larger percentage of die space can put more shaders in, that can handle RT but also compute, shaders etc. Less fixed function but more capable shaders. To me seems like Nvidia went on a direction that is not sustainable or usable for the mass majority of PC gamers. One should not have to buy a 4090 level RT level card just to get good framerates for RT games. Frame generation tech is interesting, so maybe after a few tweaks and updates will come on its own like DLSS 1.0 -> 2.x did. When the 4080 price hit $1199 for a card so much cut down from their top card, the writing maybe on the wall.
 
Last edited:
Which begs the question, Is RT the right solution for games? As games get more complex in geometry, you need more rays to bounce around to get a lighting calculation and current RT rays injected is actually extremely low for a high quality RT solution.
From my complete lag of understanding, not really, if extreme RT you throw say 4 sample by monitor pixel and the scene having 14,000 triangle humans or 30,000 triangle humans does not change much for it, creating more of a log versus complexity versus linear for traditional affair, to the point that enough complexity could have the RT be faster, you do not need to make sure all triangle get hit by rays, you will naturally get what you need and you still just threw just 4 sample by camera pixel and they will still all bounce say a max of 2 time in the scene and so on regardless of the triangle counts, I could just read that the case. The nature of the triangle (mirror like versus complex lighting bounce versus semi transparent being more important), maybe it make the collision work a bit more complex but I am not sure you need more rays, the rays you have will just make contact with small and better representation of the world triangles.
 
From my complete lag of understanding, not really, if extreme RT you throw say 4 sample by monitor pixel and the scene having 14,000 triangle humans or 30,000 triangle humans does not change much for it, creating more of a log versus complexity versus linear for traditional affair, to the point that enough complexity could have the RT be faster, you do not need to make sure all triangle get hit by rays, you will naturally get what you need and you still just threw just 4 sample by camera pixel and they will still all bounce say a max of 2 time in the scene and so on regardless of the triangle counts, I could just read that the case. The nature of the triangle (mirror like versus complex lighting bounce versus semi transparent being more important), maybe it make the collision work a bit more complex but I am not sure you need more rays, the rays you have will just make contact with small and better representation of the world triangles.
For RT, it is more then just your viewpoint, objects off screen also have to have rays cast which can interact with the items on the screen. Especially reflections but also lighting in general. When you are dealing with 10's of millions of triangles which block rays, more rays are needed for any accurate lighting calculation for those triangles. RT is really just an approximation, more rays the more realistic the lighting becomes but it is affected by the complexity of the scene. Simple walls and textures, few objects compared to thousands of objects current RT probably can handle decently, hundreds of moving people, 10's of million of triangles and vast distances makes RT very performance heavy and limiting. HDRI lighting was a realistic, more fix system of lighting where a light map pixel brightness would used to calculate the pixel brightness if exposed to it, just not very dynamic. Lightmaps generated from actual ray tracing gave that very realistic look which is still used today augmented by RT. Still at 4K rendering, CyberPunk 2077 with the fastest RT card, 4090, is 42 fps. RT in that game is made for who? How many can really take advantage of that? For me with the 3090 using balance DLSS and some settings I get like 51fps - lol. Other games seem to be way smarter when using RT, look great etc. but then some games don't even use RT and can look even better.
 
For RT, you can increase the performance by increasing the abilities of the GPU to process the math needed for it. Thats seperate from things like fillrates etc etc etc. So you can directly increase it's performance independently of other functions. That's true for both AMD and nVidia, despite the slightly different approaches to it's acceleration.

FSAA isn't ray tracing. FSAA performance can be improved by simply upping the pixel fill rate and memory bandwidth. While both metrics are massive compared to times of yore, nobody uses FSAA anymore because it is basically useless with how dependent rasterization is on the pixel pipeline for at least the last decade. FSAA can't anti-alias transparent textures or post-process effects.

FSAA = SSAA. It was MSAA that had some trouble with transparencies and shader effects.(and textures in general really. As it also couldn't deal with textures aliasing seen when using Anisotropic Filtering)

The reason we moved away from SSAA was that there was no way to improve it's performance directly. It will always be slower than when it's off, because you are literally rendering the game at a higher resolution and down-sampling. That why we moved to Anti-Alasing methods that didn't do this. Like MSAA that tries to edge detect and up/down sample only the edges of polygons and post processing AA methods like FXAA/SMAA and TAA.
 
  • Like
Reactions: noko
like this
For RT, you can increase the performance by increasing the abilities of the GPU to process the math needed for it. Thats seperate from things like fillrates etc etc etc. So you can directly increase it's performance independently of other functions. That's true for both AMD and nVidia, despite the slightly different approaches to it's acceleration.



FSAA = SSAA. It was MSAA that had some trouble with transparencies and shader effects.(and textures in general really. As it also couldn't deal with textures aliasing seen when using Anisotropic Filtering)

The reason we moved away from SSAA was that there was no way to improve it's performance directly. It will always be slower than when it's off, because you are literally rendering the game at a higher resolution and down-sampling. That why we moved to Anti-Alasing methods that didn't do this. Like MSAA that tries to edge detect and up/down sample only the edges of polygons and post processing AA methods like FXAA/SMAA and TAA.
Now instead of rendering at a higher resolution and down sampling, we render at a lower resolution and upsample. Progress. Then add non rendered frames in between, moving forward. Maybe the move to RT is not the best path ;)

Yes, RT is very compute and memory intensive -> smart caches and more shaders should help out a lot.
 
Last edited:
For RT, it is more then just your viewpoint, objects off screen also have to have rays cast which can interact with the items on the screen.
It depends on which direction you go, if you start with the eyes to do your raytracing it will not care about visilibity, it will reach anything that emit light.

When you are dealing with 10's of millions of triangles which block rays, more rays are needed for any accurate lighting calculation for those triangles.

Maybe the more triangle the more benefit you get more benefit from the added rays but you really do not need them, the way it work is you have X rays did it a triangle or not, if you have smaller triangles it will work better by default without having to add some rays sample or adding much time to the render..

Distance of rays is obviously an issue, no wall that kill them, etc... but the numbers of triangles on the models used I am not sure it is an big issue, could make some benchmark to try, in a very quick blender test

Ratio triangleRatio timetriangleRender time (s)triangle per second
1​
1.00​
141,446​
3.68​
38,436.41
4​
1.03​
565,784​
3.78​
149,678.31
16​
1.29​
2,263,136​
4.74​
477,454.85
64​
2.05​
9,052,544​
7.55​
1,199,012.45

Increase the triangles counts on models changed render time little, even multiplying them by a giant ratio of 4 only added 3% more time.
 
It depends on which direction you go, if you start with the eyes to do your raytracing it will not care about visilibity, it will reach anything that emit light.

When you are dealing with 10's of millions of triangles which block rays, more rays are needed for any accurate lighting calculation for those triangles.

Maybe the more triangle the more benefit you get more benefit from the added rays but you really do not need them, the way it work is you have X rays did it a triangle or not, if you have smaller triangles it will work better by default without having to add some rays sample or adding much time to the render..

Distance of rays is obviously an issue, no wall that kill them, etc... but the numbers of triangles on the models used I am not sure it is an big issue, could make some benchmark to try, in a very quick blender test

Ratio triangleRatio timetriangleRender time (s)triangle per second
1​
1.00​
141,446​
3.68​
38,436.41
4​
1.03​
565,784​
3.78​
149,678.31
16​
1.29​
2,263,136​
4.74​
477,454.85
64​
2.05​
9,052,544​
7.55​
1,199,012.45

Increase the triangles counts on models changed render time little, even multiplying them by a giant ratio of 4 only added 3% more time.
With your tests, increase the triangles with increasing the number of objects, not necessarily just more triangles on an object, each with their own separate materials and properties. When light bounce, color bleed, diffusion, reflections, shadows has to be calculated for each pixel. Just rendering one material, one object with nothing around it is very much different. Put a bounding box around it or limit the distance for the rays. Complex objects with different materials, crevices, different light sources would also contribute to increase work load if those added triangles add significant dimensions to the object.
 
With your tests, increase the triangles with increasing the number of objects, not necessarily just more triangles on an object
Yes that a specific element of complexity, i.e. triangle count.

Shadows end up always calculated by each pixels regardless, again I could not fully explain why but the complexity in raytracing tend to be a log(triangles) not the numbers of triangle like for regular raster, the complexity of the material would be important the fact that there is many differents in the same scene I am not sure play much.

John Carmak did talk about it back in the days:
https://arstechnica.com/gadgets/201.../?comments=1&comments-page=2#comment-23723213
Because ray tracing involves a log2 scale of the number of primitives, while rasterization is linear, it appears that highly complex scenes will render faster with ray tracing, but it turns out that the constant factors are so different that no dataset that fits in memory actually crosses the time order threshold.

Like he said that threshold is so far away and not necessarily possible, but in theory the kind of scene that would need 2TB of VRAM to fit or some very high complexity could maybe be not that slower on a pure RT silicon than a raster one.

One variable is that non specialized render technology did continue to advance (performance and quality) a lot, look what a PS4 can do with the latest COD.

And could very well always keep the edge forever specially for production that has giant ressource and money to do all the complex pre-baking technology and with unity-UE being so widely available to everyone with little to no entry point pricing, that could mean virtually everyone.
 
Yes that a specific element of complexity, i.e. triangle count.

Shadows end up always calculated by each pixels regardless, again I could not fully explain why but the complexity in raytracing tend to be a log(triangles) not the numbers of triangle like for regular raster, the complexity of the material would be important the fact that there is many differents in the same scene I am not sure play much.

John Carmak did talk about it back in the days:
https://arstechnica.com/gadgets/201.../?comments=1&comments-page=2#comment-23723213
Because ray tracing involves a log2 scale of the number of primitives, while rasterization is linear, it appears that highly complex scenes will render faster with ray tracing, but it turns out that the constant factors are so different that no dataset that fits in memory actually crosses the time order threshold.

Like he said that threshold is so far away and not necessarily possible, but in theory the kind of scene that would need 2TB of VRAM to fit or some very high complexity could maybe be not that slower on a pure RT silicon than a raster one.

One variable is that non specialized render technology did continue to advance (performance and quality) a lot, look what a PS4 can do with the latest COD.

And could very well always keep the edge forever specially for production that has giant ressource and money to do all the complex pre-baking technology and with unity-UE being so widely available to everyone with little to no entry point pricing, that could mean virtually everyone.
Interesting, now that was from 2013 before current hardware, 16 bounces of light for slightly glossy surfaces was an interesting comment. People with the majority of RT cards, when they put on RT will they have a good experience? or should I say a better experience? I would say no. Even 3090 owners turn off RT due to the performance hit at times in games and not worth the other compromises that wreck IQ, increase lag, performance more then what RT adds if even noticed. Current RT games speaks loudly on gamers experience with and without. Anyways playing Crysis 2 Remaster now, nice improvement over the old and it is only using software raytracing and performing well on the 3090 at 4K, of course with quality DLSS.
 
Interesting, now that was from 2013 before current hardware, 16 bounces of light for slightly glossy surfaces was an interesting comment. People with the majority of RT cards, when they put on RT will they have a good experience? or should I say a better experience? I would say no. Even 3090 owners turn off RT due to the performance hit at times in games and not worth the other compromises that wreck IQ, increase lag, performance more then what RT adds if even noticed. Current RT games speaks loudly on gamers experience with and without. Anyways playing Crysis 2 Remaster now, nice improvement over the old and it is only using software raytracing and performing well on the 3090 at 4K, of course with quality DLSS.

I'm playing Crysis 2 Remastered as well and I believe it only has RT reflections but maybe I'm wrong. While the RT reflections are cool and all, I have to say that it doesn't make a day and night difference to the experience. If I didn't have a 4090 I could just as easily turn RT off and still enjoy the game all the same. Really the only games that I see RT truly upgrading the visuals is fully path traced games like Portal RTX. But just imagine the kind of GPU we would need in order to run a fully path traced Crysis, maybe an RTX 9090 Ti lol.
 
I'm playing Crysis 2 Remastered as well and I believe it only has RT reflections but maybe I'm wrong. While the RT reflections are cool and all, I have to say that it doesn't make a day and night difference to the experience. If I didn't have a 4090 I could just as easily turn RT off and still enjoy the game all the same. Really the only games that I see RT truly upgrading the visuals is fully path traced games like Portal RTX. But just imagine the kind of GPU we would need in order to run a fully path traced Crysis, maybe an RTX 9090 Ti lol.
It does more then reflections, global illumination or lighting, color bleed and shadows.
 
It seems some people confuse generational performance increase caused by having more cores clocked higher with generational performance increase in the core itself.

RTX 4000 series has minor RT performance improvement which seems to only happen because of improved caches and otherwise RT performance is identical.
If RT performance per core was improved then we would see bigger framerate differences in RT games compared to pure rasterization.

What I think will happen is Nvidia will avoid significant changes they cannot make without increasing transistor budget for RT cores unless AMD catches up to them in relative RT performance because the only thing which matters to Nvidia is to be ahead of its competition and not overall RT performance. Nvidia needs to have GPU which is on the top of GPU charts and one which is this premium product one gets to play at Ultra quality settings.
 
And where did you get that from? I know they have a new lighting system but it's not RTGI and there's no mention of RT shadows, just RT reflections.

https://www.eurogamer.net/digitalfoundry-2021-crysis-2-3-remastered-tech-review

The problem with that article, is it is about consoles. And PC's can push the RT beyond what it done on the consoles and a later patch let that go even further...

https://screenrant.com/crysis-remastered-ray-tracing-pc-update-boost-mode/

The latest PC update for Crysis Remastered, Version 2.1.2, adds an "experimental" Ray Tracing Boost Mode that enables ray tracing across nearly every surface.

But it's also worth noting that your article points out that Crysis Remastered was already using SVOGI on all platforms. SVOGI is a form of RayTracing and, as the name already sugguest's, handles the Global illumination in the game.
 
The problem with that article, is it is about consoles. And PC's can push the RT beyond what it done on the consoles and a later patch let that go even further...

https://screenrant.com/crysis-remastered-ray-tracing-pc-update-boost-mode/

The article covers both console and PC. The screenrant article only seems to refer to Crysis 1 Remastered. I should probably go back and revisit that one with this new experimental mode.

"We'll talk about the PC versions first, as this is where both Crysis 2 and Crysis 3 have received the most love. In this case, I'd highly recommend watching the video embedded below, where Alex Battaglia experiences the maximum joy of having both Crysis and ray tracing once again combined into one glorious whole. But there is actually more to the games that just RT additions alone, certainly in the case of Crysis 2 Remastered."

"However, the PC version of Crysis 3 Remastered does receive a lot more love. What we don't get is much in the way of remastered art, mind you, while SVOGI is off the table. The most obvious improvement is ray traced reflections, again improving the realism of materials and specular lighting overall. Crysis 3 is filled with a lot of organic surfaces and environments, however, and here, ray tracing helps by eliminating a lot of cubemap glow and light leakage."
 
The article covers both console and PC. The screenrant article only seems to refer to Crysis 1 Remastered. I should probably go back and revisit that one with this new experimental mode.

"We'll talk about the PC versions first, as this is where both Crysis 2 and Crysis 3 have received the most love. In this case, I'd highly recommend watching the video embedded below, where Alex Battaglia experiences the maximum joy of having both Crysis and ray tracing once again combined into one glorious whole. But there is actually more to the games that just RT additions alone, certainly in the case of Crysis 2 Remastered."

Sorry had a late edit. To add more detail.

But both 2 & 3 also use SVOGI to do the GI and thats a form of RT.
 
If RT performance per core was improved then we would see bigger framerate differences in RT games compared to pure rasterization.
A typo and you meant less ? Not necessarily.

RTX 4000 series has minor RT performance improvement
This is based on what, it seem to have exploded in that regard:

NVIDIA-GeForce-RTX-4080-Vray-RTX-Benchmark-Score-1.png


3DMark-DirectX-Raytracing-Feature-Test.png


Rt core
2080ti: 68
3090ti: 84 (+23%)
4090: 128 (+52%)

RT Tflops:
2080TI: 42.9
3090TI: 78.1 +82%
4090: 191 +244%
 
Would be nice if these gains actually translated into real games though.
4090_metroexodus.png


the 4090 virtually double the 3090 in some RT heavy game, I am not sure how that possible with a minor RT boost

https://www.dsogaming.com/pc-perfor...x-4090-benchmarks-30-most-demanding-pc-games/

In the most demanding game, the best proxy for Rt performance


Dying light 2, Control, Metro Exodus RT edition, quake 2 rtx,
4090: 62 /76/ 70 / 76
3080: 34/ 38 / 31 / 38

Obviously non RT part of the workload of a game (cpu, composition, traditional raster and so on) did not improve has much has the +2.4x in TFlops on RT here but that still quite the jump.
 
A typo and you meant less ? Not necessarily.
I said it wrong, yes.
In games it looks like RT effects didn't get 'lighter' on framerate enough to talk about RT performance increase. You still loose quite a lot performance by enabling RT.

This is based on what, it seem to have exploded in that regard:

...

Rt core
2080ti: 68
3090ti: 84 (+23%)
4090: 128 (+52%)

RT Tflops:
2080TI: 42.9
3090TI: 78.1 +82%
4090: 191 +244%
Not sure what these tests test but yeah, it might be possible that some things related to RT are much faster and it just doesn't translate that well to today's games eg. they are barely shooting any rays and other steps (note: ray tracing has few steps to it and not only finding ray-geometry intersections) still take proportionally the same time as they were so overall improvements seem less than some aspects of it are.

It might as well be that future games with heavier use of RT will see bigger generational improvements between Turing/Ampere and Ada Lovelace.
 
  • Like
Reactions: noko
like this
the 4090 virtually double the 3090 in some RT heavy game, I am not sure how that possible with a minor RT boost
4090 has more cores and faster clocks vs whatever card you like from RTX 3000 series you pick.

Imho 4090 vs 3090Ti is a nice case study
4090 has 16384 shaders clocked at 2235 MHz = 36618240
3090Ti has 10752 shaders clocked at 1560 MHz = 20966400
36618240 / 20966400 = 1.7465
From pure cores times clock point of view 4090 should be 1.7465 times faster than 3090Ti. This is ignoring memory being virtually the same.

Let's take your game example: 72.8 / 42 = 1.7333
Obviously this is not that simple BUT this is this simple :)

If they made Ampere with the same amount of cores as 4090 and clocked it with the same clocks 4090 uses then such card would surely be more bottlenecked by slower memory (both bandwidth and average latency) but for the most part it would be almost as fast as 4090.
One could even downclock memory on 3090Ti to make it proportionally slower to get decent estimate how fast would such 3090Ti with more/faster cores be vs 4090 and with it estimate architectural improvements between Ampere and Ada Lovelace.

tl;dr: More faster cores ==> more FPS
No good reason with current data to claim Ada Lovelace differs in any significant way from Ampere in RT or otherwise.
 
In games it looks like RT effects didn't get 'lighter' on framerate enough to talk about RT performance increase. You still loose quite a lot performance by enabling RT.
That seem to be 2 different things, if you not RT workload ms time by frame diminished you need a massive jump in RT performance for RT effect to keep up the same relative performance decrease no ?
If they made Ampere with the same amount of cores as 4090 and clocked it with the same clocks 4090 uses then such card would surely be more bottlenecked by slower memory (both bandwidth and average latency) but for the most part it would be almost as fast as 4090.
I am not sure what size of a die you need to fit that amount of core with Samsung 8 and how many thousand watt it would have been, but I can imagine.

No good reason with current data to claim Ada Lovelace differs in any significant way from Ampere in RT or otherwise.
I feel there is 2 different conversation, one rather obvious RT performance relative to raster performance does not seem to have a significant change from Lovelace to Ampere that get shifted too RT performance does not differ in a significant way from Ampere (despite showing a massive gain in pretty much pure path traced scenario...)
 
That seem to be 2 different things, if you not RT workload ms time by frame diminished you need a massive jump in RT performance for RT effect to keep up the same relative performance decrease no ?
Going from eg. 3060 to 3090Ti we see nice performance improvement across the board. RT performance improves as much as rasterization performance.
Then going to 4090 we see the same thing.

Conclusion should be pretty obvious what happened.

--------------------------
Apparently Ada Lovelace RT performance improvements which Nvidia claimed are only with new features which need to be coded for. If that is the case then it would make sense we do not see improvements in current titles.
 
That RT performance exploded at least has much as raster did ? Like it does if we go to a 3090TI from a 3060 ?
If RT performance increased as much as raster performance and by a chance it is exactly as much as increase in core count times clock then it means there is actually no improvement in RT core design.

Personally I think that whatever future RT performance increases we will get due to new RTX 4000 features could be accomplished just fine on Ampere and even Turing but because Nvidia worries about user experience these cards won't support these features/improvements and will have to get new GPU to get them 🙃
 
Possible
Rt core count
2080ti: 68
3090ti: 84 (+23%)
4090: 128 (+52%)

Rt tflops
2080TI: 42.9
3090TI: 78.1 +82%
4090: 191 +244%

Went from 1860 to 2520 mhz while gaining 52% more core, 1*1.52*1.35 = 2.05, seem to still be an higher Tflops gains than that.

But Turing to Ampere Tflops jumps was more impressive with that napkin math, but I am not sure how valid we are.
1.23*(1860 /1545) = 1.48 vs the 82%, I would imagine that by now the low-hanging fruit easy gain are all made by the third gen and gen over gen gain will be like anything else.

But we are having a very different conversation than the claim and going into a bit of a bizaro world, yes would Ampere had most of the technological advantage Lovelace has over it by packing more than twice the transistor by mm than it did, it would be quite similar....

Like saying would the 7900xtx had only 80 CU with half the transistor in them and still on 512.0 GB/s of memory bandwith instead of 960gb/s and still running at 2ghz the gain would be small
 
Last edited:
Back
Top