AMD RDNA 2 gets ray tracing

Ah yes, you are right. Volta has Tensor cores but not the same RT cores as Turing.

But in any case, it does support DXR and the gist of it was that the programs are running parallel, which they are.

I must have misread this line from Wikipedia. Sorry for the confusion.
RTX runs on Nvidia Volta- and Turing-based GPUs, specifically utilizing the Tensor cores (and new RT cores on Turing) on the architectures for ray tracing acceleration.
https://en.wikipedia.org/wiki/Nvidia_RTX
 
  • Like
Reactions: XoR_
like this
Well first of all the pic you shared does show concurrent execution of RT with other work so your premise is wrong.

The parallelism of work on modern GPUs is mostly dependent on how the game engine schedules tasks as all recent hardware can process graphics and compute concurrently. Obviously there are dependencies during the rendering of a single frame and there is an order in which work needs to be completed.

Given the inherent divergence between rays it does actually make sense to write ray-hit params out to an intermediate buffer and then shade only once RT is done. This would maximize coherence of memory requests during shading.

Firing off shading immediately for each individual ray hit without any sort of sorting or batching would wreak havoc on any architecture including RDNA.
Thanks for the feedback, as for the given frame rendering on Metro Exodus, that is a timeline for the rendering and it does indicate more of a serial dependency for the RT core and shading.
  1. First part is setting up the geometry, optimized by using primitive or building blocks, instancing etc. Then the Ray's are cast using RT cores independent as it appears from the shaders finding what intersects the geometry and any secondary bounces.
  2. At least in this example the shading part is not directing any additional rays after that block. The data from the RT is moved to the shader cores, read write cycle. Not sharing memory access
AMD: If the patent is followed which it seems is the case by what was indicated by Mark Cerny, appears:
  1. The Ray's are cast to find interssection with obects for associated shaders
  2. Any object associated shaders can project additional Ray's as needed
  3. Example, a object with a reflective component, like water, glass, mirror right in the shader can have more rays cast to define what will be reflected so as to render the pixel
  4. The data is local to both the shader and the RT engine which is not the case with Turing (AMD owns the patent). AMD design, any extra ray's are cast as needed while Nvidia you cast enough, way more Ray's just in case, sufficient to supply the shaders the data needed
  5. The RT part and shading also can work independently at the same time vice waiting for each other
If memory is not local to both type of operations, finding geometry intersections and shading, going back and forth with read writes to each other buffers, caches etc I would think would be very problematic.

AMD combined those operations to have the same memory pool (their patent) to allow as needed the rays needed and not wasting processing, time, memory on Ray's doing little to the sceen.

Obects or materials for example like a rock, do not need rays for reflection, so no extra Ray's are needed or time wasted processing them. For Nvidia the intersecting finding and material shading are separate so extra Ray's have to be cast for just in case it will be needed by the objects materials.

How all of this really works out will have to be seen. I am just interested in how accomplished and benefits or not in the different methods. From what I gather AMD method is more versatile, programmable for efficiency. A clear white paper and some good engineering/developer discussions with AMD should make this clear.
 
Last edited:
I think that makes sense. I really do hope AMD found some "special sauce" but it could be more hype knowing AMD and the stuff that they talked about before that never materialized.
 
I think that makes sense. I really do hope AMD found some "special sauce" but it could be more hype knowing AMD and the stuff that they talked about before that never materialized.
Exactly, what matter's is how it performs and if this gets support taking graphics to the next level.
 
If AMD's ray tracing solution is indeed more efficient than Nvidia's, would Nvidia need more raw power in their GPUs to come out on top in terms of final ray tracing performance?
 
If likely to generate more favor than complaint, they would have released already.
Therefore I would say AMD's solution must still be unpresentable work in progress.
Not to say they won't get there some day, but you can wish in one hand...
 
The Thing is Nvidia's Ray Tracing solution is inefficient. It's not even leveraging all the resources that Turing cards provide. It was supposed to use the Tensor cores, but the denoising is done on the shaders.

If Nvidia can get issues like this sorted in their next gen GPUs, then I can't see how AMD will be able get near that level of performance in RT without some sort of dedicated RT hardware onboard the card.

Reading Noko's post above, if what he says is correct, then It looks like AMD is going the way of reducing the amount of Ray Tracing been done in any scene.

I just wonder will that cause problems with comparisons. Using the rock example above. Lets say there is a frame with rocks. Nvidia Ray Traces 20% of the scene, AMD only does 10% because it doesn't Ray trace the Rocks. Will that be fair in benchmarks, especially if there is a visual difference?

These will be interesting times.
 
The Thing is Nvidia's Ray Tracing solution is inefficient. It's not even leveraging all the resources that Turing cards provide. It was supposed to use the Tensor cores, but the denoising is done on the shaders.

If Nvidia can get issues like this sorted in their next gen GPUs, then I can't see how AMD will be able get near that level of performance in RT without some sort of dedicated RT hardware onboard the card.

Reading Noko's post above, if what he says is correct, then It looks like AMD is going the way of reducing the amount of Ray Tracing been done in any scene.

I just wonder will that cause problems with comparisons. Using the rock example above. Lets say there is a frame with rocks. Nvidia Ray Traces 20% of the scene, AMD only does 10% because it doesn't Ray trace the Rocks. Will that be fair in benchmarks, especially if there is a visual difference?

These will be interesting times.
That will be up to the individual if visual/perf tradeoff is worth it... And what the visual difference, if any is. Heck, turning RT on has a visual difference for a sacrifice of performance... So do all levels of RT. Sometimes lowest level of RT is good enough, sometimes the highest level still sucks. It's up to the implementation and use. Maybe we'll get back to having IQ included in benchmarks.
 
The Thing is Nvidia's Ray Tracing solution is inefficient. It's not even leveraging all the resources that Turing cards provide. It was supposed to use the Tensor cores, but the denoising is done on the shaders.

If Nvidia can get issues like this sorted in their next gen GPUs, then I can't see how AMD will be able get near that level of performance in RT without some sort of dedicated RT hardware onboard the card.

Reading Noko's post above, if what he says is correct, then It looks like AMD is going the way of reducing the amount of Ray Tracing been done in any scene.

I just wonder will that cause problems with comparisons. Using the rock example above. Lets say there is a frame with rocks. Nvidia Ray Traces 20% of the scene, AMD only does 10% because it doesn't Ray trace the Rocks. Will that be fair in benchmarks, especially if there is a visual difference?

These will be interesting times.

You'd think that if Nvidia can get the Tensor core denoising working on their next gen GPUs it could be done on the current Turing cards as well. I've been wondering what happened to the Tensor core denoising on Turing cards.
 
Thanks for the feedback, as for the given frame rendering on Metro Exodus, that is a timeline for the rendering and it does indicate more of a serial dependency for the RT core and shading.
  1. First part is setting up the geometry, optimized by using primitive or building blocks, instancing etc. Then the Ray's are cast using RT cores independent as it appears from the shaders finding what intersects the geometry and any secondary bounces.
  2. At least in this example the shading part is not directing any additional rays after that block. The data from the RT is moved to the shader cores, read write cycle. Not sharing memory access
AMD: If the patent is followed which it seems is the case by what was indicated by Mark Cerny, appears:
  1. The Ray's are cast to find interssection with obects for associated shaders
  2. Any object associated shaders can project additional Ray's as needed
  3. Example, a object with a reflective component, like water, glass, mirror right in the shader can have more rays cast to define what will be reflected so as to render the pixel
  4. The data is local to both the shader and the RT engine which is not the case with Turing (AMD owns the patent). AMD design, any extra ray's are cast as needed while Nvidia you cast enough, way more Ray's just in case, sufficient to supply the shaders the data needed
  5. The RT part and shading also can work independently at the same time vice waiting for each other
If memory is not local to both type of operations, finding geometry intersections and shading, going back and forth with read writes to each other buffers, caches etc I would think would be very problematic.

AMD combined those operations to have the same memory pool (their patent) to allow as needed the rays needed and not wasting processing, time, memory on Ray's doing little to the sceen.

Obects or materials for example like a rock, do not need rays for reflection, so no extra Ray's are needed or time wasted processing them. For Nvidia the intersecting finding and material shading are separate so extra Ray's have to be cast for just in case it will be needed by the objects materials.

How all of this really works out will have to be seen. I am just interested in how accomplished and benefits or not in the different methods. From what I gather AMD method is more versatile, programmable for efficiency. A clear white paper and some good engineering/developer discussions with AMD should make this clear.
Where do you pull all this bullshit from?
AMD having patent means only they rephrased the same thing differently and filed for patent. If they did not they would be sued by Nvidia for breaking their patent.

The way DXR works is defined by Microsoft and rays are always cast as needed and all secondary shader logic is written as shader programs in recursive manner.
Local memory pool == cache. The more you have the more cache you have the better as in some cases eg. when you shooting rays for adjacent pixels it is very likely you won't need different BHV entries to find intersections or otherwise if given execution unit already have some needed data it does not need to get it again. From this it is conceivabe that just by increasing cache RT performance should increase.

Your whole premise is that Turing ray tracing is terrible and your all ignorant "facts" come from from this first assumption.
 
You'd think that if Nvidia can get the Tensor core denoising working on their next gen GPUs it could be done on the current Turing cards as well. I've been wondering what happened to the Tensor core denoising on Turing cards.
Metro Exodus frame analysis show tensor core usage.
From what I know not all games use tensor core for denoising. It is developer choice what they use really...
 
This is why have review sites, because random users will post crap based on ignorance.

Notice how claims are being tossed around that have either no foundation in reality or are pure BS:
"Serial raytracing"
"Inefficient"

Well, as soon as AMD put out it's hardware these kind of posts (and posters) will seem very hilliarious...as some people want 2+2=3,7875422...2
 
Metro Exodus frame analysis show tensor core usage.
From what I know not all games use tensor core for denoising. It is developer choice what they use really...
Isn't the tensor core usage on that example from DLSS?

Where do you pull all this bullshit from?
AMD having patent means only they rephrased the same thing differently and filed for patent. If they did not they would be sued by Nvidia for breaking their patent.

The way DXR works is defined by Microsoft and rays are always cast as needed and all secondary shader logic is written as shader programs in recursive manner.
Local memory pool == cache. The more you have the more cache you have the better as in some cases eg. when you shooting rays for adjacent pixels it is very likely you won't need different BHV entries to find intersections or otherwise if given execution unit already have some needed data it does not need to get it again. From this it is conceivabe that just by increasing cache RT performance should increase.

Your whole premise is that Turing ray tracing is terrible and your all ignorant "facts" come from from this first assumption.
Read the Patent, it is all there, look at para 0023, also fig 18 ladder logic may help as well. The first abstract point on first page sums it up:
http://www.freepatentsonline.com/20190197761.pdf
 
This is why have review sites, because random users will post crap based on ignorance.

Notice how claims are being tossed around that have either no foundation in reality or are pure BS:
"Serial raytracing"
"Inefficient"

Well, as soon as AMD put out it's hardware these kind of posts (and posters) will seem very hilliarious...as some people want 2+2=3,7875422...2
lol, you make me laugh -> please keep it up :D
 
Isn't the tensor core usage on that example from DLSS?
Good point.
So no tensor core denoising in this game.

Read the Patent, it is all there, look at para 0023, also fig 18 ladder logic may help as well. The first abstract point on first page sums it up:
http://www.freepatentsonline.com/20190197761.pdf
I think it does work exactly the same in Turing, AMD use words "texture unit" a lot, probably to not infringe on Nvidia patents.
It doesn't matter where you put BHV handling functionality, separate cores, texture units, shader units, etc.

Do you have any performance results from RDNA2?
 
Good point.
So no tensor core denoising in this game.


I think it does work exactly the same in Turing, AMD use words "texture unit" a lot, probably to not infringe on Nvidia patents.
It doesn't matter where you put BHV handling functionality, separate cores, texture units, shader units, etc.

Do you have any performance results from RDNA2?
Well after geometry is setup and BVH building, BVH is sent to the CU, ray testing for the bounding volume or boxes are then determined by the shaders if the ray tests will be done or not. To me this looks way different then what Nvidia is doing.

As for performance, the only indication which is really not specific is from the PS5 presentation video dealing with brief raytracing talking points, complex animations and reflections: Mark Cerny
". . .I've already seen a PS5 title that successfully using raytracing base reflections in complex animated scenes only with modest costs. . . ."

Does that mean something like BF5 which Turing took a significant hit when doing just reflections or something less, no idea. We just have to wait and see how this pans out and how AMD actually explain how everything works. All indications on what Mark Cerny said dealing with Ray Tracing indicates that this patent is applicable in how it works, at least he called it by the same name Ray Intersection Engine, plus how he describe how the two can work independently in the CU.
 
Last edited:
If AMD's ray tracing solution is indeed more efficient than Nvidia's, would Nvidia need more raw power in their GPUs to come out on top in terms of final ray tracing performance?
Once every console and their programmer deem a must do? I think "Yes" is the correct response. What do you think?
 
Good point.
So no tensor core denoising in this game.


I think it does work exactly the same in Turing, AMD use words "texture unit" a lot, probably to not infringe on Nvidia patents.
It doesn't matter where you put BHV handling functionality, separate cores, texture units, shader units, etc.

Do you have any performance results from RDNA2?

All we have is this:

I guess that is why false statements are tried as an excuse /shrugs

Another "fun" fact is that dude "moore's law is dead" who posts a lot of...well crap (he is cleary biased on NO engineer) who claimed "Consoles first".
Welll....lol:
https://www.pcgamer.com/amd-big-navi-first-rdna-product/

"There's a lot of excitement for Navi 2, or what our fans have dubbed as the Big Navi" says Devinder Kumar, AMD's CFO, at the recent Bank of America Securities Global Technology Conference. "This will be our first RDNA 2 based product."

That is what happens when you listen to people posting based on their "feelings"....foot-in-mouth-syndrome ensues.
 
Reading Noko's post above, if what he says is correct, then It looks like AMD is going the way of reducing the amount of Ray Tracing been done in any scene.

Casting a ray requires an explicit call from the game via DXR. It's not up to hardware or drivers to decide on the number of rays to cast. Unfortunately noko does not seem to understand how this works.
 
Casting a ray requires an explicit call from the game via DXR. It's not up to hardware or drivers to decide on the number of rays to cast. Unfortunately noko does not seem to understand how this works.

Yeah, we don't need a repeat of the tesselation-over-ride...the devs decide how the game is rendered, not AMD.
 
If memory is not local to both type of operations, finding geometry intersections and shading, going back and forth with read writes to each other buffers, caches etc I would think would be very problematic.

AMD combined those operations to have the same memory pool (their patent) to allow as needed the rays needed and not wasting processing, time, memory on Ray's doing little to the sceen.

Obects or materials for example like a rock, do not need rays for reflection, so no extra Ray's are needed or time wasted processing them. For Nvidia the intersecting finding and material shading are separate so extra Ray's have to be cast for just in case it will be needed by the objects materials.

The memory architecture will be critical for performance but has nothing to do with number of rays cast. The casting of rays is completely controlled by the game engine.

I don't understand what you're trying to say with the rock example. You wouldn't blindly cast reflection rays for non-reflective surfaces on any architecture.

From what I gather AMD method is more versatile, programmable for efficiency. A clear white paper and some good engineering/developer discussions with AMD should make this clear.

If AMD's hardware is more versatile or programmable it will only matter if DXR is also updated to support the greater flexibility.
 
Well after geometry is setup and BVH building, BVH is sent to the CU, ray testing for the bounding volume or boxes are then determined by the shaders if the ray tests will be done or not. To me this looks way different then what Nvidia is doing.

Yes, AMD's patent describes an approach where traversal over intermediate BVH nodes is controlled by shader code. Unfortunately DXR doesn't support that functionality so it's a moot point unless AMD is going to roll out a custom RT api like they did with Mantle.

Also, this approach would likely be much, much slower given all of the data moving back and forth between the RT engine and shader core during traversal. Think of how slow anisotropic filtering would be if the TMU had to ask the shader core to select each mip level during texture mapping.
 
Yes, AMD's patent describes an approach where traversal over intermediate BVH nodes is controlled by shader code. Unfortunately DXR doesn't support that functionality so it's a moot point unless AMD is going to roll out a custom RT api like they did with Mantle.

Also, this approach would likely be much, much slower given all of the data moving back and forth between the RT engine and shader core during traversal. Think of how slow anisotropic filtering would be if the TMU had to ask the shader core to select each mip level during texture mapping.
Shaders control what to do with each intersection, eg. do some calculation and general logic, cast additional rays, do color blending, etc. and this makes sense to expose such functionality to developers because it makes such API fully programmable and allow for all kind of effects, even those which are beyond imagination of API developers.

On the other hand exposing functionality that deals with hardware implementation of BHV traversal does not seem to bring any benefits to developers. I also find it hard to imagine how application developer could optimize aspects of rendering pipeline better than driver developers can do it. It should be driver developers job to select best execution unit for each ray intersection and what data should be kept in cache.

As far as developer is concerned there might be a black hole inside GPU chip and rendered scene is really created inside. All developer really care about is that GPU and its driver implementation is matching reference rendering and its performance.
 
Last edited:
Shaders control what to do with each intersection, eg. do some calculation and general logic, cast additional rays, do color blending, etc. and this makes sense to expose such functionality to developers because it makes such API fully programmable and allow for all kind of effects, even those which are beyond imagination of API developers.

On the other hand exposing functionality that deals with hardware implementation of BHV traversal does not seem to bring any benefits to developers. I also find it hard to imagine how application developer could optimize aspects of rendering pipeline better than driver developers can do it. It should be driver developers job to select best execution unit for each ray intersection and what data should be kept in cache.

As far as developer is concerned there might be a black hole inside GPU chip and rendered scene is really created inside. All developer really care about is that GPU and its driver implementation is matching reference rendering and its performance.
Tell that to the vulkan and direct x12. The entire point was the game/engine has a higher level view and can reduce unecessary calls and locks, etc. This is also why Nvidia tended to do better in dx11* because they had more time/$ invested optimising their drivers per game. It's a fine line between doing as much as possible in the driver and allowing someone who has a much broader view of how it runs do the work. If you have close one way to do things in the driver and everyone uses it the same, then it works well. If people want to utilize it differently, then you have to expose it at a lower level. Like I said, it's a fine line because you could just expose gpu registers and allow game devs full access, but obviously that would lead to much more difficulty implementing and would be really difficult for people to get started (not to mention there would be no compatibility at all). Its always hard to figure out where that line should be as you can see by vulkan/dx12 walking it back some.
All this said, Microsoft already has it's implementation for the next DXR so regardless how it's implemented internally it has to follow the same interface (unless they expose it in vulkan through extensions or something).

* Edit:
Better as in more than the difference in hardware would otherwise normally be, not trying to say that this was the only reason they were better.
 
Casting a ray requires an explicit call from the game via DXR. It's not up to hardware or drivers to decide on the number of rays to cast. Unfortunately noko does not seem to understand how this works.

Yes, I agree. Sorry, I shouldn't have wrote that post when I was tired. My point was lost in my badly written post. I wasn't agreeing with Noko.
 
Once every console and their programmer deem a must do? I think "Yes" is the correct response. What do you think?

I would think yes as well but at this point we'll have to wait and see if the AMD ray tracing method really and truly is more efficient.

Metro Exodus frame analysis show tensor core usage.
From what I know not all games use tensor core for denoising. It is developer choice what they use really...
Isn't the tensor core usage on that example from DLSS?

Would Tensor core denoising provide any real advantages over doing it with shaders or is it just not worth it?
 
I would think yes as well but at this point we'll have to wait and see if the AMD ray tracing method really and truly is more efficient.




Would Tensor core denoising provide any real advantages over doing it with shaders or is it just not worth it?

If you move denoising from the shaders, the tensor core will do the compute, taking a load of the shaders.
 
If you move denoising from the shaders, the tensor core will do the compute, taking a load of the shaders.

Makes sense. It would be interesting to see current RTX games get an update to use Tensor denoising and compare the performance versus before.
 
Would Tensor core denoising provide any real advantages over doing it with shaders or is it just not worth it?

Possibly but denoising using general compute is hard enough and requires intimate knowledge of the game engine. Training a deep learning model to do the same thing is next level hard and most developers don't have that type of expertise in house. All the tensor use cases in games so far are implemented by nvidia, i.e. DLSS
 
If you move denoising from the shaders, the tensor core will do the compute, taking a load of the shaders.
Does it? Or does the shared have to complete the shading first then pass it be de noised?
 
Last edited:
Yes, AMD's patent describes an approach where traversal over intermediate BVH nodes is controlled by shader code. Unfortunately DXR doesn't support that functionality so it's a moot point unless AMD is going to roll out a custom RT api like they did with Mantle.

Also, this approach would likely be much, much slower given all of the data moving back and forth between the RT engine and shader core during traversal. Think of how slow anisotropic filtering would be if the TMU had to ask the shader core to select each mip level during texture mapping.

Since Microsoft owns and helped design XBox Series X and Sony PS5 you don't think they will incorporate into their APIs methods to use the hardware? That is ridiculous.

Casting a ray requires an explicit call from the game via DXR. It's not up to hardware or drivers to decide on the number of rays to cast. Unfortunately noko does not seem to understand how this works.

You seem to not know much of anything dealing with what I know or not. Not sure what statement you are so confused about I made. Enlighten us with your grace. Shaders are programs in general if you didn't know.


There are valid use cases for programmable traversal. This Intel paper is a good place to start. But again without API support it's an academic discussion.

https://software.intel.com/content/...ersal-with-an-extended-programming-model.html
More gibberish, it will be supported if worth while by the designers and makers as in Microsoft, Sony, AMD. Of course they could have wasted millions on hardware design and features and not support it. You most know better.
 
Since Microsoft owns and helped design XBox Series X and Sony PS5 you don't think they will incorporate into their APIs methods to use the hardware? That is ridiculous.

I suggest instead of getting upset you improve your understanding of the things you’re discussing. We don’t know what Microsoft will do in the future. We only know what DXR is today.
 
I suggest instead of getting upset you improve your understanding of the things you’re discussing. We don’t know what Microsoft will do in the future. We only know what DXR is today.
So you don't know, thank you for the clarification. Not upset at all, your assuming too much again.

Found this which pretty much aligns with what I came up with. Cyberpunk 2077 maybe RNDA2 friendly as in Ray Tracing:

https://itigic.com/amd-will-introduce-ray-tracing-in-its-console-and-pc-gpus/
 
So you don't know

I never claimed to know how RDNA or PS5 raytracing hardware will work. I leave that to you. We don’t even know if the shader based approach described in the patent will ever see the light of day.

In the same patent it says “Alternatively, shader functionality can be implemented partially or fully as fixed-function, non-programmable hardware external to the compute unit”.

Found this which pretty much aligns with what I came up with. Cyberpunk 2077 maybe RNDA2 friendly as in Ray Tracing:

https://itigic.com/amd-will-introduce-ray-tracing-in-its-console-and-pc-gpus/

That article offers no new information. It’s speculation based on the patent and no better than the posts in this thread.
 
Found this which pretty much aligns with what I came up with. Cyberpunk 2077 maybe RNDA2 friendly as in Ray Tracing:
https://itigic.com/amd-will-introduce-ray-tracing-in-its-console-and-pc-gpus/
How could it not if Cyberpunk 2077 will use the same DXR?

This sentence from article is ridiculous:
AMD’s system is much simpler than NVIDIA’s for developers, since in the case of Huang’s, programmers have to work with at least two different engines (Shader + RT) . Instead, AMD reuses many of its units and buses, where only the new intersection motors are the novelty as such.

Why humans are so stupid? 😖
 
How could it not if Cyberpunk 2077 will use the same DXR?

This sentence from article is ridiculous:
AMD’s system is much simpler than NVIDIA’s for developers, since in the case of Huang’s, programmers have to work with at least two different engines (Shader + RT) . Instead, AMD reuses many of its units and buses, where only the new intersection motors are the novelty as such.

Why humans are so stupid? 😖
Don't know what you mean why humans are so stupid, your a human I take it, you might have to look in the mirror and figure out that one. I don't think humans are stupid, just wrong 90% of the time ;)

As for DXR, it has been updated and here is what Microsoft said:
“Microsoft and AMD worked closely on the development of the DirectX 12 Ultimate feature set to ensure a great experience with AMD RDNA 2 architecture”
– Bryan Langley, Graphics Group Program Manager, Microsoft
https://community.amd.com/community...suals-with-amd-rdna-2-and-directx-12-ultimate

Obviously DXR did not support AMD raytracing, at least that is how I take it and the update addressed that which also supports Nvidia better (which should include Ampere). Now I don't know if two code paths will be needed for Nvidia method or AMD method or that has been abstracted to a level. We just have to see the results which is what really counts for most folks handing over the bucks.
 
Don't know what you mean why humans are so stupid, your a human I take it, you might have to look in the mirror and figure out that one. I don't think humans are stupid, just wrong 90% of the time ;)
Nvidia uses RT cores, CUDA cores and TPU for all texture related stuff. It is even stated in Nvidia description for how to program DXR.
"AMD raytracing" as you call it does the same thing. They just do not explicitly call added BHV intersection finding functionality as additional "core" but TPU extension.
Both implementation just added BHV intersection finding functionality and reused existing engines.
Also DXR explicitly requires from programmer to create geometry structures and provide shaders to execute programs that control what to do with found intersection (or miss shader in case no intersection was found) so all DXR compatible hardware need similar implementation. Not the same as there can be number of differences but those will only show themselves in performance characteristics.

Then how AMD implementation can be simpler for developers?
Internal implementation differences will show itself only in performance characteristics eg. one can be faster in some areas than the other, or stated differently: have bottlenecks in different parts of rendering pipeline.

Obviously DXR did not support AMD raytracing, at least that is how I take it and the update addressed that which also supports Nvidia better (which should include Ampere). Now I don't know if two code paths will be needed for Nvidia method or AMD method or that has been abstracted to a level. We just have to see the results which is what really counts for most folks handing over the bucks.
DXR is an API and you create hardware and drivers for the API, not the other way around. That is even if Microsoft works with hardware manufacturers when creating these APIs.

At this point in time however we only know AMD will have DXR 1.1 capable GPU and Nvidia claim Turing is DXR 1.1 compatible and even has beta DXR 1.1 drivers. With this information if we were to have any assumptions it would be that there won't be any differences between RDNA2 and Turing ray-tracing.

https://news.developer.nvidia.com/directx-12-ultimate-preview/ said:
DirectX Raytracing 1.1
DXR traces paths of light with physics calculations, enabling highly accurate simulations. DXR 1.1 adds three major new capabilities:

  • ExecuteIndirect: GPU work creation now allows ray tracing. This enables shaders on the GPU to invoke and control ray tracing without an intervening round-trip back to the CPU. This is useful for adaptive ray tracing scenarios like shader-based culling, sorting, classification, and refinement.
  • Incremental State Objects: State objects can now be incrementally compiled to enable streaming engines to efficiently add new ray tracing shaders as needed when the player moves around the world and new objects become visible without causing performance issues like stuttering
  • TraceRayInline: Inline ray tracing is an alternative form of ray tracing that gives developers the option to drive more of the ray tracing process, as opposed to handing work scheduling entirely to the system (dynamic-shading). It is available in any shader stage (including compute shaders, pixel shaders etc) which also allows for easier integration into existing game engines. Both the dynamic-shading and inline forms of ray tracing use the same opaque acceleration structures.
ps. Yes, I am simple humble human being... for now 🙃
 
Obviously DXR did not support AMD raytracing

That's not obvious at all. DXR 1.1 is a software update. It doesnt change hardware requirements and has nothing to do with "AMD raytracing".

https://devblogs.microsoft.com/directx/dxr-1-1/

"None of these features specifically require new hardware. Existing DXR Tier 1.0 capable devices can support Tier 1.1 if the GPU vendor implements driver support."
 
Back
Top