AMD’s upcoming flagship GPUs should be 3x faster than RX 6900XT

A little disappointed that Nvidia is staying monolithic, but yeah both AMD and NVidia are supposed to be 600w + flagships. That is gonna get spicy.
For now. Currently this is AMD's magic sauce that is finally seeing action. Soon it will be the standard for all. 600W AMD? Not even close. Might hit 350W but they won't need to go pedal to the metal like Nvidia will to hold the crown. Nvidia will show well this next round but like Intel's 12th gen they will need to tare a page from the Fermi page book to do so. And it will sell very well of course :D
 
For now. Currently this is AMD's magic sauce that is finally seeing action. Soon it will be the standard for all. 600W AMD? Not even close. Might hit 350W but they won't need to go pedal to the metal like Nvidia will to hold the crown. Nvidia will show well this next round but like Intel's 12th gen they will need to tare a page from the Fermi page book to do so. And it will sell very well of course :D
AMD’s Instinct MI 250 cards were announced at 550w, the gaming versions of their cards traditionally have higher boost clocks than their work station cards and subsequently higher power draw.
 
Last edited:
AMD’s Instinct MI 250 cards were announced at 550w, the gaming versions of their cards traditionally have higher boost clocks than their work station cards and subsequently higher power draw.
Yes traditionally, but no ;)
 
Yes traditionally, but no ;)
So they are both 2 chiplets each running 80cu, both with similar base clocks but somehow the consumer one is going to pull half the juice. I just don’t see it.

Based on the benchmarks of the MI250 vs the A100 the MI250 is between 1.4 and 2.8x faster than the A100 except for AI workloads where it falls behind. Launching the 7990x at half that power would manage to edge out a 3090 probably the 3090TI as well but not sure how that would stack up against a 4090.
 
Last edited:
So they are both 2 chiplets each running 80cu, both with similar base clocks but somehow the consumer one is going to pull half the juice. I just don’t see it.

Based on the benchmarks of the MI250 vs the A100 the MI250 is between 1.6 and 2.8x faster than the A100 except for AI workloads where it falls behind. Launching the 7990x at half that power would manage to edge out a 3090 probably the 3090TI as well but not sure how that would stack up against a 4090.
The 6900XT is already = to the 3090 at less power. So no improvement then? IPC is said to be much better then RDNA2 as well as chiplet and second gen Infinity cache plus a shrink. The whole point of that is that AMD, just like CPU's, can compete more efficiently with the monolithic die's (cpu/gpu) at a far better tdp. No?
 
The 6900XT is already = to the 3090 at less power. So no improvement then? IPC is said to be much better then RDNA2 as well as chiplet and second gen Infinity cache plus a shrink. The whole point of that is that AMD, just like CPU's, can compete more efficiently with the monolithic die's (cpu/gpu) at a far better tdp. No?
The MI250 is their RDNA 3 MCM chiplet workstation GPU and it in best case is 3.05x faster than an A100 when doing multi-grid algebra. Most of the stuff is only 1.6 - 2.2x faster than the A100. The 6900xt is 330w vs a 3090’s 360w, so not much less power.

The 6900XT was basically the consumer MI100, the MI100 performed against the A100 about the same as the 6900XT does against the 3090. So it would stand to reason that the same would hold true for the MI200 and 250 and the subsequent consumer versions of those cards. Both of which are between 500 and 600w cards. IPC may be better but the MI250 is topping out at 1700 boost clock with a 1000 mhz base clock. The 6900xt is 1865 base and 2215 boost. So drastically improved IPC but significant lower clock speeds.
 
You would hope, but if they use 2x the juice while being 3x faster then technically it’s an electrical decrease and a profit increase.
Latest rumor is AMD sticks with GDDR6 on a 256 bit bus. Incrementally more bandwidth with higher mem clocks, but not keeping pace with a power increase from 350 to 600ish watts.
 
If Raytracing wasn't here to stay, AMD wouldn't be spending money trying to improve theirs.

And it's the intergenerational improvement that's commendable, not where they stand in relation to Nvidia. Raytracing isn't zero sum, everyone benefits when there's competition to do it better and faster. RT has always been the graphical holy grail.

Raja Koduri of Intel was recently spotted watching raytracing videos on YouTube, and then commenting "whoa, that's so cool, how does it do that?" So Intel is probably thinking about RT now as well.

I'm not saying it's not here to stay. I'm saying that the majority of games that are coming out don't have it. And I was responding to someone knocking AMD's effort in relation to Nvidia's as if that in and of itself is going to be a decision factor when purchasing a new card.

The last two games I played were AC: Valhalla and FC6. Neither one was raytraced and were both AAA titles released in the post RT release era for example.
 
I'm not saying it's not here to stay. I'm saying that the majority of games that are coming out don't have it. And I was responding to someone knocking AMD's effort in relation to Nvidia's as if that in and of itself is going to be a decision factor when purchasing a new card.

The last two games I played were AC: Valhalla and FC6. Neither one was raytraced and were both AAA titles released in the post RT release era for example.
Just pointing out that FC6 had raytracing..
 
Just pointing out that FC6 had raytracing..

Well fair enough...



I was watching that, and I guess I just didn't notice in a FPS as the non-RT version is pretty well done. It's close enough that I wouldn't pick Nvidia over AMD in that title (currently playing with a 3080). In fact, I'd probably rather the HD textures with a 6800/6900XT that won't run on a 3080 to the RT implementation.
 
My 6600xt is 10mh slower than my 5600xt was :ROFLMAO:
Perhaps... but my 6600XT benchmarked @ 10% faster in every score than my prior 5700XT, so I kept the 6600XT and eBay'd the 5700XT for stupid money
/6600XT also had lower thermals and burned less watts
 
  • Like
Reactions: Jelly
like this
So they are both 2 chiplets each running 80cu, both with similar base clocks but somehow the consumer one is going to pull half the juice. I just don’t see it.



The latest is 60 WGPs, or Work Group Processors, which I think is 120 CUs, total. The base chiplets have 40 WGPs at a minimum, which means they're disabling at least 1/4 of the dies. They've got to be cherry-picking individual work groups for efficiency. With that, plus a tweaked architecture, and a new process, it seems realistic. Also, the rumor is that they're running at 2.5 GHz, up from 2.25. I believe 3x performance is attainable.

The big implication here is that there is a possibility for up to an 80 WGP part, with up to 40 working WGPs for the "insaner" part. Even if that's not possible for yields or more realistically, power reasons, I could see them doing something like 36 WGPs on both for maybe 15 percent gains/20 percent increased power consumption? something in there.
 
So they'll still be one generation behind NVIDIA with ray tracing performance.
I think that's a pretty good assumption considering NV started out a generation ahead on RT and seems to be putting more of an emphasis on it beyond "our cards are capable of doing it in an API-compliant way"

I think my predicted specs are already debunked so that part of the post did not age well but I stand behind the pessimistic RT perf prediction.


I expect they'll brute force their way to decent ray tracing with the 7000-series and add accelerators in 8000. By that time, Nvidia will also be on chiplets, at which point, I kind of expect both companies to handle ray tracing with dedicated parts. Not like, dual GPUs, but rather, have discrete chips for different functions all under the same heatspreader.

That's just a guess, but it lines up with all the rumors and patent filings.
I'm curious what you meant by that first sentence...
Do you mean add more accelerators in 8000-series? I thought that the Ray Accelerators AMD has been talking about being in the 6000-series were dedicated BVH Intersection solving IP attached to the ROPs and offered a similar level of acceleration to NV's RT Cores, even if the actual performance/efficiency is lower. Feel like I'm missing something here...
 
Last edited:


The latest is 60 WGPs, or Work Group Processors, which I think is 120 CUs, total. The base chiplets have 40 WGPs at a minimum, which means they're disabling at least 1/4 of the dies. They've got to be cherry-picking individual work groups for efficiency. With that, plus a tweaked architecture, and a new process, it seems realistic. Also, the rumor is that they're running at 2.5 GHz, up from 2.25. I believe 3x performance is attainable.

The big implication here is that there is a possibility for up to an 80 WGP part, with up to 40 working WGPs for the "insaner" part. Even if that's not possible for yields or more realistically, power reasons, I could see them doing something like 36 WGPs on both for maybe 15 percent gains/20 percent increased power consumption? something in there.

Well see, but if the Instinct MI250 is only doing 1700mhz at 550 watts then I dread what that would pull at 3ghz.
 
I'm curious what you meant by that first sentence...

I mean a completely separate die dedicated to ray tracing.

Navi 31 will have at least 3 dies, maybe 4. Two GPU dies, and some kind of interconnect die that may or may not hold the cache. With Navi 31, ray tracing is handled by the GPU dies.

AMD is already saying that with Zen 5 they will have "accelerator" dies, for things like machine learning, in addition to their chiplets, and interconnect die or interconnect/cache die. I'm saying I think there's a chance that they're planning to do the same thing with their GPUs, and hand off ray tracing to a separate part on the same board. Nvidia has patents for on off-die ray tracing chip, and some people expected to see it with the RTX 3000 series (because of the wonky cooler).

By the time we get to Hopper and RDNA 4, ray tracing performance could be powered by a completely independent part of the GPU.

As far as Navi 31 by "brute force" I mean these parts are just clocked crazy high and they are piling on the WGPs, so even if ray tracing efficiency is, say, 80 percent compared to Nvidia, they just have more cores and more clocks and will just get the ray tracing done because of the amount of power they're able to dedicate to it.
 
I mean a completely separate die dedicated to ray tracing.

Navi 31 will have at least 3 dies, maybe 4. Two GPU dies, and some kind of interconnect die that may or may not hold the cache. With Navi 31, ray tracing is handled by the GPU dies.

AMD is already saying that with Zen 5 they will have "accelerator" dies, for things like machine learning, in addition to their chiplets, and interconnect die or interconnect/cache die. I'm saying I think there's a chance that they're planning to do the same thing with their GPUs, and hand off ray tracing to a separate part on the same board. Nvidia has patents for on off-die ray tracing chip, and some people expected to see it with the RTX 3000 series (because of the wonky cooler).

By the time we get to Hopper and RDNA 4, ray tracing performance could be powered by a completely independent part of the GPU.

As far as Navi 31 by "brute force" I mean these parts are just clocked crazy high and they are piling on the WGPs, so even if ray tracing efficiency is, say, 80 percent compared to Nvidia, they just have more cores and more clocks and will just get the ray tracing done because of the amount of power they're able to dedicate to it.
Ah, got it. I have also been following the "dedicated RT hardware on a separate die / chip" rumors but I misinterpreted your original comment as saying that current and next-gen Radeon don't have any RT acceleration at all- thanks for the clarification!
 
Latest rumor is AMD sticks with GDDR6 on a 256 bit bus. Incrementally more bandwidth with higher mem clocks, but not keeping pace with a power increase from 350 to 600ish watts.
I don’t see how swapping out the HBM2e for DDR6 is going to shave 200+ watts out of the power profile, especially if the consumer version of this card is supposed to hit a boost clock of 3ghz up from the 1.7 of the MI250.

Just saying there’s a lot of rumours going around how RDNA 3 is going to be the greatest thing since Jesus conjured sliced bread and based on what they’ve announced for workstations and data centres a lot of those rumours don’t hold water.
 
I don’t see how swapping out the HBM2e for DDR6 is going to shave 200+ watts out of the power profile, especially if the consumer version of this card is supposed to hit a boost clock of 3ghz up from the 1.7 of the MI250.

Just saying there’s a lot of rumours going around how RDNA 3 is going to be the greatest thing since Jesus conjured sliced bread and based on what they’ve announced for workstations and data centres a lot of those rumours don’t hold water.
Yeah, I'm speculating on the rumors that Navi 31 will be a 600ish watt card, with a similar memory bus to Navi 21. Incrementally higher MH/sec from higher mem clocks, but much higher power draw should hopefully make it unattractive to miners.
 
The MI250 is their RDNA 3 MCM chiplet workstation GPU and it in best case is 3.05x faster than an A100 when doing multi-grid algebra

<snip>

I don't know how much we can really compare RDNA vs CDNA/GCN, they're two different product lines with different features. It's been that way since MI60 (GCN)/MI100 (CDNA1) and RX5000 (RDNA1)/RX6000 (RDNA2).
 
Yeah, I'm speculating on the rumors that Navi 31 will be a 600ish watt card, with a similar memory bus to Navi 21. Incrementally higher MH/sec from higher mem clocks, but much higher power draw should hopefully make it unattractive to miners.
Winning!
 
I’d be happy if my next graphics card used less than the 400+w my 3090 does. If it ended up using more I’m gonna have to do some thinking.

Sick of how hot my study gets
 
I’d be happy if my next graphics card used less than the 400+w my 3090 does. If it ended up using more I’m gonna have to do some thinking.

Sick of how hot my study gets
Yeah, my next build is gonna be something I can do on a 650w PSU with room to spare. Summers get too hot in here and with new windows and siding I shouldn't have to heat my office in the winter which is on my spring todo list.

I'm happy at 1440p, currently handled by a 2080ti, I figure the 60 series of the 2022, 2023 parts should handle that with ease.
 
For now. Currently this is AMD's magic sauce that is finally seeing action. Soon it will be the standard for all. 600W AMD? Not even close. Might hit 350W but they won't need to go pedal to the metal like Nvidia will to hold the crown. Nvidia will show well this next round but like Intel's 12th gen they will need to tare a page from the Fermi page book to do so. And it will sell very well of course :D
I thought there was some EU plug in GPU power limit ~350W.
 
For now. Currently this is AMD's magic sauce that is finally seeing action. Soon it will be the standard for all. 600W AMD? Not even close. Might hit 350W but they won't need to go pedal to the metal like Nvidia will to hold the crown. Nvidia will show well this next round
The 6900XT is already = to the 3090 at less power. So no improvement then? IPC is said to be much better then RDNA2 as well as chiplet and second gen Infinity cache plus a shrink. The whole point of that is that AMD, just like CPU's, can compete more efficiently with the monolithic die's (cpu/gpu) at a far better tdp. No?

You are forgetting something. Nvidia's Ampere cards are on what is basically a 10nm process. AMD's RDNA 2 cards are on a 7nm+ process. Even been a whole node size down from Nvidia and on a much more power efficient process, the difference in power consumption isn't that massive. If both Ampere and RDNA 2 were on the same process, I wonder which would be using the most power? And since, going by rumours, they are both using TSMC 5nm process for their next gen cards, it will be interesting to see which one is better power wise.
 
Wake me up when MSRP is back to reality and they are 3x more available than Nvidia.
 
Wake me up when MSRP is back to reality and they are 3x more available than Nvidia.
AMD and high availability are not two things that will be in the same sentence for a long time. They are committed to Sony, Microsoft, Valve, Tesla, and at least 2 supercomputer projects. That paired with their further attempts to get in on the lucrative OEM contracts and compounded by TSMC’s limited 5nm fab space (half of their 7nm) means their ability to get parts to the retail channel is decreased not increased.
 
AMD and high availability are not two things that will be in the same sentence for a long time. They are committed to Sony, Microsoft, Valve, Tesla, and at least 2 supercomputer projects. That paired with their further attempts to get in on the lucrative OEM contracts and compounded by TSMC’s limited 5nm fab space (half of their 7nm) means their ability to get parts to the retail channel is decreased not increased.


Disagree. Not every product they currently have on 7nm will move to 5nm. All the Sony/Microsoft/Valve, plus the navi 23 Tesla parts will all stay on 7nm.

High end cpu chiplets and the gpu chiplets will move to 5nm. The I/O chiplets will either stay 14nm or, more likely, move to 7nm.

The rumors are 3d stacked Infinity Cache for high end parts, if that's correct it will be done on the same 7nm as the stacked cache on Milan.

Very few parts will need to move to 5nm, but they'll be discontinuing large 7nm chips like Navi 21, allowing them to build more I/O and cache chiplets.

In addition, TSMC is still ramping up 5, 7 and even 28nm capacity.

CPU availability has been great for a while, the move to MCM for GPUs will likely have a similar effect.
 
There are too many ways to twist the data.

The 3 times faster could be referring solely to RT performance, which would take it from ~12fps at 4k RT Ultra in Cyberpunk on a highly overclocked 6900xt to 36fps. Certainly an improvement, but nowhere near as exciting as 3x overall....

As everyone else, I view early leaks and releases with a great deal of skepticism.

We will see when the time comes.
 
Last edited:
There are too many ways to twist the data.

I agree but I still think AMD is going to take the crown this round. Remember, their flagship is rumored to have two cut-down GPU dies. If that's the case, they've got 25 percent more die to light up if they need to. Even if their solution requires dual PSUs.

Hey, they could call it the new Crossfire!
 
I agree but I still think AMD is going to take the crown this round. Remember, their flagship is rumored to have two cut-down GPU dies. If that's the case, they've got 25 percent more die to light up if they need to. Even if their solution requires dual PSUs.

Hey, they could call it the new Crossfire!

It really depends. If it really is a fancy new technology allowing multiple dies to present themselves to the OS as a single GPU using some sort of fancy new low latency interconnect, similar to what is done with chiplets on the CPU side, it could truly revolutionize things.

It's it's just another AFR Crossfire/SLI implementation, then it will be mostly useless, just like SLI and Crossfire have been in the past.
 
The thing in which I remain very interested: how they allocate the dies.
Traditional raster goodies, tensor/AI/comp, raytracing.

NV has been pushing all the goodies, so they have to support that, and improve it. But then Game X comes out, doesn't support DLSS or raytracing (or does so aggressively minimally, like Far Cry 6) - and now you look like a chump.

So very happy this isn't my problem to solve.
<TF2 Engineer "I solve practical problems" speech>
 
You are forgetting something. Nvidia's Ampere cards are on what is basically a 10nm process. AMD's RDNA 2 cards are on a 7nm+ process. Even been a whole node size down from Nvidia and on a much more power efficient process, the difference in power consumption isn't that massive. If both Ampere and RDNA 2 were on the same process, I wonder which would be using the most power? And since, going by rumours, they are both using TSMC 5nm process for their next gen cards, it will be interesting to see which one is better power wise.
https://www.techpowerup.com/269347/...gpus-built-on-samsung-8nm-instead-of-tsmc-7nm

https://www.techpowerup.com/266351/...w-density-metric-for-semiconductor-technology

Comparing apples to oranges to bananas, Samsung 8nm and TSMC 7nm are roughly the same density as Intel 10mm. Navi2 and Ampere are at process parity.
 
Yeah, I saw some electron microscope shots showing tsm. 7nm against intel 10nm. The sizes of similar features were almost identical.
I guess thats why Intel has started naming thier processes with ambiguous names.
Kinda like the socket 939 Athlons stopped using the actual Mhz values
 
If it really is a fancy new technology allowing multiple dies to present themselves to the OS as a single GPU using some sort of fancy new low latency interconnect, similar to what is done with chiplets on the CPU side, it could truly revolutionize things.
My assumption, based on rumors and my own speculation, is that it is transparent to the software. So from an API perspective, it IS one GPU but the hardware and driver does the magic under the hood.

Which, if true, would be a huge revolution, as big as multi-core CPUs. We could essentially double (or more, depending on how many chips they pack in there) framerates without any developer work, or traditional problems with AFR rendering.
 
My assumption, based on rumors and my own speculation, is that it is transparent to the software. So from an API perspective, it IS one GPU but the hardware and driver does the magic under the hood.

Which, if true, would be a huge revolution, as big as multi-core CPUs. We could essentially double (or more, depending on how many chips they pack in there) framerates without any developer work, or traditional problems with AFR rendering.
This could be a step in the direction of life-like VR....imagine UE5 detail in a VR experience (given the displays are capable).
 
Disagree. Not every product they currently have on 7nm will move to 5nm. All the Sony/Microsoft/Valve, plus the navi 23 Tesla parts will all stay on 7nm.

High end cpu chiplets and the gpu chiplets will move to 5nm. The I/O chiplets will either stay 14nm or, more likely, move to 7nm.

The rumors are 3d stacked Infinity Cache for high end parts, if that's correct it will be done on the same 7nm as the stacked cache on Milan.

Very few parts will need to move to 5nm, but they'll be discontinuing large 7nm chips like Navi 21, allowing them to build more I/O and cache chiplets.

In addition, TSMC is still ramping up 5, 7 and even 28nm capacity.

CPU availability has been great for a while, the move to MCM for GPUs will likely have a similar effect.
Apple alone has something like 50% of TSMC's 5nm output, Intel has a significant portion as well, then pile in Nvidia, AMD, and the numerous amounts of Chinese companies who have their own GPUs being launched next year on TSMC 5nm, TSMC's 5nm is far more constrained than their 7nm is regardless of them not moving their CPU's over. Yes TSMC is ramping up their 5nm with an expected completion date of 2025 on the Arizona and Germany facilities. The transition over to MCM is not the golden bullet you think it is, MCM simplifies things and allows for easier to build designs with fewer failures, that's great, but you still need silicon to print them on and that is where AMD is going to be hurting for supply.

AMD has settled very comfortably into a position of a boutique hardware designer, they make limited-run, high-performance parts, at a maximum margin. It's a good thing for them, they will be making more parts than ever but don't expect them to be overly present in retail channels.
 
AMD and high availability are not two things that will be in the same sentence for a long time. They are committed to Sony, Microsoft, Valve, Tesla, and at least 2 supercomputer projects. That paired with their further attempts to get in on the lucrative OEM contracts and compounded by TSMC’s limited 5nm fab space (half of their 7nm) means their ability to get parts to the retail channel is decreased not increased.
no problem. back to sleep
 
Back
Top