Rumor: 3080 31% faster than 2080 Ti

"The founder of NVIDIA, clarified in an interview with the media that most orders for 7nm process products will still be handed over to TSMC in the future, and Samsung will only get a small number of orders."

Makes me wonder if we will see a debacle similar to the one Apple had with their A9 chips in the iPhone 6S where people were complaining because they got the phones with the Samsung manufactured chips which didn't perform as well in battery life tests.

(And of course Apple responded by pulling the battery life test app from the App Store :p typical Apple. The less the customer knows the better)

In that generation it appeared as if TSMC's process was better, but I have no idea where we stand this gen.

If Nvidia plans on using both in the same spec GPU, whichever chip overclocks better on average is certainly going to get a fan following, and whichever performs worse, buyers are going to feel snubbed.
 
Makes me wonder if we will see a debacle similar to the one Apple had with their A9 chips in the iPhone 6S where people were complaining because they got the phones with the Samsung manufactured chips which didn't perform as well in battery life tests.

(And of course Apple responded by pulling the battery life test app from the App Store :p typical Apple. The less the customer knows the better)

In that generation it appeared as if TSMC's process was better, but I have no idea where we stand this gen.

If Nvidia plans on using both in the same spec GPU, whichever chip overclocks better on average is certainly going to get a fan following, and whichever performs worse, buyers are going to feel snubbed.

It's extremely unlikely that Nvidia will tape out the same chip at two different fabs. There's no reason to do so.
 
Lol, what? So because some people cannot afford something due to covid, they should just lower prices??? So we can expect to see $10,000 cars too?
Actually if supply far exceeds demand for whatever reason prices tend to drop, it's just a matter of whether or not the manufacturers want to wait it out and then cash in or not. For instance used car prices did in fact plummet, new car prices not so much mostly because there is a limit to how much they can cut the price, now back to video cards, yeah I don't see these as being cheap due to COVID-19 manufacturers will wait them out until they can sell for full MSRP. Plus there's always the secondary market factor, where greedy fuckers buy at MSRP and when there's no supply available end up selling over MSRP to profit.
 
There's no chance 3090/3080 Ti will be made by Samsung, that's going to be a TSMC exclusive. The rest of the line however is up for grabs. But AMD is getting preferential allocation by TSMC last I heard so Navi will be 100% TSMC which may play out in AMDs favor this generation.
 
You're not putting any stock into this somewhat official report from a year ago? Of course we also have word directly from the horse's mouth that most of the business is still going to TSMC.

http://www.koreaherald.com/view.php?ud=20190702000692

“It is meaningful that Samsung Electronics’ 7-nanometer process would be used in manufacturing our next-generation GPU,” said Yoo. “Until recently, Samsung has been working really hard to secure (partners) for foundry cooperation.”
While Yoo did not reveal the exact amount of foundry production by Samsung, he acknowledged that production would be “substantial.” He declined to comment on whether Nvidia is planning to make further requests to Samsung for foundry production.


https://finance.technews.tw/2019/12/20/nvidia-7nm-for-tsmc/

"The founder of NVIDIA, clarified in an interview with the media that most orders for 7nm process products will still be handed over to TSMC in the future, and Samsung will only get a small number of orders."
Samsungs 7nm process is still not operational, they are still in fault testing phases and aren't expecting to be taking orders till 2021, but the designs developed for TSMC will not work for Samsungs so they will have to have distinctly different product stacks built by each.
 
There's no chance 3090/3080 Ti will be made by Samsung, that's going to be a TSMC exclusive. The rest of the line however is up for grabs. But AMD is getting preferential allocation by TSMC last I heard so Navi will be 100% TSMC which may play out in AMDs favor this generation.
Not exactly preferential allocation but they are producing more there between their GPU's, APU's, and CPU's they are physically producing more chips and they have agreements with Microsoft and Sony to meet quota's and demands and as such TSMC is bound to them as well so while not exactly preferencial they are under the gun by people with large teams of lawyers and deep pockets. So TSMC has incentive to make sure the chips for the XBox and PS5 are rolling off the lines like mad, and depending on popularity that could actually hurt their other products availability as TSMC still has to meet their other customers needs while still only having some 1000 wafer per day production limit.

This makes me think that the 3000 series is going to have smaller die's compared to the 2000 series simply because they will need to have larger yields per wafer to even have a hope of meeting demand.
 
This makes me think that the 3000 series is going to have smaller die's compared to the 2000 series simply because they will need to have larger yields per wafer to even have a hope of meeting demand.


Would be interesting if we get to see the GPU version of chiplets.

You know, small compute units with like 8 or 16 cores per chiplet, a memory controller on a cheaper older process (like GloFo 12nm).

Would be even cooler if AMD just let board partners mix and match. Add as many CU chiplets as you want for scaleable performance to meet the need of the design.

You could see some cool specialty low power models as well as some really crazy one off extreme 500w designs only for water cooling that way.

It would be lots of fun to have a few standard recommended configurations, and then give board partners that flexibility make their own and see what they come up with.
 
Were Turing cards reliable or were they malfunctioning?

Would there be a new card so powerful at 1080p and capped at ~80 fps would the fans not spin or barely run? Not RDR2 or Control but maybe some games.
 
Would be interesting if we get to see the GPU version of chiplets.

You know, small compute units with like 8 or 16 cores per chiplet, a memory controller on a cheaper older process (like GloFo 12nm).

Would be even cooler if AMD just let board partners mix and match. Add as many CU chiplets as you want for scaleable performance to meet the need of the design.

You could see some cool specialty low power models as well as some really crazy one off extreme 500w designs only for water cooling that way.

It would be lots of fun to have a few standard recommended configurations, and then give board partners that flexibility make their own and see what they come up with.
All 3 have GPU versions in the works problem is the graphics pipeline is not multi thread friendly. So with current engines you get the same problem early multi CPU Cores had, 1 core at 100% and the rest in the 10-15% range. nVidia has some great patents on how to fix this and they use it in their Tesla cards but it’s still at least 2 generations out before it is ready for the consumer level.
 
All 3 have GPU versions in the works problem is the graphics pipeline is not multi thread friendly. So with current engines you get the same problem early multi CPU Cores had, 1 core at 100% and the rest in the 10-15% range. nVidia has some great patents on how to fix this and they use it in their Tesla cards but it’s still at least 2 generations out before it is ready for the consumer level.

Not quite right.

GPU pipelines are the very definition of highly parallelized multithreaded loads. That's how they work.

The problem is that multigpu setups to date have presented the GPU's as two separate GPU's to the operating system.

If - instead - they were linked with high speed interconnects - like what AMD is now doing in their CPU's - multiple chiplets could present themselves to the OS as a single larger GPU with more cores.

There would be no need for multi-GPU game/OS tricks because you would essentially have a single GPU, just spread out over many chiplets.
 
I'd be very happy if there was a 30% increase in x80 category as 1080ti ($699) equivalent 2080 ($699) didnt have any increase in performance whatsoever. Here I am 4 years after my initial purchase and pretty much the only better "consumer" gpu is 2080 ti which is 50% more expensive for 30% more performance.

When I look back over my GPUs over the past 4 years, this is precisely what depresses me as well. 1080 ti to 2080 to Radeon VII to 2080 Super... all within 10% of one another, all $700-800.

The price/performance needle hasn't moved one bit and I strongly suspect it's not going to budge with Ampere. nV can layer all the semi-useless features they want, 1080 ti-level performance will still cost $700 in November.
 
Will be interesting to see what the cost premium is going to be between the two new mainstream Ampere offerings, the 3080 and the 3090 (aka 3080Ti). I hope I am wrong, but I predict the 3080 will be $1200 (exceeding the 2080Ti by +30%), and the 3090 will be $1500 (exceeding the 2080Ti by +45%). I doubt they'll come in much cheaper than that, but if they do, that'll be a nice surprise. And I see a 3070 (matching the 2080Ti in performance) coming in at roughly $900 soon after. The next-gen Titan offering will probably come in at something absurd like $2199.
 
What the hell is wrong with some of you people? 31% would be absolutely pathetic. Even Turing did that and that was pretty much a joke. I guess some of you are just absolutely forgetting the jump the 980ti had over the 780ti and then the 1080ti jump over the 980ti. Both of those were 75% or better.

Please enlighted us with performance numbers?
Pascal was an anomaly performance wise, but suddenlt everyone has forgot the past and make the same false claims
(Hint: The numbers have been posted several times on this forum...)
 
From all indications this gen leap will be large for Nvidia. And nesessarily so. Last gen had no competition. This gen, like the 1080ti gen, has concern for AMD. Nvidia will be balls to the wall for sure. AMD will be as well. Nvidia for the crown for sure. But the battle for 500-800$ will be strong!
 
Last edited:
What the hell is wrong with some of you people? 31% would be absolutely pathetic. Even Turing did that and that was pretty much a joke. I guess some of you are just absolutely forgetting the jump the 980ti had over the 780ti and then the 1080ti jump over the 980ti. Both of those were 75% or better.

I know people keep telling you that NVIDIA had small jumps traditionally but if I recall Kepler to Maxwell was also a pretty huge jump:
https://static.techspot.com/articles-info/1191/bench/Comparison_01.png

Going from 780 Ti to 980 Ti yielded nearly 51% performance gain there. Then 680 to 780 Ti was still a respectable 44% gain so they were definitely above the 30% threshold and although not represented in that graph, I think 980 Ti to 1080 Ti was >50% jump. So yeah, a 30% gain would be very anemic for a 7nm die shrink + more cores. Surely they didn't dump all that extra space into RT, at least I hope they didn't. I still think we're going to see a 40%+ avg gain with 2080 Ti >>3090 or whatever the top end card is and maybe even more at 4K. Because if NVIDIA can't pull that off, then going by RDNA 2 performance in XBOX, the desktop version should be an absolute beast and NVIDIA will have some serious competition. If they're neck and neck and NVIDIA costs $100-$150 more I'll unfortunately have to pay the NVIDIA tax because I don't want to deal with AMD driver headaches. Plus things like Shadowplay and NVIDIA filters have spoiled me so there's no way I could switch and lose those.
 
I never thought I'd be living in a world where 31% performance increase across generations for the same cost would be considered fantastic, but compared to the last gen this would be true. Forget the model naming schemes that nV tries to confound us with, forget the unsupported "gaming" features, forget about the novelty of having "new" technology... we got near zero improvement in price/performance with Turing. If 3080 gives us 31% better performance than 2080 ti for $900, that would be (relatively) fantastic.

What we'll get, undoubtedly, is more along the lines of a 10% price/performance improvement with Ampere. I'm guessing 3080 will equal 2080 ti in performance and will be MSRP (for whatever MSRP is worth anymore) for $900.
 
I think I paid $400'ish for a Geforce 2 Ultra back in the day .. if that's any consolation o_O
 
I'm skeptical that the marketing fluff and fanboy magical thinking about the console's APU isn't going to meet the assumed 2080 levels of performance.

Otherwise why would AMD give them away at cost to cheapskate Sony/MS - who are trying to shave every last dollar from their BOMs - when instead they could fast track the chip into the discrete/retail market and completely disrupt the Nvidia party? Doesn't add up.
I was purely going on the specs but sure, fanboy magical thinking... I very rarely touch consoles unless i'm playing minecraft with my daughter split screen, which hasn't been for a while since i setup our own local minecraft server so we can all jump on and play together on our PCs. The 5700xt is about 15% slower than a standard 2080 (it varies heavily on game though). I think a 15% gain in performance wouldn't be unreasonable. Of course it could be completely wrong, but hey, at this point this is all speculation. I don't think it'll hit 2080 ti levels and fall well short of Ampere's upper end products. Just as an example, the 5700xt has 40 compute units. The xbox x is going to have 52 CU's with a newer architecture and run at similar clocks (-100mhz?). If the architecture alone had ZERO increase in compute/raster performance, it's got a 23% increase in CU's. As long as RDNA 2 is at least on par with RDNA that should be able to make up the 15% difference between it and the 2080 (given adding a CU isn't perfect scaling). Again, lots of unknowns, but it's within reason if you actually look at the specs and break things down, no magic marketing or fanboyisms needed. The PS5 will have less CU's (36 vs 40) but will run at a higher frequency. If they have 0% architectural increases in performance, this one may well fall a little short. If they have anywhere similar to the difference between VEGA and RDNA it'll easily surpass it. In reality, it'll probably be somewhere between those 2 numbers. I think 10%-15% is reasonable. The increase from vega -> navi was pretty substantial. They went from 64/56 CU's down to 40 and gained performance (partially due to more clocks, partially due to better architecture). They both have the same example bandwidth, so that was purely frequency and architecture that contributed. So, I still think 2080 is about the place these will fall, with a few % either way.

It does add up, they make money from the consoles, maybe not as much as they'd like, but they aren't going to turn down some free publicity and extra R&D that's paid for by someone else. They are also high volume with very large quantity initial PO's (purchase orders). They know they can produce this part for the next 4 years and sell millions of them. The ps4 sold 106 million units as of the beginning of this year (total sales). XBox One was 'only' 46 million. It's very hard to find sales figures on normal GPU sales, but I can imagine they probably don't sell that many of any single generation of GPU (and even if they can/do, why not do both?) They want to be front and center in the consoles as it helps them in a few ways; sales, name recognition, and optimizations (developers optimize for the console with AMD GPU which helps developers understand how to get the most out of them) to name a few. Sure I'd of loved RDNA2 to drop earlier this year, but I'd also of loved Ampere to come quicker too.

All this said, I won't even be buying a console (at least not at release, I tend to end up getting them once they go on sale and/or used market). I haven't bought a console on release since.... ever. I will be upgrading 3-4 of my desktops later this year (and possibly my home server) when RDNA2/Ampere come out as and zen3. Not sure what i'll be putting in them yet because we have no clue what performance or price will be, and with as many desktops as I upgrade at a time, it's normally not the highest end parts, so perf/$ reigns supreme in my house. Anyways, I understand being skeptical, things get hyped and then let down over and over. I still think it reasonable that going from 40 CU's of an older architecture up to 52 CU's of a newer architecture should give us a 10-15% improvement which would make the 2080 a reasonable guesstimate. Nobody has a crystal ball, I'm just going on rough math's, but it definitely seems within the realm of reason.
 
  • Like
Reactions: noko
like this
Not quite right.

GPU pipelines are the very definition of highly parallelized multithreaded loads. That's how they work.

The problem is that multigpu setups to date have presented the GPU's as two separate GPU's to the operating system.

If - instead - they were linked with high speed interconnects - like what AMD is now doing in their CPU's - multiple chiplets could present themselves to the OS as a single larger GPU with more cores.

There would be no need for multi-GPU game/OS tricks because you would essentially have a single GPU, just spread out over many chiplets.
It's more the contention for the memory bandwidth than the way the OS views them. In multi-gpu systems, each GPU must contain the entire scene in memory on it's own board (it needs all the vertex & texture data loaded even if it's only rendering 1/2 the scene). If you were to got a chiplet route and share memory, there would be a lot of contention for the memory bus, so each chiplet added would give less and less ROI. It has nothing to do with how it's presented to the OS, as you can see close to 100% scaling in many games with 2 GPU's when implemented properly, it's just the small # of people running dual GPU's with the added work didn't make it a great investment. Also, done wrong and you end up not getting much better performance or end up with really bad frame time jitter. I guess as long as you keep the chiplets small, and you end up with a similar # of CU's contending for memory it would be similar performance. I think if this were a simple thing to solve, AMD having been using them for years now, would have figured it out by now. I would bet they do have some designs and R&D going on in that direction though, so it may just be a matter of time before it comes over to GPUs. Just a quick look at 3100 vs 3300x gives a good example of the downside of the chiplet. The extra latency. Easily measurable gains at the same frequency just due to the layout.
 
Will be interesting to see what the cost premium is going to be between the two new mainstream Ampere offerings, the 3080 and the 3090 (aka 3080Ti). I hope I am wrong, but I predict the 3080 will be $1200 (exceeding the 2080Ti by +30%), and the 3090 will be $1500 (exceeding the 2080Ti by +45%). I doubt they'll come in much cheaper than that, but if they do, that'll be a nice surprise. And I see a 3070 (matching the 2080Ti in performance) coming in at roughly $900 soon after. The next-gen Titan offering will probably come in at something absurd like $2199.

I can't see Nvidia pricing them that high. There are factors you are forgetting that caused the high price of Turing cards. First generation of new hardware in consumer GPUs and the mining bubble burst leaving Nvidia with stockpiles of unsold Pascal cards.

I am expecting the prices to be something similar to what the Supers launched at.

3060 - $399
3070 - $549
3080 - $749
3090 - $999

The Titan isn't a consumer card anymore. It's aimed at the professional market so it will be priced like the Titan V and the RTX Titan probably around $2499.
 
I can't see Nvidia pricing them that high. There are factors you are forgetting that caused the high price of Turing cards. First generation of new hardware in consumer GPUs and the mining bubble burst leaving Nvidia with stockpiles of unsold Pascal cards.

I am expecting the prices to be something similar to what the Supers launched at.

3060 - $399
3070 - $549
3080 - $749
3090 - $999

The Titan isn't a consumer card anymore. It's aimed at the professional market so it will be priced like the Titan V and the RTX Titan probably around $2499.
This is my guess as well. If the AMD GPU rumors are true they are gonna have competition at the high end as well, so the same thing that happened with Intel CPUs will hopefully happen to Nvidia GPUs.
 
I can't see Nvidia pricing them that high. There are factors you are forgetting that caused the high price of Turing cards. First generation of new hardware in consumer GPUs and the mining bubble burst leaving Nvidia with stockpiles of unsold Pascal cards.

I am expecting the prices to be something similar to what the Supers launched at.

3060 - $399
3070 - $549
3080 - $749
3090 - $999

The Titan isn't a consumer card anymore. It's aimed at the professional market so it will be priced like the Titan V and the RTX Titan probably around $2499.

My WAG (Using performance targets instead of meaningless names)

~ RTX 2070 Super performance: $400
~ RTX 2080 Super performance: $600
~ RTX 2080 Ti performance: $800
~ RTX 2080Ti + 30%: $1000
~ RTX 2080Ti + 45%: $1200
 
My WAG (Using performance targets instead of meaningless names)

~ RTX 2070 Super performance: $400
~ RTX 2080 Super performance: $600
~ RTX 2080 Ti performance: $800
~ RTX 2080Ti + 30%: $1000
~ RTX 2080Ti + 45%: $1200

I'd edge those up $50-100, but I'm guessing MSRP will be in that ballpark. Very little price/performance gain over Turing (or Pascal, for that matter), and nV will once again be touting new features/tech that will be virtually meaningless to most gamers for this next gen.
 
I'd edge those up $50-100, but I'm guessing MSRP will be in that ballpark. Very little price/performance gain over Turing (or Pascal, for that matter), and nV will once again be touting new features/tech that will be virtually meaningless to most gamers for this next gen.

If anything they are probably more likely to edge down depending on what AMD delivers. AMD looks to have pretty good tech this time, based on claims and what we see in Next Gen consoles. Though when AMD has something good, they price it up, and then there is the question of when they deliver. Lots of factors.
 
Funny I had the same, playing with the price in a text file, but later upped the 3090 with 100.

RTX 3090 1100
RTX 3080 750
RTX 3070 550
RTX 3060 400
 
It's more the contention for the memory bandwidth than the way the OS views them. In multi-gpu systems, each GPU must contain the entire scene in memory on it's own board (it needs all the vertex & texture data loaded even if it's only rendering 1/2 the scene). If you were to got a chiplet route and share memory, there would be a lot of contention for the memory bus, so each chiplet added would give less and less ROI. It has nothing to do with how it's presented to the OS, as you can see close to 100% scaling in many games with 2 GPU's when implemented properly, it's just the small # of people running dual GPU's with the added work didn't make it a great investment. Also, done wrong and you end up not getting much better performance or end up with really bad frame time jitter. I guess as long as you keep the chiplets small, and you end up with a similar # of CU's contending for memory it would be similar performance. I think if this were a simple thing to solve, AMD having been using them for years now, would have figured it out by now. I would bet they do have some designs and R&D going on in that direction though, so it may just be a matter of time before it comes over to GPUs. Just a quick look at 3100 vs 3300x gives a good example of the downside of the chiplet. The extra latency. Easily measurable gains at the same frequency just due to the layout.

Nope. Traditional multi GPU even when implemented properly is fundamentally flawed due to the alternate frame rendering process. It causes input lag and all sorts of problems.

Even if you get 100% scaling (which I have never seen) it is still pretty bad.

The way 3DFX originally did it by alternating scan lines was a much better approach, but they never solved the scaling issue. Later development of Split Frame Rendering got closer to solving this issue but because people didn't demand it, and AFR was easier to program that's the crap we got.

With a chiplet design, as long as you could get good fast interconnects you wouldn't have any more contention for the memory than you would by adding more cores to a very large single GPU, because it would be a very large single GPU, just split across multiple dies.

We wouldn't have to worry about multi-GPU at all in game, because it would be one GPU.
 
RTX 3060 $449
RTX 3070 $599
RTX 3080 $799
RTX 3090 $1199

I don't see them lowering prices on these at all, they will adjust if any serious contention comes from AMD's products specifically Ray Tracing performance. I feel if they have close shader performance but AMD trails severely in Ray Tracing they will leave the prices as is marketing AMD as the knockoff.
 
Nope. Traditional multi GPU even when implemented properly is fundamentally flawed due to the alternate frame rendering process. It causes input lag and all sorts of problems.
Well, I'd consider the AFR method to be the 'hack' to get around having to actually supporting Multi-GPU.

Should be SFR (split-frame rendering) with each GPU either doing every other line, doing a 'tile' consisting of some number of pixels, or just having one GPU start at the top and the other start at the bottom. Really just want to pick the solution that requires the least amount of reconciliation between pixels rendered adjacently by different GPUs while ensuring that each GPU finishes its assigned workload at about the same time.

This was easy back when 'SLI' meant 'Scan-line Interleave'...
 
My WAG (Using performance targets instead of meaningless names)

~ RTX 2070 Super performance: $400
~ RTX 2080 Super performance: $600
~ RTX 2080 Ti performance: $800
~ RTX 2080Ti + 30%: $1000
~ RTX 2080Ti + 45%: $1200

It will be interesting to see how they market 4K. For many years now 1080p has been the target for mainstream cards < $300. That’s not likely to change with Ampere but we can all hope. The mainstream segment is overdue for a bandwidth bump to deal with higher resolutions and refresh rates. Those suck significant bandwidth even at lower quality settings.
 
Nope. Traditional multi GPU even when implemented properly is fundamentally flawed due to the alternate frame rendering process. It causes input lag and all sorts of problems.

Even if you get 100% scaling (which I have never seen) it is still pretty bad.

The way 3DFX originally did it by alternating scan lines was a much better approach, but they never solved the scaling issue. Later development of Split Frame Rendering got closer to solving this issue but because people didn't demand it, and AFR was easier to program that's the crap we got.

With a chiplet design, as long as you could get good fast interconnects you wouldn't have any more contention for the memory than you would by adding more cores to a very large single GPU, because it would be a very large single GPU, just split across multiple dies.

We wouldn't have to worry about multi-GPU at all in game, because it would be one GPU.

Interesting interview with Bill Dally, Nvidia’s chief scientist:

https://www.pcgamesn.com/nvidia/graphics-card-chiplet-designs

In essence, the point is that Nvidia has “de-risked” creating GPUs from AMD Zen 2-style multiple chiplets, but its engineers say it’s still not at a point where the technology makes sense in terms of dropping it into a next-gen graphics card design.

The interviewer asked where the crossover point is with the industry moving down to 7nm and then onto 5nm… where is the crossover point for GPU chiplets to actually become worthwhile? To which Alben replied, “We haven’t hit it yet.”
 
Well, I'd consider the AFR method to be the 'hack' to get around having to actually supporting Multi-GPU.

Should be SFR (split-frame rendering) with each GPU either doing every other line, doing a 'tile' consisting of some number of pixels, or just having one GPU start at the top and the other start at the bottom. Really just want to pick the solution that requires the least amount of reconciliation between pixels rendered adjacently by different GPUs while ensuring that each GPU finishes its assigned workload at about the same time.

This was easy back when 'SLI' meant 'Scan-line Interleave'...

SFR didn’t work back then due to different parts of the screen having very different workloads and it won’t work now. Most of the heavy lifting typically happens in the bottom half of the screen. Tiling is really the only practical approach.

The hard problem to solve with any multi GPU solution including chiplets is how do you share the intermediate render buffers generated by each chip. Pixel adjacency is one factor but often a pixel may require a lookup into a buffer at a different location in screen space (e.g. shadow maps).

Asking developers to figure this out was never going to happen. It’s just not worth their time and effort.

The only way this works is if AMD/nvidia figure out a universal way to present multiple chips as one GPU to any application without the need for extensive per-application tweaking. But given DirectX12 gives developers free reign to do all sorts of funky stuff I don’t know if it’s feasible for a driver to be able to handle every possible scenario.
 
Interesting interview with Bill Dally, Nvidia’s chief scientist:

https://www.pcgamesn.com/nvidia/graphics-card-chiplet-designs

In essence, the point is that Nvidia has “de-risked” creating GPUs from AMD Zen 2-style multiple chiplets, but its engineers say it’s still not at a point where the technology makes sense in terms of dropping it into a next-gen graphics card design.

The interviewer asked where the crossover point is with the industry moving down to 7nm and then onto 5nm… where is the crossover point for GPU chiplets to actually become worthwhile? To which Alben replied, “We haven’t hit it yet.”

Yeah,

it really depends on yields and bins. Chiplets make a ton of sense if you are struggling with lots of yield loss on larger dies. It allows you to be much more efficient with saving silicon.

Considering how huge some of their recent high end GPU dies are, I am really surprised they haven't hit it yet. Maybe the margins just aren't tight enough on those high end GPU's. At $1200 per GPU they are not being forced into more creativity.

For the cheaper and smaller dies where they are being forced to be more competitive on price, the dies are also smaller, making this less of an issue.
 
Well, I'd consider the AFR method to be the 'hack' to get around having to actually supporting Multi-GPU.

Should be SFR (split-frame rendering) with each GPU either doing every other line, doing a 'tile' consisting of some number of pixels, or just having one GPU start at the top and the other start at the bottom. Really just want to pick the solution that requires the least amount of reconciliation between pixels rendered adjacently by different GPUs while ensuring that each GPU finishes its assigned workload at about the same time.

This was easy back when 'SLI' meant 'Scan-line Interleave'...

Definitely. As long as you can get a high speed chiplet interconnect to work (and this is a huge challenge as you need on chip level of signals across a package), considering how GPU's are already vastly parallelized it would seem that the most effective solution would just be to just not need to do anything with multi-GPU at all. allow all CU's accross chips to work together like one large GPU. Then you don't need to duplicate memory, don't need fancy algorithms to split things between GPU's, once you do this, the game or the OS is none the wiser, and it should "just work".

The interconnect is the rub though, but AMD is already doing it on the CPU side, so maybe they aren't too far off. I don't know if this presents more of a challenge in a GPU than it does in a CPU though. I'm curious to see if they go down this route, because from all indications, big Navi is going to be a huge die, and could probably benefit yield-wise from this kind of a treatment.
 
SFR didn’t work back then due to different parts of the screen having very different workloads and it won’t work now. Most of the heavy lifting typically happens in the bottom half of the screen. Tiling is really the only practical approach.
Yeah, they still half-assed it, doing a 50/50 split instead of a 'meet in the middle' approach. Looking at how Cinebench renders tiles, as an accessible example, would be the likely method to use as it is more likely to split the work up in a practical way, and should scale not only to two GPUs but many more.

The hard problem to solve with any multi GPU solution including chiplets is how do you share the intermediate render buffers generated by each chip. Pixel adjacency is one factor but often a pixel may require a lookup into a buffer at a different location in screen space (e.g. shadow maps).

Asking developers to figure this out was never going to happen. It’s just not worth their time and effort.
Watching developers essentially shit the bed with their DX12 'ports' was somewhat eye-opening. I knew it was possible, I just didn't expect them to fail so consistently. In retrospect, I was giving them way too much credit. At least ground-up DX12 implementions are promising.

And yeah, this is going to have to be solved from the driver level on; perhaps GPU makers will be able to share optimizations with developers, but they themselves are going to have to do the work first.
 
The interconnect is the rub though, but AMD is already doing it on the CPU side, so maybe they aren't too far off. I don't know if this presents more of a challenge in a GPU than it does in a CPU though. I'm curious to see if they go down this route, because from all indications, big Navi is going to be a huge die, and could probably benefit yield-wise from this kind of a treatment.
They'll have to use a silicon interposer. What they may do is put some HBM on the interposer with the logic and then continue to use external VRAM as well; the additional complexity cost may be mitigated by not needing to have as much HBM, as it'd be serving as a local framebuffer to service the logic dies, and not needing to have hot-clocked VRAM as it won't need to serve as said framebuffer but rather as a cache for the HBM.
 
Asking developers to figure this out was never going to happen. It’s just not worth their time and effort.

The only way this works is if AMD/nvidia figure out a universal way to present multiple chips as one GPU to any application without the need for extensive per-application tweaking.

Agreed.

IMO. It has to be a more advanced chiplet design than what is used for Ryzen. Ryzen just uses traces for interconnects.

It's a must that GPU chiplets effectively work as a single chip. Using either EMIB or Silicon interposer connects of prodigious BW and minuscule latency.

The often referenced NVidia Research Paper show individual memory pool for each "chiplet". I don't think this scales well to 4 chips for a gaming GPU, it requires multiple memory hops to share memory.

I think the design will have to be multiple chips for with a central controller chip, to scale beyond 2 chips.

Though it is possible to just have a dual chip design. Where you only use multi-chip for the biggest design, The memory hop between the two chips should be no worse than with a central memory controller, and you can skip the expense that memory controller chip. This might be a good first step before moving on to the central memory controller.
 
Back
Top