Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

RTX 3xxx performance speculation

Thread starter Nebell
Start date Oct 25, 2019

Jun 18, 2020

#1,121

Snowdog

[H]F Junkie

IdiotInCharge said:
There's an argument for spreading out the heat generation a bit; supposing everything else is lined up properly, see AMD failing at this with their first HDM implementations, cooling is perhaps both less complicated and more efficient. AMD has also shown this with the Ryzen 3000 (Zen 2) series with a large cache/uncore die and up to two eight-core CPU dies in a pachage.

Current 10th gen Intel desktops, using significantly more power, still run at cooler temperatures than Ryzens, using less power with the same cooler. So I am not seeing this advantage. Here the OC 10900K is consuming ~70 Watts more than the Ryzen 9, and is running few degrees cooler. Single or multi-chip is near irrelevant difference in cooling.

Temperature-Testing.png.webp

Last edited: Jun 18, 2020

Jun 18, 2020

#1,122

Factum

2[H]4U

IdiotInCharge said:
There's an argument for spreading out the heat generation a bit; supposing everything else is lined up properly, see AMD failing at this with their first HDM implementations, cooling is perhaps both less complicated and more efficient. AMD has also shown this with the Ryzen 3000 (Zen 2) series with a large cache/uncore die and up to two eight-core CPU dies in a pachage.

Years ago, when Kirk was the main man planning NVIDIA's architecture I watched a webcast about the G80.
In that talk, an engineer from NVIDIA talked about what consumes power in a chip:
Compute = cheap.
Moving data around = EXPENSIVE.

Moving data between chip(let)s is the worst case...the notion about a compute node (involved in rendering frames) would be separate from the main GPU not only goes against NVIDIA's way of designing things, but also against industry knowledge about data-transfer.

It is indeed Silly Season once again *sigh*

Jun 18, 2020

#1,123

BrotherMichigan

Limp Gawd

Snowdog said:
Current 10th gen Intel desktops, using significantly more power, still run at cooler temperatures than Ryzens, using less power with the same cooler. So I am not seeing this advantage. Here the OC 10900K is consuming ~70 Watts more than the Ryzen 9, and is running few degrees cooler. Single or multi-chip is near irrelevant difference in cooling.

You have to also consider that Intel is physically shaving down the dies to reduce the thermal resistance of these chips, as well as the fact that they're quite a bit less dense. It's not simply a "chiplet vs monolithic" comparison.

Jun 18, 2020

#1,124

Snowdog

[H]F Junkie

BrotherMichigan said:
It's not simply a "chiplet vs monolithic" comparison.

It never will be that simple, the other factors will always swamp the single, vs multi-chip with same power, because those differences are inconsequential.

There is no real case for their being a significant difference between one chip and two under the same heat-spreader. 120 W of cooling is still needed to cool 1-120W die or 2-60W dies.

Furthermore, splitting die increases the actual power dissipation, because off chip communications, use more power than on chip.

Jun 18, 2020

#1,125

IdiotInCharge

NVIDIA SHILL

Snowdog said:
Here the OC 10900K is consuming ~70 Watts more than the Ryzen 9, and is running few degrees cooler. Single or multi-chip is near irrelevant difference in cooling.

There's... a lot of assumptions being made here in order to make a comparison. First being that there is no way to verify that temperatures are being recorded in the same way, which is highlighted by the increased power draw of the Intel CPU with its monolithic die, but with a lower reported temperature. Temperature, at least as reported by CPUs, is an innaccurate method of comparing heat generation and removal between architectures, unfortunately. Then you add in the different packaging styles...

The main point with chiplets is that there's a physical separation of 'heat centers'. The chiplet approach spreads the main energy consumers apart, enlarging the package overall, and thus enlarging the size of the needed heatspreader and enlarging the contact area for a cooling soluion. Obviously there's now some interconnect power needed where it wasn't before, but when using a silicon interposer this is minimized relative to spreading the the dies about a PCB as AMD is currently doing with Ryzen.

Jun 18, 2020

#1,126

IdiotInCharge

NVIDIA SHILL

Factum said:
Compute = cheap.
Moving data around = EXPENSIVE.

Moving data between chip(let)s is the worst case...the notion about a compute node (involved in rendering frames) would be separate from the main GPU not only goes against NVIDIA's way of designing things, but also against industry knowledge about data-transfer.

That's true whether you're inside the GPU die or trying to pipe stuff around a datacenter, yes. I don't really see a means for a separate RT-focused part to be efficiently integrated with an RTX GPU. Nothing stopping them from trying, of course, and I believe that's really all there is to substantiate an argument for such an approach, but I agree that it's just not very likely for interframe rendering. Much more likely (and yet still highly unlikely) would be some form of multi-GPU arrangement where the second die consists of most of the compute blocks found on the main GPU and works in parallel on the same data.

Jun 18, 2020

#1,127

Snowdog

[H]F Junkie

IdiotInCharge said:
The main point with chiplets is that there's a physical separation of 'heat centers'. The chiplet approach spreads the main energy consumers apart, enlarging the package overall, and thus enlarging the size of the needed heatspreader and enlarging the contact area for a cooling soluion. Obviously there's now some interconnect power needed where it wasn't before, but when using a silicon interposer this is minimized relative to spreading the the dies about a PCB as AMD is currently doing with Ryzen.

What evidence do you have that having two dies under the same heat spreader is better than one? Also AFAIK AMD doesn't use silicon interposer for Ryzen.

Also when interposers are used, they typically butt chips directly again each other, meaning there is no significant separation.

MCM is all about yields, and pretty much nothing else.

Last edited: Jun 18, 2020

Jun 18, 2020

#1,128

noko

Supreme [H]ardness

2FA

Chip to Chip communications using Photonics would solve bandwidth issues, power issues but just not there yet.

https://semiengineering.com/making-silicon-photonics-chips-more-reliable/
https://www.nextplatform.com/2019/01/29/first-silicon-for-photonics-startup-with-darpa-roots/

Jun 18, 2020

#1,129

noko

Supreme [H]ardness

2FA

Snowdog said:
What evidence do you have that having two dies under the same heat spreader is better than one? Also AFAIK AMD doesn't use silicon interposer for Ryzen.

Also when interposers are used, they typically butt chips directly again each other, meaning there is no significant separation.

MCM is all about yields, and pretty much nothing else.

Yes it is about yields but also market combination of products, like in Zen using a basic building block from low end to Data Centers with Zen designs. Maybe a type of lego but very effective in not having to have multiple different chip designs, fab production and so on. Which in itself also increases yields.

Last edited: Jun 18, 2020

Jun 18, 2020

#1,130

noko

Supreme [H]ardness

2FA

Factum said:
NVIDIA is going chiplet with Hopper (next generation), not Ampere.

Factum said:
Years ago, when Kirk was the main man planning NVIDIA's architecture I watched a webcast about the G80.
In that talk, an engineer from NVIDIA talked about what consumes power in a chip:
Compute = cheap.
Moving data around = EXPENSIVE.

Moving data between chip(let)s is the worst case...the notion about a compute node (involved in rendering frames) would be separate from the main GPU not only goes against NVIDIA's way of designing things, but also against industry knowledge about data-transfer.

It is indeed Silly Season once again *sigh*

WAT???

Can you please explain Nvidia going chiplet with Hopper (Rumor) to that it goes against NVIDIA's way of designing things?

"It is indeed Silly Season once again *sigh*" is fitting.

Jun 18, 2020

#1,131

IdiotInCharge

NVIDIA SHILL

Snowdog said:
What evidence do you have that having two dies under the same heat spreader is better than one? Also AFAIK AMD doesn't use silicon interposer for Ryzen.

It's better if the solution is unworkable without it. In this case, it's just different. However, there is an argument to be made that pushing the power consumers apart allows for a greater contact surface and eases cooling (at the cost of a larger heatspreader). That's what I'm getting at, whether on interposer or just a tightly-coupled package like Zen 2.

Snowdog said:
Also when interposers are used, they typically butt chips directly again each other, meaning there is no significant separation.

Perhaps; if the chiplet arrangement is simply the result of a monilithic design being 'chopped' into smaller dies with interconnects added through the interposer, then there's likely very little tangible benefit and perhaps even an increase in cooling needed.

Snowdog said:
MCM is all about yields, and pretty much nothing else.

Well, yes. Whether that's what AMD is doing with Zen, where a monilithic version would still be manufacturable just at greater difficulty and cost, or if it's a product that would simply be too large to manufacture (i.e. impossible, so yields would otherwise be zero), which is where we expect Nvidia to be heading soon.

Jun 18, 2020

#1,132

noko

Supreme [H]ardness

2FA

IdiotInCharge said:
It's better if the solution is unworkable without it. In this case, it's just different. However, there is an argument to be made that pushing the power consumers apart allows for a greater contact surface and eases cooling (at the cost of a larger heatspreader). That's what I'm getting at, whether on interposer or just a tightly-coupled package like Zen 2.

Perhaps; if the chiplet arrangement is simply the result of a monilithic design being 'chopped' into smaller dies with interconnects added through the interposer, then there's likely very little tangible benefit and perhaps even an increase in cooling needed.

Well, yes. Whether that's what AMD is doing with Zen, where a monilithic version would still be manufacturable just at greater difficulty and cost, or if it's a product that would simply be too large to manufacture (i.e. impossible, so yields would otherwise be zero), which is where we expect Nvidia to be heading soon.

It is hard to to compare temperature with two different nodes, fab process and density of the chips. It is meaningless in other words, too many other factors to consider. If one had the same node, design of chip with a big chip vs two smaller chiplets with the same power envelope then yeah that would be good grounds to determine temperature wise if a monolithic chip for temperature is better than breaking it up into chiplets. In the end it doesn't matter as long as it works as expected. Temperature has little bearing as long as it works.

AMD one design chiplet that can be combined, one design, one test cycle, one round of revisions and then it goes in desktop, HEDT and data centers/Servers - Intel has a separate chip, all the R&D, Testing, Revisions . . . Then repeat for HEDT chips, then repeat for desktop chips which in itself has more monolithic designs. AMD has basically two designs, APU and Chiplet which APUs maybe chiplets in the future. Well then of course Xbox and PS5 designs which are joint ventures.

Look at this way, Zen 3 will update from the low end all the way to HPC/Data centers/Servers virtually overnight while Intel would have to do each segment separately. Speed of implementation, cost to design and produce all dramatically reduced including much better yields and lower costs at the fabs since only one design vice many.

Last edited: Jun 18, 2020

Jun 18, 2020

#1,133

Snowdog

[H]F Junkie

IdiotInCharge said:
Well, yes. Whether that's what AMD is doing with Zen, where a monilithic version would still be manufacturable just at greater difficulty and cost, or if it's a product that would simply be too large to manufacture (i.e. impossible, so yields would otherwise be zero), which is where we expect Nvidia to be heading soon.

I don't think we are there that soon, and NVidia has shown they are willing to go very big on GPU dies.

GPU MCM aslo has a lot more negative trade-offs than CPU MCM. We have had years of multi-socket CPU without much issue, but multi-card or even multi-GPU chip on the same card are stuck with needing individual, and exclusive memory pools for each chip and some kind of CF/SLI software kludge.

It could happen with Hopper in as rumored, but the MCM part of that rumor, might just be for Data Center GPUs, where MCM is more viable.

Even if it we get real MCM Gaming GPUs (shared memory pool, no CF/SLI SW), I wouldn't expect a big drop in the price of GPUs.

Jun 18, 2020

#1,134

IdiotInCharge

NVIDIA SHILL

Snowdog said:
I don't think we are there that soon, and NVidia has shown they are willing to go very big on GPU dies.

This is true. Not sure what the 'limit' is, but if they plan to exceed it, they'll need to break the GPU apart.

Snowdog said:
GPU MCM aslo has a lot more negative trade-offs than CPU MCM. We have had years of multi-socket CPU without much issue, but multi-card or even multi-GPU chip on the same card are stuck with needing individual, and exclusive memory pools for each chip and some kind of CF/SLI software kludge.

These are... incomparable. CPUs are relatively low bandwidth, latency-sensitive devices, and GPUs are quite the opposite. As seen on modern CPUs, keeping multiple cores going requires fucktons of cache (Imperial fucktons, not the new-age metric ones). GPUs make up for this by not having such random workloads, such that bandwidth latency may be accounted for. CPUs can't do that; see Zen and Zen+. Zen2 is little more than Zen with enough cache and fewer barriers.

Further, we're not talking about SLI/CF or some other form of linking two complete GPUs. We're talking about splitting GPU work up between dies, in hardware, with software transparency (the driver will probably need to be aware to do some data massaging to account for developer laziness / stupidity, as always). Where external multi-GPU solutions could provide a full 2x performance jump with proper optimization through the software stack, a hardware split could just run as a monolithic die with the same resources would without developer tuning for most common usecases (of which games are).

Snowdog said:
It could happen with Hopper in as rumored, but the MCM part of that rumor, might just be for Data Center GPUs, where MCM is more viable.

Cost is a pretty big factor here, so you're probably right. Nvidia will likely target compute SKUs first just to hide any BOM increases due to production problems behind the higher MSRPs that compute SKUs command for the first go around.

Snowdog said:
Even if it we get real MCM Gaming GPUs (shared memory pool, no CF/SLI SW), I wouldn't expect a big drop in the price of GPUs.

Oh no, I don't expect pricing to go down. I expect it to go up!

But I also expect the performance ceiling to go up too.

Jun 18, 2020

#1,135

Snowdog

[H]F Junkie

IdiotInCharge said:
These are... incomparable. CPUs are relatively low bandwidth, latency-sensitive devices, and GPUs are quite the opposite. As seen on modern CPUs, keeping multiple cores going requires fucktons of cache (Imperial fucktons, not the new-age metric ones). GPUs make up for this by not having such random workloads, such that bandwidth latency may be accounted for. CPUs can't do that; see Zen and Zen+. Zen2 is little more than Zen with enough cache and fewer barriers.

Further, we're not talking about SLI/CF or some other form of linking two complete GPUs. We're talking about splitting GPU work up between dies, in hardware, with software transparency (the driver will probably need to be aware to do some data massaging to account for developer laziness / stupidity, as always). Where external multi-GPU solutions could provide a full 2x performance jump with proper optimization through the software stack, a hardware split could just run as a monolithic die with the same resources would without developer tuning for most common usecases (of which games are).

Workload determines latency sensitivity. In the NVidia MCM GPU paper, they tested a variety of compute workloads, and saw varying impacts from latency, I expect greater latency sensitivity from gaming workloads. I also expect Ray Tracing will increase latency sensitivity, since Rays can bounce anywhere on the scene, so you need fast access to everything in the scene (you can see RT drastically increases memory usage), not just a little local slice.

Cost is a pretty big factor here, so you're probably right. Nvidia will likely target compute SKUs first just to hide any BOM increases due to production problems behind the higher MSRPs that compute SKUs command for the first go around.

At the very least Data center will be first, Ampere Data Center is monolithic, so IMO zero chance gaming Ampere will be MCM.

If Hopper Data Center is monolithic, then the same applies. But even if Hopper Data Center is MCM, then there is still no guarantee that Hopper gaming is MCM, but still no guarantee.

Data Center work won't be as latency sensitive. In general this is work that is easily split across multiple cards, without issue, and it also isn't real-time, so again the kind of load that won't suffer from a bit more MCM latency.

Jun 19, 2020

#1,136

noko

Supreme [H]ardness

2FA

Another factor is TSMC rapid success not only 7nm but 5nm and looks like even smaller nodes which was not predictable several years back when Nvidia and AMD was seriously looking at MCM options. Except price constraints may still win out with MCM.

Jun 19, 2020

#1,137

Snowdog

[H]F Junkie

Katcorgi Twitter, source of much of the Ampere rumors has minor update again:
https://twitter.com/KkatCorgi/status/1273889616282521603

2nd Gen NVIDIA TITAN
GA102-400-A1 5376 24GB 17Gbps

GeForce RTX 3090
GA102-300-A1 5248 12GB 21Gbps

GeForce RTX 3080
GA102-200-Kx-A1 4352 10GB 19Gbps

It makes more sense that the 24GB card is a Titan, but at such a huge drop in memory speed, it would likely perform worse.

But I am very skeptical of the 21 Gbps memory on the 3090.

Jun 19, 2020

#1,138

MangoSeed

[H]ard|Gawd

Snowdog said:
But I am very skeptical of the 21 Gbps memory on the 3090.

I suppose there's nothing stopping Nvidia from using non-standard memory speeds but at what cost? Surely if you could run GDDR6 at 20Gbps+ with reasonable power consumption and yields we would have heard something from JEDEC and the memory manufacturers by now. I would keep this in the nonsense rumor bucket.

Jun 19, 2020

#1,139

Snowdog

[H]F Junkie

MangoSeed said:
I suppose there's nothing stopping Nvidia from using non-standard memory speeds but at what cost? Surely if you could run GDDR6 at 20Gbps+ with reasonable power consumption and yields we would have heard something from JEDEC and the memory manufacturers by now. I would keep this in the nonsense rumor bucket.

AFAIK, JEDEC only specifies up to 16 Gbps, and only Samsung has announced anything faster at 18 Gbps. NVidia OCing to 21 Gbps seems like quite a stretch. I don't think they have a history of doing this. They need their products to be reliable.

Jun 21, 2020

#1,140

Snowdog

[H]F Junkie

Supposed 3D Mark database entry for Ampere, is ~30% faster than 2080Ti.
https://videocardz.com/newz/nvidia-geforce-rtx-3080-ti-3090-ampere-3dmark-time-spy-score-leaks

If this is real and represents a 3080Ti/3090, then this is exactly around where I expect performance to be. ~30% more, that will disappoint those thinking this should be another Pascal type leap.

Jun 21, 2020

#1,141

Auer

[H]ard|Gawd

Snowdog said:
Supposed 3D Mark database entry for Ampere, is ~30% faster than 2080Ti.
https://videocardz.com/newz/nvidia-geforce-rtx-3080-ti-3090-ampere-3dmark-time-spy-score-leaks

If this is real and represents a 3080Ti/3090, then this is exactly around where I expect performance to be. ~30% more, that will disappoint those thinking this should be another Pascal type leap.

In my pretend universe, this is the $500 RTX3070.
IRL it's probably the $1000 RTX3080.

Jun 21, 2020

#1,142

Snowdog

[H]F Junkie

I don't know about naming, but yeah, pretty much guarantee, this will be $1000+.

Jun 21, 2020

#1,143

pippenainteasy

[H]ard|Gawd

Well if the rumors of GA102 3080 is true, this could be a $699 3080 that's 30% faster than 2080 Ti, and you could have a $999 3090 thats 45% faster. Maybe with $799/$1199 launch prices again for thr first year. Full uncut Titan for $2499 with 5376 cuda.

This would more or less give you the same gap as 2080 Super vs 2080 Ti.

Jun 21, 2020

#1,144

Nebell

2[H]4U

According to that article, 2080Ti Kingpin is 33% faster than 2080Ti FE.
What? Am I reading that wrong? That's like, one generation leap.

Jun 21, 2020

#1,145

Dayaks

[H]F Junkie

Nebell said:
According to that article, 2080Ti Kingpin is 33% faster than 2080Ti FE.
What? Am I reading that wrong? That's like, one generation leap.

I think that was kingpin with LN2... notice the crazy high clocks.

Jun 21, 2020

#1,146

Snowdog

[H]F Junkie

Dayaks said:
I think that was kingpin with LN2... notice the crazy high clocks.

Yeah, it even says LN2 in the text for the OC Titan, so you can bet the top score is LN2 as well.

Jun 21, 2020

#1,147

German Muscle

Supreme [H]ardness

Nebell said:
According to that article, 2080Ti Kingpin is 33% faster than 2080Ti FE.
What? Am I reading that wrong? That's like, one generation leap.

That is correct. the KP 2080ti even beat the RTX Titan

Jun 21, 2020

#1,148

Lastan010

Limp Gawd

what about LN2 on the upcoming 3080, another 30% perf increase?

Jun 22, 2020

#1,149

German Muscle

Supreme [H]ardness

Lastan010 said:
what about LN2 on the upcoming 3080, another 30% perf increase?

Well see when the Kingpin version of ampere comes out.

Jun 22, 2020

#1,150

Nebell

2[H]4U

I need to get myself some LN2.

Jun 22, 2020

#1,151

Nenu

[H]ardened

Nebell said:
I need to get myself some LN2.

You might need speed as well, short but fast gaming lol.

Jun 22, 2020

#1,152

Nebell

2[H]4U

Nenu said:
You might need speed as well, short but fast gaming lol.

This guy seems to enjoy it

Jun 26, 2020

#1,153

Geezus

Limp Gawd

Huge grain of salt but very powerful if true.
https://wccftech.com/rumor-alleged-...-up-to-23-tflops-of-peak-graphics-horsepower/

Jun 26, 2020

#1,154

BrotherMichigan

Limp Gawd

Hopefully that's fake, because a 320W TDP means even higher actual power draw.

Jun 26, 2020

#1,155

German Muscle

Supreme [H]ardness

im pretty sure it is fake

Jun 26, 2020

#1,156

RamonGTP

Supreme [H]ardness

BrotherMichigan said:
Hopefully that's fake, because a 320W TDP means even higher actual power draw.

It can draw all the power it wants if it has the performance to back it up as far as I’m concerned. Go with an APU if you want low power.

Jun 26, 2020

#1,157

Dayaks

[H]F Junkie

Geezus said:
Huge grain of salt but very powerful if true.
https://wccftech.com/rumor-alleged-...-up-to-23-tflops-of-peak-graphics-horsepower/
View attachment 256691

48% faster with 28% more power draw compared to the a 2080ti.

29% boost clock (1700 -> 2200?) 24% more cores it should be 60% faster with 0% increase to IPC.....
10% boost clock (2000 -> 2200) 24% more cores it'd be 36% faster; with a 8% increase to IPC it matches the numbers on the chart.

Not out of the realm of possibility they have a 8% IPC increase, slightly higher mhz, and more cores...

Jul 8, 2020

#1,158

pippenainteasy

[H]ard|Gawd

627mm2 is the new rumor for GA102. With 25% less density on 8nm Samsung vs 7nm TSMC it would be about 501mm2 on 7nm. Big Navi will be 505mm2 on 7nm TSMC. So it sounds like next gen flagships will both come in about the same size. Reminds me of the Vega 64 vs 980 Ti days.

Jul 8, 2020

#1,159

Bankie

2[H]4U

2FA

Nebell said:
This guy seems to enjoy it

Kinda off topic but this guy was great and my son was always excited to show me vids of some of the cool projects he would do. It's too bad the guy died unexpectedly last year.

Jul 8, 2020

#1,160

Armenius

Extremely [H]

pippenainteasy said:
627mm2 is the new rumor for GA102. With 25% less density on 8nm Samsung vs 7nm TSMC it would be about 501mm2 on 7nm. Big Navi will be 505mm2 on 7nm TSMC. So it sounds like next gen flagships will both come in about the same size. Reminds me of the Vega 64 vs 980 Ti days.

Is this a new rumor I haven't heard about? We had actual industry insiders back in May say that low-end products will be using 8LPP or 8LPU while mid- to high-end products will be using N7P.

You must log in or register to reply here.

Share:

Reddit Pinterest Tumblr WhatsApp Email Link