390X coming soon few weeks

FYI- Early GDDR5 allowed AMD to compete with a massive G200 using only half the die size and half the bus width.


I think he is talking about total available bandwidth, not the use of depending on the chip and bus size.

But you are right that it was advantageous to AMD to go to GDDR5 because of their bus size at the time.

Depends on the application.
With gaming, yes there is a diminishing return for bandwidth.
With everything else, you want all the bandwidth you can get. GPUs are inherently good at latency hiding, which makes them starving for bandwidth.
Diminishing returns are based not only on the application but also the GPU, newer programs tend to push ALU's more so the bottleneck will shift.

GPU's aren't that starved of bandwidth right now, When designing the GPU this is taken into consideration, I still don't see how Fury can use double the bandwidth with only 45% more ALU throughput, its not like the x290 was bandwidth starved.
 
Why are people viewing this as something bad? I read it as saying 4GB is plenty. How come no one posted the good news from that tweet.

It's not something bad, per se.

4GB being plenty is good news (and honestly what I expected all along).

But it IS a rebuke to all those who were saying that the added bandwidth would somehow make the RAM behave as if it were more than 4GB of GDDR5.
 
Zarathustra[H];1041666970 said:
But it IS a rebuke to all those who were saying that the added bandwidth would somehow make the RAM behave as if it were more than 4GB of GDDR5.
The filling/draining water bucket analogy has been pretty convincing so far, but without having a technological understanding of GPU behavior I guess it's hard to say. Especially since it's a new kind of memory we've never seen before.
 
I won't read too much into a vague statement on twitter and will wait for real benchmarks, because I'm guessing there are some usage scenarios (4K+) that HBM excels at, but I was hoping AMD would demonstrate how/why HBM was necessary and GDDR5 was suddenly bandwidth constrained. Because I wasn't really aware it was - GPU is still the bottleneck even on a 980 Ti OC'd to absolute limits.
I think its a good forward facing tech that should bear fruit eventually, but existing GDDR5 speeds seem appropriate for existing GPU power.

TL;DR ever notice how putting a big overclock on your GDDR5 doesn't make a whole hell of a lot of difference in FPS? Or underclocking it for that matter.

The importance of HBM for AMD's Fiji resides in the fact that Fiji, with 4096SPs, required 45% greater bandwidth to enable proper scaling. Without it, AMD's Fiji would scale very poorly.

HBM adds 60% more bandwidth just when the GPU needs 45% more bandwidth to maintain its scaling. What's more is that the GPU has texture/color compression so the effective bandwidth should be even higher (provided decompression is fast enough, of course:rolleyes:).
 
I think the hope was that HBM could somehow swap shit in and out so fast that you wouldn't fill the 4GB of VRAM, making the 4GB limitation a non-issue. Sounding like it's still going to be an issue though.

Well, and the fact that Fiji uses memory compression, so should act like about 5GB of RAM (or more).
 
GPU's aren't that starved of bandwidth right now, When designing the GPU this is taken into consideration, I still don't see how Fury can use double the bandwidth with only 45% more ALU throughput, its not like the x290 was bandwidth starved.

Bandwidth is only up 60% with HBM if you don't consider compression. If throughput has increased with GCN1.3 (likely), then that 45% increased need for bandwidth could easily be quite a bit greater.
 
Bandwidth is only up 60% with HBM if you don't consider compression. If throughput has increased with GCN1.3 (likely), then that 45% increased need for bandwidth could easily be quite a bit greater.


Its 100% over the r290x.....

you flipped your math around I think.

and no I'm not even thinking its going to be GCN 1.3 specific to the fact it seems to be very Tonga like. Won't see GCN 1.3 till the gen after this.
 
The biggest benefit of HBM is the 30 - 50 watts saved by the memory technology compared to GDDR 5 can be fed back into having a more powerful GPU. And a more powerful GPU will need more bandwidth. I would also expect certain bandwidth intensive features to be closer to "free", or rather limited by something other than bandwidth.
 
Its 100% over the r290x.....

you flipped your math around I think.

and no I'm not even thinking its going to be GCN 1.3 specific to the fact it seems to be very Tonga like. Won't see GCN 1.3 till the gen after this.

Same for me... I'm thinking more and more that Fury = full Tonga...
The R9 285 was a test for its bigger brother Fury....

So I'm going for Fury = GCN 1.2
 
Diminishing returns are based not only on the application but also the GPU, newer programs tend to push ALU's more so the bottleneck will shift.

GPU's aren't that starved of bandwidth right now, When designing the GPU this is taken into consideration, I still don't see how Fury can use double the bandwidth with only 45% more ALU throughput, its not like the x290 was bandwidth starved.

In an ideal world you optimize the GPU for bandwidth. So if you go into knowing you will have +500Gbps of bandwidth, you design your GPU around that not the other way around... Like designing the GPU first and they figuring out how much bandwidth you need to keep it fed.
 
Same for me... I'm thinking more and more that Fury = full Tonga...
The R9 285 was a test for its bigger brother Fury....

So I'm going for Fury = GCN 1.2
I would think HBM wouldn't be compatible with GCN 1.2 unless AMD has been preparing for it.
 
In an ideal world you optimize the GPU for bandwidth. So if you go into knowing you will have +500Gbps of bandwidth, you design your GPU around that not the other way around... Like designing the GPU first and they figuring out how much bandwidth you need to keep it fed.

Actually the GPU is designed by the process node they are on and transistor budget based on how the performance you are looking for and the bandwidth that is available, Its all one package ;)
 
Well, and the fact that Fiji uses memory compression, so should act like about 5GB of RAM (or more).


All GPU's use compression, its not memory compression btw, delta color compression. Now AMD's Tonga actually has a much better color compression scheme over Hawaii if I remember correctly, it has a 25% advantage over Hawaii, don't know how much they can push that or if they will even change that, cause there wasn't much talk about this when they did talk about their next gen card in an interview, the question was asked but it wasn't answered really. And this saves bandwidth not physical memory.

Texture compression is what saves memory space, and this has been fairly standard for a few generations now, and is usually API driven.
 
Last edited:
Actually the GPU is designed by the process node they are on and transistor budget based on how the performance you are looking for and the bandwidth that is available, Its all one package ;)

Well, yes you have to take everything into account but one of the first things you consider when doing a rough layout is the targeted die size and interface size due to pad limitations.

All GPU's use compression, its not memory compression btw, delta color compression. Now AMD's Tonga actually has a much better color compression sheme over Hawaii if I remember correctly, it has a 25% advantage over Hawaii, don't know how much they can push that or if they will even change that, cause there wasn't much talk about this when they did talk about their next gen card in an interview, the question was asked but it wasn't answered really. And this saves bandwidth not physical memory.

Texture compression is what saves memory space, and this has been fairly standard for a few generations now, and is usually API driven.

Up to ~40% was the marketing numbers, which were accurate in some cases. But yes, 20-30% would be an accurate average over Hawaii.

New texture compression tools have been made available in an AMD SDK since May I think.
http://developer.amd.com/tools-and-sdks/graphics-development/amdcompress/
 
I would think HBM wouldn't be compatible with GCN 1.2 unless AMD has been preparing for it.

It wouldn't be "locked out of GCN 1.2 per say... they would "only" need to remove the GDDR5 "hardware" and put in the HBM "hardware"... yeah I know, I make it sounds easy and it's not... unless they build Tonga with this in mind from the start... hence the weird configuration of the chip compare to the other ones in its family...

...look I'm having fun, less than a day and we won't have anything else to theorize about until Pascal from Nvidia and/or next GPU from AMD....
:D
 
Up to ~40% was the marketing numbers, which were accurate in some cases. But yes, 20-30% would be an accurate average over Hawaii.

New texture compression tools have been made available in an AMD SDK since May I think.
http://developer.amd.com/tools-and-sdks/graphics-development/amdcompress/

Ah cool yeah I think in real world is around 25%,


The texture compression tools, I've used em, pretty much the same formats as before, they have some new ones specific for certain types of textures but the same 75% savings from a 32 bit BMP or TGA. (not really new it was made 4 years ago or more just been updating it since then.)
 
Ah cool yeah I think in real world is around 25%,


The texture compression tools, I've used em, pretty much the same formats as before, they have some new ones specific for certain types of textures but the same 75% savings from a 32 bit BMP or TGA. (not really new it was made 4 years ago or more just been updating it since then.)

There was a developer blog about them shortly after the release, it went into detail about some of the changes, but I can't seem to find it now.
 
Texture compression has been in DirectX API for over a decade, IIRC it was S3 that introduced it, originally called "S3TC" back in like 1999 with the Savage4. Delta Color compression is what is newer with Tonga.
 
I was wondering if the fact the HBM is closer to the die add to performance even slightly over GDDR5? And the fact there are less of them being 4 compared to 8, physical stacks on the PCB or Die.
 
I was wondering if the fact the HBM is closer to the die add to performance even slightly over GDDR5? And the fact there are less of them being 4 compared to 8, physical stacks on the PCB or Die.
We already know the width & bandwidth, what else do you want?
 
Why are people viewing this as something bad? I read it as saying 4GB is plenty. How come no one posted the good news from that tweet.

Rinaldo its AMD everything Amd does is bad. the strawmaning in this thread is ridiculous like liberals they are all looking for something to whine and complain about. Typical Americans.
 
your still goung to be bottlenecked by PCIe speed, and system ram.

only so much it will be able to do.

Supposedly it decreases load and unload times, so perhaps there is some performance to be had there.
 
I was wondering if the fact the HBM is closer to the die add to performance even slightly over GDDR5? And the fact there are less of them being 4 compared to 8, physical stacks on the PCB or Die.

Okay, since no one else seems to be answering the question you are actually asking, I'll explain. :D

There are three main reasons why HBM memory is located adjacent to the GPU:

1. Over 1,000 connections per memory chip.
-- Creating even 1,000 connections from a GPU on a PCB is expensive, imagine going four times that for HBM!

2. Shorter traces (wires) require less power.

3. Shorter traces require less management and routing.

The end result is that HBM will probably never be placed on a DIMM-type module for expansion and will almost exclusively be used in 2.5D/3D arrangements.
 
Okay, since no one else seems to be answering the question you are actually asking, I'll explain. :D

There are three main reasons why HBM memory is located adjacent to the GPU:

1. Over 1,000 connections per memory chip.
-- Creating even 1,000 connections from a GPU on a PCB is expensive, imagine going four times that for HBM!

2. Shorter traces (wires) require less power.

3. Shorter traces require less management and routing.

The end result is that HBM will probably never be placed on a DIMM-type module for expansion and will almost exclusively be used in 2.5D/3D arrangements.

GDDr5 also need older analog regulators taking up more space by a lot.
 
Just gonna leave this here

EGxN0TK.jpg


Source

I think the table should be pretty self explanatory. These are all supposed to be XFX DD cards btw
 
Just gonna leave this here

EGxN0TK.jpg


Source

I think the table should be pretty self explanatory. These are all supposed to be XFX DD cards btw

if 390x is 5% slower than 980 and its priced almost a 100 below it than I really don't think its a bad price by any stretch of imagination and on top it doubles the memory of 980. I will leave overclocking aside as I am sure they both can be overclocked so that argument is a waste. Stock vs Stock I don't see anything wrong with the price there. I am sure you will find these around 350 on sale here and there.
 
there's some cherry picking going on if they got R9 390 being +7% over a GTX 970

That is probably overall amonth all games, and if 390 beats 970 is one game by big margin it might have shifted the result. But I don't see why it couldn't match the 970 at high resolutions atleast. These probably don't throttle either like the original 290s due to heat and power draw.
 
Back
Top