AMD Radeon R9 Fury X Video Card Review @ [H]

Atleast you called spade a spade and admitted that it's a VRAM issue. I commend you for that!
 
https://youtu.be/8hnuj1OZAJs?t=91
https://youtu.be/8hnuj1OZAJs?t=136

I was just watching some gameplay footage with this card at 4K. Look at what happens at the 1:39 and 2:20 mark, massive stutter. It's like the card is running out of VRAM because the usage goes down a lot when this happens, seems like it's swapping new assets in/out.

someone on reddit talked about the differences between GDDR5 & HBM in terms of how to feed data to the GPU

https://www.reddit.com/r/Amd/comments/3b6n1c/absolutely_nobody_is_talking_about_fury_x_and/

This was the first thing that popped into my head when I learned that the HBM would achieve it's high bandwidth by being extremely wide.

Now it seems that the Fury X has 8x 512bit wide memory controllers. Totalling 4096 bits. It's also dual issue.

For comparison the 980ti has 12x 32 bit memory controllers (980 vanilla has 8).

GDDR5 chips are 32 bits wide (fitting nicely) and transfer 2 lots of data per clock. So a minimum of 64 bits of data is written per clock to one area of memory.

In the case of the Ti, that means each controller writes a minimum of 64 bits per clock to one page of memory. It cannot write to multiple pages simulatenously, so in order to get the most out of it's controller it is required to cache memory accesses into contiguous 64 bit lumps. That's obviously not so difficult - but memory granularity is the precise reason that we have 12x32 bit controllers instead of 1x384 bit controller (because those controllers cost die space and transistors!). It's not so difficult because everything is 32bit nowadays and when you are writing a framebuffer and z-buffer pixel you are writing 64 bits of data (w/o compression).

However, whack the bus width up to 512 bits, remaining dual issue and suddenyl every time you write to memory you need to write 1kb of data in one go, or else you are wasting bandwidth. If you have a ton of stuff to write, and it's all over different areas of memory, you're boned because you can only write to one memory page at a time. GDDR5 has a page size of 2kb (no idea about the HBM implementation). Changing your memory page incurs a latency penalty.

Quite clearly, the memory granularity issue with HBM just got a whole lot more severe. I wanted to know what AMD has done to alleviate this. How muhc extra cache does Fiji have to handle this?

Can somebody do some testing with single pixel polygons? this is how you'd show up memory granularity tanking the efficiency of a memory bus, throw millions of single pixel polygons at it that aren't connected to each other, so are separate objects with separate draw calls.

I'm willing to bet that doing a test like this would hurt both cards a lot (because single pixel polygons are bastards for efficiency everywhere, they're a worst case scenario I suppose) but I think it might absolutely kill the Fury.

If it does, then AMD really need to work on their caching to make best use of the memory bus.

Or... I could be talking a whole lot of hot air;)

I'm inclined to think it's a driver issue and AMD has a lot of work to still do in optimizing for HBM usage otherwise I assume we'd have heard complaints about stutter from 2x GTX 980 or 2x 290X set ups by now with their 4GB frame buffers
 
The normal fury might prove to be a lot better for those reasons as well because they would probably have some better cooling on the VRMs..

Assuming first of all that Air Fury won't be a cut down chip, if Air Fury was in any way superior to Water Fury, why wouldn't they have put Air Fury out first - especially because it would be the cheaper card to produce without the watercooler on the BOM? No. Water Fury is their best foot forward, it's highly unlikely that Air Fury has better stock performance or OC'ing headroom.
 
Makes you wonder but im sure nvidia has ran some weird test to boost things.

Which is why we shouldn't pay any attention to the performance graphs these companies show. Instead we'll have to rely on reputable review sites such as [H] to determine the actual performance figure, what are the improvements over previous hardware release, etc.

Our focus also was not entirely spent at 4K either.

I for one will not ignore 1440p gamers, that is the resolution where you are able to mostly maximize settings in most games with new cards, and pretty much the resolution to use if you want great looking PC gaming with maximum game settings at acceptable performance. 4K, you cannot max out games, the GPUs aren't powerful enough yet. 1080p is mostly a given, every game will perform great at 1080p on high-end cards, but 1440p presents a challange still for some, and is IMO the best test of a graphics card. 4K gaming as I said is growing, but it is no where near the saturation levels of 1440p and 1080p PC gaming.

Thanks and please keep up the good works. Those lower resolutions are certainly still very relevant to many of us. And when making a purchase decisions, I would certainly want to see how a hardware performs at the resolution I'm gaming at, regardless of whether that resolution is the most challenging scenario or not.
 
someone on reddit talked about the differences between GDDR5 & HBM in terms of how to feed data to the GPU

https://www.reddit.com/r/Amd/comments/3b6n1c/absolutely_nobody_is_talking_about_fury_x_and/



I'm inclined to think it's a driver issue and AMD has a lot of work to still do in optimizing for HBM usage otherwise I assume we'd have heard complaints about stutter from 2x GTX 980 or 2x 290X set ups by now with their 4GB frame buffers

Frame buffer size is distinctly different from memory granularity and how writes work at that level. This has to do with optimizing transfer sizes, latency between changing memory pages, and cache lookups of where said page happens to be. Drivers can help to a point, but it's also in silicon, and an overall design decision that could be good ~or~ bad in the long run, depending on how games develop over the next few years.
 
Different price and they could have simply touted the performance at silent levels for SFFs.
Even just being close to the 980Ti at 575 or 600 would have been acceptable.
That's my biggest issue with this release.
I'm still going 980Ti though.
 
https://youtu.be/8hnuj1OZAJs?t=91
https://youtu.be/8hnuj1OZAJs?t=136

I was just watching some gameplay footage with this card at 4K. Look at what happens at the 1:39 and 2:20 mark, massive stutter. It's like the card is running out of VRAM because the usage goes down a lot when this happens, seems like it's swapping new assets in/out.

I don't think that has to do with VRAM personally. Looks like very unoptomized drivers to me. My cards are a paltry 2GB and I've run settings on games that well exceed that and it never stutters like that. I get slow downs to be sure, but it never actually pauses like it did in that video.
 
That guy in the video says he's running 4k. So the VRAM is very likely the cause of those pauses. it's swapping textures etc with system ram over the PCIE bus. doesn't matter how fast the HBM bus is, if you gotta go to system ram, shit will bog down...

8Gb HBM will hopefully do better, for both AMD and Nvidia. Both would do well to take note and not even bother releasing anything HBM with less than 8Gb VRAM, (unless aimed at very low end market segment, or mobile).

For now, this is a 1080p, 1200p, or 1440p card. At 1440 you will not be able to max everything in all games, but will be able to in some.

I think the silicon's performance/speed is hitting a wall. no matter how cool you get it, at some point it just cannot go any faster. Voltage limits of the card design could also play a part in that. But who wants a 100C card in their case? the back-side will melt pcie slots if any signiifcant overlcocking happens. The nviida silicon however is using a bit less power, but overclocks way better. i.e. its a better design at it's core.

The only thing that's going to improve upon this GCN 1.3 offering, is a massive die shrink. That will lower power consumption and temps, which allows for higher clocks. It likely needs more ROP's as well. If I had to guess, they've already learned from this thing, december january timeframe a GCN 1.4 will come out with 128 ROP's. They had to watercool this thing, just to get to the performance we are seeing. It barely overclocks. To me, the added watercooler isn't a feature I want. And when(if) they come out with an air cooled version, clocks are likely to be lowered on the GPU.

Who has any data on how much drivers for GCN have improved performance, on a per GCN version release basis? Like, take a GCN 1.0 card (7900 or 280x), and run it with release day drivers, and again with current drivers, to see the performance improvement delta? Then do the same with GCN 1.1 (290x), and again with GCN 1.2 (285). [fortune teller mode] I would wager that the improvements on GCN 1.2 are small. but on 1.0 they are likely more significant. If 1.2 improvements are small, 1.3 will likely be as small or even smaller [/fortune teller mode]. Educated guess from extrapolated data. If every GCN release has seen big performance improvements, then fury might see the same amounts of improvements. And from that, you can also extrapolate how much possible improvement we could see. Then comparing the performance improvement delta for nvidias GPUs in the same way, a future fortune telling comparison could be made.

Somehow, I think the nvidia GPU would wind up still in the lead. Throw in overlcocking, the lead would be even greater.
 
Last edited:
I don't think it's likely at all, At least, not strictly a vram issue. Even if you look at [H]s own data, Fury is not as far behind a 980Ti at 4k as it is at 1440p. If it was strictly a VRAM issue it would tank harder at 4k but instead it's narrowing the gap.


I can run 200% resolution scale on BF4 which is for all intents and purposes, 4k (actually a bit higher since i'm at 1200p) and it won't pause like it did on that video, and that's with 2GB cards. I have NEVER seen a game pause like that because vram was saturated.
 
I don't think it's likely at all, At least, not strictly a vram issue. Even if you look at [H]s own data, Fury is not as far behind a 980Ti at 4k as it is at 1440p. If it was strictly a VRAM issue it would tank harder at 4k but instead it's narrowing the gap.


I can run 200% resolution scale on BF4 which is for all intents and purposes, 4k (actually a bit higher since i'm at 1200p) and it won't pause like it did on that video, and that's with 2GB cards. I have NEVER seen a game pause like that because vram was saturated.

It's strange because you can see his VRAM tank by around a 1GB when it happens. Maybe nVidia's drivers are better to handle the situations due the 970 fiasco? (bahaha) His VRAM gets up to around 3.6GB then dumps down to 2.7GB ish when the stuttering happens.
 
It's strange because you can see his VRAM tank by around a 1GB when it happens. Maybe nVidia's drivers are better to handle the situations due the 970 fiasco? (bahaha) His VRAM gets up to around 3.6GB then dumps down to 2.7GB ish when the stuttering happens.

Yeah, it could certainly be that NVidia's drivers cope better with swapping assets in and out of vram than AMD's do.
 
Is there any info on what AMD meant about tuning specific games to make sure that 4K never means more than 4GB of ram?

Because that sounds to me like custom profiles to increase texture compression and lower shader precision in ways which may not be noticeable, without close scrutiny of screenshots.
 
I so wanted this to be my next card. Skipped out on some great deals and sold off parts in anticipation of it. What a let down. And the smugness of AMD with all the waiting and that press conference only to lay an egg a few days later. At $499 it would have been a sure purchase though. Ugh, I think I will now just wait until the fall for some black Friday deal.
 
Thought I saw and AMD slide saying that Fury X was the "gateway to 5k gaming." 5k would be quad 1440p. They must have been referring to 5k solitaire. Form what i understand, the heirachy will be Fury X, Fury, Fury nano, then 390x. I also recall them saying that Fury nano will be substantialy more powerful than the 390x. The problem is that the FURY X is not much better than the 390x. it will be interesting to see how the other two cards thread there way in there.
 
AMD knew the card couldn't sell itself for $649, much less the original rumored price of what... $800 or so? So they relied on hype and misinformation to do it for them.
 
Hard punching review but good. Sampling of games maybe a little on the light side but that has to be due to the rather small window to get the initial review out. Thanks on the continue great work.

4K, CFX and multi monitor gaming would be great, Windows 10 (too early but very applicable) and OCing.

I like the cooling, size and HBM memory allowing for a radical change to the design - that is really cool.

4gb? It is just too little for the long haul high resolution gaming. Price is too high but since they sold them all out without even a game bundle or hint of one the initial price is probably OK for the first in line crowd willing to pay (some will no doubt feel like suckers later on). Real price I think should be less than $600 mostly due to only having 4gb of memory. $599 would be OK, $549 would probably be hard not to buy. So memory and price is the biggest issues I have with the card.

At $649 the TitanX is starting to look reasonable :D. The 980Ti at this time is the better performance/buck.
 
Being sold out means literally nothing unless you know how plentiful the supply was to begin with.
 
Thought I saw and AMD slide saying that Fury X was the "gateway to 5k gaming." 5k would be quad 1440p. They must have been referring to 5k solitaire. Form what i understand, the heirachy will be Fury X, Fury, Fury nano, then 390x. I also recall them saying that Fury nano will be substantialy more powerful than the 390x. The problem is that the FURY X is not much better than the 390x. it will be interesting to see how the other two cards thread there way in there.

THIS.

I'm really curious to see how this plays out.
 
Does HBM show any kind of improvement on anything? Like less hit with SSAA or anything else? Has anyone done memory benchmarks with HBM on the Fury? How about Compute type programs (which would also be limited by the 4gb).
 
HBM is an improvement on just about everything. It's important to not confuse Fury's performance with HBM, the only reason it's even performing as well as it is, is because of HBM. A lot more bandwidth, less power consumption and a much smaller package. It's definitely a superior technology to GDDR5.
 
I wonder what the latency on the HBM is. It's obviously not streaming into the chip on 4096 dedicated i/o pads.
 
Yes and who needs that superiority when GDDR5 still bests it today.

Except it doesn't. 980Ti > Fury X isn't the same thing. Otherwise nvidia wouldn't bother with it with pascal. If Fury X was using gddr5 it would be slower and consume way more power.
 
HBM is an improvement on just about everything. It's important to not confuse Fury's performance with HBM, the only reason it's even performing as well as it is, is because of HBM. A lot more bandwidth, less power consumption and a much smaller package. It's definitely a superior technology to GDDR5.

There is a downside that may be the reason for some of the performance issues, Memory Granularity, it was mentioned before.
The size of a data write is much much larger on HBM and can lead to not being to complete a write in one pass because when data has to be placed in more than one page, only one page can be written to at a time.
This can lower max bandwidth and increase latency.
More info.
https://www.reddit.com/r/Amd/comments/3b6n1c/absolutely_nobody_is_talking_about_fury_x_and/
 
Last edited:
Except it doesn't. 980Ti > Fury X isn't the same thing. Otherwise nvidia wouldn't bother with it with pascal. If Fury X was using gddr5 it would be slower and consume way more power.

Yes its a shame that AMD has to resort to HBM to keep up with Nvidias GDDR5. That being said, I can't wait till Nvidia start their round of HBM, it's actually going to be groundbreaking me thinks.
 
I'm tired of the drivers excuse. [H] has to review the card as it is, not as it could be in 1, 3, 6, or 12 months. It's not like AMD just got cards to use. They've had cards for many months. They've had plenty of time to work on drivers. Unless it's a really small team which would be a shame.
 
If its going to be improved on substantially, AMD need to explicitly state why and how with an official statement they will stand by, otherwise theres little point in having hope.
 
I'm tired of the drivers excuse. [H] has to review the card as it is, not as it could be in 1, 3, 6, or 12 months. It's not like AMD just got cards to use. They've had cards for many months. They've had plenty of time to work on drivers. Unless it's a really small team which would be a shame.
A delayed game is eventually good, but a rushed game is forever bad.

Drivers are as good as they will pretty much ever be. Maybe an extra 5% can be coaxed out of them but the card isn't going to magically gain 25% more performance. It's not like GCN is a new arch the only thing new is HBM. It's as good as AMD is willing to invest in them further optimization is fine-tuning not major redesign with major performance.
 
There is a downside that may be the reason for some of the performance issues, Memory Granularity, it was mentioned before.
The size of a data write is much much larger on HBM and can lead to not being to complete a write in one pass because when data has to be placed in more than one page, only one page can be written to at a time.
This can lower max bandwidth and increase latency.
More info.
https://www.reddit.com/r/Amd/comments/3b6n1c/absolutely_nobody_is_talking_about_fury_x_and/

That's all dependent on the instruction widths supported and if there's a delayed write cache implemented.

I can byte align, word align DWORD align, QWORD align my code. DWORD is the most efficient in terms of speed. But then I waste memory if I'm writing single byte structures.
 
I'm tired of the drivers excuse. [H] has to review the card as it is, not as it could be in 1, 3, 6, or 12 months. It's not like AMD just got cards to use. They've had cards for many months. They've had plenty of time to work on drivers. Unless it's a really small team which would be a shame.

Same. IF it was true, which I doubt, it'd actually be a good reason not to buy the card based on AMD's driver release cycle.
 
That's all dependent on the instruction widths supported and if there's a delayed write cache implemented.

I can byte align, word align DWORD align, QWORD align my code. DWORD is the most efficient in terms of speed. But then I waste memory if I'm writing single byte structures.

Its also dependent on whether the driver can always write all the data in one pass.
That is dependent on the data coming in large enough consecutive blocks that the cache can cope with otherwise the data rate will drop substantially.
The cache cant hold onto data that has to be written in a timely fashion and it has limited size.
When the data is too fragmented, it could be a real issue for latency and bandwidth.
 
I'm tired of the drivers excuse. [H] has to review the card as it is, not as it could be in 1, 3, 6, or 12 months. It's not like AMD just got cards to use. They've had cards for many months. They've had plenty of time to work on drivers. Unless it's a really small team which would be a shame.

How does the driver excuse affect you? You are using a nvidia card till it dies(according to your signature. Go about your day and don't let what some company you probably have never used before bother you.
 
How does the driver excuse affect you? You are using a nvidia card till it dies(according to your signature. Go about your day and don't let what some company you probably have never used before bother you.


I have used both AMD and nvidia just this time going with the 980ti coming from my current card gtx 680 I waited for along time before upgrading.

AMD has always been good. I think the expectations for this card in conjunction with current events nvidia market share and there previous outings just made it a long road for AMD to go down.

Nvidia realseing the gtx 980ti at this time and it being a real good card made things even more difficult for AMD.
 
How does the driver excuse affect you? You are using a nvidia card till it dies(according to your signature. Go about your day and don't let what some company you probably have never used before bother you.

But I have I used products from this company even before it was this company. It's funny that ATI can be bought out by another company and still retain it's allegedly bad qualities. People here continuously make the same driver excuse for AMD, but now they can't use the "it's cheaper" excuse.
 
I'm tired of the drivers excuse. [H] has to review the card as it is, not as it could be in 1, 3, 6, or 12 months. It's not like AMD just got cards to use. They've had cards for many months. They've had plenty of time to work on drivers. Unless it's a really small team which would be a shame.

well in reality there is not that much time between final silicon and launch....they do the best they can and unfortunately for BOTH NVidia and AMD there is no way their launch drivers will ever be truly polished.

AMD 290X shows a HUGE improvement between launch and now in terms of driver performance

Jan 14 HOCP reviews 290X and it does only 1080p in the Dying Light game. now (as of last week) it runs 2560*1440 with the same setting and frame rates (well within 2fps)

doubt it?

http://hardforum.com/showpost.php?p=1041694207&postcount=74

2073600 * 59.3fps = 122964480 pixels a second

3686400 * 58.1fps = 214179840 pixels a second

so the launch drivers only exposed 57.41% of the card's actual performance as compared to today.
 
Last edited:
so the launch drivers only exposed 57.41% of the card's actual performance as compared to today.

That comparison doesn't even make sense, since it appears there's different settings being used, a different version of the game being used, different drivers, and, having not played the game, possibly even a different play area. That's far too many variables to simply say "the drivers did it".
 
well in reality there is not that much time between final silicon and launch....they do the best they can and unfortunately for BOTH NVidia and AMD there is no way their launch drivers will ever be truly polished.

AMD 290X shows a HUGE improvement between launch and now in terms of driver performance

Jan 14 HOCP reviews 290X and it does only 1080p in the Dying Light game. now (as of last week) it runs 2560*1440 with the same setting and frame rates (well within 2fps)

doubt it?

http://hardforum.com/showpost.php?p=1041694207&postcount=74

2073600 * 59.3fps = 122964480 pixels a second

3686400 * 58.1fps = 214179840 pixels a second

so the launch drivers only exposed 57.41% of the card's actual performance as compared to today.


hmm can't compare the review, different level, different settings.
 
Back
Top