AMD Polaris GCN 4.0 Macau China Event

[H] reviews have ALWAYs had a small sample size of games tested.

You people don't understand basic maths if you think a small handful of games can really be used as a reliable metric to judge GPU hardware on. I merely raised an example, if you took DX12 Quantum Break where a 390 = 980Ti, Hitman DX12 etc and now your small sample size is totally AMD biased. NV GPUs would look shit in comparison.

Do you people not see how having a handful of games, and selecting mostly titles sponsored by NVIDIA skews the results in their favor? Really? Is it that hard a concept?

Just to give you another example.

Fury X = 1080 in DX12 Warhammer.

XxhXDJKP2x6Z9jPMP7XgjD-650-80.png


The current game selection again, 5 NV sponsored titles out of 6 tested.

Good to see how AMD GPUs perform in NV sponsored games for sure. AMD doesn't appreciate it and I can see why they won't engage with this site.


Love your bias

Total War: Warhammer benchmarks strike fear into CPUs | PC Gamer

HPop4qQRUbptwh6wZeRFcV-650-80.png


TV3WAocaoi9gvGp5AJSueY-650-80.png


Showing us the a CPU bottlenecked settings of a game now?
 
Yep that's why its always better to see how real world game play benchmarks give you more data to compare from.
 
Quite a spread, wonder how that will set with those who threw a rod over nVidia having little GPU decorated cakes?
AMD is handing out little die-cakes that are glazed and sprinkled with chocolate chips asynchronously and concurrently.

Nvidia can glaze+sprinkle concurrently, but only at the tray-of-cakes level.
 
LOL you do realize that graph its cpu bottlenecked right?

But you guys used to go on about AMD drivers being inefficient on cpu usage, so surely that makes you wrong in this case? It's faster than the 1080 which is usually 10-25% faster everywhere else lol.
 
  • Like
Reactions: Yakk
like this
I don't have a problem with these events. They get a chance to wine and dine their reviewers, and hand out cards at the end, making sure there are no early leaks. They also get guaranteed in-depth reporting from those they flew out.

And the most expensive part is the airfare. The hotels are a lot cheaper than the W Nvidia used in Austin. They probably end-up costing about the same.
 
But you guys used to go on about AMD drivers being inefficient on cpu usage, so surely that makes you wrong in this case? It's faster than the 1080 which is usually 10-25% faster everywhere else lol.

Lol. Really didn't put any thought into this did you ?

sfsfsfsf.png
 
Last edited:
Lol. Really didn't put any thought into this did you ?

It's slower at 4k but faster in CPU limited benchmarks.

Razor1: It's a CPU limited benchmark.
AMDDriversucks crowd (incl you sometimes and razor1) for the last while: Any API/Driver overhead sucks cus AMD lelz

That's not what I'm seeing here. if it were CPU limited and 'AMD driver sucks' as you guys always go on about, then surely it would be even slower again with overhead, yes?
 
It's slower at 4k but faster in CPU limited benchmarks.

Razor1: It's a CPU limited benchmark.
AMDDriversucks crowd (incl you sometimes and razor1) for the last while: Any API/Driver overhead sucks cus AMD lelz

That's not what I'm seeing here. if it were CPU limited and 'AMD driver sucks' as you guys always go on about, then surely it would be even slower again with overhead, yes?

Man, this comment made my head spin.

What do you mean slower with overhead ? I'm so lost man, what are you on about ...

Faster in CPU limited benchmarks ? So it's faster when it's not being used ? I'm genuinely at a loss for words here
 
JhxofSA9GKEEcxZ3Wdj69b-650-80.png

SFnH4joQQ5LHBfF2H3ijJf-650-80.png


Do these two images, in this sequence, mean anything to you ?
screenshot-www pcgameshardware de 2016-05-27 17-01-21.png

latest
 
Last edited:
I don't have a problem with these events. They get a chance to wine and dine their reviewers, and hand out cards at the end, making sure there are no early leaks. They also get guaranteed in-depth reporting from those they flew out.

And the most expensive part is the airfare. The hotels are a lot cheaper than the W Nvidia used in Austin. They probably end-up costing about the same.
I really do. It's scummy for journalists to accept all-expenses-paid trips in that way, and while I'm sure everybody swears up down, left and right, that it doesn't impact their coverage, we all know it often gives them the benefit of the doubt.

That's the only way to get embargoed hardware, so they all go. But I'm sure a lot of these guys feel like hookers doing it, and wish it wasn't necessary.
 
But you guys used to go on about AMD drivers being inefficient on cpu usage, so surely that makes you wrong in this case? It's faster than the 1080 which is usually 10-25% faster everywhere else lol.


That was an inherent problem with AMD's Dx11 drivers, this benchmark was DX12, that shouldn't happen, and doesn't happen and nV;s and AMD's cards are in this game's DX12 version, where they should be just like in DX11 when the GPU is the bottleneck (this is why I posted the DX11 and DX12 when GPU limited just for comparison sake), yet, this guy shows us a graph and expects us to believe that the 1080 and FuryX is equal when the graph clearly states they were testing CPU limitations? Are we fools, looks like some people are that liked his post without bothering to understand what he has showed. He is trying to pull the rug from underneath us.
 
  • Like
Reactions: N4CR
like this
Man, this comment made my head spin.

What do you mean slower with overhead ? I'm so lost man, what are you on about ...

Faster in CPU limited benchmarks ? So it's faster when it's not being used ? I'm genuinely at a loss for words here

FFs lol.

Visual example. CPU 0-8 has five bars of utilisation each representing 20% of total power available - |||||

Nvidia drivers 'supa wow faultless WHGQLZ' ever only use one bar of CPU - |
AMD drivers 'supa shit alwayz top kek amdsuxx' uses two bars of CPU - ||

Now we have a CPU limited benchmark which wants to steal all five ||||| bars for doing cpu shit and not gpu overhead.
This means that bitch ain't got any more utilisation left. Zero bars unless it defers some resources to the gpu overhead.


So along comes Nvidia superwow mega drivers, oh shit, it's only 20% of the total cpu usage required or | one bar of five ||||| used for GPU shit - result is a slight slowdown in overall performance due to GPU overhead vs no overhead (perfect case).

Along comes AMD drivers 'topkeksuxlolz24/7 beta version' uses two bars || of five ||||| for gpu overhead. This results in a noticeably larger slow down vs the nivdia drivers and heaps more than a perfect 'no overhead' driver/API.

So, after all you guys bitching about AMD drivers suck/mega overhead for ever, we have a scenario where instead the AMD overhead appears to be far less, as it's faster at 1080p when CPU limited and slower everywhere else when gpu limited?

So, is AMD using less CPU than Nvidia? It's practically the only way that the Fury X and other AMD cards which usually are slower in GPU limited examples, can be so close in CPU limited examples..
 
That was an inherent problem with AMD's Dx11 drivers, this benchmark was DX12, that shouldn't happen, and doesn't happen and nV;s and AMD's cards are in this game's DX12 version, where they should be just like in DX11 when the GPU is the bottleneck, yet, this guy shows us a graph and expects us to believe that the 1080 and FuryX is equal when the graph clearly states they were testing CPU limitations? Are we fools, looks like some people are that liked his post without bothering to understand what he has showed. He is trying to pull the rug from underneath us.

Bahanime; master manipulator.
 
That was an inherent problem with AMD's Dx11 drivers, this benchmark was DX12, that shouldn't happen, and doesn't happen and nV;s and AMD's cards are in this game's DX12 version, where they should be just like in DX11 when the GPU is the bottleneck (this is why I posted the DX11 and DX12 when GPU limited just for comparison sake), yet, this guy shows us a graph and expects us to believe that the 1080 and FuryX is equal when the graph clearly states they were testing CPU limitations? Are we fools, looks like some people are that liked his post without bothering to understand what he has showed. He is trying to pull the rug from underneath us.

Thank you for clarifying this. I thought I was going insane (leldra.. ;))

So the DX11 drivers are the overhead issue but not DX12 due to API changes. This is great to hear - it also potentially shows from that dataset, that AMD has less overhead at this stage.. more testing on this would be quite interesting indeed.

And yes posting CPU limited benches at peasant resolutions a FX is almost never run at, is a bit of a rug pull indeed! Cheers
 
FFs lol.

Visual example. CPU 0-8 has five bars of utilisation each representing 20% of total power available - |||||

Nvidia drivers 'supa wow faultless WHGQLZ' ever only use one bar of CPU - |
AMD drivers 'supa shit alwayz top kek amdsuxx' uses two bars of CPU - ||

Now we have a CPU limited benchmark which wants to steal all five ||||| bars for doing cpu shit and not gpu overhead.
This means that bitch ain't got any more utilisation left. Zero bars unless it defers some resources to the gpu overhead.


So along comes Nvidia superwow mega drivers, oh shit, it's only 20% of the total cpu usage required or | one bar of five ||||| used for GPU shit - result is a slight slowdown in overall performance due to GPU overhead vs no overhead (perfect case).

Along comes AMD drivers 'topkeksuxlolz24/7 beta version' uses two bars || of five ||||| for gpu overhead. This results in a noticeably larger slow down vs the nivdia drivers and heaps more than a perfect 'no overhead' driver/API.

So, after all you guys bitching about AMD drivers suck/mega overhead for ever, we have a scenario where instead the AMD overhead appears to be far less, as it's faster at 1080p when CPU limited and slower everywhere else when gpu limited?

So, is AMD using less CPU than Nvidia? It's practically the only way that the Fury X and other AMD cards which usually are slower in GPU limited examples, can be so close in CPU limited examples..


DX12 doesn't AMD driver overhead issues

You can clearly see there is a problem with the FuryX as it should be equal to the the 1080 in CPU limited situations. But it isn't (DX11 cpu limited graph)
 
Thank you for clarifying this. I thought I was going insane (leldra.. ;))

So the DX11 drivers are the overhead issue but not DX12 due to API changes. This is great to hear - it also potentially shows from that dataset, that AMD has less overhead at this stage.. more testing on this would be quite interesting indeed.

And yes posting CPU limited benches at peasant resolutions a FX is almost never run at, is a bit of a rug pull indeed! Cheers

Hawaii behaves very different to Fiji in DX11 here btw, anyway...

The overhead in DX11 is actually a consequence of GCN's graphics command processor, something about the command buffer that limits it to single thread I believe, will google and post sauce

DX12 doesn't AMD driver overhead issues

You can clearly see there is a problem with the FuryX as it should be equal to the the 1080 in CPU limited situations. But it isn't (DX11 cpu limited graph)


I'm still confused about the last few posts, I genuinely have no idea what was being said lol. CPU bottleneck means 970 and 1080 perform similarly, it really says nothing about the GPUs
 
Yeah AMD did state something like that, but I don't think that is the real cause. nV was able to use more threads for their driver and AMD just wasn't or doesn't want to create a multithreaded version of their driver for Dx11, it is a lot of work, they pretty much have to rewrite their driver form scratch and DX12 being around what is the real use?
 
Yeah AMD did state something like that, but I don't think that is the real cause. nV was able to use more threads for their driver and AMD just wasn't or doesn't want to create a multithreaded version of their driver for Dx11, it is a lot of work, they pretty much have to rewrite their driver form scratch.

I thought they did that, thought they rewrote the whole driver stack for omega or crimson or whatever it is they call it . I mean, FFS AMD. You give your fkin drivers NAMES but your GPU architectures get version numbers? Are you fucking serious !?

Yeah but NV only started w/ multithreading when they implemented gigathread in hw, think it was kepler ?
 
So AMD gamed out because of a threading issue.. ffs AMD. Guess they figured just move to DX12 and flag DX11.. not worth the investment for last few games on it where it'll maybe have some sort of worthwhile difference. Most major titles are likely to be DX12/Vulkan/OGL from now on.

This graph shows to me they're pretty much neck and neck for overheads, if not sometimes a slight advantage in each class to the AMD setup overall, if we are mostly seeing differences in overhead here - 390 always ahead of the 970 too in cpu limited.

My bad for the confusion, sorry!
image.png
 
I don't know if AMD can use more than one thread for their drivers in DX11 tough, it could be a hardware problem, but I don't see why it would be limited to that. as MS gives explicit extensions for it
 
So AMD gamed out because of a threading issue.. ffs AMD. Guess they figured just move to DX12 and flag DX11.. not worth the investment for last few games on it where it'll maybe have some sort of worthwhile difference. Most major titles are likely to be DX12/Vulkan/OGL from now on.

This graph shows to me they're pretty much neck and neck for overheads, if not sometimes a slight advantage in each class to the AMD setup overall, if we are mostly seeing differences in overhead here - 390 always ahead of the 970 too in cpu limited.

My bad for the confusion, sorry!
image.png
NP dude!
 
I don't have a problem with these events. They get a chance to wine and dine their reviewers, and hand out cards at the end, making sure there are no early leaks. They also get guaranteed in-depth reporting from those they flew out.

And the most expensive part is the airfare. The hotels are a lot cheaper than the W Nvidia used in Austin. They probably end-up costing about the same.


Fattens them up ;) for the slaughter.
 
I don't know if AMD can use more than one thread for their drivers in DX11 tough, it could be a hardware problem, but I don't see why it would be limited to that. as MS gives explicit extensions for it

Every threaded title in DX11 resulted in extremely poor performance results. In fact, they usually end up slower. (For both nvidia and amd) It was a flawed implementation.
 
We are not talking about applications here, we are talking about the driver being able to use more than one core ;) The application will always be limited to one core (execution wise in DX11), but as long as the driver isn't on the same core its works out better.
 
Gigathread

In this era of multi-core processors, I'm sure that the idea of threading isn't a new one to most of you. In CPU terms, multi-threading is a word often bandied about these days, and involves splitting an applications workload into various 'threads' to take advantage of the parallelism of dual and quad-core CPUs - For example, in a game you could have one core for physics and one for A.I. as a basic example.

Although threading in the context of 3D rendering isn't exactly the same, the basic premise is similar - To make use of the incredible parallelism inherent in a modern GPU by effectively splitting up the workload. However, rather than splitting the workload into particular tasks or threads at a developer level, GPUs handle threading by themselves, by splitting the data to be rendered into batches of pixels which can then be sent to wherever they need to go in the GPU.

We've already mentioned the scheduler present alongside every cluster of sixteen Stream Processors - As well as this there is also a global scheduler present in G80, which oversees the graphics core as a whole. These schedulers combined together come under NVIDIA's 'Gigathread' banner, and as a whole make sure that thousands of threads are 'in flight' on the graphics core at any one time to keep everything well fed. In a sense, this is similar to the concept first seen in the PC space courtesy of ATI's Radeon X1000 series 'Ultra Threaded Despatch Processor', although of course the implementation is different largely due to the presence of a Unified Shader Architecture. As part of the drive for efficiency running through Gigathread, threads can be moved between Stream Processors and clusters as required without any performance penalty, as well as elsewhere in the core as needed.

Also somewhat fitting under this section is the improved branching capabilities on show in G80. In the last generation of GPUs, we saw ATI quite often touting their large dynamic branching advantage over NVIDIA's competing hardware. Although this had little effect in even the most demanding game titles (although of course you can make a chicken and egg argument here - Was dynamic branching not used because of its poor performance on GeForce 7 series boards?), it did allow ATI to make real strides forward in the GPGPU (General Purpose GPU) segment, where the horsepower of a modern graphics board can be put to more CPU-like uses, with Stanford University's Folding@Home GPU client perhaps the most notable example of GPGPU in action. Good dynamic branching performance is a must-have for such purposes, which is why ATI have reigned supreme thus far here.

However, G80 has put NVIDIA well and truly back in the race on the dynamic branching front, thanks to a mixture of dedicated branching units in hardware to take this workload away from the Stream Processors, as well as much improved granularity - In other words, the number of pixels batched together when a branch is traversed. Granularity is important because dynamic branching is all about going through different permutations (or branches) to reach the correct result - The more pixels you send down a branch at any one time, the longer those pixels potentially have to travel to reach the correct result and thus the lower your dynamic branching performance. So, the aim is to work with as small a batch of pixels as possible when branching in this way.

For reference, ATI's R520 architecture had a dynamic branching granularity of sixteen pixels, although this was raised to forty-eight in R580 - In comparison, NVIDIA's G7x architecture had a granularity of 1024 pixels, hence its far slower performance. So, how much better is granularity in G80, I hear you ask? 16 objects for vertex data and 32 pixels for pixel data is the answer. NVIDIA are now well and truly ready to start pushing GPU-based physics and GPGPU applications on their new architecture, which should be enough to tell you that they now know they have a decent branching implementation.

Nvidia's GigaThread engine, the global scheduler, intelligently ties together all these threads and pipes data around to use this wealth of processing power. We are in a world of out of order thread block execution, application context switching here.

I'm confused because it says it was introduced with Fermi, but this article talks about Tesla
 
I really do. It's scummy for journalists to accept all-expenses-paid trips in that way, and while I'm sure everybody swears up down, left and right, that it doesn't impact their coverage, we all know it often gives them the benefit of the doubt.

That's the only way to get embargoed hardware, so they all go. But I'm sure a lot of these guys feel like hookers doing it, and wish it wasn't necessary.

Right, but look at the alternative: If AMD/Nvidia just do press releases, they're liable to get buried by "something more important." Just look at how late your average tech website is at turning around press releases. But taking someone to a destination and shoving the product in their face for three days gets them interested permanently.

Hell, people keep reporting about your event even if they didn't get an invitation :D

From ATI to AMD back to ATI - From ATI to AMD back to ATI? A Journey in Futility

And after the press release you have to add a marketing campaign to that, because the press release route is so uncoordinated and inconsistent. So that adds cost. Might as well get the community involved and spread the gospel direct to their biggest fans (i.e. Social Media before it was Twitter).

This is why I still have hope for AMD: they know that without these events, the fanboys and critics stop caring so much. If that ends, then nobody knows you exist without a massive marketing campaign (think on the level of Intel Inside).
 
Last edited:
I should stay off the forums with 4.5 hours sleep in 3 days >__<

That NDA lift would time well with Ctex.. that said, I'm really wondering if they're going to talk much at Ctex about anything with this sneaky Macau event so close. Or perhaps we'll just get a launch date. They seem sure on Vega launch window, so why not Polaris... something is not adding up, or they're just trying to get Nvidia to price the 1070 high..
 
Well nV already priced the 1070 its not going to change, all the naysayers about the 1080 FE prices, AIB partners were going to over price their boards are definitely wrong, 1080 board prices are starting at $600

I think AMD planned for a computex launch and pushing it up just didn't make any sense for them, one month isn't going to make a big difference....... I hope the 29th of June is a hard launch date........
 
Where? First run was $699 across the board (except for price gougers) from what I saw.

Not quite $600, $20 more, but it's the first one I found with a price.

Expect the ROG Strix GeForce GTX 1080 to be available starting June 4. You can choose between the hot-clocked OC version for $639.99 or a stock-clocked variant with a $619.99 MSRP. Stay tuned for more coverage of the card at PCDIY—we have exciting plans for this beautiful beast.

This is the ASUS Strix GeForce GTX 1080—Pascal on another level - ASUS PC DiY
 
Ok, so nothing at launch below $699, and nothing for the advertised $599 for the foreseeable future. Those Strix cards are par for the course in terms of a mild bump over MSRP for a custom/OC variant. It's just that with the reference boards going $100 over MSRP, we likely won't see MSRP for awhile. And while I know most 1080 customers won't care (price isn't an issue for you guys), this does give me an idea of what to expect for the 1070. Suddenly, that $379 card ("only" $50 over the 970 launch price) is $449.
 
check ocuk, they are in pounds but no price gouging going on over there.
 
Ok, so nothing at launch below $699, and nothing for the advertised $599 for the foreseeable future. Those Strix cards are par for the course in terms of a mild bump over MSRP for a custom/OC variant. It's just that with the reference boards going $100 over MSRP, we likely won't see MSRP for awhile. And while I know most 1080 customers won't care (price isn't an issue for you guys), this does give me an idea of what to expect for the 1070. Suddenly, that $379 card ("only" $50 over the 970 launch price) is $449.

The cards launched 12 hours ago, the custom cards will sell for under $699. Pricing has already been revealed, if people have to wait a few weeks for the card they want, it will be fine (and less than $699 unless they want an water cooled variant or something extreme).
 
I guess you have to wait a week to save $80, but probably get better cooling for the wait. And like I said, this was just the first one I came across, there might be others out there before 6-4, and maybe cheaper than $620.
 
Any estimates for Vega? I have 1200/1500 R9 290 2 way CF and appearently crossfire doesn't work well with freesync.
 
Back
Top