[WCCFTECH]AMD Rolling Out New Polaris GPU Revisions With 50% Better Perf/Watt

Coordinating multiple threads I would presume would give some latency due to dependencies between them causing some stalls or can that be hidden?

DX 12 starts to shine when you exceed DX 11 draw call limitations, as in even more objects/shaders add on even more compute operations with multiple cpu cores driving the GPU. At this time I do not see developers wanting to push beyond DX 11 boundaries yet. Now in most of the BF1 benchmarks it does appear AMD does improve over DX 11 with DX 12 - does that mean it will give a better gaming experience then Nvidia - at this time it does not look like it but does look competitive non the less. Why does Nvidia not do as well in DX 12 or more exactly does worst? That I do not understand, is it lack of threading ability of Nvidia GPU? Meaning for DX 12 workloads with multiple threads it will always have limitations?


Yeah that's where I'm thinking the problem is too. Hard to hide that latency, specially when you want your graphics and compute queues to go concurrently cause the dependencies and the things that need to be synced up have higher priority. I can't see this being solely a driver issue, possible more on nV than on AMD, but the reason being its happening on both IHV cards.

The draw call limitations is only part of the the benefits of DX12, a bet a major part lol, but devs really can't go past the limits the consoles put on that, cause draw calls are, as you know highly dependent on the CPU, threads, and IPC, and consoles, well they might have more threads on average vs 4 core CPU's on desktops, but their IPC well lol, they are quite weak. nV's utilization and dynamic load balancing is different that AMD's so both architectures just have different needs when it comes to programming that's all. Dev's can have different draw calls for PC's vs. Consoles, but that would take quite a bit of work.
 
  • Like
Reactions: noko
like this
Yeah that's where I'm thinking the problem is too. Hard to hide that latency, specially when you want your graphics and compute queues to go concurrently cause the dependencies and the things that need to be synced up have higher priority. I can't see this being solely a driver issue, possible more on nV than on AMD, but the reason being its happening on both IHV cards.

The draw call limitations is only part of the the benefits of DX12, a bet a major part lol, but devs really can't go past the limits the consoles put on that, cause draw calls are, as you know highly dependent on the CPU, threads, and IPC, and consoles, well they might have more threads on average vs 4 core CPU's on desktops, but their IPC well lol, they are quite weak. nV's utilization and dynamic load balancing is different that AMD's so both architectures just have different needs when it comes to programming that's all. Dev's can have different draw calls for PC's vs. Consoles, but that would take quite a bit of work.
lol, you are so right about consoles! That is actually kinda funny, build a GPU for the PC that can do massive amount of threads asynchronously but be limited to basically DX 11 work loads from your own design console cpu/gpu combinations. :ROFLMAO: (AMD knows how to spin wheels rather well)
 
Yeah that's where I'm thinking the problem is too. Hard to hide that latency, specially when you want your graphics and compute queues to go concurrently cause the dependencies and the things that need to be synced up have higher priority. I can't see this being solely a driver issue, possible more on nV than on AMD, but the reason being its happening on both IHV cards.

The draw call limitations is only part of the the benefits of DX12, a bet a major part lol, but devs really can't go past the limits the consoles put on that, cause draw calls are, as you know highly dependent on the CPU, threads, and IPC, and consoles, well they might have more threads on average vs 4 core CPU's on desktops, but their IPC well lol, they are quite weak. nV's utilization and dynamic load balancing is different that AMD's so both architectures just have different needs when it comes to programming that's all. Dev's can have different draw calls for PC's vs. Consoles, but that would take quite a bit of work.

Its not the first time we've seen performance regression when CPU is the limiting factor (in DX12 vs DX11) and even in games that should in theory benefit immensely (hello Ashes my old friend) NV doesn't do any better in dx12 than DX11.

Whatever is going on with threaded command submission in the nvidia driver appears to be doing a better job than dice, eidos (deus ex), oxide.

The one exception seems to be tomb raider funnily enough
 
Its not the first time we've seen performance regression when CPU is the limiting factor (in DX12 vs DX11) and even in games that should in theory benefit immensely (hello Ashes my old friend) NV doesn't do any better in dx12 than DX11.

Whatever is going on with threaded command submission in the nvidia driver appears to be doing a better job than dice, eidos (deus ex), oxide.

The one exception seems to be tomb raider funnily enough


True, actually because DX12 has the capability to use more threads, it seems a counter to that is DX 12 as an API needs more CPU to begin with, which is actually a common side affect of threaded programs. Just look at Vista, the first windows OS that was capable of mutlithreaded programs properly, it was taxing on CPU's, not just GPU's, of course, Windows 7 remedied the GPU side, but CPU side kinda remained the same as people started upgrading their PC's for the new OS already.
 
lol, you are so right about consoles! That is actually kinda funny, build a GPU for the PC that can do massive amount of threads asynchronously but be limited to basically DX 11 work loads from your own design console cpu/gpu combinations. :ROFLMAO: (AMD knows how to spin wheels rather well)

XBox One uses DX 12.
 
XBox One uses DX 12.


We are talking about work loads on what a CPU is capable of not the actual API's.

AMD's jaguar cores are weak in IPC compared to Intel's CPU's IPC, so by using double the core count on AMD console CPU's they can get to a 4 core Intel a few gens ago, and that is your limit for draw calls too, cause if you push more than that, then console performance suffers.
 
True, actually because DX12 has the capability to use more threads, it seems a counter to that is DX 12 as an API needs more CPU to begin with, which is actually a common side affect of threaded programs. Just look at Vista, the first windows OS that was capable of mutlithreaded programs properly, it was taxing on CPU's, not just GPU's, of course, Windows 7 remedied the GPU side, but CPU side kinda remained the same as people started upgrading their PC's for the new OS already.

Yeah overhead due to multithreading is natural (communication and synchronization overhead) but you'd expect there to be an overall improvement considering core counts on some of these test rigs in reviews.

If the communication overhead from using multithreading command list compilation/submission outweighs the benefit of unlocking 16 logical cores then something is seriously wrong lol.
 
Yeah, and most likely that is also happening.
Looks like a better way or force synchronization of threads on a hardware level is needed. I am sure Nvidia has been studying the problem plus with the API now set in stone for awhile it should be easier to design hardware around it to allow efficient threading (just forward looking here, prediction Volta will do DX 12 right).

AMD does DX 12 better than DX 11 with a lot more cpu cores or threads involved, that does not mean if you now double or triple the draw calls and everything that goes with it that current AMD hardware will just automatically be able to run the game at some great fps. In other words take BF1 and double the draw calls with huge number of objects all current GPU's will probably struggle and fail at giving playable frame rates.

So really AMD does not have much choice I do believe but to improve Polaris or replace it. They need as much as possible to improve the perf/w for it to win out over Nvidia.
 
XBox One uses DX 12.
Basically the current crop of consoles maxed out graphically will be within DX 11 ability to handle. When Microsoft releases their next generation that will be a good starting point and hopefully it will be using Vega and not Polaris.
 
Looks like a better way or force synchronization of threads on a hardware level is needed. I am sure Nvidia has been studying the problem plus with the API now set in stone for awhile it should be easier to design hardware around it to allow efficient threading (just forward looking here, prediction Volta will do DX 12 right).

AMD does DX 12 better than DX 11 with a lot more cpu cores or threads involved, that does not mean if you now double or triple the draw calls and everything that goes with it that current AMD hardware will just automatically be able to run the game at some great fps. In other words take BF1 and double the draw calls with huge number of objects all current GPU's will probably struggle and fail at giving playable frame rates.

So really AMD does not have much choice I do believe but to improve Polaris or replace it. They need as much as possible to improve the perf/w for it to win out over Nvidia.
Again, you have ashes of the singularity which is arguable the most threaded game so far and it runs just fine in DX12, identical to dx11 in fact. I do not think there is a hardware problem to solve at all
 
Looks like a better way or force synchronization of threads on a hardware level is needed. I am sure Nvidia has been studying the problem plus with the API now set in stone for awhile it should be easier to design hardware around it to allow efficient threading (just forward looking here, prediction Volta will do DX 12 right).

AMD does DX 12 better than DX 11 with a lot more cpu cores or threads involved, that does not mean if you now double or triple the draw calls and everything that goes with it that current AMD hardware will just automatically be able to run the game at some great fps. In other words take BF1 and double the draw calls with huge number of objects all current GPU's will probably struggle and fail at giving playable frame rates.

So really AMD does not have much choice I do believe but to improve Polaris or replace it. They need as much as possible to improve the perf/w for it to win out over Nvidia.


Its not a hardware issue on nV's side, just need different programming techniques for both IHV's that's all.

We can see it in GOW4,

Its the same as before, different techniques are better for both.
 
Its not the first time we've seen performance regression when CPU is the limiting factor (in DX12 vs DX11) and even in games that should in theory benefit immensely (hello Ashes my old friend) NV doesn't do any better in dx12 than DX11.

Whatever is going on with threaded command submission in the nvidia driver appears to be doing a better job than dice, eidos (deus ex), oxide.

The one exception seems to be tomb raider funnily enough

I dunno but I suspect game development is too complex for devs to make coding to the metal work anymore, besides I really doubt that is a high priority section for them either. This isn't the pre PS2 days anymore.
 
I dunno but I suspect game development is too complex for devs to make coding to the metal work anymore, besides I really doubt that is a high priority section for them either. This isn't the pre PS2 days anymore.


Coding close to metal, just isn't really done anymore, even with DX12 or LLAPI's, its not a close to metal as lets say Quake 3 was back in the day. That knowledge and experience has been kinda lost on most programmers. With games being more complex its not that programmers aren't capable of doing it, its the time to do it that is more of an issue. When we had engines that were 100k lines of code vs now where you are talking millions of lines of code, the time to do close to metal optimizations is going to take quite a bit longer specially since the programmers have to learn the intricacies of new API's and hardware too. With big money behind these games and pretty much the same time lines of development 3 to 5 years for a game, well, something has to give and that is exactly what we are seeing.
 
Coding close to metal, just isn't really done anymore, even with DX12 or LLAPI's, its not a close to metal as lets say Quake 3 was back in the day. That knowledge and experience has been kinda lost on most programmers. With games being more complex its not that programmers aren't capable of doing it, its the time to do it that is more of an issue. When we had engines that were 100k lines of code vs now where you are talking millions of lines of code, the time to do close to metal optimizations is going to take quite a bit longer specially since the programmers have to learn the intricacies of new API's and hardware too. With big money behind these games and pretty much the same time lines of development 3 to 5 years for a game, well, something has to give and that is exactly what we are seeing.

Whoever the coined the term LL-API in reference to DX12 and Vulkan should be shot, along with whoever first mystified async compute. It took like six months to get over that hurdle
 
I dunno but I suspect game development is too complex for devs to make coding to the metal work anymore, besides I really doubt that is a high priority section for them either. This isn't the pre PS2 days anymore.

I cannot say I agree with that considering AMD was able to get bare metal gaming with Mantle working well. (At least until they decided to drop support for it.) DX 12 will become the defacto standard but, that will simply take time and Nvidia does not have any hardware really ready for it either.
 
I cannot say I agree with that considering AMD was able to get bare metal gaming with Mantle working well. (At least until they decided to drop support for it.) DX 12 will become the defacto standard but, that will simply take time and Nvidia does not have any hardware really ready for it either.

All nvidia hardware is really ready for it since Kepler.

Like, really *really* ready.
 
I cannot say I agree with that considering AMD was able to get bare metal gaming with Mantle working well. (At least until they decided to drop support for it.) DX 12 will become the defacto standard but, that will simply take time and Nvidia does not have any hardware really ready for it either.
AMD hardware does have features to support asynchronous threading better but if you start dumping massive amount of threads to the GPU and it does not have the performance to handle it - the point becomes mute.

Balance arch per what kind of loading. AMD does do DX 12 better in general but it does not necessarily give massive increases in results over Nvidia. The largest impact I've seen has been with Doom Vulcan and using intrinsic coding for AMD hardware - that makes my Nano basically equal to my 1070 in Doom. Yeah pretty cool but someone with just a 1070 would not be left behind and the extra 4gb of ram allows the 1070 to use all the nightmare settings which I cannot do with my Nano.

Nvidia does DX 12 but does in general have poorer performance then DX 11 which I hope will improve via developers understanding the hardware better, better drivers and updates to DX 12. Plus Nvidia helping the developers out with good tools and libraries which I expect they will do will also help.
 
MDolenc said:
I don't think you appreciate just what's going on.

1.6GHz-ish 8 core APU on a console vs. 3GHz-ish 4 core CPU on a PC is a whole different ballpark for starters.

There simply are no games out there that would push into the realm where DX12 truly shine and there are very good reasons for that! Developers have been dealing with the "batch batch batch" philosophy for literally decades. They are used to it and it's not like anyone is going to throw away the tool chain behind it just because they can (for no benefit in DX12 and huge cost in DX11). You can push seriously into this direction but then no sane developer will even make a DX11 version because it simply won't be usable. Say DX12 version runs at 60FPS and DX11 runs at 20FPS. That's entirely doable what would be the point in spending resources for a DX11 port of such a game though? To make a few forum enthusiasts happy? Developers already know that, there are tech demos for that. A good example of this direction is Ashes of Singularity. How playable is it in DX11? And you can check the development for way back when DX12 was not even on the drawing board? It's still not close to the limits of what DX12 allows.


When talking about this you also need to take note about what are we supposed to use all those draw calls for. 10000 player first person shooter? Yeah graphically I think we could handle that. Everything else will crumble. Space invaders with 500000 independently moving uniquely shaped, uniquely textured asteroids on screen at once? You could probably still do it with some effort on DX11.


If you're reading what developers are talking about around here I'd say it's pretty obvious that's not the direction that's going to be taken. There's a lot more talk about graphics pipelines where basically GPU feeds itself or advanced on GPU culling techniques for example. This approaches are again something that fits DX12 much much better and is again something that we probably are not going to see back ported to DX11 just so that we could make a comparison between the two APIs and be amazed by the awesome speedup.


I'd also like to point out that calling out immature drivers is a seriously slippery slope here... There's sort of a pinky promise here that drivers should be lite! They should not do a whole bunch of background analysis about what application is trying to do and then rearrange stuff to better fit hardware like they do in DX11. Which then causes head aches for game developers when pipeline stalls out of the blue. It's not guaranteed to stay this way.

Basically if DX11 runs better then DX12 then driver does a better job then game developers. Which after 7 years since DX11 was publicly released I'm freaking amazed anyone finds surprising. If it's the other way around then game developers are doing a better job then the drivers. There are no magic "DX12 instructions". There is async compute that can help if hardware can make use of it but it's in no way magic.
 
AMD hardware does have features to support asynchronous threading better but if you start dumping massive amount of threads to the GPU and it does not have the performance to handle it - the point becomes mute.

Balance arch per what kind of loading. AMD does do DX 12 better in general but it does not necessarily give massive increases in results over Nvidia. The largest impact I've seen has been with Doom Vulcan and using intrinsic coding for AMD hardware - that makes my Nano basically equal to my 1070 in Doom. Yeah pretty cool but someone with just a 1070 would not be left behind and the extra 4gb of ram allows the 1070 to use all the nightmare settings which I cannot do with my Nano.

Nvidia does DX 12 but does in general have poorer performance then DX 11 which I hope will improve via developers understanding the hardware better, better drivers and updates to DX 12. Plus Nvidia helping the developers out with good tools and libraries which I expect they will do will also help.

The problem with Nvidia's present day hardware is it is really nothing more than a die shrunk and overclocked architecture that is essentially the same, nothing has really changed. Therefore, DX 12 is not going to be working well on Nvidia cards at present since they made their hardware work more specifically with DX 11 and their Gameworks software.
 
  • Like
Reactions: N4CR
like this
The problem with Nvidia's present day hardware is it is really nothing more than a die shrunk and overclocked architecture that is essentially the same, nothing has really changed. Therefore, DX 12 is not going to be working well on Nvidia cards at present since they made their hardware work more specifically with DX 11 and their Gameworks software.

You can pretty much say the same about GCN 1.0 to GCN 1.3. Maxwell to Pascal may even have given more than GCN 1.0 to GCN 1.3 ever did.

But that doesn't make your claim right in terms of DX11 and DX12.
 
The problem with Nvidia's present day hardware is it is really nothing more than a die shrunk and overclocked architecture that is essentially the same, nothing has really changed. Therefore, DX 12 is not going to be working well on Nvidia cards at present since they made their hardware work more specifically with DX 11 and their Gameworks software.


Look man you don't know the differences between Maxwell and Pascal, making statements like that, just shows that. How many discussions have you read about pascal maxwell, gcn, here by me and others? Did any of that sink in? What I see is a person that has over 6000 posts yet he doesn't understand the basic fundamentals of what these chips are about. And you are an enthusiast? I expect enthusiasts at least pick up a few tid bits of knowledge on their own let alone have to read someone elses posts.

People that post like you are not enthusiasts, they are just here for the marketing and fan fair, its like the difference between a person that has a capability to get a 100k + car vs a person that wants one but can never have one because they don't have the discipline to make enough money to get one. The person that can get that type of vehicle, knows everything about that vehicle, more then the sales man. The guy that doesn't have the money to get one, just looks at it and wishes about getting one because it looks cool.

What an enthusiast should be a person that wants to know everything about a product, inside and out, and not person that reads marketing crap and talks about that and only that. Just because someone said something doesn't make it true. Yeah the entire Pascal is the same as Maxwell, comes from one point which I will talk about later on in this post.

If you don't care to understand things that are so basic, why even bother posting about them? Are you and enthusiast or one of the sheep?

GCN 1.2 to 1.3 is much closer in features than Maxwell to Pascal. You do know that right?

Pascal's shader array is very similar to Maxwell, that is it, everything else has changes which accommodate features that they had issues with Maxwell and then some. The reason where this whole thing about Pascal = Maxwell thing came about is because of its shader array, there were slides from nV for a Japanese tech conference where nV states this, close to 6 months before Pascal's release. Everyone knew the shader array wasn't going to change too much at that point, but the shader array isn't the whole chip!

This is probably also why AMD might have relaxed with their speculation of what Pascal might be because like you they thought its just Maxwell on 16nm, well that didn't happen and because of AMD's inability to really fix their power issues so far, they are getting hurt because of it. They are further behind now with Pascal then they were with Maxwell, things like that don't happen if nV didn't make changes going from Maxwell to Pascal!
 
Last edited:
So why has Nvidia often lost ground on DX12 then? Game ready!
Nvidia will be stuck on DX11 in 10 years time?
 
So why has Nvidia often lost ground on DX12 then? Game ready!
Nvidia will be stuck on DX11 in 10 years time?


If DX12 paths don't even come up to what nV's DX11 paths can do what does that tell us lol, keep in mind DX11 can only use 1 CPU core. Something is terrible wrong because the easy low hanging fruit of getting more cores involved isn't even helping.... That should be a boost on every hardware out there, even Intel IGP.

And come to think of it only game evolved games have nV ever lost ground in DX12 and when they are already loosing in DX11 in many of those game evolved titles, well its easy to see what is the possibility. Its a game evolved title, not a biggy AMD is up to the same tricks as nV was in the past with Gameworks, this is nothing new.

People will not except that both of these companies do the same things, there is no company that is better than another, its just business and how you get ahead. AMD should have do more of it sooner but for what ever reason, most likely money wasn't put into the dev rel's hand to support it.

Keep this in mind BF1 developer DICE was one of the first developers (and whom have some of the most experienced developers when it comes to LLAPI's) to come forward and push LLAPI's but yet we see horrible results from their DX12 path with frame times. BF1 should not be a port either from my understanding, the Frostbite engine as I stated before, doesn't compete with the other major engines on the market because its not as easy to work with, there are features that quite frankly make the engine hard to work with.
 
Last edited:
So why has Nvidia often lost ground on DX12 then? Game ready!
Nvidia will be stuck on DX11 in 10 years time?

They still need to build a new gpu architecture that is fully compatible with DX 12. What they have now is not it but then again, they can also not use the gameworks stuff to get there either, which makes for a more level playing field.
 
They still need to build a new gpu architecture that is fully compatible with DX 12. What they have now is not it but then again, they can also not use the gameworks stuff to get there either, which makes for a more level playing field.


Oh you will be suprised they can still increase tesselation and it will still hurt AMD architecture even polaris, as we know polaris only got to gtx 960 level of polygon through put which is still a gen behind (more than a gen behind because P10 is the one I'm talking about that reaches 960 levels of polygon through put and it is supposed be a card that performs like a 970 or a bit above)

There is no such thing as they don't have compatibility with DX12 with anything above Kepler, Fermi shouldn't even have a problem actually. So again are you an enthusiast or one of the sheep, because these concepts are basic to understand.

Can you give me reasons why you think they aren't "Fully Compatible" and I want explanations for what you write, indepth with concepts that actually make sense, pretty much go from a programmers concepts (high level so you don't need to know programming to do this) and simplify it so the layman can understand. You can go through the many posts about LLAPI's and DX12 concerning GCN, Maxwell, Pascal if you like and paraphrase if you don't want to do the leg work. And please not just linking to others posts, I would like it in your own words with concepts that you think fit what you are trying to explain. Links are welcome as supporting evidence but they must be by reputable people that are in the industry and or programmers, that can easily be recognized. That way there will be no misunderstanding of what you are talking about.
 
Last edited:
Oh you will be suprised they can still increase tesselation and it will still hurt AMD architecture even polaris, as we know polaris only got to gtx 960 level of polygon through put which is still a gen behind (more than a gen behind because P10 is the one I'm talking about that reaches 960 levels of polygon through put and it is supposed be a card that performs like a 970 or a bit above)

There is no such thing as they don't have compatibility with DX12 with anything above Kepler, Fermi shouldn't even have a problem actually. So again are you an enthusiast or one of the sheep, because these concepts are basic to understand.

Can you give me reasons why you think they aren't "Fully Compatible" and I want explanations for what you write, indepth with concepts that actually make sense, pretty much go from a programmers concepts (high level so you don't need to know programming to do this) and simplify it so the layman can understand. You can go through the many posts about LLAPI's and DX12 concerning GCN, Maxwell, Pascal if you like and paraphrase if you don't want to do the leg work. And please not just linking to others posts, I would like it in your own words with concepts that you think fit what you are trying to explain. Links are welcome as supporting evidence but they must be by reputable people that are in the industry and or programmers, that can easily be recognized. That way there will be no misunderstanding of what you are talking about.
So are we all casually just saying game works was nothing but nvidia bullying the market? I remember when everyone was trying to deny it had an impact. Oh how times have changed.
 
Last edited:
They still need to build a new gpu architecture that is fully compatible with DX 12. What they have now is not it but then again, they can also not use the gameworks stuff to get there either, which makes for a more level playing field.

I dont think you know what you talk about again. It is fully compatible with DX12 and DX12.1 for that matter. If you think Maxwell and Pascal for example isn't fully DX12 compatible. Then Polaris and previous versions isn't either.
 
So are we all casually just saying game works was nothing but nvidia bullying the market? I remember when everyone was trying to deny it had an impact. Oh how times have changed.


hey if someone has an advantage they should take that advantage, I have always stated AMD should get off their ass and do the same things nV did. That is business, you are in it to win it, and if you can't see and take advantage of opportunities you shouldn't be in business in the first place.

Gameworks never was used to "bully" the market, the devs decided to use, it could be turned off by end users. That is all that matters. Its not like anyone twisted the dev's arm to use it.....

Ironiclly AMD has gone one step further than nV where you can't turn off or change paths in these API's, so go figure they learned! Good for them.
 
Last edited:
Back
Top