Looking forward to the comparison between the 8370 and similarly priced Intel chips. Looks like the Skylake equivalent price-wise is the i5 6500. The 6600K looks to be about $50.00 more.
 
Looking forward to the comparison between the 8370 and similarly priced Intel chips. Looks like the Skylake equivalent price-wise is the i5 6500. The 6600K looks to be about $50.00 more.

I would actually like to see how it performs across last 3 generations of Intel chips ;) .
 
What I mean is ,test any dual gpu setup with multiadapter and compare cpu bound scenarios, and also with single gpu compare if asynchronous shaders has some benefit on cpu bound scenarios vs gpu bound
As everyone said. Read the article.
Article said:
It would seem to me that while AotS is being held up as the go-to Async Compute game, it is far from it. However, going back to what the developer said, it would seem that Asynch Compute advantages might be much better realized under CrossFire and possibly SLI on the PC. We will test that in the future hopefully.
...This testing has been somewhat harrowing for me over the last couple of weeks. I have spent literally a week dealing with issues with the AotS benchmark. After I had put this article together, I felt as though it would benefit from some added GPUs. Going back and building on my benchmarks, the AotS benchmark started crashing. New driver loads, new OS images, new OS installs, new game installs would NOT fix my issues. I changed all hardware and still had issues. When I moved the system over to an Intel CPU based system, all my issues went away. I do not have an explanation for this, I am just explaining that I wanted to have more AMD CPU based data on this, but it ended up being impossible for me. I will have a follow up article using a Haswell-E system with more GPUs as well.

Since you probably won't read that quote.... This is an AMD CPU only article. He wanted to do it, but couldn't do to technical issues that didn't get resolved. These issues didn't happen with the Intel setup that's being tested for a follow up article. The information you want is coming soon.
 
Last edited:
This is an idle curiosity question triggered by the article, but not one I expect to have been covered in the article:

Is there any intelligence in the DX12 API that recognizes whether or not a DX12-compliant
GPU is present?
In other words, does DX12 use different sub-routines/program calls when it comes to the CPU-related programming if it sees a DX12 GPU, or do features like async compute rely on program calls alone and ignore any GPU-related information?

As I am pretty ignorant in the world of programming, please excuse me if my terminology is not accurate.
 
This is an idle curiosity question triggered by the article, but not one I expect to have been covered in the article:

Is there any intelligence in the DX12 API that recognizes whether or not a DX12-compliant
GPU is present?
In other words, does DX12 use different sub-routines/program calls when it comes to the CPU-related programming if it sees a DX12 GPU, or do features like async compute rely on program calls alone and ignore any GPU-related information?

As I am pretty ignorant in the world of programming, please excuse me if my terminology is not accurate.
If I remember my DX programming correctly the game needs to be programmed to follow a separate target rendering path based on the video card's capabilities, but this was with DX11 and earlier. DirectX has "get" functions that will pull a standardized string that tells you what the hardware capabilities of the present video card are, and then you would branch it into the appropriate rendering path, or flag certain methods to be used or not. I have not delved into DX12 to see how it works in this regard.
 
I like how a feature that gives us a boost in FPS and no expense in quality is looked at as trivial...
 
Really liked the article and i agree, it has potential, now the thing is if developers will be able to extract the performance from it.

The part that you explain that Async isn't the main performance generator for AMD is quite interesting, makes me wonder if AMD wouldn't have shown us such strong performance on dx11 with a better driver development budget than they had, but well, hindsight is 20/20

So AMD had no budget on DX11, but does on DX12. And Nvidia had a huge budget on DX11, but does not on DX12?
Give your head a shake would ya.
How about AMD hardware at this time is better suited to DX12, and Nvidia's better suited to DX11.
Make sense?
 
So AMD had no budget on DX11, but does on DX12. And Nvidia had a huge budget on DX11, but does not on DX12?
Give your head a shake would ya.
How about AMD hardware at this time is better suited to DX12, and Nvidia's better suited to DX11.
Make sense?

It is simple Nvidia might not want (or can) to do anything about the current situation seeing that most games are still DX11. DX12 however is not the same as DX11 with DX11 you can get a huge difference if you optimize drivers and under DX12 the game developers are doing the optimizations.

That allows bugs to be fixed by developers instead of having to wait for a new driver that interprets the code better from which ever developer ...
 
So AMD had no budget on DX11, but does on DX12. And Nvidia had a huge budget on DX11, but does not on DX12?
Give your head a shake would ya.
How about AMD hardware at this time is better suited to DX12, and Nvidia's better suited to DX11.
Make sense?

Don't be so aggressive, i am pro amd but the things are exactly like that.

DX11 had lots of baggage since it had to support all the older versions of the API, that baggage makes a code hell that is almost impossible to clean up without massive resources.

DX12's clean slate based from AMD's Mantle let's them do what wasn't economically viable on DX11, simple as that.

I am not saying anything against their hardware, but it does show that now they can finally use it efficiently and it was widely known that in DX11 under cpu constraints they weren't even close to Nvidia. I didn't like it, since i have been using AMD for quite a bit, but that is how it was... This also explains why new driver iterations kept improving AMD performance way beyond what new drivers usually improved team green's performance.
 
It is simple Nvidia might not want (or can) to do anything about the current situation seeing that most games are still DX11. DX12 however is not the same as DX11 with DX11 you can get a huge difference if you optimize drivers and under DX12 the game developers are doing the optimizations.

That allows bugs to be fixed by developers instead of having to wait for a new driver that interprets the code better from which ever developer ...


Yeah and the repeated posts in the past that nV pays devs to code for their cards aligns with this sentiment? Flipping things to fit your needs when you post doesn't make an argument correct.
 
Yeah and the repeated posts in the past that nV pays devs to code for their cards aligns with this sentiment? Flipping things to fit your needs when you post doesn't make an argument correct.

You mean Nvidia can't pay developers to put in code that will struggle on AMD cards ?
 
Since AoTS is part of the article, I'll post this here. It can move it to its own thread if needed.

I think this is an absolutely Great set of tweets from Dan Baker on the subject of drivers, I'll post a couple tweet here :

Dan Baker– ‏@dankbaker
I'd like to go on the record that if you need a new driver for launch day, the driver and API are busted. #SpecialDriverNotNeeded

And

Dan Baker
Interesting, I have never shipped a game w/o driver workarounds

Rest of the big tweet chain :

Twitter
 
The whole problem with Dan's statement is nothing ever just works on all hardware. How many times have you coding things that are equally optimized for AMD and Intel CPU's? I can tell you from my experience NEVER!.

Things don't always just work because architecture is different Pieter3dnow can shove shit one way or another to make his argument about who pays who and what is being done, but that is all it is paint over two pigs face, both AMD and nV do the same shit over and over, for one to ignore that fact is just being blind.

And Dan Barker has a very skewed view of things, if he thinks that. Yeah I agree if things aren't programmed right this is the only time there were issues with the API and driver intervention had to fix those issues. Again that is on the side of the developer......

And this article clearly shows a huge driver over head for nV in DX12 AOTS, and minimal performance increase for Aysnc Shaders. That driver overhead seems to be in DX11 too for nV, what does that tell us? Something is screwed up with the program as in traditionally nV's DX11 path shouldn't have that much driver overhead.
 
Last edited:
First since we are looking at semi-close-to-metal API/DX12 driver involvement is far smaller as the game is communicating directly with the hardware.

Second if it works/runs that means it was coded for that hardware, but doesn't necessarily speak to how efficiently or optimally.

Third, that Nvidia driver overhead in DX12 is likely the driver coding to make async work not necessarily an optimization issue nor speaking to driver involvement as being as necessary as in DX11 nor anywhere near as important.
 
And your second sentence this is shown with this game it shows driver overhead of DX12 when async isn't active on nV drivers. Its not optimially programmed for nV.

But then your third sentence that is not true because when async shaders aren't active we still see the driver (CPU) overhead in both DX11 and DX12 paths.
 
And your second sentence this is shown with this game it shows driver overhead of DX12 when async isn't active on nV drivers. Its not optimially programmed for nV.
But then your third sentence that is not true because when async shaders aren't active we still see the driver (CPU) overhead in both DX11 and DX12 paths.

Nvidia still working on programming the Async driver for 900 series ? Nvidia was quick to ask the devs to not allow Async when they released the demo weren't they or did you need a link for that.
I'm somewhat baffled the time it takes for Nvidia to get it done...
 
First since we are looking at semi-close-to-metal API/DX12 driver involvement is far smaller as the game is communicating directly with the hardware.

Second if it works/runs that means it was coded for that hardware, but doesn't necessarily speak to how efficiently or optimally.

Third, that Nvidia driver overhead in DX12 is likely the driver coding to make async work not necessarily an optimization issue nor speaking to driver involvement as being as necessary as in DX11 nor anywhere near as important.

Sounds like Nvidia just needs to write a better driver. I bet that they have a better driver in the works and it will get released exclusively with their new video cards. The current cards match up to the AMD offerings close enough so it doesn't matter if they get a boost or not. To be exact it would be better if the new cards exclusively receive the boost to performance for better sales.
 
Nvidia still working on programming the Async driver for 900 series ? Nvidia was quick to ask the devs to not allow Async when they released the demo weren't they or did you need a link for that.
I'm somewhat baffled the time it takes for Nvidia to get it done...

And you are an expert at driver development? I'm not and I'm not going to fathom upon the difficulty to write a good driver. But I will say we haven't seen that type of driver overhead in DX12 or 11 in any other DX12 game (without Aysnc So we can leave async out of this conversation) and definitely not any DX11 game thus far. AOTS is the only game that exhibits this phenomenon.
 
And your second sentence this is shown with this game it shows driver overhead of DX12 when async isn't active on nV drivers. Its not optimially programmed for nV.

But then your third sentence that is not true because when async shaders aren't active we still see the driver (CPU) overhead in both DX11 and DX12 paths.
Then it is more likely proof of architecture limitations, serial opposed to parallel.
 
what, this is without async, what are you smoking? We aren't talking about that. What does serial and parallel have to do with CPU overhead?

lets go back to graphics 101 about parallel path execution?
 
I'm really interested to see the X99 results, especially at lower clock. Just put together an e5-2670 system, and DX12 could really make it have a longer life then it ever should have. Shame you have to get on W10 for it though....
 
wer cores will erase the advantage of DX12, at least the main advantage reported by whoever in that fps comparison 42 to 60, a 43% jump.

The one thing that scenario might point out, if tested on a Fury-X since those are scaling up with DX12, is how much of the performance is coming from the extra cores, and how much of the performance is coming from elsewhere in DX12.

That actually could be interesting.
Since AoTS is part of the article, I'll post this here. It can move it to its own thread if needed.

I think this is an absolutely Great set of tweets from Dan Baker on the subject of drivers, I'll post a couple tweet here :



And



Rest of the big tweet chain :

Twitter
Thanks for linking was interesting..
Also worth noting Andrew Lauritzen's responses as well, who is heavily involved with the graphics side in Intel and great knowledge on DX12.
Cheers
 
I'm really interested to see the X99 results, especially at lower clock. Just put together an e5-2670 system, and DX12 could really make it have a longer life then it ever should have. Shame you have to get on W10 for it though....

Win 10 is really cheap though. You can upgrade from Win 7or 8 for free or buy a copy for cheap from Kinguin. My Canadian buddies who won't change socks because they hate change so much just got new Win 10 PCs and love them and the OS.
 
Thanks for linking was interesting..
Also worth noting Andrew Lauritzen's responses as well, who is heavily involved with the graphics side in Intel and great knowledge on DX12.
Cheers

Yes, there's actually a lot more to that thread than what I mentioned, but I figured only those really interested would bother to read all of it, and understand what is talked about ;)
 
I like how a feature that gives us a boost in FPS and no expense in quality is looked at as trivial...

You do make a valid point. One thing to also take into account is that the small fps boost you are seeing may get better, but for now with only a few DX12 titles it's difficult to tell, it will probably change. Its going to be a popcorn fest the day we start to see some real benchmarks on these new cards!

Great article, btw, lots of things I wondered about answered.
 
got around to throwing my FX-4100 back into my system. so here's a compairison between a FX-4100 @ 4560 and a FX-8120 @4560
note: do to the glitching with the built-in AA I have it disabled and 2X msaa enabled in Crimson settings for all tests.

FX-4100 DX11 1440p high
== Hardware Configuration ================================================
GPU 0: AMD Radeon R9 200 Series 280x @ 1175/1750
CPU: AuthenticAMD
AMD FX(tm)-4100 Quad-Core Processor @ 4560
Physical Cores: 2
Logical Cores: 4
Physical Memory: 16300 MB
Allocatable Memory: 134217727 MB
==========================================================================


== Configuration =========================================================
API: DirectX
==========================================================================
Quality Preset: Custom
==========================================================================

Resolution: 2560x1440
Fullscreen: True
Bloom Quality: High
PointLight Quality: High
Glare Quality: Low
Shading Samples: 8 million
Terrain Shading Samples: 8 million
Shadow Quality: Mid
Temporal AA Duration: 0
Temporal AA Time Slice: 0
Multisample Anti-Aliasing: 1x note: 2x enabled in crimson settings
Texture Rank : 1


== Total Avg Results =================================================
Total Time: 60.003075 ms per frame
Avg Framerate: 23.357775 FPS (42.812298 ms)
Weighted Framerate: 22.989410 FPS (43.498287 ms)
Average Batches per frame: 12919.730469 Batches
==========================================================================


== Results ===============================================================
BenchMark 0
TestType: Full System Test
== Sub Mark Normal Batch =================================================
Total Time: 70.875992 ms per frame
Avg Framerate: 36.020657 FPS (27.761847 ms)
Weighted Framerate: 35.321518 FPS (28.311354 ms)
Average Batches per frame: 4511.345215 Batches
== Sub Mark Medium Batch =================================================
Total Time: 55.932816 ms per frame
Avg Framerate: 26.513950 FPS (37.715992 ms)
Weighted Framerate: 26.056526 FPS (38.378101 ms)
Average Batches per frame: 9463.808594 Batches
== Sub Mark Heavy Batch =================================================
Total Time: 53.200409 ms per frame
Avg Framerate: 15.883338 FPS (62.959061 ms)
Weighted Framerate: 15.672650 FPS (63.805416 ms)
Average Batches per frame: 24784.037109 Batches
=========================================================================

FX-4100 DX12 1440p high
== Hardware Configuration ================================================
GPU 0: AMD Radeon R9 200 Series 280x @ 1175/1750
CPU: AuthenticAMD
AMD FX(tm)-4100 Quad-Core Processor @ 4560
Physical Cores: 2
Logical Cores: 4
Physical Memory: 16300 MB
Allocatable Memory: 134217727 MB
==========================================================================


== Configuration =========================================================
API: DirectX 12
==========================================================================
Quality Preset: Custom
==========================================================================

Resolution: 2560x1440
Fullscreen: True
Bloom Quality: High
PointLight Quality: High
Glare Quality: Low
Shading Samples: 8 million
Terrain Shading Samples: 8 million
Shadow Quality: Mid
Temporal AA Duration: 0
Temporal AA Time Slice: 0
Multisample Anti-Aliasing: 1x note: 2x enabled in crimson settings
Texture Rank : 1


== Total Avg Results =================================================
Total Time: 60.008785 ms per frame
Avg Framerate: 37.756954 FPS (26.485186 ms)
Weighted Framerate: 37.429989 FPS (26.716545 ms)
CPU frame rate (estimated if not GPU bound): 39.071167 FPS (25.594322 ms)
Percent GPU Bound: 18.742697 %
Driver throughput (Batches per ms): 4707.592773 Batches
Average Batches per frame: 13673.628906 Batches
==========================================================================


== Results ===============================================================
BenchMark 0
TestType: Full System Test
== Sub Mark Normal Batch =================================================
Total Time: 70.924339 ms per frame
Avg Framerate: 41.086037 FPS (24.339169 ms)
Weighted Framerate: 40.758305 FPS (24.534878 ms)
CPU frame rate (estimated if not GPU bound): 42.475876 FPS (23.542774 ms)
Percent GPU Bound: 13.546737 %
Driver throughput (Batches per ms): 3250.619385 Batches
Average Batches per frame: 4897.917480 Batches
== Sub Mark Medium Batch =================================================
Total Time: 56.031715 ms per frame
Avg Framerate: 38.620983 FPS (25.892660 ms)
Weighted Framerate: 38.299553 FPS (26.109966 ms)
CPU frame rate (estimated if not GPU bound): 39.926579 FPS (25.045973 ms)
Percent GPU Bound: 19.844965 %
Driver throughput (Batches per ms): 4441.711914 Batches
Average Batches per frame: 9828.972656 Batches
== Sub Mark Heavy Batch =================================================
Total Time: 53.070293 ms per frame
Avg Framerate: 34.218769 FPS (29.223728 ms)
Weighted Framerate: 33.892799 FPS (29.504793 ms)
CPU frame rate (estimated if not GPU bound): 35.468262 FPS (28.194221 ms)
Percent GPU Bound: 22.836386 %
Driver throughput (Batches per ms): 5608.524902 Batches
Average Batches per frame: 26293.996094 Batches
=========================================================================

FX-8210 1440p high
== Hardware Configuration ================================================
GPU 0: AMD Radeon R9 200 Series 280x @ 1175/1750
CPU: AuthenticAMD
AMD FX(tm)-8120 Eight-Core Processor @ 4560
Physical Cores: 4
Logical Cores: 8
Physical Memory: 16300 MB
Allocatable Memory: 134217727 MB
==========================================================================


== Configuration =========================================================
API: DirectX
==========================================================================
Quality Preset: Custom
==========================================================================

Resolution: 2560x1440
Fullscreen: True
Bloom Quality: High
PointLight Quality: High
Glare Quality: Low
Shading Samples: 8 million
Terrain Shading Samples: 8 million
Shadow Quality: Mid
Temporal AA Duration: 0
Temporal AA Time Slice: 0
Multisample Anti-Aliasing: 1x 2x enabled in crimson settings
Texture Rank : 1


== Total Avg Results =================================================
Total Time: 60.010567 ms per frame
Avg Framerate: 26.606812 FPS (37.584362 ms)
Weighted Framerate: 26.064182 FPS (38.366829 ms)
Average Batches per frame: 12753.724609 Batches
==========================================================================


== Results ===============================================================
BenchMark 0
TestType: Full System Test
== Sub Mark Normal Batch =================================================
Total Time: 70.858635 ms per frame
Avg Framerate: 44.638176 FPS (22.402349 ms)
Weighted Framerate: 43.311378 FPS (23.088621 ms)
Average Batches per frame: 4438.385742 Batches
== Sub Mark Medium Batch =================================================
Total Time: 56.008324 ms per frame
Avg Framerate: 30.709721 FPS (32.562977 ms)
Weighted Framerate: 30.103165 FPS (33.219101 ms)
Average Batches per frame: 9447.793945 Batches
== Sub Mark Heavy Batch =================================================
Total Time: 53.164730 ms per frame
Avg Framerate: 17.304707 FPS (57.787746 ms)
Weighted Framerate: 17.008896 FPS (58.792763 ms)
Average Batches per frame: 24374.990234 Batches
=========================================================================

FX-8120 DX12 1440p high
== Hardware Configuration ================================================
GPU 0: AMD Radeon R9 200 Series 280x @ 1175/1750
CPU: AuthenticAMD
AMD FX(tm)-8120 Eight-Core Processor @ 4560
Physical Cores: 4
Logical Cores: 8
Physical Memory: 16300 MB
Allocatable Memory: 134217727 MB
==========================================================================


== Configuration =========================================================
API: DirectX 12
==========================================================================
Quality Preset: Custom
==========================================================================

Resolution: 2560x1440
Fullscreen: True
Bloom Quality: High
PointLight Quality: High
Glare Quality: Low
Shading Samples: 8 million
Terrain Shading Samples: 8 million
Shadow Quality: Mid
Temporal AA Duration: 0
Temporal AA Time Slice: 0
Multisample Anti-Aliasing: 1x 2x enabled in crimson settings
Texture Rank : 1


== Total Avg Results =================================================
Total Time: 60.007462 ms per frame
Avg Framerate: 41.143223 FPS (24.305340 ms)
Weighted Framerate: 40.456188 FPS (24.718096 ms)
CPU frame rate (estimated if not GPU bound): 55.626465 FPS (17.977055 ms)
Percent GPU Bound: 88.015099 %
Driver throughput (Batches per ms): 4276.850586 Batches
Average Batches per frame: 13625.298828 Batches
==========================================================================


== Results ===============================================================
BenchMark 0
TestType: Full System Test
== Sub Mark Normal Batch =================================================
Total Time: 70.998451 ms per frame
Avg Framerate: 48.747543 FPS (20.513855 ms)
Weighted Framerate: 47.691422 FPS (20.968132 ms)
CPU frame rate (estimated if not GPU bound): 62.852825 FPS (15.910184 ms)
Percent GPU Bound: 75.486008 %
Driver throughput (Batches per ms): 3377.241455 Batches
Average Batches per frame: 4823.318359 Batches
== Sub Mark Medium Batch =================================================
Total Time: 55.999081 ms per frame
Avg Framerate: 42.714985 FPS (23.410988 ms)
Weighted Framerate: 41.889503 FPS (23.872330 ms)
CPU frame rate (estimated if not GPU bound): 59.831779 FPS (16.713526 ms)
Percent GPU Bound: 90.587875 %
Driver throughput (Batches per ms): 4247.091797 Batches
Average Batches per frame: 9874.906250 Batches
== Sub Mark Heavy Batch =================================================
Total Time: 53.024853 ms per frame
Avg Framerate: 34.493259 FPS (28.991171 ms)
Weighted Framerate: 34.113594 FPS (29.313828 ms)
CPU frame rate (estimated if not GPU bound): 46.931927 FPS (21.307457 ms)
Percent GPU Bound: 97.971405 %
Driver throughput (Batches per ms): 4732.761719 Batches
Average Batches per frame: 26177.669922 Batches
=========================================================================

it'll be sweet if this type of improvement can be had in other DX12 games!
 
I see going parallel with a cpu using multiple cores/threads to a serial Gpu like Nvidia design will become limiting. That also does not mean that AMD parallel capable GPU's between graphics and compute is very effective either. In the long run as the cpu is taking fully advantage of; the GPU will become more and more limiting. Hence CFX and SLI DX12 should do better using Async Compute on the GPU to prevent stalls.
 
Those are some scary numbers Pendragon1. The medium results between DX12 4 core vs 8 core are bigger then the heavy benchmarks where both cpu almost have the same numbers...
 
why is that scary? that was the point I was trying to make. DX12 is allowing lower end processors to perform better, much better. these numbers are not much lower than another users system running a Xeon X5670 with a 280x. so if the [email protected] is almost equal to a [email protected] that's almost equal to a [email protected].... am I wrong/confused or is that not good?
 
why is that scary? that was the point I was trying to make. DX12 is allowing lower end processors to perform better, much better. these numbers are not much lower than another users system running a Xeon X5670 with a 280x. so if the [email protected] is almost equal to a [email protected] that's almost equal to a [email protected].... am I wrong/confused or is that not good?

Because if you would ask anyone what processor to take they would still prefer anything over FX 4100 and with DX12 that changes (which is a pretty huge leap).
 
Last edited:
oh, so you mean scary as in good... gives budget minded and aging systems people hope!
 
So really in this instance it is showing that there is little to be gained over four cores in this senario. That is a great improvement for sure as there we obviously great bottle necks with DX11.
But why the lack of advantage above four cores? Could it be this games programming limited it? Could it be the first gen GCN card you are using for testing? Obviously the weakest of all GCN cards as far as DX12 features go.
 
It depends a little on how you are willing to view this. This is the complete engine when it was in demo form on Mantle it did allow scaling with extra cores with higher batch count. Now that they limited it that would not scale any more. In the presentation that was several years ago they let it go up to 100K batches. I'm sure that they went for something more mainstream then extremely high end to the people whom own high end stuff already. The same engine still functions under DX11.

You would have to ask Oxide for the exact details where they wanted Nitrous engine to be that there is more room is obvious. Seeing that when talking about Mantle they estimated that 300K batches in the next years would not be something special.

In the DX12 playground we still have to see what DICE is doing with Frostbyte. If DICE pushes DX12 you will get to see something that might even dwarf the Nitrous engine.
 
Back
Top