• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Async Compute!

Status
Not open for further replies.
I don't own RotTR, but DOOM runs at mid 90's at 2560 x 1440 on my 5930K stock clocks + Fury X with latest drivers at the same settings. :)
 
So are the devs lying, thats the only answer I really want to know.
In straight talk, like a yes or a no.
 
So are the devs lying, thats the only answer I really want to know.
In straight talk, like a yes or a no.

No, it's just that those gains are probably on console.

Async compute ALSO relieves CPU bottlenecks by the way, especially on shitty multi-core CPUs like the ones on consoles.

Generally speaking, you need to tweak the async compute load on a per gpu basis; depending on per-gpu characteristics (memory bandwidth per unit of compute throughput GB/s per FLOP/s for example) various combinations of of overlapping operations may or may not provide performance gains.

Imo, and you can grill me on this, the most general performance gain scenario for GCN is asynchronous compute while doing geometry work
 
So their is a gain on PC (as stated in one of the tweets posted by the OP) and a gain on consoles. So its a feature we should embrace, but we hate on it because?
 
No, it's just that those gains are probably on console.

Async compute ALSO relieves CPU bottlenecks by the way, especially on shitty multi-core CPUs like the ones on consoles.

Generally speaking, you need to tweak the async compute load on a per gpu basis; depending on per-gpu characteristics (memory bandwidth per unit of compute throughput GB/s per FLOP/s for example) various combinations of of overlapping operations may or may not provide performance gains.

Imo, and you can grill me on this, the most general performance gain scenario for GCN is asynchronous compute while doing geometry work

There's asynchronous draw calls and there's asyn compute. Draw calls allow you to process more things on the screen at once across multiple cores allowing an increase in the number of objects or processing effect. Compute takes a portion of the GPU stack which is typically used for rendering and uses it for raw math. How much they benefit the game in question is dependent on how the game is coded. But as seen with early Mantle titles, already powerful CPU's and Video cards didn't benefit much on early software stacks. The most benefit came when the card was powerful but the CPU was not, which was am imbalance of system design IMHO.
 
So their is a gain on PC (as stated in one of the tweets posted by the OP) and a gain on consoles. So its a feature we should embrace, but we hate on it because?
Someone hates on it? I mean it's probably a nice feature to extract performance out of poverty hardware that is current generation of consoles.

It's overrated on PCs, however.
 
So their is a gain on PC (as stated in one of the tweets posted by the OP) and a gain on consoles. So its a feature we should embrace, but we hate on it because?

1). Async compute is a programming paradigm, async shaders is an AMD-specific implementation, not the same thing
2). Async compute should only ever be used when there's an effective performance gain, if there isn't one and the developer insists on keeping it on, something is wrong
3). It's not the biggest thing since sliced bread

There's asynchronous draw calls and there's asyn compute. Draw calls allow you to process more things on the screen at once across multiple cores allowing an increase in the number of objects or processing effect. Compute takes a portion of the GPU stack which is typically used for rendering and uses it for raw math. How much they benefit the game in question is dependent on how the game is coded. But as seen with early Mantle titles, already powerful CPU's and Video cards didn't benefit much on early software stacks. The most benefit came when the card was powerful but the CPU was not, which was am imbalance of system design IMHO.


What do you mean by asynchronous drawcalls ? If there's a drawcall involved it's not on the compute queue at all

Are you talking about multithreading drawcalls at the driver level ?
 
So their is a gain on PC (as stated in one of the tweets posted by the OP) and a gain on consoles. So its a feature we should embrace, but we hate on it because?
We r being childish. We lack objectivity. We will argue just for the sake of arguing and on and on....:D
 
1). Async compute is a programming paradigm, async shaders is an AMD-specific implementation, not the same thing
2). Async compute should only ever be used when there's an effective performance gain, if there isn't one and the developer insists on keeping it on, something is wrong
3). It's not the biggest thing since sliced bread

Ok so I did not claim anything near what you implied in bullet #3.
You still have yet to tell me with a yes or no if those Devs are just lying about gaining perf on PCs.
Also, why focus on mocking it so much? You guys consistently talk condescendingly about a feature that could help in a decent number of scenarios according to these actual game Devs.
 
They aren't lying, but the work needed to get full utilization to gain that 10% or 15% is enormous. And you have to do that work for 3 actually 4 generations of AMD cards, 2 generations of nV cards, and the consoles lol. Multiplying the dev's work load 7 times if they want to get the best out of each hardware gen type. And each gen type utilization rates are different based on bottleneck shifts, Pascal and GCN 4.0 will see even less return because both have improved throughput.

Also you have the break down the gens based on different ratios based on ROP, TMU, ALU's, it gets pretty complex.
 
Ok so I did not claim anything near what you implied in bullet #3.
Never said you did.
You still have yet to tell me with a yes or no if those Devs are just lying about gaining perf on PCs.
No, it's just that those gains are probably on console.

Async compute ALSO relieves CPU bottlenecks by the way, especially on shitty multi-core CPUs like the ones on consoles.

Generally speaking, you need to tweak the async compute load on a per gpu basis; depending on per-gpu characteristics (memory bandwidth per unit of compute throughput GB/s per FLOP/s for example) various combinations of of overlapping operations may or may not provide performance gains.

Imo, and you can grill me on this, the most general performance gain scenario for GCN is asynchronous compute while doing geometry work

Also, why focus on mocking it so much? You guys consistently talk condescendingly about a feature that could help in a decent number of scenarios according to these actual game Devs.

I don't see how my posts can be misconstrued as being mocking, it makes no sense to mock a computing paradigm. If anything I may be mocking the view that the performance gains from async compute come from "beyond the hardware limitations", like the hardware is outputting more than what it should be possible of, magically.

If your code is 95% efficient on hardware X, then I will shit in my hat and eat it if you get more than 5% gains from async.

If Dan Baker (Oxide, AotS developer) says he got 20% performance gain from async, that means, it was only 83.33% efficient without async.

Here is the problem.

People see the async performance gains, and don't see them replicated on nvidia hardware, they decided that nvidia hardware is inferior.

If a 980Ti in DX11(or DX12 without async) matches a Fury X in DX12+async, all that says to me is that with async enabled the Fury X matches a 980Ti's efficiency in this particular title.
 
  • Like
Reactions: Nenu
like this
Hey If AMD and nV can go to every game and give them dev support for async compute, GREAT!, let them do it and spend the money on it, otherwise its going to be a wash, you will have some games working better on each others hardware and or dev's not paying particular attention to what they are doing.
 
What do you mean by asynchronous drawcalls ? If there's a drawcall involved it's not on the compute queue at all

Are you talking about multithreading drawcalls at the driver level ?

Yes, but it's part of the asynchronous behavior supported by Mantle and Vulkan and DX12. People need to make a distinction where performance increases are coming from and some of these performance increases are from the threaded draw calls.
 
Hey If AMD and nV can go to every game and give them dev support for async compute, GREAT!, let them do it and spend the money on it, otherwise its going to be a wash, you will have some games working better on each others hardware and or dev's not paying particular attention to what they are doing.

This is another thing worth thinking about NGFidel; the Hitman developer themselves stated that the work they had to put in to extract async compute gains from a range of hardware simply wasn't worth it.

They could have improved performance by the same amount, or more, universally; just by optimizing their shaders or something

Yes, but it's part of the asynchronous behavior supported by Mantle and Vulkan and DX12. People need to make a distinction where performance increases are coming from and some of these performance increases are from the threaded draw calls.

Oh. Yeah definitely. That's why AotS is the only game in which we can test, because we can do A;B testing of async on vs async off
 
Yes, but it's part of the asynchronous behavior supported by Mantle and Vulkan and DX12. People need to make a distinction where performance increases are coming from and some of these performance increases are from the threaded draw calls.
well that's why the max increases from async is around 10%, if GPU's were more than 10% underutilized something is very very wrong, either with program or with the GPU design. Cause even with fixed function shaders back in the day, we didn't see more than 20% under utilization unless the program was fubared.
 
They don't lie about async being useful on current console generation. As for Oxide guy, he slightly overrates the gain from Async in AotS, IMO. Nvidia GeForce GTX 1080 im Test (Seite 11)
Considering AotS has the ability to toggle Async now, I wish more sites would test it... It seems like an obvious thing to do. More people need to see this.
The whole shitstorm we've been following for nearly a year now is all based on... 10%. Which is also (roughly) the same amount of OC advantage Nvidia has over AMD.
 
Ok so I did not claim anything near what you implied in bullet #3.
You still have yet to tell me with a yes or no if those Devs are just lying about gaining perf on PCs.
Also, why focus on mocking it so much? You guys consistently talk condescendingly about a feature that could help in a decent number of scenarios according to these actual game Devs.
Could be a number of reasons they mock it. For one they own Nvidia hardware therefore has seen no benefit. Could also be that having no benefit while the competition does makes them fear any traction it may receive and thereby devaluing their cards performance rank.

The sad part is that the grand picture of DX12 multithread and async is nothing but positive for the community but because the greater share of the market can not presently benefit, they lash out and mock the very thing that could move gaming toward.
 
  • Like
Reactions: Zuul
like this
Never said you did.





I don't see how my posts can be misconstrued as being mocking, it makes no sense to mock a computing paradigm. If anything I may be mocking the view that the performance gains from async compute come from "beyond the hardware limitations", like the hardware is outputting more than what it should be possible of, magically.

If your code is 95% efficient on hardware X, then I will shit in my hat and eat it if you get more than 5% gains from async.

If Dan Baker (Oxide, AotS developer) says he got 20% performance gain from async, that means, it was only 83.33% efficient without async.

Here is the problem.

People see the async performance gains, and don't see them replicated on nvidia hardware, they decided that nvidia hardware is inferior.

If a 980Ti in DX11(or DX12 without async) matches a Fury X in DX12+async, all that says to me is that with async enabled the Fury X matches a 980Ti's efficiency in this particular title.
Your theory would be true If all hardware were the same and all code was 95% efficient. None of it is true in real world. Stop making an ass out of yourself.
 
Could be a number of reasons they mock it. For one they own Nvidia hardware therefore has seen no benefit. Could also be that having no benefit while the competition does makes them fear any traction it may receive and thereby devaluing their cards performance rank.

The sad part is that the grand picture of DX12 multithread and async is nothing but positive for the community but because the greater share of the market can not presently benefit, they lash out and mock the very thing that could move gaming toward.

everything about DX12 is great, outside of async, async creates a divergence of code, I can see it help out in specific instances, but it isn't a cure all, its a band aid for the underutilization problem, the best approach is a hardware solution where the developers don't have to worry about it. And we have already seen the divergence.
 
So because Nvidia doesnt gain/because nvidia can OC to match a vendor neutral feature then its suddenly not worth praise?
 
Considering AotS has the ability to toggle Async now, I wish more sites would test it... It seems like an obvious thing to do. More people need to see this.
The whole shitstorm we've been following for nearly a year now is all based on... 10%. Which is also (roughly) the same amount of OC advantage Nvidia has over AMD.

Reviewers/tech websites have been particularly annoying with regards to this, most reviews don't even mention whether async is enabled or not, and the default behavior has changed across game version. The release version had async disabled by default when it detected nvidia hardware, later versions no

And still the variance in results is just perplexing.

Look at this
screenshot-www computerbase de 2016-06-03 16-26-13.png


42.6 fps with no async

I get 59 @ 1480/7800

There's no way in hell a 280mhz overclock(23%) makes a 39% performance difference

I got 59 fps with 8xMSAA. Default is 4x for crazy preset.

So because Nvidia doesnt gain/because nvidia can OC to match a vendor neutral feature then its suddenly not worth praise?

It's not about OCing to match a feature NGFidel. It's about OCing to match the compute throughput on the higher specced AMD cards.

The Fury X destroy(ed) a stock 980Ti in AotS and people said it was because of async.

It's actually because of 6tflops vs 8.6tflops

The only meaningful comparison you can make, as far as async goes, is when both cards are rated at similar compute throughput
 
Could be a number of reasons they mock it. For one they own Nvidia hardware therefore has seen no benefit. Could also be that having no benefit while the competition does makes them fear any traction it may receive and thereby devaluing their cards performance rank.

The sad part is that the grand picture of DX12 multithread and async is nothing but positive for the community but because the greater share of the market can not presently benefit, they lash out and mock the very thing that could move gaming toward.


If it wasn't for me and Razor breaking your balls you wouldn't have a clue what async actually is.

Are you following my posts around the forum, waiting for someone to post something that is critical of me, or that disagrees with me, so you can reply to them with your usual "nvidia fanboys are just trying to ruin amd" tripe, pretending you're subtle, as if you're not actually replying to me directly?

I don't understand it. It's very infantile. Just reply to me directly. It's not like I'm some kind of monster that gobbles poor forum-goers up. I am perfectly reasonable when the people I'm talking to aren't talking 100% bullshit, and they know it.

Stop with your 'subtle', passive aggressive provocation justreason. If push comes to shove I'm just going to start bombarding you with your own forum posts, what are you going to do ? Start denying what you wrote in them ?
 
Last edited:
Reviewers/tech websites have been particularly annoying with regards to this, most reviews don't even mention whether async is enabled or not, and the default behavior has changed across game version. The release version had async disabled by default when it detected nvidia hardware, later versions no

And still the variance in results is just perplexing.

Look at this View attachment 3650

42.6 fps with no async

I get 59 @ 1480/7800

There's no way in hell a 280mhz overclock(23%) makes a 39% performance difference

I got 59 fps with 8xMSAA. Default is 4x for crazy preset.



It's not about OCing to match a feature NGFidel. It's about OCing to match the compute throughput on the higher specced AMD cards.

The Fury X destroy(ed) a stock 980Ti in AotS and people said it was because of async.

It's actually because of 6tflops vs 8.6tflops

The only meaningful comparison you can make, as far as async goes, is when both cards are rated at similar compute throughput

This simply points right back to drivers maturity. Which you tend not to believe whenever it suits your argument. How does one explain this huge disparity in performance. We fucking need Sherlock Holmes on that one ....lol
 
This simply points right back to drivers maturity. Which you tend not to believe whenever it suits your argument. How does one explain this huge disparity in performance. We fucking need Sherlock Holmes on that one ....lol

This guy too. He's so focused on breaking my balls that he doesn't even think about what he's typing.

That's a GTX 1080 review. From like, a week ago or something. Same fucking drivers.
 
This simply points right back to drivers maturity. Which you tend not to believe whenever it suits your argument. How does one explain this huge disparity in performance. We fucking need Sherlock Holmes on that one ....lol
Easy: 5820k@4.5Ghz against 6700k@4.5Ghz in CPU-heavy game. Seriously, all jokes about "playing AotS" aside, that game is CPU-heavy as all hell.
 
Easy: 5820k@4.5Ghz against 6700k@4.5Ghz in CPU-heavy game.

Possibly, [H] did two AotS reviews. The first one was the release day one, it was on a 6700k system.

Second one was on haswell-e

Fury X performance receded on haswell-e. 980Ti improved.

God knows why
 
I get you guys now,i still think it's a benefit but i can see where it may be over hyped.
 
I get you guys now,i still think it's a benefit but i can see where it may be over hyped.
We think it's a benefit too ! Just needs a lot of work for the benefit to translate into meaningful performance gains across a range of hardware.

Even in AotS different AMD cards benefit differently from async. Some get no benefit, some get 5%, some 15% etc etc.

The overclocking thing was a huge point of contention when I initially started posting my AotS results.
Tons of people said "you can't just OC to make up for the lack of a feature", but I'm not. It's a feature that's supposed to maximize the utilization of the shader array.

If your hardware can do 10 trillion operations per second, then async compute will help you maximize the use of the 10 trillion op/s

So if you start comparing a 980Ti with 6bn op/s with a Fury X at 8.6bn op/s, you're not talking about async compute at all, You're just "proving" that 8.6 is more than 6. No offense to anyone, but that's a pretty dull conclusion :p
 
Possibly, [H] did two AotS reviews. The first one was the release day one, it was on a 6700k system.

Second one was on haswell-e

Fury X performance receded on haswell-e. 980Ti improved.

God knows why
Easy: 5820k@4.5Ghz against 6700k@4.5Ghz in CPU-heavy game. Seriously, all jokes about "playing AotS" aside, that game is CPU-heavy as all hell.
You maybe right. From like a week ago. You don't know how long it took to wright. So stop acting like a little school girl. We all know what you are doing. This explains why some of the more mature and objective guys here have you on ignore.
 
The crazy thing is, now IHV's can create 3rd party libraries that are so tuned to their architecture, that the other IHV can't do anything about. This opens a can of worms, you think Tessellation with gameworks was bad, what if nV starts doing specific async routines and locks them up behind gameworks? Yeah the dev could have access to the code but AMD sure won't, its a double edged sword which is very dangerous. There will be on driver options to turn down async like they have with tessellation this time around.

Doesn't this sound like Intel's compiler game they played with AMD?
 
If you didn't care about game works then why care now. That mentality is already deeply seeded in this industry and it has been for almost a decade now.
 
You maybe right. From like a week ago. You don't know how long it took to wright. So stop acting like a little school girl. We all know what you are doing. This explains why some of the more mature and objective guys here have you on ignore.

More stupidity imhotep, you act like the driver version isn't included in the info of the test system

I don't speak German, but this was pretty easy to understand for me.

Als kleiner Hinweis an dieser Stelle sei erwähnt, dass ab sofort neben den Referenzkarten ebenso jeweils stellvertretend ein Partnerdesign mit einem anderen Kühler und höheren Taktraten zum Einsatz kommt. Als Treiber werden der GeForce 368.13 für die GeForce GTX 1080, der GeForce 364.72 für die restlichen Nvidia-Grafikkarten und der Crimson 16.4.2 für die AMD-Modelle verwendet. Sämtliche Benchmarks wurden nach dem Aufwärmen der Grafikkarte durchgeführt, um einem nur kurzzeitig sehr hohen Boost-Takt entgegen zu wirken.

See what I mean imhotep? I get a lot of flak all the time, people calling me an nvidia shill, a troll, etc etc...

Yet I'm the one sourcing my information, I'm the one explaining async in simple terms, I'm the one posting screenshots of reddit threads. I'm the one quoting Robert Hallock when JustReason decides to post outdated comments instead of just accepting that he was misinformed.

Basically I get a ton of shit, yet in the end I'm usually right. Not because I'm some kind of genius, but because I think before I type. You guys should try that sometime
 
The crazy thing is, now IHV's can create 3rd party libraries that are so tuned to their architecture, that the other IHV can't do anything about. This opens a can of worms, you think Tessellation with gameworks was bad, what if nV starts doing specific async routines and locks them up behind gameworks? Yeah the dev could have access to the code but AMD sure won't, its a double edged sword which is very dangerous. There will be on driver options to turn down async like they have with tessellation this time around.

Doesn't this sound like Intel's compiler game they played with AMD?
Uhm, it is literally Intel's compiler game except played by both IHVs this time.
 
We think it's a benefit too ! Just needs a lot of work for the benefit to translate into meaningful performance gains across a range of hardware.

Even in AotS different AMD cards benefit differently from async. Some get no benefit, some get 5%, some 15% etc etc.

The overclocking thing was a huge point of contention when I initially started posting my AotS results.
Tons of people said "you can't just OC to make up for the lack of a feature", but I'm not. It's a feature that's supposed to maximize the utilization of the shader array.

If your hardware can do 10 trillion operations per second, then async compute will help you maximize the use of the 10 trillion op/s

So if you start comparing a 980Ti with 6bn op/s with a Fury X at 8.6bn op/s, you're not talking about async compute at all, You're just "proving" that 8.6 is more than 6. No offense to anyone, but that's a pretty dull conclusion :p
The bigger the # of operations in hardware. The more chance for inefficiencies. part of why AMD was pushing for low level APi's like DX12, Vulcan starting with Mantle.
 
You maybe right. From like a week ago. You don't know how long it took to wright. So stop acting like a little school girl. We all know what you are doing. This explains why some of the more mature and objective guys here have you on ignore.


actually some of the guys that have him on ignore have me on ignore too, some of them don't like it when you show them solid evidence to what the say, because they know they can't say anything back, then they go down the road of name calling, and when we return to favor they put us on ignore lol. They have also put CSI PC on ignore as well. Ironically all of us actually have first hand knowledge of how these things work!
 
The crazy thing is, now IHV's can create 3rd party libraries that are so tuned to their architecture, that the other IHV can't do anything about. This opens a can of worms, you think Tessellation with gameworks was bad, what if nV starts doing specific async routines and locks them up behind gameworks? Yeah the dev could have access to the code but AMD sure won't, its a double edged sword which is very dangerous. There will be on driver options to turn down async like they have with tessellation this time around.

Doesn't this sound like Intel's compiler game they played with AMD?

Uhm, it is literally Intel's compiler game except played by both IHVs this time.

lol. true

Async is indeed dangerous from this point of view. One of the main reasons I was so outspoken against Oxide initially is because they proudly proclaimed they had NO IHV-SPECIFIC CODE. That is just insane to me. DX12 game, no ihv-specific code ? And they're happy about it ? wtf is going on

Then Dan Baker said there IS ihv-specific code; async is disabled by default when nvidia hardware is detected.
That's it.

actually some of the guys that have him on ignore have me on ignore too, some of them don't like it when you show them solid evidence to what the say, because they know they can't say anything back, then they go down the road of name calling, and when we return to favor they put us on ignore lol. They have also put CSI PC on ignore as well. Ironically all of us actually have first hand knowledge of how these things work!


lol then they tell me they have me on ignore like it's some kind of punishment for me. Like, now I don't have to keep explaining shit to them I'm supposed to be upset.


The bigger the # of operations in hardware. The more chance for inefficiencies. part of why AMD was pushing for low level APi's like DX12, Vulcan starting with Mantle.

Okay. But then why is it a 1500mhz 980Ti (8.4tflops) doesn't suffer from the same inefficiencies the Fury X does ? Fury X with async is outperformed by 1500mhz 980Ti without async.

That means the performance gain (from async) brings the Fury X to parity with 980Ti in terms of utilization

So why has it become the norm to claim Nv's lack of async is an inherent handicap in terms of dx12 performance ?

The way I see it, AMD took charge of the async issue, their marketing material married async compute and async shaders and presented them as one.

NVidia architectures cannot reproduce 'async shaders'.

They can however 'do' async compute.

They aren't the same thing.

One is a paradigm, the other is a hardware specific implementation.
 
Last edited:
Noone is trying to discredit you in anyway. General mentality of some of us here tends to be open minded about what both teams have been bringing to the table. Even though we have our own preferences.
Async compute looks promising and even if it does not take off. I for one like to be optimistic and objective about it :)
 
Noone is trying to discredit you in anyway. General mentality of some of us here tends to be open minded about what both teams have been bringing to the table. Even though we have our own preferences.
Async computelooks promising and even if it does not take off. I for one like to be optimistic and objective about it :)

Man I don't understand why the focus has been on using async compute for rendering. Rendering ties you to frames. If you're tied to frames then you need synchronization with other related rendering tasks. If your work is synchronized it ain't fucking asynchronous, at least not by the dictionary definition of the word lol.

Interesting uses of async compute should be things that are not tied to rendering, like AI, or in some cases physics.

Like, engine renders a frame, frame is ready X ms before vsync, spend X ms doing async work before moving onto next frame
 
Status
Not open for further replies.
Back
Top