Async Compute!

Status
Not open for further replies.
Perfect hyperbole would be would anyone who believed nV when they stated pascal was 10 times faster (that slide about neuronets) over maxwell, that was for gaming. If someone came on this forum and stated that, I would say they are crazy. Even if neuronets weren't even mentioned in nV's presentations.

This is why when people started talking about 30% and above performance increases from Async, and why nV was having so much problems with AOTS with performance loss, it was unbelievable, to any programmer that has done shaders and looked at shader utilization figures for both cards. They should be very well versed at what would be good for each IHV if they want to get decent performance out of their software.

PS good programmers will look into shader utilization profiles to optimize their code. Not doing so, is not only bad but stupid not to, it gives you so much information quickly than measuring out frame rates.

I stated this at B3D about the oxide dev, I was surprised they didn't look into the AOTS shader profile lol.
'oxide is an ethical company, no ihv-specific code in out dx12 build'
 
'oxide is an ethical company, no ihv-specific code in out dx12 build'
So its only ok for Nvidia to include gameworks. Why are you annoyed by something that gets implemented by a dev that works closely with AMD>? You downplay everything and anything you see concerning AMD.
You do this in every possible topic.
 
Humor me, AMD is open about anything they do. async compute or not. Realistic approach to reality is making me a straw man. lol. PLease enlighten me :)
 
Humor me, AMD is open about anything they do. async compute or not. Realistic approach to reality is making me a straw man. lol. PLease enlighten me :)

Lol. I'm not calling you a straw man. Google 'straw man'.

Rofl. This humored me

Although, if I were to call you a straw man, it would only be so I can say I'm the big bad wolf that's going to topple you with a huff and a puff.
 
0_o wow that $199 really has people all riled up.

Async Compute A part of the DX12, but does not need to be used to be a DX12 game.

End of discussion
 
Humor me, AMD is open about anything they do. async compute or not. Realistic approach to reality is making me a straw man. lol. PLease enlighten me :)


Do yourself a favor. Just put Leldra on ignore. You'll be happier, because all he does is flamebait, taunt, and never contributes to any thread in a positive manner, ever.

You won't be missing anything, and you'll be happier for it.
 
0_o wow that $199 really has people all riled up.

Async Compute A part of the DX12, but does not need to be used to be a DX12 game.

End of discussion

Where's that Einstein quote when I need it.

Meh.

In my own words

'things need to be as simple as they can be without dumbing down the very essence of what is being discussed'
 
Humor me, AMD is open about anything they do. async compute or not. Realistic approach to reality is making me a straw man. lol. PLease enlighten me :)


If you believe that..... shit AMD is a company, nV is a company, Intel is a company, they are have done shady tactics when they are capable of doing them *capability is really important here, just because AMD/ATi has been at the short end of the stick and aren't able to do them because of financial reasons, doesn't mean they are holier then thou
 
Where's that Einstein quote when I need it.

Meh.

In my own words

'things need to be as simple as they can be without dumbing down the very essence of what is being discussed'

“Everything should be made as simple as possible, but not simpler.” ;)
 
Lol. I'm not calling you a straw man. Google 'straw man'.

Rofl. This humored me

Although, if I were to call you a straw man, it would only be so I can say I'm the big bad wolf that's going to topple you with a huff and a puff.
0_o wow that $199 really has people all riled up.

Async Compute A part of the DX12, but does not need to be used to be a DX12 game.

End of discussion
Its not async in particular. I think its very clear by now that in DX12 AMD is catching up even with something as old as FURY X. Pascal is a bit of a flop in comparison. $700, no thank you. Numbers are speaking for themselves.
 
Its not async in particular. I think its very clear by now that in DX12 AMD is catching up even with something as old as FURY X. Pascal is a bit of a flop in comparison. $700, no thank you. Numbers are speaking for themselves.

dude its 600 the only pascal that is at 700 right now is the FE versions, You can go to EVGA's site right now and get overclocked versions for less than $650. less than the cost of a 980ti and with better performance.
 
Its not async in particular. I think its very clear by now that in DX12 AMD is catching up even with something as old as FURY X. Pascal is a bit of a flop in comparison. $700, no thank you. Numbers are speaking for themselves.

Nobody asked you what you think offers the best value, nobody cares what you do or don't buy.

You entered a discussion about async and made a bunch of accusations. I feel like Illidan Stormrage...

you are not prepared.
 
Its not async in particular. I think its very clear by now that in DX12 AMD is catching up even with something as old as FURY X. Pascal is a bit of a flop in comparison. $700, no thank you. Numbers are speaking for themselves.
Your Einstein quote means nothing here. By this logic we can go back to calculators. Never thinking about the complexities of GPU's and tech behind making them do things faster and more efficient.
 
Your Einstein quote means nothing here. By this logic we can go back to calculators. Never thinking about the complexities of GPU's and tech behind making them do things faster and more efficient.


you didn't understand the quote, what Einstein was talking about when explaining things, is make it as simple as possible without loosing the context of what you are trying to explain.
 
Your Einstein quote means nothing here. By this logic we can go back to calculators. Never thinking about the complexities of GPU's and tech behind making them do things faster and more efficient.
Paris_Tuileries_Garden_Facepalm_statue.jpg


Some battles just can't be won

Also, a calculator is still a computer
 
Paris_Tuileries_Garden_Facepalm_statue.jpg


Some battles just can't be won

Also, a calculator is still a computer
This is not about winning. Its about the facts. Facts, you and razor1 refuse to accept. That is something you have proven not only to me , but the majority of posters here. Linking and all the sources you can dig does not make you a winner or correct. Yes, calculator is a computer. One of the most besic ones.
Too bad you cant make the connection here...lol
 
0_o wow that $199 really has people all riled up.

Async Compute A part of the DX12, but does not need to be used to be a DX12 game.

End of discussion

And if developers find it's too hard to implement/not enough juice for the squeeze, we can just blame nVidia for buying them off. Win win!
 
This is not about winning. Its about the facts. Facts, you and razor1 refuse to accept. That is something you have proven not only to me , but the majority of posters here. Linking and all the sources you can dig does not make you a winner or correct.

Well that's patently untrue, buddy.

You and a few others can't accept the facts.

You admitted you don't know or care how async works.

See ya.
 
Well that's patently untrue, buddy.

You and a few others can't accept the facts.

You admitted you don't know or care how async works.

See ya.
I don't have to understand it in the same detail developers do. You don't seem to understand it that way either. So, yes have a great day and come again as you please.lol
 
This is not about winning. Its about the facts. Facts, you and razor1 refuse to accept. That is something you have proven not only to me , but the majority of posters here. Linking and all the sources you can dig does not make you a winner or correct.


its about winning an argument, that is what you think it is? Dude if you feel that way, that's why you are posting the way you are right now.

Cause you don't understand the fundamentals of GPU utilization and why Async compute is the way it is there are benefits and pitfalls to it.

I don't have to understand it in the same detail developers do. You don't seem to understand it that way either. So, yes have a great day and come again as you please.lol

You don't need to be a dev to understand the high level stuff

The concepts are easy to understand but to understand the concepts you need to understand many other things, that's what is lacking.

Its like doing calculus, without algebra you ain't going to be able to calculus, but after knowing algebra, even though you might not have done calculus before, the concepts are easy to understand.
 
its about winning an argument, that is what you think it is? Dude if you feel that way, that's why you are posting the way you are right now.

Cause you don't understand the fundamentals of GPU utilization and why Async compute is the way it is there are benefits and pitfalls to it.



You don't need to be a dev to understand the high level stuff

The concepts are easy to understand but to understand the concepts you need to understand many other things, that's what is lacking.

Its like doing calculus, without algebra you ain't going to be able to calculus, but after knowing algebra, even though you might not have done calculus before, the concepts are easy to understand.
Man if I was wrong about all my understanding then the forum is the least of my problems. I'm going to fail my exams lol

Not to mention that what the devs are saying aligns with what we are saying

Hell, hard evidence from aots performance shows that different cards benefit from async differently within gcn family
 
Man if I was wrong about all my understanding then the forum is the least of my problems. I'm going to fail my exams lol

Not to mention that what the devs are saying aligns with what we are saying

Hell, hard evidence from aots performance shows that different cards benefit from async differently within gcn family
Did it cross your mind that those with most transistors benefit the most. This is why this tech has been developed in the first place. Sort of like Intel's hyperthreading.
 
7 pages and still bickering, damn Crosshairs should just lock this thread.
 
And if developers find it's too hard to implement/not enough juice for the squeeze, we can just blame nVidia for buying them off. Win win!

No we just call them lazy developers. Just like Multi-GPU is all on the developer.

If a Developer does not use Async Computer or Multi-Gpu to me that is being a lazy developer. DX12 will now hold the gaming companies responsible.
 
Did it cross your mind that those with most transistors benefit the most. This is why this tech has been developed in the first place. Sort of like Intel's hyperthreading.

No it didn't cross my mind at all, transistor counts is of no relevance to this discussion, at all.

I have no idea why they would lock the thread, they've locked them before, and while I understanding the logic behind the decision, it's neither just nor effective.

The conversation shouldn't descend into anger and insults, I get that. On the other hand, you are just arguing with me, you don't have a point. The discussion flutters from subject to subject, I don't think there's an actual point behind what you're saying?

*if* you had a point we would be discussing the differences between gcn and Cuda in terms of scheduling, dispatching, queuing.

We would be discussing that gcn operates differently and has more granular control over each of the 16-wide simd VUs.

But no, instead you're talking me about hyperthreading and abstaining from technology and using calculators.

Please.
 
No we just call them lazy developers. Just like Multi-GPU is all on the developer.

If a Developer does not use Async Computer or Multi-Gpu to me that is being a lazy developer. DX12 will now hold the gaming companies responsible.

I also think the fundamental problem with low level programming is, it is very architectural specific, one can program for a GPU vendor very well but not the other due to either time contraints or contractual agreement with a GPU vendor. I don't even want to think about when there is a major architectural change and headaches it will bring.
 
"Did it cross your mind that those with most transistors benefit the most. This is why this tech has been developed in the first place. Sort of like Intel's hyperthreading."

Not at all. This is the explanation Koduri gave in one of his interviews. He must be wrong according to Razor1 and Iedra...lol
 
I also think the fundamental problem with low level programming is, it is very architectural specific, one can program for a GPU vendor very well but not the other due to either time contraints or contractual agreement with a GPU vendor. I don't even want to think about when there is a major architectural change and headaches it will bring.

If they program for DX12, they can program Async Compute and Multi-GPU in without giving any GPU vendor special treatment.

Make a game with Async Compute and Multi-GPU and let the benchmarks show who does well in DX12.
 
This topic is about Async compute and not about CUDA or things you want to talk about. So, yes don't try to derail it.:)
 
Did it cross your mind that those with most transistors benefit the most. This is why this tech has been developed in the first place. Sort of like Intel's hyperthreading.

hyperthreading and branch prediction is what made Intel processors much more attractive when doing multi threading. Yes that does take up transistor space.

In GPU terms, this does affect them too with async concerns, but its purely not that as the control silicon tends to be the same size on a per generation level. With GCN 1 to 3 this hasn't change much either. We don't know much about GCN 4 yet though. Mainly because you have different bottlenecks within the gpu units because of the different counts and ratios that can affect utilization rates. keep in mind you might not be bottlenecked by the same thing within a frame.

CUDA is ancillary to this, as the stack is completely controlled by nV. But even with CUDA the documents specifically state look at your utilization rate when doing the code.
 
hyperthreading and branch prediction is what made Intel processors much more attractive when doing multi threading. Yes that does take up transistor space.

In GPU terms, this does affect them too with async concerns, but its purely not that as the control silicon tends to be the same size on a per generation level. With GCN 1 to 3 this hasn't change much either. We don't know much about GCN 4 yet though. Mainly because you have different bottlenecks within the gpu units that can affect utilization rates. keep in mind you might not be bottlenecked by the same thing within a frame.

CUDA is ancillary to this, as the stack is completely controlled by nV.

Yeah but there's no need to talk about transistor counts, the more independent and independently addressable execution units there are, the more you can benefit from this kind of scheme, particularly considering the distribution of work at various stages of the pipeline in gpus. As far as GCN is concerned it is a very good idea, there's no reason to say anything to the contrary.

We are just being drawn into another discussion that is quickly spiraling off course.

For example one thing that contributes to GCN being particularly ALU-dense, other than higher transistor density in their designs, is the use of VUs as opposed to scalar as in CUDA.

Async computes does not operate at the transistor level, and talking about transistor counts is basically misdirection on his part
 
yeah transistor amounts doen't matter much with Async in GPU's cause the control silicon tends to be relatively the same size across the entire line. So if you are capable of doing async compute you already have put the transistors down to do it.

Now if we are talking about dynamic load balancing vs. async shaders there will be some transistors used for those operations, the amount is kinda unknown though. What you will need in silicon is enough register space to unroll the shaders and then predict the scheduling dynamically from that. And this is up to anyone to guess the amount needed but both AMD and nV will have to use close to a similar amount to get similar results.
 
Status
Not open for further replies.
Back
Top