Nvidia showed off "Pascal" w/ Maxwell GPUs, later removes photos.

TaintedSquirrel

[H]F Junkie
Joined
Aug 5, 2013
Messages
12,691
http://www.extremetech.com/gaming/2...otype-allegedly-powered-by-maxwell-not-pascal

https://twitter.com/wayneborean/status/687179443572441088

As Anandtech readers noted, the hardware Jen-Hsun showed was nearly identical to the GTX 980 in an MXM configuration. The new Drive PX 2 is shown above, the GTX 980 MXM is shown below. The hardware isn’t just similar — the chips appear to be identical. Some readers have also claimed they can read the date code on the die as 1503A1 — which would mean the GPUs were produced in the third week of 2015.

@SemiAccurate FYI, Nvidia has pulled the picture https://www.flickr.com/photos/21075872@N02/23556454063a … used in this article http://semiaccurate.com/2016/01/11/nvidia-pascal-over-a-year-ahead-of-1416nm-competition/ … hope you kept a copy!

4DVvvOB.jpg
 
IDK, the angle of the pic just seem off, it make me think its actually fake..
 
I was wondering, why does it have mxm modules in the first place?

Never mind, I got it
 
Last edited:
How is Nvidia stupid enough to walk their craft projects around in front of people with cameras :p
 
This explains to some extend why Pascal wasn't shown at CES. They don't have a good enough working GPU to even demo, let alone GPU's to show people a Pascel Drive PX 2.

Really kind of shocked to hear about this.
 
The reason is simple and not nefarious: they don't want to actually show Pascal hardware to the competition before a time of their choosing.
 
How much trouble will they get in the auto industry... When they try to pull this stuff off.

The government is after VW for $50 billion (>_<)
 
How much trouble will they get in the auto industry... When they try to pull this stuff off.

The auto industry puts fake geometry on their prototype cars during outdoor testing on private tracks all the time so the paparazzi don't get pictures of the real look. They never get in trouble for it.

The government is after VW for $50 billion (>_<)

Because vehicles sold with software that lies to emissions tests are completely totally like not showing what a prototype graphics card looks like. yeah no
 
I wonder how long Nvidia buyers will get Async Compute? One? Two? Maybe two years?
 
I wonder how long Nvidia buyers will get Async Compute? One? Two? Maybe two years?

Not sure what you are talking about Nvidia can run Async Compute, it is already there on Maxwell.

Now how efficiently it can sun Async Computer is another story.

But yea 1-2 years? How about right now?
 
The reason is simple and not nefarious: they don't want to actually show Pascal hardware to the competition before a time of their choosing.

Your sig is just so fitting with this line.

I may have just lied to you.
 
Nvidia CEO flat out lied to a room full of journalists and investors.... oooooook. Not that surprising he's done it before, but I would guess the SEC might have a few questions for him.

And I'm also curious why Audi brought Qualcomm onboard so quickly...
 
I'll give a damn when reviews show a difference.

That may take a while as DX12 seems to be indefinitely delayed for "driver issues" and Vulkan is undergoing an extensive legal review despite being ready. It's almost like one of the hardware developers is delaying things until they can support a multi-threaded API similar to Mantle.
 
That may take a while as DX12 seems to be indefinitely delayed for "driver issues" and Vulkan is undergoing an extensive legal review despite being ready. It's almost like one of the hardware developers is delaying things until they can support a multi-threaded API similar to Mantle.

Current reviews, the ones currently running DX12 (as in, ACTUAL DATA) shows that AsynchCompute is not that big of a deal. DX12 is not being held back, its out in the open right now, and you can try it out.
 
I have to agree, they do this all the time, although in most instances it's just the writing scratched off of the GPU or IC. Heck, we even received a bunch of customer parts one time that had the identification shaved off...some kind of high end headphone amplifier pass-through system. Kind of weird, but they wanted it that way, the board being made up of almost entirely off the shelf parts.
 
Current reviews, the ones currently running DX12 (as in, ACTUAL DATA) shows that AsynchCompute is not that big of a deal. DX12 is not being held back, its out in the open right now, and you can try it out.
Not a big deal, yet developers had to explicitly code paths to prevent Nvidia performance from tanking. Followed up by a "Do's and Don'ts" from Nvidia saying not to do it. There is a high and low level aspect to the async. One of those absolutely crushes Nvidia performance and the other has a relatively minor boost in performance. The part that doesn't work is the one that lets you use lots of threads to render. Not having to worry about thread safety is rather significant for a developer.
 
Not a big deal, yet developers had to explicitly code paths to prevent Nvidia performance from tanking. Followed up by a "Do's and Don'ts" from Nvidia saying not to do it. There is a high and low level aspect to the async. One of those absolutely crushes Nvidia performance and the other has a relatively minor boost in performance. The part that doesn't work is the one that lets you use lots of threads to render. Not having to worry about thread safety is rather significant for a developer.


Async compute is part of Dx12, nV cards are capable of doing Async compute, with out it they can't get DX12 certification (this is why even Fermi can get DX12 certification as with Kepler, and both previous gens are good at Async compute) what you are really posting about is concurrent kernel execution, get your terms right.

The Do's and Don'ts is not only for nV hardware, its for all hardware, there is only one section of that paper that is explicit to nV hardware and its near the end and its not pertaining to concurrent execution.

PS there is no such thing as multithreaded API's, multithreading is exposed by the API its exposed through how the end code is written.
 
Excellent way to defuse the arguments, despite their truths.
How much you get paid bro?
 
Excellent way to defuse the arguments, despite their truths.
How much you get paid bro?


Where I work high 6 digits, that's without my bonuses from any film or tv show that I have worked on over the past 6 years once and if a show goes into syndication after 6 seasons usually, then a get piece of that too.

how much do you get paid?
 
You better ask your mother ;)
Just yell, she will hear you from the basement!


Well I do chat with her at times, since I live around 3 and half hours away from my family. Their house has no basement, but I just bought them a house in Frisco, Texas so I guess I won't be seeing them as often as I used to :p. But thats ok, I plan to move down there in 5 years once I retire.

You want to know something, if these guys knew what they were talking about, it would be great, but they don't, its the blind leading the blind, and that's the problem that these forums have been having for the past 10 years. Very few are willing to really learn what their computer really does. Its just crap talking and posting back and forth that gets no where.
 
Async compute is part of Dx12, nV cards are capable of doing Async compute, with out it they can't get DX12 certification (this is why even Fermi can get DX12 certification as with Kepler, and both previous gens are good at Async compute) what you are really posting about is concurrent kernel execution, get your terms right.

The Do's and Don'ts is not only for nV hardware, its for all hardware, there is only one section of that paper that is explicit to nV hardware and its near the end and its not pertaining to concurrent execution.

PS there is no such thing as multithreaded API's, multithreading is exposed by the API its exposed through how the end code is written.
Just because it's capable doesn't mean it's a good idea.

As for the threading and Do's and Don't I'm talking about a more robust architecture than just "Consider a ‘Master Render Thread’ for work submission". I'm looking at it more from the aspect of physics acceleration, rendering, AI, and maybe even reflecting sounds. Tasks more similar to several different processes sharing data.
 
Just because it's capable doesn't mean it's a good idea.

As for the threading and Do's and Don't I'm talking about a more robust architecture than just "Consider a &#8216;Master Render Thread&#8217; for work submission". I'm looking at it more from the aspect of physics acceleration, rendering, AI, and maybe even reflecting sounds. Tasks more similar to several different processes sharing data.


Async compute is only two compute threads or more executing together.

Concurrent execution with compute instructions interleaving the master rendering thread is not the same thing. Maxwell 2 is capable of having more than one rendering thread, more than one compute thread. Now if the driver which the driver and architecture is made for reducing latency when doing compute shaders, is forced to doing concurrent execution, in some cases it doesn't like it because it will actually increase latency, not decrease it beyond what its already doing in another way.

Edit: Simple look at it this, the reduction of latency, is better when what:

A) one architecture is better at Ops/ cycle
Or

B) one architecture is better at thread throughput

This goes to AMD hardware too, as with Intel, multithreaded code is specific the hardware.

Take a look at CPU's, hyperthreading on Intel works the same way. And if you have good hyperthreading for Intel CPU's it might not work as well on AMD CPU's and vice a versa.

So if you take an architecture that has more ops/cycle and force it to do multithreaded code, yeah at some point it will break.
Now if you take an architecture that has more thread throughput, if forced to do more ops/ cycle, at some point it will break.

Which way is better? A mixture of both would be great, but this generation of GPU's nV went one way, AMD with another. Todays game seems like it doesn't matter for nV, will this make an impact in the future, is anyones guess because it might take long enough for games to use enough of concurrent execution where this generation of GPU's are not really viable for the games out at that time.
 
Last edited:
can someone fill me in on what this "wooden screw" thing means? :(


Fermi was shown off 6 months before it launch right around the time AMD launched their new generation hardware and nV stated it was ready with Fermi, but the card was a fake it literally had wood screws on the back plate.
 
You want to know something, if these guys knew what they were talking about, it would be great, but they don't, its the blind leading the blind, and that's the problem that these forums have been having for the past 10 years. Very few are willing to really learn what their computer really does. Its just crap talking and posting back and forth that gets no where.

Alright BS aside.
Are you seriously stating that current Nvidia hardware isn't lacking compared to AMD's as far a DX12/Vulcan is concerned. Not from a ya it supports it, but from a ya it allows for better low level APi performance which better utilizes more threads compared to the competition when properly coded?
 
Not sure what you are talking about Nvidia can run Async Compute, it is already there on Maxwell.

Now how efficiently it can sun Async Computer is another story.

But yea 1-2 years? How about right now?

Async compute and graphics at the same time, then?
 
Alright BS aside.
Are you seriously stating that current Nvidia hardware isn't lacking compared to AMD's as far a DX12/Vulcan is concerned. Not from a ya it supports it, but from a ya it allows for better low level APi performance which better utilizes more threads compared to the competition when properly coded?


Async compute, nV's hardware is damn good at it. Cuda, has had this type of feature for 3 generations of graphics cards.

It depends, does the use of concurrent kernel execution in the same thread lead to better latency reduction on nV hardware vs. running async compute in only the compute thread and leaving the rendering threads alone.

Its not as simple as it will or it won't. Its highly dependent on the code. The coder doesn't have control over which instructions will interleave with which kernels at what time. They can to some degree but there is another issue for doing this, and that introduces more latency than would be advisable by using context switching on either AMD or nV hardware, although AMD does have less of a penalty because they do have fine grain context switching, none the less its still a penalty.
 
Sweet. Would love to hear you at a DX12/Vulcan programmers conference! LMAO :)


Programming ain't my forte, not anymore, hasn't been for 15 years, but I still understand what goes on under the hood and still program when I have to mostly just fx stuff, and that is my forte vfx and production.

http://www.training.prace-ri.eu/uploads/tx_pracetmo/CUDA22.pdf

page 9 of this programming guide for CUDA, asynchronous calls......

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=7&ved=0ahUKEwjZi6LchavKAhXBqR4KHYJpAZ0QFghFMAY&url=https%3A%2F%2Fopus4.kobv.de%2Fopus4-zib%2Ffiles%2F5036%2FZR-14-19.pdf&usg=AFQjCNH5wvJufK7MSe4QL5ieWF2JBG_kSQ&sig2=FyIr6cYOPgqL_SfPLhKe4w&cad=rja

Here is a good paper that shows the benefits of performance by doing so.

This is the thing, compute threads usually are faster than the rendering thread, so if the compute threads finish well before the master rendering thread, the advantage of doing concurrent kernel execution in the master rendering thread is moot because the time will be the same at the end. There will be no perceived latency reduction.

Executing single small-scale workloads on GPU is insucient to fully utilize the
GPU's compute performance. A viable approach to take advantage of accelerators,
however, is to execute multiple kernels in a concurrent way.
For a synthetic
benchmarking kernel, we found using Nvidia's Hyper-Q feature { a mechanism
for concurrent ooad computations on a shared GPU within multi-threaded/-
process applications { give signicant performance gains over a serial execution
of the same amount of work on GPU.
However, we encountered some deciencies
when using Hyper-Q from within multi-threaded programs causing the overall
program performance be up to a factor 3 behind the expectations. We were able
to recover the performance by using a kernel reordering scheme we introduced
earlier for Nvidia Fermi GPUs.
Using the Nvidia Visual Proler, we found that
seemingly not all hardware queues oered by Kepler GPUs are equally favored,
but some of them are processed to completion before the others. Executions using
Hyper-Q within multiple processes behave close to optimal with respect to
concurrency on GPU.
This is straight from the second link. Ignore the type o's the cut and paste some of the words got screwed up weird.
 
Last edited:
Fermi was shown off 6 months before it launch right around the time AMD launched their new generation hardware and nV stated it was ready with Fermi, but the card was a fake it literally had wood screws on the back plate.

Whoa... if you really don't know, this one's a classic! You owe it to yourself to read this one slowly :

https://semiaccurate.com/2009/10/01/nvidia-fakes-fermi-boards-gtc/

Ah thanks guys! I really didn't know haha!
 
Ah back to your incredible none related day job. Your quite pro at this.

Good job at updating your post. LOL
 
Last edited:
Back
Top