Lorien
Supreme [H]ardness
- Joined
- Aug 19, 2004
- Messages
- 5,197
So much hype and no release date. NV30 repeat.
Yup. This pretty much sums it up.
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
So much hype and no release date. NV30 repeat.
So much hype and no release date. NV30 repeat.
This wasn't a launch you tool...
I am going to disagree with you there. DX11 does actually bring something to the table that justifies it. (let me agree with you on 10, 10.1 could have if everyone went with it) I agree that a lot of companies are going to focus on the larger game markets but trends come and go and console platforms do not stay stagnant. tessellation alone (I think some already use it?) make it a candidate for the next Xbox. also Direct compute, OpenCL and the still viable PC game market are going to be pushing this. probably not this coming year but with lambree and other factors I think it is coming. OpenCL is awesome but if it ends up like OpenGL MS will be setting the standard I think.
Also I think right now we are looking at a case where PC gaming is going to distinguish itself from console gaming, hell look at eyefinity. there is no console that can do that. and Nivia's up and coming may well be able to do all kinds of special effects (I am hoping for working AO driver override) to say that PC gaming is dead is short sighted I think.
JM2C
111.9GB/s to be exact (Both GPUs have their own memory banks, but data in those banks are always identical..also data sent trough their own memory channels are therefore identicalThanks for the link BTW GTX295=223.8GB/s
All the people who don't want to wait nobody is asking you to. Go buy what you need now. So what's the point of the same 2 or 3 people polluting every Fermi thread with their noise and BS? Is it out of fear? Do you fear Fermi?
Directx11 is 100% meaningless, less than meaning less. .
LOL.or... then some, and then a GPU
If NVIDIA fans should be worried, then we should all be worried.Not sure what to make of your first post Ron, pretty much everyone on this site knows specs alone does not make an excellent product. And to me it looks like the Nvidia fans are worried cause they seem to be referring less and less to games and focusing on other area's of the market. And if I was them I would be worried, within a year or two intel will be up and running with there videocards. And NV has no X86 license to speak of.
Not sure what to make of your first post Ron, pretty much everyone on this site knows specs alone does not make an excellent product. And to me it looks like the Nvidia fans are worried cause they seem to be referring less and less to games and focusing on other area's of the market. And if I was them I would be worried, within a year or two intel will be up and running with there videocards. And NV has no X86 license to speak of.
Nvidia wants to replace the CPU not become the new CPU
I think the numbers NVIDIA's provided so far are intriguing. It's impossible to gauge whether GF100 will make a great gaming chip (or at least as good a gaming chip as RV870, which would be fine with me), but the supposed increase in compute performance is pretty exciting, to say the least.
Well, I have a feeling something like FLACuda might run up to five times faster on GF100 than it does on an i5 or i7. If that's the case, then there may be the capability to speed up more FP-heavy tasks (like video encoding) by a factor of maybe four or five times over the current generation. Developers really aren't scratching the surface of what CUDA can provide because the hardware we have today isn't usually faster than high-end CPUs, so there's not much incentive to bother with it. If that changes, and we see a two, three or four-fold increasing in computational performance out of GF100, then the whole computing environment itself is likely to change. Not immediately, but probably pretty quickly.What I want to know, is what the GF100 can bring us (consumers) with its new cGPU/GPGPU design? What can it do, that cannot be achieved on the CPU and GPU's we already have (as example on opencl and dx11)?
Well, I have a feeling something like FLACuda might run up to five times faster on GF100 than it does on an i5 or i7. If that's the case, then there may be the capability to speed up more FP-heavy tasks (like video encoding) by a factor of maybe four or five times over the current generation. Developers really aren't scratching the surface of what CUDA can provide because the hardware we have today isn't usually faster than high-end CPUs, so there's not much incentive to bother with it. If that changes, and we see a two, three or four-fold increasing in computational performance out of GF100, then the whole computing environment itself is likely to change. Not immediately, but probably pretty quickly.
We'll do the same stuff we're doing on CPUs, only we'll be doing it quicker.
111.9GB/s to be exact (Both GPUs have their own memory banks, but data in those banks are always identical..also data sent trough their own memory channels are therefore identical
--->
111.9GB/s is the effective memory bandwidht for GTX 295
You're looking at a heatspreader
If those memory channels carry that identical data..then how does that scaling happens? Those channels can transfer that 233GB/s of data, but only 111.9GB/s of different data.Sigh, no. Bandwidth scales. It's capacity that doesn't.
True but it's got to scale. You're telling me bigger, hotter chip =/= larger physical heatspreader?
If those memory channels carry that identical data..then how does that scaling happens? Those channels can transfer that 233GB/s of data, but only 111.9GB/s of different data.
GPU1 has 896mb with a 448 bit bus
GPU2 has 896 with 448 bit bus also
the data is replicated, they are rendering the same textures and shit so it loads everything in both memories ... but GPU1 has only access to its memory pool, same for GPU2
And that memory pool can only communicate at 111.9GB/s to the GPU its associated
(this is as i understand)
You mean the Ati fanboys are starting to sweat a little? The 5870 isn't even the fastest card out, and this card looks like it's going to eat it for breakfast. Maybe not, but with those specs......wow....
Then again I might just have one of each card.....eyefinity is really really amazing.
Do you understand GPU architectures? Because if you did you wouldn't hold this opinion.
GF100 has a peak theoretical computational performance figure of ~ 1.25TFLOPs for Single Precision and 624GFLOPs for Double Precision. (how did I get to this number you ask? GT200 had a rate of 78GFLOPS for DP and nVIDIAs CEO was quoted saying GF100 has 8x that Peak figure for a total of 624GFLOPs. If you understand that GF100 does DP at half of the SP rate then you understand that multiplying the 624GFLOP DP figure gives you the GFLOP SP figure. You can catch that here: http://www.pcper.com/article.php?aid=789 at the bottom). This means the clock speed is likely to be ~610MHz with a Shader Clock of ~ 1220MHz.
Most folks do not know this but GT200 (GTX 280) actually has a peak theoretical computational performance figure of 622GFLOPs (not 933GFLOPs which was derived under the false assumption of a missing MUL which has now been pulled out of the SFU see here: http://www.anandtech.com/video/showdoc.aspx?i=3651&p=3 Quote: "In addition to the cores, each SM has a Special Function Unit (SFU) used for transcendental math and interpolation. In GT200 this SFU had two pipelines, in Fermi it has four. While NVIDIA increased general math horsepower by 4x per SM, SFU resources only doubled. The infamous missing MUL has been pulled out of the SFU, we shouldn’t have to quote peak single and dual-issue arithmetic rates any longer for NVIDIA GPUs.").
RV870 has a peak computational performance rate of 2.72TFLOPs for Single Precision and 544GFLOPs for Double Precision.
Therefore for Double Precision workloads GF100 has the upper hand. Now when it comes to games you're relying on SP loads as well as RBE, Memory Bandwidth and TMU performance mainly.
384bit GDDR5 of GF100 is wider than the 256bit GDDR5 bus of the RV870. This is an area where GF100 has a clear advantage.
All that is left is TMU and RBE designs for us to get a clear picture of how things will pan out. Games are moving towards more compute heavy loads (DirectX11, Direct Compute 11 and OpenCL). It's fair to say that things will likely end up quite close if there aren't any other large architectural design changes in the Texture Mapping Unit and Render Back End units. If things are close price/performance will likely be the deciding factor. GF100 is an enormous design (My memory isn't the best but I think I read something to the tune of 1 Billion more transistors than RV870). This means that just as before, AMD will be able to compete quite easily with pricing while nVIDIA will struggle in that area.
Look at folding @ home.
There 2 AMD FLOPS = 1 NVIDIA FLOP in effective performance.
The real question is then:
Why do AMD need 2x FLOPS to match NVIDIA's performance?
http://forum.beyond3d.com/showthread.php?t=50539
Do we need to deucate you that pure FLOP numbers are a joke, like you need education in not making deceitful Youtube videos claming CPU = GPU in performance?
Try running the ingame benchmark with "your" CPU hack and post the numbers...
Taken from: http://www.xbitlabs.com/articles/video/display/radeon-hd5870_3.htmlAccording to the developer, the peak computing power of the RV870 is as high as 2.7 teraflops in single-precision mode (FP32) and 544 gigaflops in double precision mode (FP64) which is used for most serious computing tasks. A special mention must be made of the ability to execute threads in protected memory sections which makes it easier to transfer code originally developed for the classic CPU to the GPGPU platform. All these innovations in the RV870’s computing section make it a perfect choice for GPGPU, especially in comparison with Nvidia’s solutions whose double-precision performance is far from ideal.
No.
I've explained that to you before but you have a thick head.
The performance difference in Folding@Home had to do with GT200s ability to access protected memory as a temporary software cache. This is something RV770 lacked and it just so happens something that helped in that particular application/instance (F@H).
In fact you posted the links which proved my assertion right here: http://foldingforum.org/viewtopic.php?f=51&t=10442
and here: http://www3.interscience.wiley.com/cgi-bin/fulltext/121677402/HTMLSTART
I'll explain. GT200 has an ability to access protected memory in software which is used as a temporary cache. When an error occurs, GT200 can revert back to the previous calculations and continue from there. RV770 did not have this ability. Therefore when an error occurred, RV770 had to simply start all over again. This explain why RV770 is doing more FLOPs than GT200 despite having a lower output.
Here is a quote:
Taken from: http://www.xbitlabs.com/articles/video/display/radeon-hd5870_3.html
This is a software option therefore a new AMD Folding@Home GPU client would need to be written to take advantage of the changes with RV870.
Are you still in denial?
That's actually not as comically, outlandishly large as I thought it would be...
Yea, AMD's GPGPU are generations behind NVIDIA's...and again are you still trying to use raw FLOPS numbers?
What is next...the consoles are "supercompters" because the have l33t "FLOPS"?
And you "forgot" to post your ingame benchmark numbers?
What are you talking about? You lose an argument (because you clearly have no clue what you're talking about) and now you want to discuss the PhysX hack? Why are you attempting to change the topic?
GPGPU has yet to take off. The deciding factors are now in play (Direct Compute 11 and OpenCL).
Do you even know what a FLOP is? FLoating point Operations Per Second.."The FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations, similar to the older, simpler, instructions per second.". It is THE theoretical performance measurement figure for hardware. We're not talking MHz here. Damn I'll link you to Wikipedia to make it easy on you: http://en.wikipedia.org/wiki/FLOPS. The peak amount of work that can be done each second. RV870 displays an astounding 2.72TFLOPs in Single Precision.
You have no argument here. RV870s is a Computational Monster. You could argue that AMD hasn't placed many resources in the development of their GPGPU tool-set (and you would be correct) but to mock higher Floating Point Operations Per Second is ridiculous and shows that you really haven't got a clue.
Are there other limiting factors? Yes there are. Cache is one of them and GF100 comes with a large shared L2 Cache (768KB I believe). There is also how the software is written and how it utilizes the Computational Performance (Folding@Home being the prime example of that). RV870 is a SuperScalar design, that's a fancy way of saying that it is a highly threaded design. With anything highly threaded you need software that is coded to take advantage of this. Any performance limitations you may be insinuating are primarily caused by this. nVIDIA, on the other hand, chose a Scalar design. Scalar designs are more simple but not as powerful. When running several simple calculations they can be superior to SuperScalar designs which rely on more complex calculations.
AMD have fixed several of their GPGPU shortcomings with the RV870 (one which I highlighted in my post above). They also seem to be slowly retiring Brook+ in exchange for OpenCL and Direct Compute 11 (a wise move IMHO).
Before you post, make sure you know what you're talking about.