why do nvidia cards fold so much better than ati?

Discussion in 'Distributed Computing' started by .:Burn:., Oct 10, 2009.

  1. .:Burn:.

    .:Burn:. Limp Gawd

    Messages:
    360
    Joined:
    Sep 22, 2009
    as the title says... why does nvidia fold better than ati?
     
  2. Vaulter98c

    Vaulter98c [H]ard|DCer of the Month - October 2009

    Messages:
    5,494
    Joined:
    May 21, 2008
    NVidia and ATI cards both have what we refer to as shaders, however, the way that NV cards do their computing is differant than how the ATI cards do, hence the large numbers on the ATI cards vs Nvidia cards

    The client for the NV cards is just more efficient, its better written, and at the end of the day, despite all the differences in the 2 architecture (killed the sp on that one) it comes down to the NV client is better written than the ATI one

    we can only hope that with the rumored release of the next GPU core in the coming months this issue is fixed, but until then, NV will stay on top

    Thats the laymens version, im sure someone could get more in depth
     
  3. Atech

    Atech [H]ardness Supreme

    Messages:
    4,851
    Joined:
    Jul 14, 2007
    Much better than your flawed guess ;) :
    http://forum.beyond3d.com/showthread.php?t=50539
     
  4. Zero82z

    Zero82z Pick your own.....you deserve it.

    Messages:
    28,105
    Joined:
    Jan 20, 2004
    It's not a flawed guess at all. He is right in saying that the difference is due to the fact that the ATI client is poorly optimized. It is designed for the HD2000 and HD3000 series and hasn't been updated to take advantage of architectural improvements in the 4000 and 5000 series that could lead to significant increases in performance. They are also using Brook+ and have not transitioned to OpenCL yet which should give another boost in performance once it's fully supported on ATI cards simply because Brook+ seems to be much more difficult to work with.

    So to make a long story short, it's basically because the necessary resources and manpower to properly optimize the ATI client haven't been dedicated to the task, either because the project doesn't have the resources or because they are needed elsewhere.
     
  5. jeremyshaw

    jeremyshaw [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Aug 26, 2009
    Here's why:
    ATi has several big traditional bruteforce processing (for lack of a better word) units, and lott 'o tiny MADD (math) units. All are counted as shaders.

    nVidia has many bruteforce processing units, and the MADD units are just aren't counted as shader cores.
    That is why you always have to divide the ATi SP count by 5 to get a better reading of actuall useable power (the way nVidia counts it).

    Simply put, the nVidia architecture is just faster. The ATi cards (dunno about the current generation) is slower.

    How DX11/DirectCompute may change this, we don't know. But right now, the new ATi architecture is actually limited by the current GPU client, as DX11 adds two new shader operations (on top of Geometry, Pixel, and Vertex), and that may change things - however, I'm not betting on it. I still belive, based on the limited performance gain compared to what is out there, and ATi's stated concentration on Graphics, any current nVidia GPU will probably still outdo any optimised ATi client, unless if F@H can run like a video game (resource wise), and not as a parrell processing application.

    Dunno if that made sense.
     
  6. jeremyshaw

    jeremyshaw [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Aug 26, 2009
    Also, didn't ATi shift off of the Brook+ a while back? I thought it died with the R600 generation?
     
  7. Atech

    Atech [H]ardness Supreme

    Messages:
    4,851
    Joined:
    Jul 14, 2007
    That is just a lame excuse for an architectual problem:
    http://theovalich.wordpress.com/2008/11/04/amd-folding-explained-future-reveale/
    http://foldingforum.org/viewtopic.php?f=51&t=10442&start=0#p103025

    Like I have said before...NVIDIA is generations ahead of AMD on GPGPU.
     
  8. jeremyshaw

    jeremyshaw [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Aug 26, 2009
    You didn't say that, you linked to a flame war on B3D (Charlie, that ahole is believed to get most of his info from there).

    Besides, I think my earlier post describes it clearly. (no lost love;)).
     
  9. Atech

    Atech [H]ardness Supreme

    Messages:
    4,851
    Joined:
    Jul 14, 2007
    The links can be found in my first link...but I guess people are getting more and more lazy...
     
  10. jeremyshaw

    jeremyshaw [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Aug 26, 2009
    Yeah, I found it on the second page, after the only Folder stepped in:rolleyes:
     
    Last edited: Oct 10, 2009
  11. Zero82z

    Zero82z Pick your own.....you deserve it.

    Messages:
    28,105
    Joined:
    Jan 20, 2004
    I think it's funny that you would even imagine that you have read more about this subject than I have and that you assume that there are any links you can post that I haven't already seen and read long ago.

    The fact is that nothing you have posted contradicts my statements in any way, and you are misinterpreting the currently flawed implementation of the F@H GPU client on ATI video cards as an issue with the architecture, when in reality it is merely a matter of the client not being properly designed to take advantage of the strengths of ATI's different GPU design.

    As evidenced by the fact that ATI GPUs take less of a performance hit when performing calculations involving larger proteins, ATI's architecture actually has more brute force power than nVidia's architecture, since most of the calculations in question are simple MADD operations rather than the other transcendental operations that only 20% of ATI's SPs are capable of handling. The problem is that the ATI cores are still stuck in "R600 mode" as it is called by some, and it doesn't make use of the LDS that was added to RV770 which would mitigate many of the "calculate twice" issues that currently plague the client.

    Another issue is that the system used to benchmark ATI workunits is using an RV670 GPU, so the points allocation is also geared towards people with those cards and not newer RV770 and RV870 GPUs which are much improved when it comes to GPGPU applications.
     
  12. jeremyshaw

    jeremyshaw [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Aug 26, 2009

    oopps!

    I really need to get data on the new ATi GPUs. Guess I'l hold off buying that gts250, for now.
     
  13. Atech

    Atech [H]ardness Supreme

    Messages:
    4,851
    Joined:
    Jul 14, 2007
    No amount of "optimizing" drivers/software can create the cache AMD GPU's lack and NVIDIA's GPU have. ..which means that AMD GPU's have to do more work (due to being unable to store the data) compared to NVIDIA GPU's...like it or not.
     
  14. Zero82z

    Zero82z Pick your own.....you deserve it.

    Messages:
    28,105
    Joined:
    Jan 20, 2004
    The problem isn't that ATI GPUs can't store "enough" data, it's that they aren't storing "any" data at all right now since F@H doesn't use the LDS. And a single step of a single GPU workunit doesn't require a particularly large amount of data storage, especially not with the small proteins that are currently being used for most of the workunits that are in the wild right now. Each shader unit (set of 4 standard FPUs and one special-function unit) has a 16KB LDS in RV770 and 32KB in RV870, which is more than enough to give a significant performance boost to overall work production speed.
     
  15. Vaulter98c

    Vaulter98c [H]ard|DCer of the Month - October 2009

    Messages:
    5,494
    Joined:
    May 21, 2008
    Thank you for defending me Zero, I couldnt think of anyone Id rather have on my side :)

    /hiding now