Infinity Fabric: Why Crossfire needs to die and MultiGPU can be reborn

Discussion in 'AMD Flavor' started by vjhawk, Sep 12, 2018.

  1. vjhawk

    vjhawk n00bie

    Messages:
    60
    Joined:
    Sep 2, 2016
    AMD shocked the CPU world when they released the epic Threadripper 2990WX. This 32 CPU 64 threaded beast crushed all Intel CPUs in multithreaded applications including most notably Cinebench.

    What makes the 2990WX possible is the technology known as Infinity Fabric.

    MQJZSd5.jpg

    This unique technology allows all 32 cores of the Threadripper to gain access to the system memory.

    If AMD wants to make a mark on the GPU world, they should implement Infinity Fabric to create true multi-gpu solutions and just let Crossfire die.

    Crossfire doesn't succeed because developers are too lazy to dedicate coding resources to make it work when they know that 99% of the market uses single GPU solutions.

    So to get around this, leverage the power of Infinity Fabric to create seamless multi-gpu solutions.

    Create multi-gpu video cards that use the infinity fabric to link two gpus into a single virtual GPU that shares memory access across that fabric. Then design your own drivers to access and utilize this 'Virtual GPU' as a single unit.

    Nvidia's expansionism of their price range actually creates an opportunity gap that AMD can exploit if they can successfully leverage new technology to create compelling counter solution.

    For example the RTX 2080 Ti runs at $1200.

    If AMD can create a VX 1090 with 2x smaller GPU (450-500mm2) that outperforms the Nvidia solution (750mm2) at $999 then they are winning the game.

    At $699 they can again undercut the 2080 with 2x smaller core GPU, let's dub it the VX 1080.

    As for single solution GPUs, RX 1070 and RX 1060 answer the call, bringing Vega level performance to mainstream consumers at around $299 and $199 respectively.

    12GB on the VX line, 8 and 6 GB on the RX lines and AMD has achieved market segmentation. Of course skip HBM2 for these consumer lines, and go with DDR6 to have affordable memory for these components. And you can even downgrade to DDR5 or DDR5X for lower end models - RX 1050, 1030, etc. Meanwhile removing crossfire support from its video cards reduces manufacturing costs and increases profit margins for AMD.

    How does Nvidia counter? They either need to cut prices or come out with faster products at reasonable prices. A win for all consumers.

    Oh and before I go, what could be AMD's Titan killer? Oh that would be the AMD Cosmos VX 1099 featuring 4x GPUs configured just like the 2990WX, while featuring 24 GB-32GB of HBM2 memory. It would be aimed as an enterprise solution, but if you price it at $2499 for the 24GB version and $2799 for the 32GB version,it would undercut and potentially crush its competition with sheer brute force compute power.
     
  2. KazeoHin

    KazeoHin [H]ardness Supreme

    Messages:
    7,452
    Joined:
    Sep 7, 2011
    Just a heads up. 'Pooling' GPU memory on two different cards will never be a thing unless you are working with massive datasets. The latency introduced by even IF would be detrimental to performance.

    Also, people believe that memory on SLI/CFX is 'mirrored', when in truth, it isn't. Each GPU uniquely controls it's own, individual memory bank with no regard for what the other GPU is doing with it's memory. It's just that its usually rendering the same exact thing so the memory usually has the same data.

    I think, if AMD (or even Nvidia) really put their back into it, they could create a driver that can pool multiple GPUs into one "Virtual GPU" and have it be completely invisible to the software layer. No, I don't think this will be easy, or even likely, but it can be done. I would love to see a firmware/driver level approach to mGPU.

    That said, AFR is literally the best way to do mGPU. any other technique is less efficient. In well optimized scenarios, AFR can hit nearly 99% scaling. This is rare due to lack of software optimizations, but no other method can reach that level of scaling even in theory. Think about it. You'll violate the laws of physics if you somehow combine two chips and get the power of more than two chips. By dual GPU AFR able to reach 199% of the performance of a single card, It's achieving close to the maximum theoretical performance possible. SFR is a close candidate, however there are reasons why it isn't ideal for dual GP configs. Using a SFAFR (Fplit-Frame-Alternate-Frame-Rendering) method with Quad GPUs is probably the best way to get good performance while keeping latency minimum (two groups of two cards alternate AFR-style rendering SFR frames), but AFR would achieve an overall higher FPS given the same resources.
     
    SickBeast and vjhawk like this.