Is nVidia trying to bring back SLI in OpenGL?

Not for games though, too many variables and any game developer should be using Vulkan instead. But as a means of further accelerating specific programs or workloads sure.
I know, wishful thinking on my part.
 
SLI and Crossfire is dead. Now, mGPU support for VR or something, well, we will see.
I think it has a strong place in Datacenter's for projects like Google’s Stadia, and others. The ability to tie a vGPU back to a resource pool of linked GPU’s as opposed to divvying up workloads to individual cards would seem like a large win to me.
 
I think it has a strong place in Datacenter's for projects like Google’s Stadia, and others. The ability to tie a vGPU back to a resource pool of linked GPU’s as opposed to divvying up workloads to individual cards would seem like a large win to me.

Oh, I did not know that, thanks. (Well, it sure is mostly dead for our local computer use, though.) Even the 5700 and XT do not support it, at all.
 
Oh, I did not know that, thanks. (Well, it sure is mostly dead for our local computer use, though.) Even the 5700 and XT do not support it, at all.
Even if they don’t do it all the Vulkan and DX12 API’s paired with current GPU prices make it an unattractive solution. The problem with SLI for games is that when all is said and done one GPU actually has to be the one that takes the finished frame and output it to the screen. Modern systems and displays are at a point where the delay in transferring that finished frame is a noticeable one and most methods for mitigating that delay now result in situations where frame tearing is apparent. In virtual systems the GPU doesn’t actually output anything it all goes to a virtualized buffer so you could fill that buffer from multiple sources then present it, far easier than you can combine outputs through a physical link.
 
I do wish that Nvidia would give us more serious mGPU options. Memory pooling would be huge outside of gaming. My rig has 96GB of VRAM in it, but that ends up being more like 24GB in a RAID 1 since every GPU has redundant data loaded into it.

One huge gap, IMO, is that even our most basic tools like GPU-Z still aren't mGPU-friendly. Even our OS and motherboards make it difficult to figure out which GPU in software matches up with which GPU in the machine.
 
I do wish that Nvidia would give us more serious mGPU options. Memory pooling would be huge outside of gaming. My rig has 96GB of VRAM in it, but that ends up being more like 24GB in a RAID 1 since every GPU has redundant data loaded into it.

One huge gap, IMO, is that even our most basic tools like GPU-Z still aren't mGPU-friendly. Even our OS and motherboards make it difficult to figure out which GPU in software matches up with which GPU in the machine.
Memory is pooled in NVLink.
 
Memory can be pooled with NVLink, but it isn't pooled for all NVLink uses. Plus NVLink only allows two cards to be linked at the moment.
NVLink currently supports up to 16 GPU’s and memory mode is dependent on the implementation by the developer when they build in the support.
 
I am not entirely sure why VR doesn't just about require two cards, one for each eye.

Instead it is completely incompatible with multiple GPUS
 
NVLink currently supports up to 16 GPU’s and memory mode is dependent on the implementation by the developer when they build in the support.

I don't think Nvidia enterprise solutions are relevant here. The Tesla V100 can handle either four or six connections (depending on which set of documentation you read), but it does it with the slow 50GB/s implementation. To expand to 16 GPUs, you need to also throw in the NVSwitch - but this is also done via the slower 50GB/s implementation rather than the 100GB/s in the Turing GPUs.

Outside of those niche enterprise products, NVLink supports precisely two GPUs and no more than that. Even the mighty Quadro 8000 is limited to just 2 GPUs paired with NVLink.
 
I don't think Nvidia enterprise solutions are relevant here. The Tesla V100 can handle either four or six connections (depending on which set of documentation you read), but it does it with the slow 50GB/s implementation. To expand to 16 GPUs, you need to also throw in the NVSwitch - but this is also done via the slower 50GB/s implementation rather than the 100GB/s in the Turing GPUs.

Outside of those niche enterprise products, NVLink supports precisely two GPUs and no more than that. Even the mighty Quadro 8000 is limited to just 2 GPUs paired with NVLink.
Normally I would agree but over 100 of the pages that nVidia posted on this the overwhelming majority of them are aimed at DGX Stations for large/complex data sets with 4 or more Tesla class GPU’s... so I have to assume this is what they are gearing these new extensions toward.
 
Last edited:
I am not entirely sure why VR doesn't just about require two cards, one for each eye.

Instead it is completely incompatible with multiple GPUS
While cool it would probably require a large change to how video is processed at the OS level and I would think an engine would have to be built from the ground up to support the feature. I think it is simply a case of there not currently being enough money in VR systems to justify the expenditure.
 
mGPU for gaming is a horrible idea. Generally, we want more frames as that means less latency. With SLI, you do get more frames, but they arrive too late, rendering the whole thing meaningless. Unless you're the type of gamer that's only in for the pretty pictures, that is.
 
mGPU for gaming is a horrible idea. Generally, we want more frames as that means less latency. With SLI, you do get more frames, but they arrive too late, rendering the whole thing meaningless. Unless you're the type of gamer that's only in for the pretty pictures, that is.

When it was done correctly, Crossfire worked extremely well and was far from meaningless.
 
I think it has a strong place in Datacenter's for projects like Google’s Stadia, and others. The ability to tie a vGPU back to a resource pool of linked GPU’s as opposed to divvying up workloads to individual cards would seem like a large win to me.


Their enterprise cards are effectively mgpu solutions. They have cards with 4 gpus in them and 32+ gb of vram that you can slice up for Vms.
 
When it was done correctly, Crossfire worked extremely well and was far from meaningless.

And SLI was even better because Nvidia cared about frametimes.

But it takes software to really make that push, and that just isn't there right now, despite engine developers having included support.


I'm still wondering why VR hasn't pushed that point yet.
 
And SLI was even better because Nvidia cared about frametimes.

But it takes software to really make that push, and that just isn't there right now, despite engine developers having included support.


I'm still wondering why VR hasn't pushed that point yet.
VR hasn’t pushed for it mostly because of the costs required to implement it correctly.
 
You don't even really want SLI which only gives a marginal boost, you want each eye rendering separately for twice the frame rate and half the latency. But you have to keep the two eyes in perfect sync. Nvidia Quadro cards have a genlock option that allows you to sync outputs from different machines.

Just saying if I was Nvidia and wanted to sell twice as many cards and push VR, I wouldn't be trying to get 2x90hz @ 4k out of a single card.
 
The other reason VR hasn't pushed hard for SLI is because the compute requirements don't really justify it. It's just not that hard to push 1080p frames down the pipe. The upper end of displays, like the Rift, are still only 2560x1440. A single 2080 Ti can handle that pretty easily.
 
The other reason VR hasn't pushed hard for SLI is because the compute requirements don't really justify it. It's just not that hard to push 1080p frames down the pipe. The upper end of displays, like the Rift, are still only 2560x1440. A single 2080 Ti can handle that pretty easily.

Two cheaper cards for VR could possibly see wider adoption than people buying 2080ti's just for VR.
 
Two cheaper cards for VR could possibly see wider adoption than people buying 2080ti's just for VR.
With current GPU pricing this doesn’t work. The cost of 2 cards comes to the point where the similarly priced single card would outperform those 2 cards.
 
With current GPU pricing this doesn’t work. The cost of 2 cards comes to the point where the similarly priced single card would outperform those 2 cards.

I dunno, 2 2060's for $350 a piece vs a 2080ti for $1200, both driving dual displays of 2560x1440, the 2060s might come out on top. Only in the context of VR I imagine, assuming whatever solution is implemented could actually render each eye completely on one card and keep them in sync.

All hypothetical
 
With current GPU pricing this doesn’t work. The cost of 2 cards comes to the point where the similarly priced single card would outperform those 2 cards.
I don't think so myself. If I could use my my 2 1070's to push each lens (I paid 300 each) would seemingly outperform a $1200 2080ti dollar for dollar.
 
I don't think so myself. If I could use my my 2 1070's to push each lens (I paid 300 each) would seemingly outperform a $1200 2080ti dollar for dollar.
But you would need a special edition VR helmet that had separate inputs for both eyes. It gets pretty niche on niche. But yeah if you are scoring parts on the cheap then it doesn’t apply but retail on retail it doesn’t play out.
 
When it was done correctly, Crossfire worked extremely well and was far from meaningless.

Was that on a 60Hz panel? Sorry but I am having a really hard time believing it worked extremely well in terms of latency
 
Was that on a 60Hz panel? Sorry but I am having a really hard time believing it worked extremely well in terms of latency
Probably 30-45 Hz and I am thinking pre 2010.... Because I also remember it working very well on my Dell 2005 24" 1080p screen, which still works for word processing and is connected to the wifes machine.
 
Last edited:
For everybody asking why is SLI not used more for VR with each GPU rendering each eye, turns out that it is already a thing with a full support package available from nVidia as part of their VRWorks SLI API's.

https://developer.nvidia.com/vrworks/graphics/vrsli

Supposedly this functionality is already built into the latest Unity and Unreal engines, it is just up to developers to enable it and program for it accordingly. So take that however you wish.
 
The other reason VR hasn't pushed hard for SLI is because the compute requirements don't really justify it. It's just not that hard to push 1080p frames down the pipe. The upper end of displays, like the Rift, are still only 2560x1440. A single 2080 Ti can handle that pretty easily.
Yeah looking at say the Oculus Rift or the HTC Vive they are running 2 screens at 1080x1200 which is only 2,592,000 pixles, standard 1080p is 2,073,600 so we are only talking an extra 518,400 pixles here which is an extra 25% over 1080p but still only 70% of 1440p, so any card that can handle gaming at 1440p can crush the vast majority of VR sets that are out there. Also start tossing in the extra rendering features that many VR titles use like rendering fast moving static images at lower resolutions or using the eye tracking to only render clearly the parts you are looking at and using lower resolutions for the area's you aren't looking and stuff like that and you are rendering even less. So you don't need a card that will crush out 200 fps in VR in many cases that will actually make you sick.
 
I picked up a pair of rx570 8GB cards for about 170$ so any talk of multi gpu interests me.
 
Did SLI ever really "work"? Getting support for "current" games was always a hassle....this popular game supports it, but this other popular game doesn't. Works a little in this game, works a lot in this other game, and totally wrecks this 3rd game. Did the memory halving ever get fixed? Two 4GB cards only means 4GB.

Too much hassle.
 
Did SLI ever really "work"? Getting support for "current" games was always a hassle....

Absolutely. Nvidia invested heavily into SLI with DX10 hardware to keep frametimes in order. AMD had to be called to carpet for their 'solution' being so bad that average frametimes went up and observable performance tanked. I ran both.

Support started dying off with DX12 and Vulkan, and while it's been firmed back up by major engine developers, game developers haven't really made an industry-wide effort to shore up support.

A lot of that likely comes from PC gaming still leaning toward lower resolutions and older or less demanding titles as a whole. Despite revenue of AAA-games increasing on the PC side, the Fortnites and Overwatches that will run on a potato have increased even more. One can still have a great time with a GTX970 / GTX1060 / RX470 etc.


The counter-point here is that stuff like 4k120 outputs (and higher), and higher-end VR sets with great FOV and high pixel density will start demanding more performance than a single GPU can deliver, cost no object.

Did the memory halving ever get fixed? Two 4GB cards only means 4GB.

This was never 'broken'- you're not getting pooled memory between GPUs in different slots for real-time (and responsive) rendering at consumer prices. Even the best enterprise solutions aren't going to be optimal for gaming workloads.


What could happen is an industry-wide push for split rendering, be that split-frame rendering as opposed to alternate-frame rendering for traditional displays or rendering separate viewports for VR or perhaps even surround, or in the case of 8k displays, for a modern surround-style viewport stitching solution.
 
Absolutely. Nvidia invested heavily into SLI with DX10 hardware to keep frametimes in order. AMD had to be called to carpet for their 'solution' being so bad that average frametimes went up and observable performance tanked. I ran both.

Support started dying off with DX12 and Vulkan, and while it's been firmed back up by major engine developers, game developers haven't really made an industry-wide effort to shore up support.

A lot of that likely comes from PC gaming still leaning toward lower resolutions and older or less demanding titles as a whole. Despite revenue of AAA-games increasing on the PC side, the Fortnites and Overwatches that will run on a potato have increased even more. One can still have a great time with a GTX970 / GTX1060 / RX470 etc.


The counter-point here is that stuff like 4k120 outputs (and higher), and higher-end VR sets with great FOV and high pixel density will start demanding more performance than a single GPU can deliver, cost no object.



This was never 'broken'- you're not getting pooled memory between GPUs in different slots for real-time (and responsive) rendering at consumer prices. Even the best enterprise solutions aren't going to be optimal for gaming workloads.


What could happen is an industry-wide push for split rendering, be that split-frame rendering as opposed to alternate-frame rendering for traditional displays or rendering separate viewports for VR or perhaps even surround, or in the case of 8k displays, for a modern surround-style viewport stitching solution.
Yeah I think in the coming years they are going to have to do something dramatic to the rendering pipeline I just can't imagine how the existing methods are going to keep pace at 4 and 8K
 
Yeah I think in the coming years they are going to have to do something dramatic to the rendering pipeline I just can't imagine how the existing methods are going to keep pace at 4 and 8K

Well you've noticed how well GPU's respond to having more cores to process with. Same reason they can do CryptoMining so well. Well if you look at AMD's Chiplet solution it easily ramps up. The problem will be getting the chiplets fed with memory and getting the results out to the monitor fast enough.
 
Well you've noticed how well GPU's respond to having more cores to process with. Same reason they can do CryptoMining so well. Well if you look at AMD's Chiplet solution it easily ramps up. The problem will be getting the chiplets fed with memory and getting the results out to the monitor fast enough.

Memory bandwidth is solvable- with money. HMB lives for this, and if AMD is able to crank up their interconnect speeds, they should be able to pull it off. Granted there's nothing stopping Nvidia or Intel from doing the same, sort of a 'tree' architecture.

Imagine the Threadripper / Epyc topology, but with the 'hub' chip being on an HBM package adjacent to one or more chiplets perhaps stacked on one or more separate interposers.

So you'd have two interposer packages, one with HBM and controller / PCIe interface / SLI/CFX interface / output bus for DP/HDMI/etc. / super high-speed bus to GPU chiplet interposers, with the chiplet interposers having a receiving 'controller' and multiple chiplets. The controller dies on both interposers would likely also be seeded with low-latency cache.


Anyway, theorizing. All of the above is possible with current technology and likely presents more of an integration challenge with respect to separate parts working together alongside drivers than a design challenge with respect to arranging the different parts.

I suspect that while branch processing (what CPUs are good at) will be capped on a per-thread basis, throughput-based processing is going to continue to skyrocket in capability. Price will likely skyrocket alongside...
 
Well you've noticed how well GPU's respond to having more cores to process with. Same reason they can do CryptoMining so well. Well if you look at AMD's Chiplet solution it easily ramps up. The problem will be getting the chiplets fed with memory and getting the results out to the monitor fast enough.
In the future I honestly see a time when the monitor will connect to a card that is basically fast cache and the GPU’s will he’s wire to it along with the PCIE slot. Especially if cloud gaming takes off. Then the GPU’s are just in a render pool and the connected card works like a virtual GPU buffer much like vGPU setups now work.
 
Back
Top