Techspot Tests a Multi GPU Setup in 2018

AlphaAtlas

[H]ard|Gawd
Staff member
Joined
Mar 3, 2018
Messages
1,713
Multi GPU setups have fallen out of favor in the past few years. Games rarely advertise support for it anymore, and AMD has even deprecated the "Crossfire" brand. TechSpot noted that the last multi-GPU test they ran was in 2016, so they decided to grab a pair of RX 590s and see what the experience is like today, and the results were mixed. Battlefield V, for example, doesn't seem to support the 2nd 590 at all, while second card boosted Battlefield 1 performance by 46%. Scaling was about as perfect as it can be in Strange Brigade's DX12 mode, while Battlefront II performance was significantly worse with 2 GPUs.

As for the RX 590s in Crossfire, we'd much rather have a single Vega 64 graphics card. It's extremely rare that two 590s will provide higher frame rates than a single Vega 64, while also offering stutter-free gaming. If you're only ever going to play a game like F1 2018 that supports Crossfire really well, then getting two RX 570s for $300 will be a hard combo to beat. But who buys a graphics card to only ever play one or two games? Other drawbacks that are also part of this conversation include heat and power consumption. Those two RX 590s were dumping so much heat into the Corsair Crystal 570X case that you could justify spending more money in case fans and even then you'll still be running hotter due to the way the cards are stacked. You'll also lose out on the power supply. The RTX 2070 works without an issue with a 500w unit, and 600w would be more than enough. The Crossfire 590s though will need an 800 watt unit, 750w would be the minimum.
 
Last edited:
Multi GPU setups have fallen out of favor in the past few years. Games rarely advertise support for it anymore, and AMD has even depreciated the "Crossfire" brand. TechSpot noted that the last multi-GPU test they ran was in 2016, so they decided to grab a pair of RX 590s and see what the experience is like today, and the results were mixed. Battlefield V, for example, doesn't seem to support the 2nd 590 at all, while second card boosted Battlefield 1 performance by 46%. Scaling was about as perfect as it can be in Strange Brigade's DX12 mode, while Battlefront II performance was significantly worse with 2 GPUs.

As for the RX 590s in Crossfire, we'd much rather have a single Vega 64 graphics card. It's extremely rare that two 590s will provide higher frame rates than a single Vega 64, while also offering stutter-free gaming. If you're only ever going to play a game like F1 2018 that supports Crossfire really well, then getting two RX 570s for $300 will be a hard combo to beat. But who buys a graphics card to only ever play one or two games? Other drawbacks that are also part of this conversation include heat and power consumption. Those two RX 590s were dumping so much heat into the Corsair Crystal 570X case that you could justify spending more money in case fans and even then you'll still be running hotter due to the way the cards are stacked. You'll also lose out on the power supply. The RTX 2070 works without an issue with a 500w unit, and 600w would be more than enough. The Crossfire 590s though will need an 800 watt unit, 750w would be the minimum.


Did you mean "deprecated"? Autocorrect can be a bitch :p

This experience has been the same one I have had every time I have tried multi-gpu, though the last time I did it with AMD was a while ago.

In 2011 I got a set of two monster triple slot Asus DirectCU II 6970's to handle my new 2560x1600 screen, a resolution which was difficult to drive at the time. Crossfire was more trouble than it was worth in my opinion. Sure I got scaling in average framerate, but I found the minimum framerates pretty much ruined the experiences. Most of the time they were the same, or only very slightly improved over the single GPU (sometimes they were even lower). This resulted in performance dropping to shit during the most hectic scenes when you needed that performance the most. Added input lag was also a huge issue. And this was when the games would even run. Lots of them just wouldn't with Crossfire enabled. They'd be buggy as all hell or outright crash. I never suffered from the microstutter issues others complained about though. In the end I switched to a - on paper - weaker single 7970, but still liked the experience better than dual 6970's.

upload_2018-12-11_12-30-17.png





Summer of 2015 I made the same mistake (guess I never learn), upgraded to a 4K screen before the GPU market was ready. I tried to remedy this by going with dual 980ti's. I was apprehensive based on my previous multi-GPU experience, but people assured me that Nvidia did SLI better than AMD did Crossfire. This was partially right. Nvidia suffered from less crashing and bugs. Their drivers were better at applying the multi-gpu tech to titles, but they still suffered from the same minimum FPS and input lag issues. Eventually I got a Pascal Titan and while still not fast enough in most cases, was much happier.


upload_2018-12-11_12-31-27.png





Now I know with DX12, the game engine handles the mGPU implementation rather than the GPU drivers, but I don't think this will help. In fact, it will likely make things worse. (Same people who can't have a product that works for months after launch are now responsible for coding multi-GPU rendering code?)

To me it is the fundamental approach that is flawed, and that approach is AFR. Rendering an alternating frame on each GPU inherently results in a boat load of problems. Inpuit lag ins unavoidable, and minimum framerate problems will always exist. The last time SLI worked well was when 3DFX was doing it, and the reason it worked well was because SLI stood for ScanLine Interleave, a form of SFR rendering mode, not AFR. Split each frame between the GPU cores.


Personally I don't understand why this seems to be such a big challenge. GPU's are already highly parallellized with hundreds of cores per GPU die. Instead of having the driver treat them as two 128core GPU's, GPU0 and GPU1, why not address them as a single 256 core rendering unit? I understand there will be interconnect concerns, but AMD's chiplet implementation should help here...

Or at least I thought it would, but AMD engineers are still talking about treating chiplet GPU's as some form of single board crossfire implementation, which is very frustrating.

What we need are chiplets with fast interconnects such that it does not matter which die a single core is located on, and you can throw together any number of chiplets with lots of cores and have the OS still see them as a single large GPU with lots of cores. Only when this happens will multi-GPU be effective.

edit:

Goddamned typos...
 
Last edited:
Same mixed bag experience with SLI & XFiah over the years. When it works.. hot damn. Wen it don't ur just boned.
 
No frametime analysis = no point in reading article. Even with the gains on some of the cards, if it's a choppy mess, what's the point? Regardless CFX is doa right now.
 
No frametime analysis = no point in reading article.

Looks like they test the average bottom 1% of framerates, which is just the inverse of the bottom 1% of frametimes.

So technically you're right, but it's basically the same metric.
 
I find it interesting that its been what 12-13 years since Nvidia and ATI released mGPU solutions and we are no better off now than we were then.
 
having run SLI and crossifre in every single gaming PC i have ever built, i say meh.

2 cards are better than one because empty PCI express slots are sad PCI express slots.

also wasn't DX12 the saving grace for mgpu setups? i think MS touted that somewhere.
 
The best case they had a 96% improvement, so it seems the potential is there, though the power draw is quite large. I would be interested to see how much development effort is involved in using Vulkan's multi-gpu distribution feature, and how much improvement a game would see from that. I am not aware of any program making use of it yet.
 
Luckily SLI/crossfire are almost dead now. What good are these frames if they're late? The whole point is to look smooth and eliminate input lag.
 
The best case they had a 96% improvement, so it seems the potential is there, though the power draw is quite large. I would be interested to see how much development effort is involved in using Vulkan's multi-gpu distribution feature, and how much improvement a game would see from that. I am not aware of any program making use of it yet.


It's exactly the same as DX12 mGPU: not easy, and not likely.

Be amazed if developers actually take the time to support Shader Intrinsics correctly. It's asking way too much for anythign but the largest teams to support at-the-metal mGPU as well.
 
Meh
And I still run into gamers that absolutely need as many PCIe lanes because they want to build a quad SLI system :rolleyes:
 
also wasn't DX12 the saving grace for mgpu setups? i think MS touted that somewhere.

No. It just shifts the burden from the GPU drivers to the game engine.

Who would you rather have coding your multi-GPU implementations? AMD and Nvidias driver teams, or the people who can't launch a game without initial crippling bugs and 6 months of hectic patching?
 
Last edited:
I ran SLI since 3DFX era. Nvidia SLI as soon as they started. I ran as many as 4 cards in SLI and even a 5 card with 1 for physx. I had enormous problems over the years with them and then I switched to AMD during the 290X era. I didn't have nearly as many issues. My drivers were more stable and more consistent. I was presently surprised after using so many generations of Nvidia cards. I run crossfire right now on Twin Vega Frontier's watercooled. It still takes some tinkering once in awhile. A game says it doesn't support it look around find out how it actually works anyway. You can usually get it working if you care. Some of us like to run 3 huge monitors others don't care. I have dreamed of the day single cards could drive the insanity I was using and now... i am down to 1 34inch ultrawide. Suddenly I could run 1 card if I wanted. It's a odd day. But I will wait.
 
Instead of having the driver treat them as two 128core GPU's, GPU0 and GPU1, why not address them as a single 256 core rendering unit?

This seems to be where Nvidia is heading with NVLink. We only get a crippled version with the RTX gaming cards, but the groundwork is there. The link is now fast enough that applications [which have support for it] can now address two NVLinked 2080Tis as one card with a single double-sized pool of memory. The real missing link at this point is for games to support the updated tech.
 
This seems to be where Nvidia is heading with NVLink. We only get a crippled version with the RTX gaming cards, but the groundwork is there. The link is now fast enough that applications [which have support for it] can now address two NVLinked 2080Tis as one card with a single double-sized pool of memory. The real missing link at this point is for games to support the updated tech.


The games shouldn't have to do anything different if the NVLink system presents itself as one giant GPU with many cores and one giant pool of vram.

There may be some Numa-like considerations but that should be about it
 
Last edited:
The games shouldn't have to do anything different if the NVLonk system rpesents itself as one giant GPU with many cores and one giant pool of vram.

The conspiracy theorist in me thinks that by requiring games to support it, they get better lock-in from developers due to the commitment required to implement it.
 
This seems to be where Nvidia is heading with NVLink. We only get a crippled version with the RTX gaming cards, but the groundwork is there. The link is now fast enough that applications [which have support for it] can now address two NVLinked 2080Tis as one card with a single double-sized pool of memory. The real missing link at this point is for games to support the updated tech.

I don't see how NVlink can possibly be as fast as VRAM. It's not a matter of technology, it's a matter of physics. Physically, two chips directly connected across a trace that measures millimeters in distance is going to transfer data much faster than two chips that are separated by centimeters with an I/O controller in between, not to mention the latency introduced by the I/O controller. Pooled VRAM isn't going to happen, but split frame rendering might become faster with NVLink.
 
I always did Crossfire from the HD2900's to my Sapphire Furies non X versions. (I tried 290's in crossfire as well as 390's but they got to hot for the case I was using at the time.) I am on a Vega 56 in one machine and an RX570 8GB in another and although I am tempted to crossfire, I do not need to and it would be a waste of money at this point. (Yes, a fun waste but a waste, nonetheless.)
 
Seriously they used TAA, that depends on multiple frames to calculate pixel color which does not work well with mGPU or CFX? Leads to stutter, lower frame times and corruption. FAIL

Some of the shadow settings and other settings in games can also have a big impact. Anyways two 1080Ti’s in Shadow Of The Tomb Raider is giving an experience (good) that no single card could do. So yes some or even many titles suck, some you cannot play better.

Serious Sam VR series (all of them) work perfect in VR in Cfx toasting any single card out there, as in two Vega’s blows away my single 1080 Ti in the same title while SLI makes it worst.
 
I don't see how NVlink can possibly be as fast as VRAM. It's not a matter of technology, it's a matter of physics. Physically, two chips directly connected across a trace that measures millimeters in distance is going to transfer data much faster than two chips that are separated by centimeters with an I/O controller in between, not to mention the latency introduced by the I/O controller. Pooled VRAM isn't going to happen, but split frame rendering might become faster with NVLink.

They just need to quadruple the lanes. Make NVlink the length of the card. The 0.01% that will purchase dual 2080tis demand it. ;)

Dual GPU card with extra traces could be cool. Imagine the fires it could start.
 
Just recently ended a nearly 10 year run of SLI in my desktops. Went from 2 G1 OC 1080's SLI to a Strix OG RTX 2080TI for 4k gaming. As [H]ard's review showed it's not a completely true 4k/60fps card but it is really close.

Pro's: Don't have to worry about support any more or look for custom bits for NV inspector. Freed up some space on the MB for potential NVME cards. Also gained ~10-20% in performance. Cost(I'll exlpain below).

Con's: Heat in the case and power draw is about the same. Perhaps a little less but nearly the same. Size(thankfully went with HAF 932 when I started this rig 7-8 years ago). Q/A on the RTX's(so far so good for me but that doesn't mean I'm not concerned). Cost.


I listed cost in both pro's and con's because ultimately the one rtx cost was nearly equal of the 2 1080's and the HB bridge I had to order due to my MB spacing. Bottom line is that kind of performance is going to cost.

My history with multi GPU:
1. PNY 560TI's
2. Gigabyte G1 OC 970's-The best of times! At one point they were even paired with a dedicated physX card.
3. MSI 980m's-A MSI GT80 laptop that I still have.
3. Gigabyte G1 OC 1080's-The downward spiral.
 
I ran SLI since 3DFX era. Nvidia SLI as soon as they started. I ran as many as 4 cards in SLI and even a 5 card with 1 for physx. I had enormous problems over the years with them and then I switched to AMD during the 290X era. I didn't have nearly as many issues. My drivers were more stable and more consistent. I was presently surprised after using so many generations of Nvidia cards. I run crossfire right now on Twin Vega Frontier's watercooled. It still takes some tinkering once in awhile. A game says it doesn't support it look around find out how it actually works anyway. You can usually get it working if you care. Some of us like to run 3 huge monitors others don't care. I have dreamed of the day single cards could drive the insanity I was using and now... i am down to 1 34inch ultrawide. Suddenly I could run 1 card if I wanted. It's a odd day. But I will wait.

I've had almost the opposite experience. I've run multi-GPU configurations from both camps since the beginning and overall I've had more trouble from Crossfire and AMD drivers than NVIDIA's. However, one of the most problem free configurations I had was with AMD's 1950XTX's in Crossfire. That setup just worked. I had built another system with dual 1950 Pro's and it worked just as well. I ended up selling that off to someone who isn't technically adept. They had zero trouble from it either. Also, there were times where the problems with Crossfire were insurmountable and I had to switch to NVIDIA cards because AMD's drivers / cards were basically shit. The Radeon HD 4870x2 Crossfire setup was unusable. Dual Radeon HD 7970's were also a no go, but I eventually figured out why. Even with a semi-direct line to AMD via HardOCP, my issues were ignored. Crossfire was redesigned to eliminate the problems I had soon after with its next generation of graphics cards.

I've had some testing here and there with its newer products and I've ever played some games on Vega and RX 580 systems. They've worked flawlessly, but the performance isn't on par with NVIDIA's so I've not put any AMD cards in my system in quite some time. I think that dual Radeon HD 7970 setup was the last one I had. I've still got one of the 7970's in use. As a single card it has been fantastic. Two of them together would have worked fine, but not in my particular situation. Basically, Crossfire was a shit show.
 
They just need to quadruple the lanes. Make NVlink the length of the card. The 0.01% that will purchase dual 2080tis demand it. ;)

Dual GPU card with extra traces could be cool. Imagine the fires it could start.

Yeah, you can address bandwidth, but you still run into the problem of latency, and that is probably the most important factor in a fast-paced game.
 
SLI/Crossfire is by its very existence a testament to current technology's shortcomings. After all, in a perfect world, a single cheap and commonly available chip would be enough to provide all the horsepower required to deliver graphics indistinguishable from reality (if sci-fi is any guide of where we want to go).

Unless all those perfect holograms which spring up in said sci-fi films and books are being delivered by server farms via the cloud. Still, the principle is the same. The idea is to arrive at a place where photorealistic 3D graphics are ubiquitous, and that won't happen via a pair of disgustingly expensive GPUs in every home.
 
No. It just shifts the burden from the GPU drivers to the game engine.

Who would you rather have coding your multi-GPU implementations? AMD and Nvidias driver teams, or the people who can't launch a game without initial crippling bugs and 6 months of hectic patching?

Ooh ooh, I'll take crippling bugs and 6 months of hectic patching for $1,000!
 
I don't see how NVlink can possibly be as fast as VRAM. It's not a matter of technology, it's a matter of physics. Physically, two chips directly connected across a trace that measures millimeters in distance is going to transfer data much faster than two chips that are separated by centimeters with an I/O controller in between, not to mention the latency introduced by the I/O controller. Pooled VRAM isn't going to happen, but split frame rendering might become faster with NVLink.

Pooled VRAM literally already happened.

Nvidia is claiming 100GBps from the neutered 2080Ti NVlink. The Quadros have 4x that. 400GBps of link bandwidth seems like plenty on a card that has only 616GBps of total memory bandwidth to begin with. If there is anything less than perfect conditions, that 616GBps can go all the way down to 1.75GBps (the speed of a single GDDR6 module). If the data the other card is trying to access is being hosted on only two of the other card's VRAM modules, then you're looking at a paltry 3.5GBps. If the target data is spread perfectly across all modules, but two of the cards in the cluster need to hit the same block of data concurrently, then the effective memory bandwidth drops to only 308Gbps - below the max link speed.

When Nvidia has talked about the need for the link lately, they emphasize that the issue with latency is not the physical distance but simply the number of links in the chain. Without the link (relying only on PCI-E), the link would be GPU1->PCI-E controller->CPU->PCI-E controller->GPU2. Then double that for anything requiring a round trip.

Anecdotally, you can get a good idea of just much link bandwidth is actually required simply by watching the memory controller load while you're gaming. Firstly, the game is probably going to use half of the VRAM on the card. Unless the card is duplicating data across modules similar to RAID 0, then the total available memory bandwidth is going to be on the order of half of the max, so 308GBps. Then watch to see what the controller is doing. My guess? It's maxing out around 50%. It's not a perfect extrapolation, but this is a good confirmation that the bandwidth being used is around half of the max.


So yes, 100GBps isn't perfect. But at the same time, it is more than enough to give meaningful performance gains - far beyond the SLI schemes of old.
 
But who buys a graphics card to only ever play one or two games?

This is me...All day.

I only play Civilization 6 and Star Wars The Old Republic. I spent a lot of money and bought a 1070 just to do so, it feels like so much overkill for both. But I can say I also needed a card with better 4k performance over my older AMD card as I use a 4k TV as a second monitor.
 
Meh
And I still run into gamers that absolutely need as many PCIe lanes because they want to build a quad SLI system :rolleyes:
You have to get with the times. We want the extra lanes for our RAID 5 NVME SSD arrays :cool:.
 
Seriously they used TAA, that depends on multiple frames to calculate pixel color which does not work well with mGPU or CFX? Leads to stutter, lower frame times and corruption. FAIL

Some of the shadow settings and other settings in games can also have a big impact. Anyways two 1080Ti’s in Shadow Of The Tomb Raider is giving an experience (good) that no single card could do. So yes some or even many titles suck, some you cannot play better.

Serious Sam VR series (all of them) work perfect in VR in Cfx toasting any single card out there, as in two Vega’s blows away my single 1080 Ti in the same title while SLI makes it worst.

Another useless mGPU article because it doesn't provide any new information. They really should have looked at all graphics settings that are inter-dependent on sequential frames and disable them. Some might be simple toggles... others will need a reshade injector to disable.

Water can only be so wet.
 
Pooled VRAM literally already happened.

Nvidia is claiming 100GBps from the neutered 2080Ti NVlink. The Quadros have 4x that. 400GBps of link bandwidth seems like plenty on a card that has only 616GBps of total memory bandwidth to begin with. If there is anything less than perfect conditions, that 616GBps can go all the way down to 1.75GBps (the speed of a single GDDR6 module). If the data the other card is trying to access is being hosted on only two of the other card's VRAM modules, then you're looking at a paltry 3.5GBps. If the target data is spread perfectly across all modules, but two of the cards in the cluster need to hit the same block of data concurrently, then the effective memory bandwidth drops to only 308Gbps - below the max link speed.

When Nvidia has talked about the need for the link lately, they emphasize that the issue with latency is not the physical distance but simply the number of links in the chain. Without the link (relying only on PCI-E), the link would be GPU1->PCI-E controller->CPU->PCI-E controller->GPU2. Then double that for anything requiring a round trip.

Anecdotally, you can get a good idea of just much link bandwidth is actually required simply by watching the memory controller load while you're gaming. Firstly, the game is probably going to use half of the VRAM on the card. Unless the card is duplicating data across modules similar to RAID 0, then the total available memory bandwidth is going to be on the order of half of the max, so 308GBps. Then watch to see what the controller is doing. My guess? It's maxing out around 50%. It's not a perfect extrapolation, but this is a good confirmation that the bandwidth being used is around half of the max.


So yes, 100GBps isn't perfect. But at the same time, it is more than enough to give meaningful performance gains - far beyond the SLI schemes of old.

Professional applications and gaming applications are entirely different beasts in terms of requirements. It is easy to pool RAM in a professional application where delivering results at the right millisecond doesn't matter, just average throughput matters. In gaming, the right result at the right millisecond is what matters, and that's where latency (and to a lesser extent bandwidth) matters. They can claim that latency won't be affected by physical distance, but simple physics cannot be ignored unless you start applying quantum physics computing, which we aren't even close to yet.

Besides, even with the NVLink, it will still have to go GPU1 memory controller -> GPU1 NVlink -> GPU2 NVlink -> GPU2. It will still be far easier to extract overall performance games by not pooling VRAM. BTW, pooled VRAM will affect AFR and SFR the same way.
 
They can claim that latency won't be affected by physical distance, but simple physics cannot be ignored unless you start applying quantum physics computing, which we aren't even close to yet..

Perhaps brushing up on the laws of physics would be useful here. To travel the 6" between the two memory controllers at the speed of light, we're looking at about half of a nanosecond (2Ghz). What game are you running at over 2,000,000,000 fps? I'd love to see a screen cap of this one.

If we skip the objective science and just look at some real world examples that illustrate your confusion, let's think about a few instances where your speed of light assertion falls apart:
  1. The distance between motherboard DIMMs and the CPU exceeds the distance between neighboring GPU memory controllers. I guess that means CPUs can't run calculations, right?
  2. The distance between the CPU and the GPUs is over 6". I guess that breaks the entire machine, right?
  3. The length of cable from the GPU to the screen could be 6ft. With your 2,000,000,000 fps game, that means 12 frames are sitting inside of the HDMI cable at any given moment! How can you play with so much lag??
  4. Actually, does it always feel like you're drunk any time you look at the screen? If it's one of those ultrawides that's 30" wide, the right side is going to be 5 frames behind the left side.
 
Perhaps brushing up on the laws of physics would be useful here. To travel the 6" between the two memory controllers at the speed of light, we're looking at about half of a nanosecond (2Ghz). What game are you running at over 2,000,000,000 fps? I'd love to see a screen cap of this one.

If we skip the objective science and just look at some real world examples that illustrate your confusion, let's think about a few instances where your speed of light assertion falls apart:
  1. The distance between motherboard DIMMs and the CPU exceeds the distance between neighboring GPU memory controllers. I guess that means CPUs can't run calculations, right?
  2. The distance between the CPU and the GPUs is over 6". I guess that breaks the entire machine, right?
  3. The length of cable from the GPU to the screen could be 6ft. With your 2,000,000,000 fps game, that means 12 frames are sitting inside of the HDMI cable at any given moment! How can you play with so much lag??
  4. Actually, does it always feel like you're drunk any time you look at the screen? If it's one of those ultrawides that's 30" wide, the right side is going to be 5 frames behind the left side.

:rolleyes:

Let's see here. We have silicon transistors capable of 10+ ghz, but in the real CPU world we can barely break 5 ghz. Same principle applies. Sure, theoretically we can do 2 Ghz across 6" of space, but real world application does not work that way. Do not confuse what is theoretically possible with what is practically possible.
 
Sure, theoretically we can do 2 Ghz across 6" of space, but real world application does not work that way. Do not confuse what is theoretically possible with what is practically possible.

Oh man. You're gonna crap your pants when you learn about wifi. WILL THE WONDERS NEVER CEASE??
 
Oh man. You're gonna crap your pants when you learn about wifi. WILL THE WONDERS NEVER CEASE??

Point being? I get better and more consistent ping on ethernet than wifi. Limitations are limitations. NVLink can get better, but so will IMC performance. NVLink (or whatever they want to call it in the future) will always be behind direct IMC access.
 
SLI or Crossfire both are pretty useless anymore and neither company is putting any effort into it. Best to just avoid the headache and also I have used both on 290x's and 8800gt's. Both have driver issues at times or some damn feature that wont work right with it enabled.
 
https://www.corsair.com/us/en/Categ...NCE-RGB-PRO-Light-Enhancement-Kit/p/CMWLEKIT2

Dang it. I wasted all of that money filling the extra RAM slots with... RAM. :-(

just waiting for the GPU version, lol.

That one has been out for a few years. They're sold by AMD and are kind of expensive for what you get. [ducking]

I keed, I keed!
 
Point being? I get better and more consistent ping on ethernet than wifi. Limitations are limitations. NVLink can get better, but so will IMC performance. NVLink (or whatever they want to call it in the future) will always be behind direct IMC access.

The question is what is good enough to make it appear as one GPU. They already store textures on each card, which is a lot of the bandwidth.

I don’t know the answer but what I know for certain is I won’t touch mGPU until it appears as one GPU and “just works”.
 
Back
Top