I've been thinking about going wild and doing another 3x Portrait 4-Way SLI system, but first I wanted to do some tests. Everyone knows that with X99 (besides PLX boards), once you go over two video cards, the third and fourth will drop down to 8.0x PCI-E 3.0. I wanted to know how much of a detriment that would be. I also wanted to know why there was so much bad scaling in 3-Way and 4-Way SLI with a lot of these online reviews.
I really don't put much stock into those reviews as they usually just toss 3 or 4 GPU's, air cooled next to each other with no tweaking on stock systems and go "Ohh 3-4 way SLI scaling sucks". That's why I always test myself. So to start off, if I wanted to go 4-Way SLI on my system in sig, I would have to move my Intel 750 PCI-E SSD to somewhere else.
Here you can see I moved the SSD from the end PCI-E 3.0 8x slot to the RV Extreme's black/grey, PCI-E 2.0 4x slot that utilizes the PCH/DMI and not the CPU direct PCI-E lanes:
Before (PCI-E 3.0 8x slot):
After (PCI-E 2.0 4x slot):
I was surprised that I didn't lose more overall score. If 4-Way SLI was worth it, I'd sacrifice that bit of speed with the SSD. Although, I'd have to get creative with a PCI-E riser to clear the sandwiched EK water cooled Titan-X's and their annoying DVI port.
So this test below is with my sig computer. All games were tested first with 2x SLI maxed out graphics (2560x1440), even using up to 200% SMAA in some cases to ensure GPU's were maxed out with the fastest GPU configuration. All game settings were then locked down and nothing changed between tests. The only variable between each set of tests was simply disabling or enabling SLI in the NVIDIA control panel, or switching from PCI-E 3.0 to 2.0 in the ASUS UEFI. After any changes were done, a reboot or GPU driver reset was accomplished to ensure a stable platform. CPU-Z and GPU-Z were used to verify configurations.
I have come to a few conclusions. SLI scaling is good, but not great. The average was 80%. PCI-E 3.0 16x PCI-E 2.0 16x (8x 3.0) does make a difference with demanding setups. I've averaged a 14% increase.
So that beg's the question; what happens to three and four way SLI when you drop down to a 16x/8x/8x/8x setup? With SLI AFR, you are only as fast as your bottleneck or slowest GPU. That 14% loss dropping down your high end GPU's from 16x 3.0 to 8x 3.0 I theorize could have a trickle down effect. Especially at very demanding resolutions. This could also explain the irregular frame-times shown with 3-4 GPU setups. As the bandwidth routinely brushes up against the limit, hiccups will undoubtedly occur.
Now, whether the dual PLX chip boards (Extreme 11 or X99-E WS) help, that is unclear. While those chips allow 16x communication between two of the GPU's via the PLX chip, there is only one 16x lane to the CPU and then back again to the other PLX and other two connected GPU's. You have two groups of those. So essentially you have two 16x lanes with two GPU's on each talking to the CPU for a total of 32x. Now add in latency for the PLX chip communication. With the PLX chip setup, you essentially have four GPU's now talking to the CPU via 32 lanes, versus 40x lanes of native.
To mention 3-Way SLI, once again you fall into a lowest common denominator setup as any third card would be forced down to 8x 3.0, now becoming the bottleneck and dragging the whole setup down. According to my tests, that would be on order of 14% on average. Quite a significance.
My thought is that by far the best bang for your buck is to stick with two really fast video cards running on PCI-E 3.0 16x slots/speed.
I really don't put much stock into those reviews as they usually just toss 3 or 4 GPU's, air cooled next to each other with no tweaking on stock systems and go "Ohh 3-4 way SLI scaling sucks". That's why I always test myself. So to start off, if I wanted to go 4-Way SLI on my system in sig, I would have to move my Intel 750 PCI-E SSD to somewhere else.
Here you can see I moved the SSD from the end PCI-E 3.0 8x slot to the RV Extreme's black/grey, PCI-E 2.0 4x slot that utilizes the PCH/DMI and not the CPU direct PCI-E lanes:
Before (PCI-E 3.0 8x slot):
After (PCI-E 2.0 4x slot):
I was surprised that I didn't lose more overall score. If 4-Way SLI was worth it, I'd sacrifice that bit of speed with the SSD. Although, I'd have to get creative with a PCI-E riser to clear the sandwiched EK water cooled Titan-X's and their annoying DVI port.
So this test below is with my sig computer. All games were tested first with 2x SLI maxed out graphics (2560x1440), even using up to 200% SMAA in some cases to ensure GPU's were maxed out with the fastest GPU configuration. All game settings were then locked down and nothing changed between tests. The only variable between each set of tests was simply disabling or enabling SLI in the NVIDIA control panel, or switching from PCI-E 3.0 to 2.0 in the ASUS UEFI. After any changes were done, a reboot or GPU driver reset was accomplished to ensure a stable platform. CPU-Z and GPU-Z were used to verify configurations.
I have come to a few conclusions. SLI scaling is good, but not great. The average was 80%. PCI-E 3.0 16x PCI-E 2.0 16x (8x 3.0) does make a difference with demanding setups. I've averaged a 14% increase.
So that beg's the question; what happens to three and four way SLI when you drop down to a 16x/8x/8x/8x setup? With SLI AFR, you are only as fast as your bottleneck or slowest GPU. That 14% loss dropping down your high end GPU's from 16x 3.0 to 8x 3.0 I theorize could have a trickle down effect. Especially at very demanding resolutions. This could also explain the irregular frame-times shown with 3-4 GPU setups. As the bandwidth routinely brushes up against the limit, hiccups will undoubtedly occur.
Now, whether the dual PLX chip boards (Extreme 11 or X99-E WS) help, that is unclear. While those chips allow 16x communication between two of the GPU's via the PLX chip, there is only one 16x lane to the CPU and then back again to the other PLX and other two connected GPU's. You have two groups of those. So essentially you have two 16x lanes with two GPU's on each talking to the CPU for a total of 32x. Now add in latency for the PLX chip communication. With the PLX chip setup, you essentially have four GPU's now talking to the CPU via 32 lanes, versus 40x lanes of native.
To mention 3-Way SLI, once again you fall into a lowest common denominator setup as any third card would be forced down to 8x 3.0, now becoming the bottleneck and dragging the whole setup down. According to my tests, that would be on order of 14% on average. Quite a significance.
My thought is that by far the best bang for your buck is to stick with two really fast video cards running on PCI-E 3.0 16x slots/speed.