PCI-E Speed tests/ramblings for video cards

Vega

Supreme [H]ardness
Joined
Oct 12, 2004
Messages
7,143
I've been thinking about going wild and doing another 3x Portrait 4-Way SLI system, but first I wanted to do some tests. Everyone knows that with X99 (besides PLX boards), once you go over two video cards, the third and fourth will drop down to 8.0x PCI-E 3.0. I wanted to know how much of a detriment that would be. I also wanted to know why there was so much bad scaling in 3-Way and 4-Way SLI with a lot of these online reviews.

I really don't put much stock into those reviews as they usually just toss 3 or 4 GPU's, air cooled next to each other with no tweaking on stock systems and go "Ohh 3-4 way SLI scaling sucks". That's why I always test myself. So to start off, if I wanted to go 4-Way SLI on my system in sig, I would have to move my Intel 750 PCI-E SSD to somewhere else.

Here you can see I moved the SSD from the end PCI-E 3.0 8x slot to the RV Extreme's black/grey, PCI-E 2.0 4x slot that utilizes the PCH/DMI and not the CPU direct PCI-E lanes:

1000



Before (PCI-E 3.0 8x slot):

1000




After (PCI-E 2.0 4x slot):

1000




I was surprised that I didn't lose more overall score. If 4-Way SLI was worth it, I'd sacrifice that bit of speed with the SSD. Although, I'd have to get creative with a PCI-E riser to clear the sandwiched EK water cooled Titan-X's and their annoying DVI port.


So this test below is with my sig computer. All games were tested first with 2x SLI maxed out graphics (2560x1440), even using up to 200% SMAA in some cases to ensure GPU's were maxed out with the fastest GPU configuration. All game settings were then locked down and nothing changed between tests. The only variable between each set of tests was simply disabling or enabling SLI in the NVIDIA control panel, or switching from PCI-E 3.0 to 2.0 in the ASUS UEFI. After any changes were done, a reboot or GPU driver reset was accomplished to ensure a stable platform. CPU-Z and GPU-Z were used to verify configurations.


2606404



I have come to a few conclusions. SLI scaling is good, but not great. The average was 80%. PCI-E 3.0 16x PCI-E 2.0 16x (8x 3.0) does make a difference with demanding setups. I've averaged a 14% increase.

So that beg's the question; what happens to three and four way SLI when you drop down to a 16x/8x/8x/8x setup? With SLI AFR, you are only as fast as your bottleneck or slowest GPU. That 14% loss dropping down your high end GPU's from 16x 3.0 to 8x 3.0 I theorize could have a trickle down effect. Especially at very demanding resolutions. This could also explain the irregular frame-times shown with 3-4 GPU setups. As the bandwidth routinely brushes up against the limit, hiccups will undoubtedly occur.

Now, whether the dual PLX chip boards (Extreme 11 or X99-E WS) help, that is unclear. While those chips allow 16x communication between two of the GPU's via the PLX chip, there is only one 16x lane to the CPU and then back again to the other PLX and other two connected GPU's. You have two groups of those. So essentially you have two 16x lanes with two GPU's on each talking to the CPU for a total of 32x. Now add in latency for the PLX chip communication. With the PLX chip setup, you essentially have four GPU's now talking to the CPU via 32 lanes, versus 40x lanes of native.

To mention 3-Way SLI, once again you fall into a lowest common denominator setup as any third card would be forced down to 8x 3.0, now becoming the bottleneck and dragging the whole setup down. According to my tests, that would be on order of 14% on average. Quite a significance.

My thought is that by far the best bang for your buck is to stick with two really fast video cards running on PCI-E 3.0 16x slots/speed.
 
Great write up Vega...While it saddens me to not see you doing another insane 4 way rig, it clearly isn't worth it right now with Intel being so stingy with PCI-E lanes...Perhaps with the launch of the 4.0 spec things will become viable again, especially considering we are trending ever more toward 5K and then 8K resolutions..
 
Nice work Vega. I do have a question though. If you're playing these games on a 4K TV why are you running them only at 1440p? Are you having to make a lot of graphical setting concessions to maintain a certain frame rate?

I see that the loss in frame rate from dropping to PCIE 3.0 to 2.0 and of course nobody wants that. But if you're using a 60Hz monitor then it shouldn't make that much of a difference in enjoyment. From what I have seen of 3 way and 4way systems is that the 3 way systems scale only in a few titles, and 4th card in a 4 way system is pretty much only there for looks.

But in some titles there is perfect scaling from 1 - 2 way - 3 way - 4 way SLi. So that gives me the impression that it's more of a game development / driver issue than heat or PCIE bus speed. Check out the graph for Sniper Elite on DGLee's blog. Sometimes the scaling is really good and sometimes it just fails to impress.

Once again thanks for taking the time to run those tests.
 
Interesting. Thanks for putting all the time into that.. I once went from Titan X SLI + 750ti (Physx) to just Titan X SLI because I wanted to ensure the PCIe wasn't causing a bottleneck.
 
Some nice research you did there! It seems we have finally hit the 10% (noticeable) performance drop for PCIe 3!

This article from Guru3D earlier in the year shows the GTX 980 SLI is already hitting a 5% drop comparing 2.0 versus 3.0:

http://www.guru3d.com/articles_pages/pci_express_scaling_game_performance_analysis_review,17.html

So it was only a short matter of time before we hit 10% But it seems you already managed that!

So, when should we expect PCIe 4.0 on Intel chipsets?

EDIT: looks like we have to wait a bit :(

http://www.kitguru.net/components/g...es-and-new-connector-to-be-finalized-by-2017/

2017 is so far away!
 
Yep changing from 2.0 to 3.0 was one of big reason for me to move from my old 2500K
 
Everyone knows that with X99 (besides PLX boards), once you go over two video cards, the third and fourth will drop down to 8.0x PCI-E 3.0.
To mention 3-Way SLI, once again you fall into a lowest common denominator setup as any third card would be forced down to 8x 3.0, now becoming the bottleneck and dragging the whole setup down. According to my tests, that would be on order of 14% on average. Quite a significance.

Well, *by far* the most popular CPU for X99 is the 5820k. No one with a 5820k is doing 16x/16x 2-way SLI to begin with. Interesting data though.
 
Last edited:
This makes me wonder if next gen cards are going to be just completely bottlenecked by PCI-E 3.0

It seems like PCI-E 4.0 not coming out until sometime in 2017 (at the earliest) is really going to hold back everything.

If next gen is around 2x's the performance of current gen, PCI-E 4.0 is going to be outdated before it even gets here (next gen refreshes).

I am hoping some sort of data compression/offloading/rerouting can be implemented that avoids this issue.

Otherwise we might start getting graphics cards with 2 or more 16x PCI-E interfaces for a single card.
 
Last edited:
This makes me wonder if next gen cards are going to be just completely bottlenecked by PCI-E 3.0

It seems like PCI-E 4.0 not coming out until sometime in 2017 (at the earliest) is really going to hold back everything.

If next gen is around 2x's the performance of current gen, PCI-E 4.0 is going to be outdated before it even gets here (next gen refreshes).

I am hoping some sort of data compression/offloading/rerouting can be implemented that avoids this issue.

Otherwise we might start getting graphics cards with 2 or more 16x PCI-E interfaces for a single card.

Well, Nvidia is supposed to fix SLI scaling with Pascal. We'll see if they go with their own version of the XDMA engine, or if they beef up the bandwidth of the interconnect by replacing it with NVLink.

http://www.kitguru.net/components/g...-nvlink-to-enable-8-way-multi-gpu-capability/

I know what my money is on :D

Why create this great card-to-card interconnect if you're only going to use it in servers? They introduced their proprietary SLI connector a decade ago, so nothing is stopping them from doing the same thing again with NVLink!
 
Nice work Vega. I do have a question though. If you're playing these games on a 4K TV why are you running them only at 1440p? Are you having to make a lot of graphical setting concessions to maintain a certain frame rate?

I see that the loss in frame rate from dropping to PCIE 3.0 to 2.0 and of course nobody wants that. But if you're using a 60Hz monitor then it shouldn't make that much of a difference in enjoyment. From what I have seen of 3 way and 4way systems is that the 3 way systems scale only in a few titles, and 4th card in a 4 way system is pretty much only there for looks.

But in some titles there is perfect scaling from 1 - 2 way - 3 way - 4 way SLi. So that gives me the impression that it's more of a game development / driver issue than heat or PCIE bus speed. Check out the graph for Sniper Elite on DGLee's blog. Sometimes the scaling is really good and sometimes it just fails to impress.

Once again thanks for taking the time to run those tests.

I have both the 4K monitor and 1440p. I thought about also doing 4K monitor runs, but I deemed the 1440p numbers sufficient for the data I was after. Honestly, I was expecting maybe a ~5% increase going from 2.0 to 3.0. Quite surprised at how large the difference was.

Granted, running 1440p with games graphics maxed and SSAA at these high of frame rates is very taxing on the PCI-E bus. Both GPU's pegged at 99% pretty much at all times. Also remember these are Titan-X's running at 1563 MHz / 8400 MHz being fed by a really fast CPU and memory.

I think game/engine design does also make an effect. Look at Metro:LL and how there was not much of a gain. Or look how some benchmarks like 3DMark have really good scaling in 3/4 way SLI. IMO some of these games/engines aren't sending as much traffic as others over the PCI-E bus. SOmething like 3DMark is pretty simplistic.

PCI-E bus speed though almost always makes a difference as shown in these results. Think about how that average 14% drop affects the rest of the system. Not only does that 14% affect the third and fourth cards performance, but with the way SLI works it affects the performance of cards 1 and 2! Now you can clearly see why 3 and 4 way SLI with these PCI-E restrictions combined with lackluster 3/4 way SLI drivers and how 3/4 way SLI seriously tapers off scaling significantly in benchmarks. By adding that 3rd or 4th card you are essentially hobbling the base 1/2 GPU scaling that the 3rd and 4th card are referenced off of.

An analogy is adding another fuel tank to a truck. The first fuel tank has 200 gallons, the second fuel tank added has 200 gallons. Now you would think you would get exactly double the range. Once you add in the weight of carrying around the extra fuel before it's burned off, your range is only extended by 80% instead of 100%. Kinda the same principle.

Well, *by far* the most popular CPU for X99 is the 5820k. No one with a 5820k is doing 16x/16x 2-way SLI to begin with. Interesting data though.

While true, I do think that the majority of users having two high end GPU's in SLI will have a 40-lane PCI-E CPU.

Well, Nvidia is supposed to fix SLI scaling with Pascal. We'll see if they go with their own version of the XDMA engine, or if they beef up the bandwidth of the interconnect by replacing it with NVLink.

http://www.kitguru.net/components/g...-nvlink-to-enable-8-way-multi-gpu-capability/

I know what my money is on :D

Why create this great card-to-card interconnect if you're only going to use it in servers? They introduced their proprietary SLI connector a decade ago, so nothing is stopping them from doing the same thing again with NVLink!

Yes, it will be interesting to see what they do with a SLI bridge interconnect revamp. They used to publish some numbers on the amount of data that went over the SLI bridge. But that data is quite outdated these days.
 
Last edited:
Yep changing from 2.0 to 3.0 was one of big reason for me to move from my old 2500K

Same conclusion I drew when I retired my 2600k z68 using a pair 780ti cards. At the time I was using 3 vg248qe in portrait surround and my bus usage spiked to 99 - 100 percent constantly resulting in low gpu usage and shitty framerates.
 
Same conclusion I drew when I retired my 2600k z68 using a pair 780ti cards. At the time I was using 3 vg248qe in portrait surround and my bus usage spiked to 99 - 100 percent constantly resulting in low gpu usage and shitty framerates.

I will never run my 1500€ cards on a bottleneck, even if the bottleneck is small.
If PCI Exp 3.0 16x is faster than 8x, than I like to go 16x.
No reason to spend money on high end cards if you are limited by the rest of the hardware.
 
Jesus!

The difference between PCI-E 3.0 and PCI-E 2.0 is that PCI-E 2.0 runs in English, whereas 3.0 runs in German....

I had no idea the differences were THAT noticeable!
 
Back
Top