PCIE Bifurcation

Got the new Gen 3 splitter today and it's quite a bit different from the previous one that was Gen 2. Hopefully this one will work with the 970 GTX. :)

Thanks for the images btw, but because they are small are the compression is high I can't make out any real details.
Could you upload the full resolution somewhere?
Preferably also for the new card.

I'm being a pain in the ass I know ;)
 
Got the new Gen 3 splitter today and it's quite a bit different from the previous one that was Gen 2. Hopefully this one will work with the 970 GTX. :)

Man, this is awesome. Come on PCIe 3.0!

RlX6PBN.gif
 
Was exploring the bios today and it seems that with proper hardware support the bifurcation can support x4/x4/x4/x4, x8/x4/x4, x8/x8 or x16. This makes it possible to have a 3 way splitter or even a 4 way splitter.

Quad SLI on a mini-itx anyone?
 
Quad SLI on a mini-itx anyone?
SLI throws a wobbler if you try and run it without all the cards on 8x or above. Which is a bit silly; 4x 3.0 is effectively the same speed as 8x 2.0, and while SLI will work with all cards on 8x 2.0, it won't with them on 4x 3.0.
 
@chemist_slime

Any update on the Gen 3 splitter? Also, do you have a link to the page where you bought it?

While SLI won't work at x4, CrossFireX will! I bet you could stuff 4 Fury Nano's into an Ncase M1.
 
@chemist_slime

Any update on the Gen 3 splitter? Also, do you have a link to the page where you bought it?

While SLI won't work at x4, CrossFireX will! I bet you could stuff 4 Fury Nano's into an Ncase M1.

175w TDP x4 would need quite a beefy psu and cooling surface area :D
 
175w TDP x4 would need quite a beefy psu and cooling surface area :D

Haha true true... well, with the silverstone 700W SFX-L coming out, 3 R9 Nano's might be more feasible... Assuming a 140W CPU, you'd hit 665W.

I imagine you could mount two R9's on the bottom of the M1, and then one on the bracket where you would normally place a radiator.
 
I'm pretty sure you can't fit 2 dual slot cards in the bottom area on the M1.

Might be more feasible to just have the 3 expansion slots in the M1 taken up by 3 video cards w/ single slot waterblocks, and the pump under the sfx-l psu. 3 GPUs on a 240mm rad, and the cpu left to fend for itself with rad exhaust air and whatever room is left over after the 240mm rad. Otherwise you'll need a crazy flexible pcie riser setup w/ non-uniform lane lengths and a lot more potential for EMI to come into play

A very interesting proposition indeed, but I think at this point it may be a bit premature - at least until we start seeing GPUs with < 125w tdp in the upper tiers of performance, and high end CPUs with more than 20 PCIe lanes while staying under 65w tdp. I'd be satisfied with a dual card solution and cpu under watercooling in the M1, as that was something I'd never thought it would be capable of beyond a dual-gpu card like a 295x or gtx690

There's gotta be a rule of thumb somewhere for how much TDP you can push through a given amount of rad surface area
 
I like the idea of doing 3 single slot cards on water blocks!

And yes, may be getting ahead of ourselves, but I do think this will become a possible thing.
 
Possible but money blown out the window. Scaling above two GPUs is very bad and with VR not wanting more cards but a single fast card, you'd probably just stick to two R9 Fury (non-X) cards. Unless you just want to throw cash at it without reason, than go right ahead !
 
Gotta throw cash at it.
Gotta have the best, the most unique of things.
Gotta show the plebs who's king.
 
Possible but money blown out the window. Scaling above two GPUs is very bad and with VR not wanting more cards but a single fast card, you'd probably just stick to two R9 Fury (non-X) cards. Unless you just want to throw cash at it without reason, than go right ahead !

GPUs can be used for a variety of things outside of VR! Rendering, computation, etc.

And yeah, I agree this is probably an overkill idea. But this is about what is possible, not necessarily practical!
 
Was exploring the bios today and it seems that with proper hardware support the bifurcation can support x4/x4/x4/x4, x8/x4/x4, x8/x8 or x16. This makes it possible to have a 3 way splitter or even a 4 way splitter. Quad SLI on a mini-itx anyone?

It's a noble goal and note that with the era do DirectX12 started it is a perfect target for the future coming. 3- and 4-way SLI have negligent performance gains now but DirectX12 eliminates this problem completely. Of course one needs DirectX12 games but these will be coming. Having machine ready for the new era is a good investment. I believe such 4-way SLI should be made of single slot graphics cards with watercooling but that requires very special water radiator and cooling fans design.

Possible but money blown out the window. Scaling above two GPUs is very bad ...
That is old DirectX11 thinking ;-).
 
I'll take the wait-and-see approach w/ DX12, as there were other projects and technologies that were supposed to be the holy grail and turned out to be absolute turds. Looking at you, Games for Windows Live
 
3- and 4-way SLI have negligent performance gains now but DirectX12 eliminates this problem completely
I'm not sure where this idea comes from. DX12 is not magic, it just takes the job of partitioning out jobs between GPUs from the driver developer and puts that task in the hands of the engine and game developer. If you do not have a workload that scales well to multiple GPUS, it will not scale well regardless of API.
The issues of data sharing between GPUs, synchronisation of buffers, duplication of workloads (if internal GPU RAM is not sufficient but swapping via system RAM is too slow), inter-card DMA (if that ever makes its way to consumer cards) etc, all still exist, and will continue to exist with DX12 and Vulkan.
If the game/engine develoepr does not have the time (or experience) to dedicate to optimising for multi-GPU rendering, DX12 could end up offering worse performance.
 
I'm not sure where this idea comes from. DX12 is not magic, it just takes the job of partitioning out jobs between GPUs from the driver developer and puts that task in the hands of the engine and game developer. If you do not have a workload that scales well to multiple GPUS, it will not scale well regardless of API. The issues of data sharing between GPUs, synchronisation of buffers, duplication of workloads (if internal GPU RAM is not sufficient but swapping via system RAM is too slow), inter-card DMA (if that ever makes its way to consumer cards) etc, all still exist, and will continue to exist with DX12 and Vulkan. If the game/engine develoepr does not have the time (or experience) to dedicate to optimising for multi-GPU rendering, DX12 could end up offering worse performance.

Contrary to what you say DX12 is magic comparing to what was before. To see this you have to understand that previously the basic computing model of graphics was based on single core and single thread. Meaning that no matter how many cards were in system all processing was done by single core in single thread. This is the reason why scaling up to 3 and 4 cards was so poor. In the DX12 the basic model is multicore and multithreading and it is tested and performing well up to 8 cores/threads and scaling with SLI (there are rumors that DX12 allows SLI up to 8GPUs) . This is giant step forward. Now what yous say about RAM and GPU data sharing is more of the hardware issue, it will be solved quite soon, AMD Fury has special RAM and new NVidia link will be solving communications.

The issue how much effort game developer needs to put into utilizing the DX12 is more complicated. On a basic level no effort is needed since it will be like in the present SLI except that the system will allocate multiple cores and threads. On the optimization level there is effort required which depends on the range of DX12 capabilities used.
 
the basic computing model of graphics was based on single core and single thread. Meaning that no matter how many cards were in system all processing was done by single core in single thread.
This is a MASSIVE oversimplification. For one thing DX11 already HAS multithreading implemented!:
We are especially hopeful about a faster shift to DX11 because of the added advantages it will bring even to DX10 hardware. The major benefit I'm talking about here is multi-threading. Yes, eventually everything will need to be drawn, rasterized, and displayed (linearly and synchronously), but DX11 adds multi-threading support that allows applications to simultaneously create resources or manage state and issue draw commands, all from an arbitrary number of threads. This may not significantly speed up the graphics subsystem (especially if we are already very GPU limited), but this does increase the ability to more easily explicitly massively thread a game and take advantage of the increasing number of CPU cores on the desktop.
Some benefit will come from job dispatch having reduced overhead, but this will mainly be a benefit in scenarios which are currently CPU bottlenecked not the case with most games) or where driver optimisation has lagged behind (e.g. recent AMD drivers).
This is the reason why scaling up to 3 and 4 cards was so poor.
No, the reason scaling is at best "OK" with 2 cards and progressively worse with more cards is because, while GPU rendering tasks are internally embarrassingly parallel (particularly pixel-level operations), GPU jobs are not. Not only are they sequential (you need to know where your polygons will be before you can determine if they are visible, you need to know if they are visible before you can shade them, etc) but also interdependent (e.g. the leaf geometry of a procedurally generated tree is dependant on the position of the tree 'base') meaning that data just be shared between GPUs, and in the correct order, or rendering will break.
In the DX12 the basic model is multicore and multithreading and it is tested and performing well up to 8 cores/threads and scaling with SLI (there are rumors that DX12 allows SLI up to 8GPUs) . This is giant step forward.
It also requires all the work that is currently done by Nvidia (and AMD) to distribute jobs between GPUs to be done by the engine or game developers. This is not something that they necessarily have any experience with, even if they have been working very closely with GPU vendors. Threading is a Hard Problem, and for many tasks will result in minimal gain even with perfect optimisation. Developers are ALREADY pushing threading as hard as they can to take advantage of modern quad-core CPUs (and to get the most out of consoles), and the scaling benefits are FAR from dramatic.
Now what yous say about RAM and GPU data sharing is more of the hardware issue, it will be solved quite soon, AMD Fury has special RAM and new NVidia link will be solving communications.
HBM and HBM2 are going to do nothing for data sharing between cards, as the bottleneck is the shared card bus. NVLink remains to be seen: all Nvidia's presentation materials have focussed on NVLink with PowerPC for HPC applications, though a handful of slides show NVLink in use as an auxiliary GPU-GPU (i.e. NOT CPU-GPU) link for x86 systems. And what we've yet to see is the physical form-factor of NVLink; it may be something like the SLI bridge where it could be compatible with existing PCIe specs for consumer devices, or it could have such rigid electrical specification that it needs an auxilliary slot and routing through the motherboard (in which case the chances of it turning up in consumer x86 boards is slim-to-none).
On a basic level no effort is needed since it will be like in the present SLI except that the system will allocate multiple cores and threads
This is not the case. DX12 Implicit Multiadapter is NOT the same as DX11 SLI/crossfire. SLI/crossfire is doing some degree of per-card job dispatching, whereas Implicit Multiadapter is a much more basic mirroring of memory and jobs between cards for AFR.


tl;dr DX12 will not suddenly make things run better with no effort. WDDM 2.0 will alleviate some degree of driverside CPU bottlenecking in certain scenarios. DX12 will not magically make multi-GPU scaling better without a LOT of work on the part of engine and game developers.
 
I love it when a random wise-crackin' leads to a very informative and educational ass-spankin'.
 
I already lost it at "single core and single thread". GPUs have effing hundreds of cores and are dedicated SIMD units for highly parallel applications, you can't tell me that is single core or single thread in any way.
 
Does anyone know if the new ASRock Z170 Gaming ITX supports bifurcation? It has some benefits over the X99.
 
Does anyone know if the new ASRock Z170 Gaming ITX supports bifurcation? It has some benefits over the X99.

From the Asrock support techs it sounds like it will support bifurcation. Benefits of Z170 seem to be possible inclusion of Alpine Ridge/Thunderbolt 3 over USB type C. However, it's unclear if they're using the Alpine Ridge controller. What benefits do you see over the X99 (besides price)?
 
From the Asrock support techs it sounds like it will support bifurcation. Benefits of Z170 seem to be possible inclusion of Alpine Ridge/Thunderbolt 3 over USB type C. However, it's unclear if they're using the Alpine Ridge controller. What benefits do you see over the X99 (besides price)?

Personally, M2 on the back, layout and greater support for coolers. Seems that performance difference between 5820K and 6700K is not that big even in multi threading (based on Tom's Hardware test) and the price of X99 board is much higher (even around 50% - 300 EUR vs 200 EUR).
 
Any news from chemist_slime if his new riser was able to achieve bifurcation at PCIe Gen 3.0 speeds?
 
@Runamok81: the new splitter worked! No PLX chip required, I was able to get the 970 GTX and 610 GT running simultaneously on OS X.
 
This is a MASSIVE oversimplification. For one thing DX11 already HAS multithreading implemented!

We are talking here SLI/CrossFire. for which there is no multithreading/multicore in DX11. DX12 provides this with several modes of operation: Alternative Frame Rendering, Split Frame Rendering and Multiadapter. Which means that frames can be drawn alternatively by the GPUs or multiple GPUs can draw a single frame, and different manufacturers graphics adapters can operate together. This all is related to multithreading jobs and distributing them to different cores.

I already lost it at "single core and single thread". GPUs have effing hundreds of cores and are dedicated SIMD units for highly parallel applications, you can't tell me that is single core or single thread in any way.

This is not about the internal working of graphics cards but how the system manages operations when there are several graphics cards (SLI). In the present model the cards are managed sequentially by a single thread in single core which leads to bottlenecks. In the DX12 cards can be managed by multiple threads in multiple cores.
 
Last edited:
Any news from chemist_slime if his new riser was able to achieve bifurcation at PCIe Gen 3.0 speeds?

@Runamok81: the new splitter worked! No PLX chip required, I was able to get the 970 GTX and 610 GT running simultaneously on OS X.

So you got a new PLX splitter which is different from the old PLX splitter and it works fine? Maybe the new one doest not even require special BIOS support? (I know it is unlikely but maybe?).
 
@Wirk. Did you see a PLX chip on the Ameri-rack splitter that Chemist_Slime is using?

ARC2-PELY423-C7_m.jpg
 
@Wirk. Did you see a PLX chip on the Ameri-rack splitter that Chemist_Slime is using?
This is obviously passive splitter but I got confused with so many splitters which Slime is juggling. What I would like to know if there is method to do the splitting without special BIOS. Equivalent question is how splitting is done with various expander boxes where just special cable is connected to a PCIE slot.
 
Last edited:
I got confused with so many splitters
In theory:
- No BIOS support for PCIe bifurcation: Only splitters with PLX chips will work
- BIOS support for PCIe bifurcation: Passive splitters will also work

In practice:
This is an edge case most manufacturers won't test for, and definitely not something consumer GPUs are tested with. What works may be complete pot-luck depending on components used.
 
In theory:
- No BIOS support for PCIe bifurcation: Only splitters with PLX chips will work
- BIOS support for PCIe bifurcation: Passive splitters will also work

In practice:
This is an edge case most manufacturers won't test for, and definitely not something consumer GPUs are tested with. What works may be complete pot-luck depending on components used.


Correct though the PLX chip part should work in any system.

Take any dual GPU card such as the GTX690/Titan Z/295X2, these are al PLX chip cards and they work without any fuss.

I could be that that is because of the always present GPUs(Some kind of handshake that is done) but these card should work in any PCIe slot. regardless of any BIOS. If only someone could test one of these cards on a earlier bios for the X99e.
 
What I would like to know if there is method to do the splitting without special BIOS.

Short answer: Without a special BIOS, you'll need a PLX chip.

Long answer: PCIe Bifurcation (the ability to split a 16x to 8x/8x without the use of a PLX chip) is less of a BIOS feature, and more of a chipset feature. It is INTEL who enables the bifurcation through their chipsets, and it is the motherboard manufacturers who in turn must enable it their BIOS. The Z87, Z97, X99, and Z170 (rumored) chipsets support Bifurcation. It's built-in. If you don't see a Bifurcation option in your BIOS, then you may need to ask your manufacturer. Chemist_Slime did just that and it was a one-day turnaround for ASRock to enable it.
 
Short answer: Without a special BIOS, you'll need a PLX chip.

Long answer: PCIe Bifurcation (the ability to split a 16x to 8x/8x without the use of a PLX chip) is less of a BIOS feature, and more of a chipset feature. It is INTEL who enables the bifurcation through their chipsets, and it is the motherboard manufacturers who in turn must enable it their BIOS. The Z87, Z97, X99, and Z170 (rumored) chipsets support Bifurcation. It's built-in. If you don't see a Bifurcation option in your BIOS, then you may need to ask your manufacturer. Chemist_Slime did just that and it was a one-day turnaround for ASRock to enable it.

Not entirely correct. Before his request on the ASRock forum there was another thread with the question if the z97e-itx/ac had bifurcation support, in that thread the ASRock_TSD responded with no and said X99e-ITX/AC had bifurcation support as well as future 100 series chipsets.

Weirdly enough they deleted the thread, but google cache still has it. see a couple of posts back.

Edit: seems that the cached page also isn't available anymore.
 
Not entirely correct. Before his request on the ASRock forum there was another thread with the question if the z97e-itx/ac had bifurcation support, in that thread the ASRock_TSD responded with no and said X99e-ITX/AC had bifurcation support as well as future 100 series chipsets.

Weirdly enough they deleted the thread, but google cache still has it. see a couple of posts back.

Edit: seems that the cached page also isn't available anymore.

That's really strange they deleted this thread. I hope it doesn't mean they're actually not planning on bringing bifurcation support to 100 series motherboards.
 
Not entirely correct. Before his request on the ASRock forum there was another thread with the question if the z97e-itx/ac had bifurcation support, in that thread the ASRock_TSD responded with no and said X99e-ITX/AC had bifurcation support as well as future 100 series chipsets.

Weirdly enough they deleted the thread, but google cache still has it. see a couple of posts back.

Edit: seems that the cached page also isn't available anymore.

I stand by the accuracy of my statements. Bifurcation support starts at the chipset level, and Intel makes the chipsets, not ASRock. According to Intel, the z87, z97, X99, and z170 chipsets support bifurcation. ASRock not choosing to support it in their BIOS is their decision. However, that decision doesn't condemn the entire platform.

Intel ARK datasheets showing chipset bifurcation support.
z97 (page 52)
z170 (page 21)

To me, this looks like a BIOS support question, right?
 
I stand by the accuracy of my statements. Bifurcation support starts at the chipset level, and Intel makes the chipsets, not ASRock. According to Intel, the z87, z97, X99, and z170 chipsets support bifurcation. ASRock not choosing to support it in their BIOS is their decision. However, that decision doesn't condemn the entire platform.

Intel ARK datasheets showing chipset bifurcation support.
z97 (page 52)
z170 (page 21)

To me, this looks like a BIOS support question, right?

Definitely seems like a bio issue to me as well from my experience.

The 1.20E bio + the ameri-rack Gen3 passive PCIE splitter + asrock x99e-itx is currently the only confirmed reliable solution that I know of that will support Gen 3.0 graphic cards. The only exception to this is that the super micro active PLX splitter also works but will also require the 1.20E bios as well. The other passive super micro splitter, and the ameri-rack gen2 only work with PCIE2.0 cards, the 970 GTX wouldn't post.
 
From the Asrock support techs it sounds like it will support bifurcation. Benefits of Z170 seem to be possible inclusion of Alpine Ridge/Thunderbolt 3 over USB type C. However, it's unclear if they're using the Alpine Ridge controller. What benefits do you see over the X99 (besides price)?

I don't think it uses that usb 3.1 chip. From the driver page it uses: Asmedia USB 3.0/3.1 XHCI Driver ver:1.16.23.0

http://www.asrock.com/mb/Intel/Fatal1ty Z170 Gaming-ITXac/?cat=Download&os=All
 
Back
Top