Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
i can see what you mean. ATI doesn't exactly have the best bandwidth with their products, and using more than 2 GPUs can quickly eat away at the that. Nvidia scales better because they have massive amount of bandwidth with their nf200 chipset, but i don't know what ATI uses for their PCI-E controller
i can see what you mean. ATI doesn't exactly have the best bandwidth with their products, and using more than 2 GPUs can quickly eat away at the that. Nvidia scales better because they have massive amount of bandwidth with their nf200 chipset, but i don't know what ATI uses for their PCI-E controller
I wonder if nvidia's scaling is better because they only need to focus on 3 way and same card sli instead of 4 way and multiple card crossfire.
I thought the bandwidth "thing" was debunked last month by Kyle? The way I took it was that even the bandwidth at 4x was enough for each card..
I wonder if nvidia's scaling is better because they only need to focus on 3 way and same card sli instead of 4 way and multiple card crossfire.
I'm using two Sapphire 4GB 5970s and they still don't see any gains to write home about with 10.8a.
i can see what you mean. ATI doesn't exactly have the best bandwidth with their products, and using more than 2 GPUs can quickly eat away at the that. Nvidia scales better because they have massive amount of bandwidth with their nf200 chipset, but i don't know what ATI uses for their PCI-E controller
This isn't a bandwidth issue. Neither NVIDIA or AMD's cards need more than 4 PCI-Express 2.0 lanes. (Equivalent to PCI-Express 1.0/1,0a with 8 lanes) Both the recent articles on this here and here refute that as a possibility. Now the 5970 wasn't specifically tested and given that it is a dual GPU card it may need more bandwidth than PCI-Express x4 connections can deliver. However 8 lanes is more than sufficient.
Also, it is a myth that the nForce 200 MCP's provide more bandwidth. They multiplex the connnection. The choke point is the PCI-Express lanes into the chipset which the nForce 200 MCP's take all of. So if you've got 4 PCI-Express x16 cards installed then you are eating 64 lanes worth of bandwidth. Two nForce 200 MCP's supposedly provide this but the fact is they have a 36 lane bridge into the X58 chipset. So again it turns into a choke point. We don't see this have any negative impact because PCI-Express cards have yet to saturate the bandwidth. When they do the nForce 200 MCP equipped boards may be in trouble. Then again they'll be no worse off than regular non-nForce 200 MCP equipped boards which won't have the necessary bandwidth either. By that time PCI-Express 3.0 will be out and we'll be using that.
Boards with the nForce 200 MCP have been tested and tested and in each case I've seen the nForce 200 MCP did nothing for performance and in fact actually hurt performance to a very slight degree. All these chips do is add latency, heat and complexity to the motherboard. In fact the only functionality they provide that is of any value is "load balancing" of the PCI-Express bus. By that I mean that if you've got nForce 200 MCP's on your board, you can have 4 or more PCI-Express x16 (physical) slots and each slot can have the same amount of bandwidth. With non-nForce 200 MCP equipped boards you can't really do that. Each slot is going to have a finite amount of bandwidth. So a 16x16x4 configuration is going to be about the most you are going to get. In the case of nForce 200 MCP equipped boards you could do 16x16x16x16. However again you've got a maximum of 36 lanes going into the chipset, so you don't really get any more performance out of the setup. You will however get the same performance (in theory) out of all three or four of your graphics cards.
And largely I think its a marketing thing. People were hung up on having "true PCI-Express x16 slots" back in the NVIDIA chipset days. This allows manufacturers to say that again on their printed spec sheets because it is technically true. Each slot will have 16 lanes or whatever. What the motherboard manufacturers don't tell you is that it really doesn't mean anything as the current boards are still constrained by the chipsets themselves and that the nForce 200 MCPs do nothing for actual performance.
I doubt it. NVIDIA I think simply puts more focus on performance. It may also be that their GPU designs tend to benefit more from SLI than AMD's benefit from Crossfire. This may be due to their different approaches in design. Its really hard to say, but it seems like a focus issue to me. NVIDIA works hard at making sure multiple card setups work properly so customers will buy more than one card a generation. AMD has had problems with Crossfire scaling forever and have only started to address them in any meaningful way recently. This is just my perception of the situation but it seems to make sense.
It was. Generally speaking PCI-Express 2.0 x4 slots are enough for today's graphics cards. This is no surprise to me because PCI-Express 1.0/1.0a x8 slots were enough for SLI configurations in the past including the 7950GX2. PCI-Express 2.0 x4 slots provide the same amount of bandwidth. We are still quite a ways from saturating PCI-Express in general. I expect this to continue for several more years.
You've missed the point of having an NF200. The point is to provide bandwidth between the GPUs, not from the chipset to the CPU. There is no need for that. So the bottleneck argument doesn't really make sense.
If you don't have enough bandwidth between GPUs the resource transfers that happen in AFR mode (when NVIDIA either can't get rid of them with an SLI profile or chooses not to get rid of certain transfers with a profile because it will cause corruption due to how the game relies on data to persist between frames) will not transfer as fast as they could, leading to somewhat reduced performance.. of course that's assuming there is enough data being transferred that bandwidth makes a difference. The transfers between the GPUs are usually direct and hence all they do is go through NF200 to the other board in another PCIE slot.
You would notice a huge difference if a game is not profiled for SLI (or you renamed the exe) and you set it to use "AFR1" mode in the NVIDIA control panel... the bandwidth between GPUs can actually help then. If you rename an app like GTA4 and set it to AFR1 you might see a difference in perf there because there is so much data being transferred between GPUs each frame.
But since NVIDIA is on top of providing profiles for games though, this benefit is nullified since they have already killed most of the transfers that happen anyway.. so the extra bandwidth goes to waste.
My point is simply that in theory it isn't pointless.. it should help for games that are not profiled for SLI where you just set AFR mode (though even then you could just make your own profile through trial and error, or just rename to another EXE and hope perf is good and you don't get corruption). But obviously that's not much of a selling point given the amount of SLI profiling that NVIDIA does. I'm just making a technical point.
You've missed the point of having an NF200. The point is to provide bandwidth between the GPUs, not from the chipset to the CPU. There is no need for that. So the bottleneck argument doesn't really make sense.
If you don't have enough bandwidth between GPUs the resource transfers that happen in AFR mode (when NVIDIA either can't get rid of them with an SLI profile or chooses not to get rid of certain transfers with a profile because it will cause corruption due to how the game relies on data to persist between frames) will not transfer as fast as they could, leading to somewhat reduced performance.. of course that's assuming there is enough data being transferred that bandwidth makes a difference. The transfers between the GPUs are usually direct and hence all they do is go through NF200 to the other board in another PCIE slot.
You would notice a huge difference if a game is not profiled for SLI (or you renamed the exe) and you set it to use "AFR1" mode in the NVIDIA control panel... the bandwidth between GPUs can actually help then. If you rename an app like GTA4 and set it to AFR1 you might see a difference in perf there because there is so much data being transferred between GPUs each frame.
But since NVIDIA is on top of providing profiles for games though, this benefit is nullified since they have already killed most of the transfers that happen anyway.. so the extra bandwidth goes to waste.
My point is simply that in theory it isn't pointless.. it should help for games that are not profiled for SLI where you just set AFR mode (though even then you could just make your own profile through trial and error, or just rename to another EXE and hope perf is good and you don't get corruption). But obviously that's not much of a selling point given the amount of SLI profiling that NVIDIA does. I'm just making a technical point.
And that theory is completely debunked by.....:
http://hardocp.com/article/2010/08/25/gtx_480_sli_pcie_bandwidth_perf_x16x16_vs_x4x4/
http://hardocp.com/article/2010/08/23/gtx_480_sli_pcie_bandwidth_perf_x16x16_vs_x8x8/
For X8/X8, they even did a nVidia Surround (ATi Eyefinity) resolution test - where massive amounts of data should be used in outputting data to the monitors, and transmitted between cards.
And once you get to dual nf200... well... they are still sharing the same connection BETWEEN each other (X58 chipset, normally), so 3+ card scaling shouldn't benefit from the added latency of a PCIe bridge (effectively what the nf200 is), and the lack of actual bandwidth increase between dual nf200 setups means there isn't a benefit.
i was reading the article about CFX improvement with the 10.8a drivers and was wondering if this driver effects the 5970 as well. even though it is still a single card shouldn't it also receive a performance boost?
It would benefit from the Crossfire application profiles. Unfortunately I don't think it does too much for CrossfireX scaling. (Dual Radeon HD 5970's) ATI is still boning us on 3 and 4 GPU scaling. When you get down to it the 5970 is really a pair of underclocked Radeon HD 5870's on a single PCB. Overclocked to 5870 speeds, the thing performs the same as dual 5870's. So if Radeon HD 5870 Crossfire scaling is improved, so it will be with the Radeon HD 5970.
I'm using two Sapphire 4GB 5970s and they still don't see any gains to write home about with 10.8a.