Hi everyone,
I don't usually post to this forum, but after hearing many statements such as the one in the title, I thought I'd finally comment on the issue. From the perspective of a gamer, you are correct, the additional bandwidth of successive agp modes is mostly irrelevant. In fact, dropping down to agp4x won't likely hinder your performance either. I can however tell you from the perspective of a 3d renderer developer however that pci-e is far more than a double-bandwidth agp slot.
The bottle-neck in agp is not the bandwidth; that there's plenty of. The problem is the lack of full-duplex operation. The entire bus must either operate in 'read' mode or 'write' mode. If I draw a single primitive on the screen, read the backbuffer, then draw another primitive, I incur a massive pipeline stall while the bus 'flips'. My system effectively pauses for entire milliseconds while this happens. To put it another way, this is similar to sitting at my oven, waiting for my turkey to cook instead of moving on to some other task.
Because of this issue, we cannot read the backbuffer per-frame in a real-time application (even while drawing one triangle at 640x480, we're talking 20fps or less). Why would you want to perform something like this? You could for example perform a myriad of post-processing effects: depth of field, motion blur, glow effects, etc. You could also perform extremely fast occlusion testing (in the absence of this, I have to revert to a low-resolution software rasterizer, run entirely on the cpu, to achieve this in my code path. I could have saved months of programming and achieved far more precision and performance if I could run this on the gpu). Countless other applications are possible including those which are too technical to go into in this forum, and those that I simply do not know.
Q: Don't games like prince-of-persia and halo-2 have blur/glow effects in them already?
A: Yes they do. For the curious, they achieve this by rendering to a texture (in video memory of course), manipulating that texture with gpu operations, and then drawing a single textured primitive to act as a virtual 'screen'. Since all the computation is performed on the gpu, there is no readback across the agp bus. Glow for example can be realized by an interesting mip-map trick, but I won't go into these techniques here. I'm sure you will realize however that though these clever workarounds do work, the whole situation vastly limits what you can do.
Hence the reason that agp8x is not fully realized in current games. It's not that we can't find a way to put it to use, it's just that these advanced techniques would cripple our framerate. And there would be no difference between a gf2mx and a r9800xt, both would run in the 'tens' of frames per second.
So while you are correct that PCIe will mean relatively little to the gamer in the near future, don't make it out to be a useless feature, it is actually quite exciting ;-)
Here's another advantage that won't be realized for a while. Current games must use video memory in a very careful way, to ensure that all the data for a level will fit onto the card. If I try to display 160mb of textures on a 128mb card, I will incur a massive (or minor, depending on the scene) performance hit as textures keep swapping to/from video/system ram. So the reason the 128mb radeon and 256mb radeon pro perform identically in unreal tournament is because the game was carefully tuned to fit inside that 128mb slot. You can be sure that Epic would have no trouble filling the 256mb card... PCIe should greatly minimize the effects of texture-thrashing, but once again we can't rely on this functionality for years, until the 'base' target has upgraded up from AGP.
This whole issue by the way, is similar to statements like "I see no difference between my 32-bit pentium and 64-bit athlon" and "Why do we need pixel shader 2 (or even 3 now!) when barely anything out there even uses 1!". Trust me, there are good things being made with shader versions 2 and even 3, but developers get them at the same time you do. We are not all Carmack's and Sweeney's. And even then, the value-pc dictates what the game can or cannot do. It is not as simple as flipping a switch; these kinds of issues have to be settled at the very beginning of a long development cycle.
Hope that clears up a few misconceptions!
I don't usually post to this forum, but after hearing many statements such as the one in the title, I thought I'd finally comment on the issue. From the perspective of a gamer, you are correct, the additional bandwidth of successive agp modes is mostly irrelevant. In fact, dropping down to agp4x won't likely hinder your performance either. I can however tell you from the perspective of a 3d renderer developer however that pci-e is far more than a double-bandwidth agp slot.
The bottle-neck in agp is not the bandwidth; that there's plenty of. The problem is the lack of full-duplex operation. The entire bus must either operate in 'read' mode or 'write' mode. If I draw a single primitive on the screen, read the backbuffer, then draw another primitive, I incur a massive pipeline stall while the bus 'flips'. My system effectively pauses for entire milliseconds while this happens. To put it another way, this is similar to sitting at my oven, waiting for my turkey to cook instead of moving on to some other task.
Because of this issue, we cannot read the backbuffer per-frame in a real-time application (even while drawing one triangle at 640x480, we're talking 20fps or less). Why would you want to perform something like this? You could for example perform a myriad of post-processing effects: depth of field, motion blur, glow effects, etc. You could also perform extremely fast occlusion testing (in the absence of this, I have to revert to a low-resolution software rasterizer, run entirely on the cpu, to achieve this in my code path. I could have saved months of programming and achieved far more precision and performance if I could run this on the gpu). Countless other applications are possible including those which are too technical to go into in this forum, and those that I simply do not know.
Q: Don't games like prince-of-persia and halo-2 have blur/glow effects in them already?
A: Yes they do. For the curious, they achieve this by rendering to a texture (in video memory of course), manipulating that texture with gpu operations, and then drawing a single textured primitive to act as a virtual 'screen'. Since all the computation is performed on the gpu, there is no readback across the agp bus. Glow for example can be realized by an interesting mip-map trick, but I won't go into these techniques here. I'm sure you will realize however that though these clever workarounds do work, the whole situation vastly limits what you can do.
Hence the reason that agp8x is not fully realized in current games. It's not that we can't find a way to put it to use, it's just that these advanced techniques would cripple our framerate. And there would be no difference between a gf2mx and a r9800xt, both would run in the 'tens' of frames per second.
So while you are correct that PCIe will mean relatively little to the gamer in the near future, don't make it out to be a useless feature, it is actually quite exciting ;-)
Here's another advantage that won't be realized for a while. Current games must use video memory in a very careful way, to ensure that all the data for a level will fit onto the card. If I try to display 160mb of textures on a 128mb card, I will incur a massive (or minor, depending on the scene) performance hit as textures keep swapping to/from video/system ram. So the reason the 128mb radeon and 256mb radeon pro perform identically in unreal tournament is because the game was carefully tuned to fit inside that 128mb slot. You can be sure that Epic would have no trouble filling the 256mb card... PCIe should greatly minimize the effects of texture-thrashing, but once again we can't rely on this functionality for years, until the 'base' target has upgraded up from AGP.
This whole issue by the way, is similar to statements like "I see no difference between my 32-bit pentium and 64-bit athlon" and "Why do we need pixel shader 2 (or even 3 now!) when barely anything out there even uses 1!". Trust me, there are good things being made with shader versions 2 and even 3, but developers get them at the same time you do. We are not all Carmack's and Sweeney's. And even then, the value-pc dictates what the game can or cannot do. It is not as simple as flipping a switch; these kinds of issues have to be settled at the very beginning of a long development cycle.
Hope that clears up a few misconceptions!