Kyle: "Heck, AGP8X is still not fully utilized."

lodi_a

n00b
Joined
Mar 31, 2003
Messages
9
Hi everyone,

I don't usually post to this forum, but after hearing many statements such as the one in the title, I thought I'd finally comment on the issue. From the perspective of a gamer, you are correct, the additional bandwidth of successive agp modes is mostly irrelevant. In fact, dropping down to agp4x won't likely hinder your performance either. I can however tell you from the perspective of a 3d renderer developer however that pci-e is far more than a double-bandwidth agp slot.

The bottle-neck in agp is not the bandwidth; that there's plenty of. The problem is the lack of full-duplex operation. The entire bus must either operate in 'read' mode or 'write' mode. If I draw a single primitive on the screen, read the backbuffer, then draw another primitive, I incur a massive pipeline stall while the bus 'flips'. My system effectively pauses for entire milliseconds while this happens. To put it another way, this is similar to sitting at my oven, waiting for my turkey to cook instead of moving on to some other task.

Because of this issue, we cannot read the backbuffer per-frame in a real-time application (even while drawing one triangle at 640x480, we're talking 20fps or less). Why would you want to perform something like this? You could for example perform a myriad of post-processing effects: depth of field, motion blur, glow effects, etc. You could also perform extremely fast occlusion testing (in the absence of this, I have to revert to a low-resolution software rasterizer, run entirely on the cpu, to achieve this in my code path. I could have saved months of programming and achieved far more precision and performance if I could run this on the gpu). Countless other applications are possible including those which are too technical to go into in this forum, and those that I simply do not know.

Q: Don't games like prince-of-persia and halo-2 have blur/glow effects in them already?

A: Yes they do. For the curious, they achieve this by rendering to a texture (in video memory of course), manipulating that texture with gpu operations, and then drawing a single textured primitive to act as a virtual 'screen'. Since all the computation is performed on the gpu, there is no readback across the agp bus. Glow for example can be realized by an interesting mip-map trick, but I won't go into these techniques here. I'm sure you will realize however that though these clever workarounds do work, the whole situation vastly limits what you can do.

Hence the reason that agp8x is not fully realized in current games. It's not that we can't find a way to put it to use, it's just that these advanced techniques would cripple our framerate. And there would be no difference between a gf2mx and a r9800xt, both would run in the 'tens' of frames per second.

So while you are correct that PCIe will mean relatively little to the gamer in the near future, don't make it out to be a useless feature, it is actually quite exciting ;-)

Here's another advantage that won't be realized for a while. Current games must use video memory in a very careful way, to ensure that all the data for a level will fit onto the card. If I try to display 160mb of textures on a 128mb card, I will incur a massive (or minor, depending on the scene) performance hit as textures keep swapping to/from video/system ram. So the reason the 128mb radeon and 256mb radeon pro perform identically in unreal tournament is because the game was carefully tuned to fit inside that 128mb slot. You can be sure that Epic would have no trouble filling the 256mb card... PCIe should greatly minimize the effects of texture-thrashing, but once again we can't rely on this functionality for years, until the 'base' target has upgraded up from AGP.

This whole issue by the way, is similar to statements like "I see no difference between my 32-bit pentium and 64-bit athlon" and "Why do we need pixel shader 2 (or even 3 now!) when barely anything out there even uses 1!". Trust me, there are good things being made with shader versions 2 and even 3, but developers get them at the same time you do. We are not all Carmack's and Sweeney's. And even then, the value-pc dictates what the game can or cannot do. It is not as simple as flipping a switch; these kinds of issues have to be settled at the very beginning of a long development cycle.

Hope that clears up a few misconceptions!
 
man, that has had to be the most cogent stement I have read on what is good about pci-e.
 
Sadly, know the truth hurts more because I can't afford one right now. DAMN YOU!!! I was happy in my ignorance.
 
Originally posted by NeoNemesis
Sadly, know the truth hurts more because I can't afford one right now. DAMN YOU!!! I was happy in my ignorance.
From reading his post though, the good thing is you don't have to as advantages of PCI-E won't be realized for a while.
 
great post man. you actually gave me some thoughts about a 512mb video card (yeah i know it sounds insane) and pci-e (which i know i cant afford to upgrade every other part in my computer for this)

you sure cleared my doubts about pci-e and high amounts of memory on a video card.

only if i had the $$$ :(
 
While most of the information is true, what was left out was the current generation of cards. While they may be released with the PCIe option, once it comes around to games that actually will require that much bandwidth on the interface, the card will be grossly underpowered, so there's not much point in getting one of these PCIe cards for gaming in this family of GPUs.
 
So let me get this right.
When Lodi_a says UT, are you talking about UT2K4 for fitting textures in the 128mb slot?
And what about D3 and HL2? Do they have high textures that will use up more than 128mb if you have it?
 
Originally posted by NoGodForMe
So let me get this right.
When Lodi_a says UT, are you talking about UT2K4 for fitting textures in the 128mb slot?
And what about D3 and HL2? Do they have high textures that will use up more than 128mb if you have it?

According to carmack, with all the doom 3 settings cranked up it will use approx 98megs of memory for textures.
 
I read something on the net a few months ago that showed a roadmap of PCI-E bus speeds and how they wont actually ramp up and show real improvements over 8X AGP until next year. Even with lodi_a's points, how long will it be before we (consumers/gamers) would actually notice performance gains by moving to PCI-E graphics with 512MB onboard memory? It would seem to me that we are at least 2 years away from seeing games that would benefit from the extras being offered.
 
Originally posted by Chris_B
According to carmack, with all the doom 3 settings cranked up it will use approx 98megs of memory for textures.

for some reason, it doesnt feel like doomIII will be big on textures, just from looking at screen shots. hl2 seems the texture hog.

but i could be wrong :p
 
intercollector:

I agree. The type of gamer that visits hardocp (that is, upgrades their video card at least once every two years) won't really benefit from these bleeding-edge technologies. I just wanted to explain why PCIe (and other up-and-coming technologies like ps3 and 64-bit computing) aren't immediately applicable to games _right now_. That's not to say that they aren't applicable to games in general, it's more of an issue of a young market.

NoGodForMe:

Don't quote me on this but I believe face3 from ut2k3 at a high resolutions with everything maxed will use about 160mb of vram. Remember that this video memory also includes two framebuffers, which can grow fairly large with high settings. This isn't a huge performance penalty mind you, but this is one scenario that should show up on benchmarks (usually an equivalent 256mb card will bench a tad bit slower, due to lower timings). I of course would have no idea what D3 and HL2 will require ;-)
 
In response to:


Q: Don't games like prince-of-persia and halo-2 have blur/glow effects in them already?

A: Yes they do. For the curious, they achieve this by rendering to a texture (in video memory of course), manipulating that texture with gpu operations, and then drawing a single textured primitive to act as a virtual 'screen'. Since all the computation is performed on the gpu, there is no readback across the agp bus. Glow for example can be realized by an interesting mip-map trick, but I won't go into these techniques here. I'm sure you will realize however that though these clever workarounds do work, the whole situation vastly limits what you can do.



Pulling video memory down to the CPU is a waste except for video capture/screenshots. The CPU is so much slower than a GPU (by something like an order of 30x), that even something not exactly efficient on the GPU is still going to be a whole lot faster, PLUS it will leave the CPU free to do real tasks (i.e. not rendering related things such as AI and Physics etc). What the newer bus will provide is make it easier for the future cards to page resources out of video memory and back into system ram efficiently while continuing to render normally. Think: GPU Memory becomes virtual memory, its swapfile will be normal system RAM.
 
Amazingly informative post, lodi_a. Does anybody else feel that it's a shame that the bar is set by the weakest available hardware and not the most potent? These technologies will bring about very important steps foreward, but buying a PCI-Express card with 256mb RAM at a time like this would be like buying a gun that nobody manufactures bullets for.
 
That was probably the most coherent post i have seen in awhile

wow that has changed my mind about PCI-E greatly and not i hope ATI's x880 will lay down the whooping on the 6900U, while requiring less power
 
you sure cleared my doubts about pci-e and high amounts of memory on a video card.

You guys are missing his point, though.

In order for the change to be relevant, the game must NEED more than 128mb of video card. If the game only NEEDS 98mb, then it won't matter if you have a 256mb card or a 512mb card or PCIe. It doesn't need it, so it just won't matter.

Games won't *start* being designed to require that much memory until EVERYONE has it. Or at least, a significant part of the target audience.

That is still years away.

So, while lots of ram or PCIe is fun and dandy features now....it shouldn't be a purchasing decision *yet*.
 
Thanks for the info. I agree with everyone this is a very informative post.
 
Zoner:

"The CPU is so much slower than a GPU (by something like an order of 30x), that even something not exactly efficient on the GPU is still going to be a whole lot faster..."

This assumes that the operation I want to perform is _possible_ entirely the gpu, and that the gpu can perform it more efficiently in the first place. Post-processing "filters" such as blur and depth of field are highly parallel, therefore it's a good idea to keep them on the gpu. However while gpu's are becoming flexible enough to allow for more general purpose computing, they still have some very obvious limits. My example above demonstrates why AGP prevents me from efficiently using the gpu in parallel with the cpu in a high-level occlusion calculation. This also has huge digital video implications.

"Think: GPU Memory becomes virtual memory, its swapfile will be normal system RAM."

This is how vram operates in modern video cards. PCIe will make this process more efficient. The real interesting thing to watch out for here is vram 'paging'. That will really allow developers to bump up the size of textures in games.

Thanks for the compliments all. :)
 
DAMMMMNN!!! This is one of the better posts that I have seen on this forum. Very informative! Now I actually understand some of this jajrgha hardenadad... (whatever that is) :)
 
Originally posted by lodi_a
Zoner:

"The CPU is so much slower than a GPU (by something like an order of 30x), that even something not exactly efficient on the GPU is still going to be a whole lot faster..."


now this i don't understand...why are they so much faster than cpus? 30X!!!!!???
2 or 3 i can believe/understand but 30?
does it have something todo with the gpu's fewer pipelines or what?
 
This assumes that the operation I want to perform is _possible_ entirely the gpu, and that the gpu can perform it more efficiently in the first place. Post-processing "filters" such as blur and depth of field are highly parallel, therefore it's a good idea to keep them on the gpu. However while gpu's are becoming flexible enough to allow for more general purpose computing, they still have some very obvious limits.

Really important point to keep in mind when people start trumpeting teh pow4h of their GPU in doing a limited subset of the total types of calculations done by a CPU.
 
They're not exactly - just that GPU's have a more specific architecture that's optimized to handle video/3D-related ops. CPUs by nature have to handle a wider range of ops and hence have a more general-purpose architecture (with corresponding tradeoffs in various situations).

Exact same deal with digital signal processors.
 
Originally posted by dderidex
You guys are missing his point, though.

In order for the change to be relevant, the game must NEED more than 128mb of video card. If the game only NEEDS 98mb, then it won't matter if you have a 256mb card or a 512mb card or PCIe. It doesn't need it, so it just won't matter.

Games won't *start* being designed to require that much memory until EVERYONE has it. Or at least, a significant part of the target audience.

That is still years away.

So, while lots of ram or PCIe is fun and dandy features now....it shouldn't be a purchasing decision *yet*.

... and games won't NEED more video ram if they continue to cater to the lowest common denominator. Maybe if the games moved a little faster, Skippy with his gf2 or mx400 would get a real card.
 
Originally posted by CleanSlate
I'm wondering too why cpus are so much less powerfull than gpus

~Adam
Think of the GPU vs CPU more in terms of parrallel processing, GPU vs SMP.
GPU has multiple pipes, multiple specific parts to do specific tasks, all in unison. CPU can't render the next image filter in Paint Shop Pro until the first one is finished to have a base for the next step to build on. It can't start working on step 2 until step 1 is done.

The GPU is built more on step 1a, 1b, 1c, 1d, etc... in unison, then step 2a, 2b, 2c, all on top of the previous rendering. Most things are done in parrallel in GPU's vs the linear nature of standard applications.
 
Originally posted by Chris_B
According to carmack, with all the doom 3 settings cranked up it will use approx 98megs of memory for textures.

Let me get this straight: Carmack himself says that D3 in full-tilt-boogie mode will use 98 MB for textures.

The typical ATI or nVidia graphics accelerator on the BOTTOM END (aka "Value Leader") has 128 MB, while most middle and high-end cards have 256! (Further note that the DX9 AIWs all have 128 MB with the sole exception of the 256 MB AIW 9800 Pro).

That pretty much means that while D3 will tax *some* GPUs (mostly those in the GF4/R25x camp) D3 isn't even going to come *close* to taxing the current crop of DX 9 VPUs (GeForce FX, ATI's R3xx).
Heck, look at the current reigning champs of PC gaming: Battlefield: Vietnam, UT 2004, and FarCry. How taxing are *any* of those on a typical VPU sold today? And we don't have the *9x vs. NT* dilemma that was present when Unreal Tournament shipped (I designed and built the UT Surprise Box based on Intel752 integrated graphics, Windows 2000 Professional, a Celeron 500, and a Sound Blaster Live! X-Gamer 5.1, and it would regularly *demolish* faster PCs running 98 SE.)

The last game that seriously taxed modern graphics hardware was (I hate bringing this up!) *HALO*, and it was roundly cursed for it. The same applied (in their day) to AquaNox and Ballistics.

And Kyle is right: AGP 8X isn't (and never will be) fully utilized. In fact, name *any* game that fully utilized the AGP specification available at the time. (I can only think of *one* that even remotely came close.)
 
Originally posted by PGHammer
And Kyle is right: AGP 8X isn't (and never will be) fully utilized. In fact, name *any* game that fully utilized the AGP specification available at the time. (I can only think of *one* that even remotely came close.)

What's this one game?
 
there is more to taxing the card than filling up its memory, I think that stencil ops for shadow calculations will slow the game down plenty.
 
Originally posted by 101
... and games won't NEED more video ram if they continue to cater to the lowest common denominator. Maybe if the games moved a little faster, Skippy with his gf2 or mx400 would get a real card.

The LCD in graphics cards these days *is* 128 MB (if you're not talking integrated graphics). It's only after you reach the middle that 256 MB cards start to appear. And even with that, except for your more demanding titles (and not even most of those) how many are going to take advantage (real advantage) of what DX 9 has to offer?

Let's look at one game that's been around for a while that is seriously accused of being a system pig (HALO: Combat Evolved).
For quite a while, those of us with higher-end ATI VPUs (9700/9800 series) thought there was a problem with the ATI drivers...but then someone realized that the game was using the old DX 8 optimizations for pixel shaders by default! To fully utilize the power we had bought our cards *for* we had to add the "/use20" switch to our timedemos and game starts.

Then it became *OhMyGAWWWWD...I didn't think Halo looked THAT good!" Youy wound up with a MUCH more graphically rich game (that in fact put the original on XBOX to complete and utter *shame*) and got even better performance in the bargain.

Do we even know what Doom3 is slated to use for a graphics engine? (UT 2004 uses the slick DX9-based UnrealWarfare II engine on the Win32 side.)
 
there are almost no shaders in ut2k4 all fixed function, and doom3 will use the doom3 engine
 
Originally posted by lodi_a
My system effectively pauses for entire milliseconds while this happens.



Gee, to think of a time when 'entire milliseconds' was still a short period of time... :p


Very good explanation/clarification by the way :)
 
Originally posted by PGHammer
Do we even know what Doom3 is slated to use for a graphics engine?

ID usually makes engines that other games license, and I am certain this is a new engine that they made...
 
Originally posted by PGHammer

Do we even know what Doom3 is slated to use for a graphics engine? (UT 2004 uses the slick DX9-based UnrealWarfare II engine on the Win32 side.)

It uses the Doom3 engine. John Carmack writes his own engines.
 
Thread Of The Year

Still though, I like putting that 6800Ultra sticker on my case...
it still gonna be better than my Ti4600 :D
 
Very good post but you didn't touch upon the actual reason why there upping the bandwidth so much, and thats HDTV encoding. When PCI-E is released this will allow for much faster encoding since now it won't be withheld by the bandwidth (I don't know much about the topic, so I don't know if it will effectivly fill the bandwidth).

And I remember reading something about some Standford students running specilily designed code with thier GPU instead of CPU. Supposedly it ran at an equivilant to a 10ghz Intel. The only problem was, the code was 10x harder to code and much more limited.
 
Originally posted by obyj34
Very good post but you didn't touch upon the actual reason why there upping the bandwidth so much, and thats HDTV encoding. When PCI-E is released this will allow for much faster encoding since now it won't be withheld by the bandwidth (I don't know much about the topic, so I don't know if it will effectivly fill the bandwidth).

And I remember reading something about some Standford students running specilily designed code with thier GPU instead of CPU. Supposedly it ran at an equivilant to a 10ghz Intel. The only problem was, the code was 10x harder to code and much more limited.

whoa...crazy stuff
try to find that article for us man
that is some interesting stuff :D
 
Originally posted by PGHammer
The LCD in graphics cards these days *is* 128 MB (if you're not talking integrated graphics). It's only after you reach the middle that 256 MB cards start to appear. And even with that, except for your more demanding titles (and not even most of those) how many are going to take advantage (real advantage) of what DX 9 has to offer?

Let's look at one game that's been around for a while that is seriously accused of being a system pig (HALO: Combat Evolved).
For quite a while, those of us with higher-end ATI VPUs (9700/9800 series) thought there was a problem with the ATI drivers...but then someone realized that the game was using the old DX 8 optimizations for pixel shaders by default! To fully utilize the power we had bought our cards *for* we had to add the "/use20" switch to our timedemos and game starts.

Then it became *OhMyGAWWWWD...I didn't think Halo looked THAT good!" Youy wound up with a MUCH more graphically rich game (that in fact put the original on XBOX to complete and utter *shame*) and got even better performance in the bargain.

Do we even know what Doom3 is slated to use for a graphics engine? (UT 2004 uses the slick DX9-based UnrealWarfare II engine on the Win32 side.)
so, um, what exactly do I need to do to get Halo to run better?
 
Back
Top