Why don't we have video cards that offer free FSAA yet?

Liekomg

Limp Gawd
Joined
Apr 26, 2006
Messages
316
I mean seriously, the quality different between no AA and 4xAA is huge, especially in games that have a lot of small details/use a lot of polygons like Oblivion. Having 4xAA on in Oblivion makes things look almost photorealistic indoors. 3dfx started the AA bandwagon like 6 years ago, yet we still don't have a card that can offer FSAA without a performance hit!

Why doesn't ATI or Nvidia create some special circuitry in their chips for the sole purpose of doing FSAA so that there is no performance penalty?
 
because the FSAA patent requires payment of $0.0001/pixel.
 
quite frankly the latest cards from nvidia and ati are quite capable of putting out 4xaa in all the latest games, including oblivion. hell the ati's can do hdr + aa at decent performance levels. dont see your problem?
 
pandora's box said:
quite frankly the latest cards from nvidia and ati are quite capable of putting out 4xaa in all the latest games, including oblivion. hell the ati's can do hdr + aa at decent performance levels. dont see your problem?
QFT...unless he want 4xAA built in plus 4xAA in the drivers...giving us 16xAA!
 
pandora's box said:
quite frankly the latest cards from nvidia and ati are quite capable of putting out 4xaa in all the latest games, including oblivion. hell the ati's can do hdr + aa at decent performance levels. dont see your problem?

Obviously they can DO FSAA in the latest games. But i'm talking about having zero performance hit for turning it on, not 30% slower with 4xAA, especially at higher resolutions. Considering some of these video cards cost $400+, you think by now they'd allow us to turn on FSAA without a performance penalty.
 
Yeah...why not have cars that get free mileage as well?

If you can design a card that can do it...then you can complain about the cards manufacturers.
 
Honestly, 30% slower for effectively increasing the quality of image four times is pretty damn impressive.

"Back in the 3dfx days", super-sample anti-aliasing would take an 80% (or worse) performance hit to enable 4xAA.
 
dderidex said:
Honestly, 30% slower for effectively increasing the quality of image four times is pretty damn impressive.

"Back in the 3dfx days", super-sample anti-aliasing would take an 80% (or worse) performance hit to enable 4xAA.


Dejagging polygon edges makes the image 4x as good?

What asshole did you pull that load of shit from?
 
Coldtronius said:
Dejagging polygon edges makes the image 4x as good?

What asshole did you pull that load of shit from?
I agree seeing smooth edges, makes the image more than 4x as good.
 
Free AA can be had, but it will cost you. They'd need to use embedded memory, and that stuff aint cheap. You won't see that anytime soon on a desktop video card. So until then there will always be a performance hit for enabling AA, it requires memory bandwidth to local memory.
 
Liekomg said:
I mean seriously, the quality different between no AA and 4xAA is huge, especially in games that have a lot of small details/use a lot of polygons like Oblivion. Having 4xAA on in Oblivion makes things look almost photorealistic indoors. 3dfx started the AA bandwagon like 6 years ago, yet we still don't have a card that can offer FSAA without a performance hit!

Why doesn't ATI or Nvidia create some special circuitry in their chips for the sole purpose of doing FSAA so that there is no performance penalty?

Am I the only one who thinks this is sort of a dumb question. It's like trying to defy laws of mathematics....
 
Brent_Justice said:
Free AA can be had, but it will cost you. They'd need to use embedded memory, and that stuff aint cheap. You won't see that anytime soon on a desktop video card. So until then there will always be a performance hit for enabling AA, it requires memory bandwidth to local memory.

Alright, so I'm confused. How/why would AA require extra local memory bandwidth? By local memory do you mean system memory or the memory on the gfx card? Shouldn't the gfx card already have the meshes, textures, etc in its memory?

I thought the performance hit was due to supersampling/multisampling as the pixel shaders render each pixel multiple times at different locations within the pixel.

Oh... one more. What do you mean by "embedded" memory? SRAM? Memory on the GPU die itself?
 
Xenos (Xbox360 GPU) alledgedly had free 2XAA and virtually free 4XAA (~5% performance hit) due to it's 10 MB EDRAM.

In practice though, many X360 games still dont use AA or enough AA. You see a 4XAA 720P framebuffer wont fit so the screen has to be split into two or three tiles using a technique called predicated tiling that basically has to built into the game engine from the ground up. In turn, this introduces a certain overhead related to geometry that splits the tiles, so it's some sort of non-trivial performance hit, though I have no idea how much.

In other words it's a pain in the ass. But as the tiling engines get better it might get better.

Also PGR3 on X360 supposedly actually rendered at 600P which was then upscaled to 720P, and used 2XAA. The advantage of this is, the framebuffer all fits in the 10MB EDRAM without tiling. It's the only game I know of that resorted to this, though.

The ideal would of course be to have so much EDRAM you didn't have to mess with tiling, but as high as PC resolutions go, this would be and absurd amount. I think one of the higher PC reses uses 28 MB for the AA'd framebuffer, so youd need more than that.

It's also a huge transistor cost. On Xenos, ~80 million of it's 338 million transistors goes to the 10MB EDRAM. The problem then becomes, that leaves less transistor budget for everything else, such as shading power.

I kinda think EDRAM is a bad idea in Xenos, because they could have added more shader power instead. But it sort of makes sense in that space for a couple reasons. One, both Sony and MS were limited to 128-bit busess for their next gen consoles. Because 256 bit-busses limit the future die shrinks you can do apparantly. And that's a big deal when these consoles that are gonna be $99 one day. So they basically had a starting point of 128-bit busses in both cases. Sony dealt with it by basically using two 128-bit busses and in turn two 256MB banks of ram. One for the CPU and one for the GPU. It'll help some but it's nowhere near as good as one 256 bit bus because the framebuffer uses most of the bandwidth and it cant be split.

MS dealt with it by the EDRAM.

The other reason is because in PC's, apparantly it's a chicklen and egg problem with EDRAM, if you had a PC GPU that used predicated tiling, it would have to be supported in the game engines, and game developers might not, and even if they did, it would take a long time for it to be fully supported. X360 of course being a console, they can target things like that much more narrowly.

And even if it WAS supported, I'm not sure you'd want it. There's no limitation to 128-bit bus in high-end graphics cards, so it's not really necessary. It's a massive investment of transistors that could go to more power instead.

Also, it's worth noting the AA performance hit on current ATI cards is pretty small. A lot smaller than on Nvidia cards anyway. Of course, this is one of the reasons ATI's dies are almost twice as big..
 
Coldtronius said:
Dejagging polygon edges makes the image 4x as good?

What asshole did you pull that load of shit from?

Actually, it does. I didn't do much AA myself, preferring to go with higher res instead for my FPS games (You seriously don't need AA for them). But for anything slower paced, it's a must! If you play RTS, RPG, or even Racing games (especially the ones that replay), the jagged lines are crap! The moving stairsteps on gently swaying leaves is horrible, especially if you have a screen full of tiny trees.

When i saw the difference when i tried playing ColinMcRae2K4 at 4x, i never went back.

Alright, so I'm confused. How/why would AA require extra local memory bandwidth? By local memory do you mean system memory or the memory on the gfx card? Shouldn't the gfx card already have the meshes, textures, etc in its memory?

Because AA is sort of like a post process. You have to sample adjacent pixels of the original image to determine the final color. You have to store all that extra data somewhere.
 
Sharky974 said:
You see a 4XAA 720P framebuffer wont fit so the screen has to be split into two or three tiles using a technique called predicated tiling that basically has to built into the game engine from the ground up.
Seems a bit strange to go with the EDRAM (10MB of it) when the 12x7 framebuffer won't fit into the embedded memory. I suppose ATi/Microsoft went this route under the assumption that developers would support it. Woops.

So what use are those 80 million transistors when there's no antialiasing used at all? That's a pretty big transistor paperweight if they're utterly unused.
 
Alright, so I'm confused. How/why would AA require extra local memory bandwidth? By local memory do you mean system memory or the memory on the gfx card? Shouldn't the gfx card already have the meshes, textures, etc in its memory?

local memory = memory on the board

I thought the performance hit was due to supersampling/multisampling as the pixel shaders render each pixel multiple times at different locations within the pixel.

and storing it as well

Oh... one more. What do you mean by "embedded" memory? SRAM? Memory on the GPU die itself?

embedded ram on the die itself, short, fast, works much like cache sorta does, only instead of having to go over the mem bus to store the data, its on the die instead

So what use are those 80 million transistors when there's no antialiasing used at all? That's a pretty big transistor paperweight if they're utterly unused.

this is the case with just about every peice of hardware, are the other 2 CPU's being used? what about PS3's PPE's when its released, the fact is its there to be used when developers want to use it, i'd rather have it and not be used then not have it and have developers want to use it later down the road

also take note, we did have a technology available to us, quincux, which was supposedly free 4xaa supported by the GF2 and 3 series, but i don't think anything ever officially supported it, possibly due to it not giving off the effects Nvidia wanted it to

and asking for something for free just because its been out for so long doesn't mean it should have been implemented, AF's been an idea for ever, and its still giving us performance hits, and seeing how these cards can support up to 1920x1200 resolutions, i doubt AA would play that big of role in your over all experience at those resolutoins.
 
phide said:
Seems a bit strange to go with the EDRAM (10MB of it) when the 12x7 framebuffer won't fit into the embedded memory. I suppose ATi/Microsoft went this route under the assumption that developers would support it. Woops.

They will, they almost have to. It's a console. Microsoft can require a lot of things. I believe MS requires some type of AA but it's open to interpetation, such as a lot of motion blur might qaulify. But console games have to pass detailed checklists called TRC's, or Sony or Microsoft have the right to not even approve them.

I'm not sure what the AA situation is on X360 games so I suppose most use AA. I mean I dont check every game.

And again, it does seem odd why they went with 10MB exactly, but the amount needed to fit 720P with 4XAA without tiling is a lot more, so again it gets down to cost.

It's not that developers dont or wont support it..it's more that it's more a pain in the ass apparantly than MS first let on, when they talked up "free" 2X and 4XAA on Xbox360. It's more like "well, it's kinda free, but you have to do this this and this first". Mainly you have to use predicated tiling, and even beyond that, there is some sort of geometry hit with the use of predicated tiling.

But as I understand it, as the tools get better it will be better...

So what use are those 80 million transistors when there's no antialiasing used at all? That's a pretty big transistor paperweight if they're utterly unused.

True, but you have to understand the context. They're limited to 128 bit bus. Just like Sony is. But the EDRAM kinda removes that limitation because it has 256 GB/s of internal bandwidth or something. I'm not that up on it but they do all the framebuffer stuff that takes all the bandwidth in there. Then they write it back but it's very small on the writeback.

Basically the chip in PS3 is a 7900GTX with a 128-bit bus like a 7600GT. So you can see why MS wanted to avoid that situation..

You could probably say, the Xbox 360 will be less bandwidth limited because of the EDRAM than the PS3. The tradeoff is the PS3 would likely have more shading power..

Whose call was correct? Who knows. So far it seems to me the 360 is keeping up pretty well with the PS3 graphically..
 
Hmm I can play most games 4XAA with no problem :D

AA is nice, I hate jaggies but I can't stand blurry textures even more so don't forget about Anisotropic Filtering while you're at it. (Looks like MS forgot about it...)


It would be nice to have 1024x AA in games (I think thats what CGI movies use) + inifinite anisotropic filtering on micro high detailed textures that make up what we think of as pixels on massive textures nowadays.

Hopefully it will happen some time before I kick the bucket!
 
My question comes down to this: How does adding more memory (even if it is on the GPU die) give you AA for free?

No matter what you add to the card (or even the GPU die itself), I still see a performance hit. Add more memory to speed up AA modes, the card would use it to speed up non-AA modes (or store more mesh or texture detail). Add more pixel shaders to speed up AA modes, the card would use them to speed up non-AA modes. There is still a performance hit for AA.

The Xbox360 may have "free" AA, but it can only output 30fps max to a television. I suspect pixel shaders are going unused and there's some free memory on that GPU if AA is turned off. Might as well turn on AA and use that silicon.

This would be an interesting experiment: Run an older game on your 7900GTX at 1280x720 with no AA and lock the framerate below 30fps. I bet you'll have unused resources on the GPU, too. Turn on AA, and watch the framerate stay at 30fps. Poof! Free AA!
 
Think of it this way...

Never turn off 4xAA and instantly, it's free. You will never have a benchmark to compare non-AA performance to, so it gives you the impression of "free" AA.
 
Sharky974 said:
It's not that developers dont or wont support it..it's more that it's more a pain in the ass apparantly than MS first let on, when they talked up "free" 2X and 4XAA on Xbox360.
Typical developer laziness :) Seems to be an upward trend these days. Implementing predicated tiling as you've described is no doubt a Richard Nixon (tricky), but there seems to be little real motivation in the industry to attempt to improve engine/renderer capabilities and optimize on a per-platform basis. The trend seems to be "Take off-the-shelf solution and get by". More than a little disconcerning, but do console gamers really care? I don't think I've ever read any article or console publication that ever mentions antialiasing, so I'm not sure if it's a concept that really exists in the consumer mindset right now.

I think Sony probably made the right decisions with the PS3, but one look at the price tag is enough to make any hardcore console gamer faint. The question is, what kind of AA are we going to be looking at with that platform, are we getting aniso, and do console gamers care?
 
bassman said:
My question comes down to this: How does adding more memory (even if it is on the GPU die) give you AA for free?

No matter what you add to the card (or even the GPU die itself), I still see a performance hit. Add more memory to speed up AA modes, the card would use it to speed up non-AA modes (or store more mesh or texture detail). Add more pixel shaders to speed up AA modes, the card would use them to speed up non-AA modes. There is still a performance hit for AA.

The Xbox360 may have "free" AA, but it can only output 30fps max to a television. I suspect pixel shaders are going unused and there's some free memory on that GPU if AA is turned off. Might as well turn on AA and use that silicon.

This would be an interesting experiment: Run an older game on your 7900GTX at 1280x720 with no AA and lock the framerate below 30fps. I bet you'll have unused resources on the GPU, too. Turn on AA, and watch the framerate stay at 30fps. Poof! Free AA!

The 360 is not limited to 30FPS..how would that even work? I know Forza 2 I think is said to be 60 FPS..in development.

360 is no different than any other GPU in that respect. You could make a game 60, or 120..but of course the graphics will be less..

The whole deal is, the EDRAM, ROPS, are on a daughter die..if you've ever seen a pic of Xenos, it has two dies a big and a little..the little one looks about like a old AMD Barton CPU or something.,.all the shaders are on the big, the EDRAM, ROPS, and I guess some logic to do "free" AA are on the little die. And they're connected with a 32 GB/s bus.

So I dont really know how it works but everything to do "free" AA is supposed to be built into the EDRAM.

It's kinda a awesome design really..the only glitch imo is the tiling.

But where I learned all of this so I can pretend I'm an expert is here:

http://www.beyond3d.com/articles/xenos/

I dont see this happening in PC GPU's..

Brent said it best..you can have free AA, but it'll cost you..
 
Sharky974 said:
The 360 is not limited to 30FPS..how would that even work? I know Forza 2 I think is said to be 60 FPS..in development.

I was wrong in saying 30fps... HDTV allows 60fps progressive. It's a limitation of the output format. Hook your console up to a standard def TV and you won't get any higher that 59.94fps interlaced (same pixel rate as progressive frames at 29.97fps). Hook up the console to your HDTV and you won't get any higher than 60fps progressive. I don't know if HDTVs with DVI would support faster framerates, so that may not be limited in that case.


Sharky974 said:
So I dont really know how it works but everything to do "free" AA is supposed to be built into the EDRAM.

That's the question I'm asking. I want to know *how* more memory would give free AA. Brent's post sounded like you could just add memory or up the memory bandwidth and you magically get AA with no performance penalty.

I don't see adding regular EDRAM doing it - it's just DRAM with a fast SRAM cache. The link you provided gives the answer: the EDRAM module in Xbox360 is more than just EDRAM. It contains extra logic for performing alpha blending, z-buffering, and MSAA. It's not the EDRAM itself that speeds up AA - it's the dedicated MSAA logic in the EDRAM module (which must be idle in non-AA modes). That makes sense now. Thanks for the link, Shark.
 
bassman said:
My question comes down to this: How does adding more memory (even if it is on the GPU die) give you AA for free?

Let me put it to you this way. On the XBOX 360 which uses eDRAM.

eDRAM = 256 GB/sec of bandwidth

Desktop Video Cards local memory bandwidth for an X1900 XTX = 49 GB/sec
 
chiablo said:
Think of it this way...

Never turn off 4xAA and instantly, it's free. You will never have a benchmark to compare non-AA performance to, so it gives you the impression of "free" AA.

LOL... that's kinda what I feel people are saying here.
 
Brent_Justice said:
Let me put it to you this way. On the XBOX 360 which uses eDRAM.

eDRAM = 256 GB/sec of bandwidth

Desktop Video Cards local memory bandwidth for an X1900 XTX = 49 GB/sec

Again, I don't see the memory bandwidth alone that gives "free" AA. Sharky's link clearly states the chip on Xbox360 that contains EDRAM also has dedicated MSAA logic.

The extra bandwidth helps AA performance, but it also helps rendering in non-AA modes, too. You'd still have a performance difference between AA and non-AA if the other settings are the same.
 
Heres a good article on how multisampling works. (super sampling is easier to understand though)

http://www.3dcenter.org/artikel/multisampling_anti-aliasing/index_e.php

When multisampling each pixel in a fragment gets its own color buffer and those pixels get averaged together to soften jaggies. So for each level of AA added you add an extra color buffer to video memory. (width*height*bpp) And kill some memory bandwidth to move these samples back and forth. So, having a huge amount of fast memory == free AA.
 
why can't they put little fairies inside the GPU that sprinkle pixie dust on the screen to make it photorealistic without slowing down the fps?

2xAA is practically free. I can enable 2xAA on every single game I own (except oblivion, but that's an NVidia limitation), and that itself makes a huge, huge difference. 4xAA is where a performance hit starts, but I can still enable that in most games.
 
bassman said:
I don't see adding regular EDRAM doing it - it's just DRAM with a fast SRAM cache. The link you provided gives the answer: the EDRAM module in Xbox360 is more than just EDRAM. It contains extra logic for performing alpha blending, z-buffering, and MSAA. It's not the EDRAM itself that speeds up AA - it's the dedicated MSAA logic in the EDRAM module (which must be idle in non-AA modes). That makes sense now. Thanks for the link, Shark.

So you basically have a co-processor that does a second pass on the image while the main GPU is rendering the next frame?
 
Sly said:
So you basically have a co-processor that does a second pass on the image while the main GPU is rendering the next frame?

That's what I understood from the article, although it wasn't clear to me if the dedicated AA logic is a frame behind the main GPU or working on the same frame.
 
bassman said:
Again, I don't see the memory bandwidth alone that gives "free" AA. Sharky's link clearly states the chip on Xbox360 that contains EDRAM also has dedicated MSAA logic.

The extra bandwidth helps AA performance, but it also helps rendering in non-AA modes, too. You'd still have a performance difference between AA and non-AA if the other settings are the same.

but at 256GB/s i dont think you would be worrying about your frame rate. more than likely 100+ fps. thats assuming the gpu itself can keep up with the memory.
 
Ok I know the solution, they are now going to start making seperate AA/AF cards to offload the processing from your main GPU, this is to go along with your physics card, your AI card and your shader processing card.. :p
 
Liekomg said:
I mean seriously, the quality different between no AA and 4xAA is huge, especially in games that have a lot of small details/use a lot of polygons like Oblivion. Having 4xAA on in Oblivion makes things look almost photorealistic indoors. 3dfx started the AA bandwagon like 6 years ago, yet we still don't have a card that can offer FSAA without a performance hit!

Why doesn't ATI or Nvidia create some special circuitry in their chips for the sole purpose of doing FSAA so that there is no performance penalty?

First of all, you're rendering at subpixel accuracy (i.e. effectively rendering a higher res image) so that's why it isn't "free." As for "special circuitry" -- it wouldn't be free, now would it? You have to use transistors that would go into other things anyway, besides I don't think this makes any sense..the green and red monsters would have done it already.
 
pandora's box said:
but at 256GB/s i dont think you would be worrying about your frame rate. more than likely 100+ fps.

I agree with you - if framerate is above whatever I consider acceptable, I don't worry about it. That's the practical approach. I'm being a little more academic with my question because we're talking about "free" AA.


pandora's box said:
thats assuming the gpu itself can keep up with the memory.

That's a good way to put it. Instantaneous memory access would do nothing for performance if some other task (pixel shading, texture mapping, etc) couldn't keep up. And on our desktop cards, since the pixel shaders are used to generate subsamples for AA they could wind up as a bottleneck.

The Xbox360 differs because it has dedicated AA hardware. Maybe that's the route desktop cards will take. Who knows...
 
Techx said:
Ok I know the solution, they are now going to start making seperate AA/AF cards to offload the processing from your main GPU, this is to go along with your physics card, your AI card and your shader processing card.. :p
My understanding is that the EDRAM needs to be either right next to the die, on top of it (a little unrealistic, but possible someday) or somewhere very close to it. If ATi could place the EDRAM somewhere on the logic board and shave the 80 million transistors from the die, they would have done so.
 
You can get "free" AA in plenty of games.....that are old.
The thing you are failing to see in the quest for "free" AA, even after years of it being available as an option, is everything else in rendering technology and features didn't just stand still and let AA take over. We would have plenty of free AA if all our current games were based on the Quake3 engine.
Oh yeah, also add in the fact people dont seem to like to play games at 1024x768 anymore.. It's gotta be 9999x9999 to look good these days.
 
-freon- said:
You can get "free" AA in plenty of games.....that are old.
Isn't AA a post-processing effect? So the amount of work required for four AA passes on a 1600x1200x32 frame would be identical whether you're running Wolf3D or FEAR?
 
Back
Top