HardUp4HardWare
Supreme [H]ardness
- Joined
- Aug 30, 2005
- Messages
- 4,274
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
HEXUS.beans :: ATI produce tool to increase Doom3 scores 'up to 35%' with AA enabled
Posted on Thursday, 13 October, 2005 by Ryszard
ATI Technologies have this morning made a tool available to HEXUS which supposedly improves scores in Doom3 when antialiasing is enabled. Improvements 'of up to 35%' are explicitly mentioned by sources within ATI. The tool seemingly changes the way the graphics card maps its own memory to better deal with handling AA sample data in Doom3's case.
The fix will shortly be rolled into CATALYST 5.11 according to ATI sources and a beta drop of that driver will be made available for testing in due course, before the final WHQL driver from Terry Makedon's CATALYST team is made available for public download in November.
HEXUS are in the process of testing the fix and we'll bring you some scores using the executable tool very shortly. It's unclear if the fix affects other games or whether it solely affects Doom3 via ATI's application detection scheme and their ability to reprogram the GPU's memory controller per application using CATALYST A.I.
More on the tool and the fix as we get it.
Trimlock said:is this ATI starting to utilize their brand new and very flexible memory controller? or is this going to be for all generations?
Rori said:a 35% boost in a game as opposed to a benchmark could be a serious boost to ATi. This is especially true if that boost carries on to all games which later license the engine as Kyle notes.
D3 is a nice high tech demo and okay for benchmarking. Nothing more than that. Quake 4 will be out in a week and might have some replay value so I can imagine that Ati wants to tweak it's drivers for this game and other upcoming games (??) based on the D3 engine.jebo_4jc said:However, if this is purely a Doom3 optimization, ATi shoulnd't have wasted their time. D3, when compared to upcoming games, wasn't much more than a fancy playable tech demo, and will be forgotten. (My copy of D3 hasn't even been installed since I first bought it and beat it--a stark contrast to HL2, UT2k4, etc).
You mean the original CS? What video card are you using? Any of today's video cards should be able to run CS at about 200fps at any quality settings.Rori said:Oooh... Dont tease me with thoughts of OpenGL optimization. By and large, 90% of my time on my LAN machine is spent playing CS. 35% bump in CS would make me cry for joy, I swear.
I seem to remember hearing rumors that ATi was working on a "from the ground up" OpenGL re-write. Oh how I wish it could be true!
jebo_4jc said:However, if this is purely a Doom3 optimization, ATi shoulnd't have wasted their time. Let's hope ATi can manage to get this working for future D3 engine titles as well.
wizzackr said:"As well as supporting Doom3, the tool is said to increase performance in all OpenGL titles when using antialiasing." - Hexus
...clever release just in time before Quake4 comes out, ATI
As well as supporting Doom3, the tool is said to increase performance in all OpenGL titles when using antialiasing.
The tool seemingly changes the way the graphics card maps its own memory to better deal with handling AA sample data in Doom3's case.
Rori said:I stand by what I said earlier, a 35% boost in a game as opposed to a benchmark could be a serious boost to ATi. This is especially true if that boost carries on to all games which later license the engine as Kyle notes. I also stand by my statement that this will have to occur at very little or no loss to IQ. Optimizations which reduce IQ are not IMO optimizations, they are "work arounds". In the end, we will have to see benchmarking from reputable sites that include both raw FPS and IQ from released drivers before we can get too excited.
That being said, I would love to see this happen and that it includes older cards. I was already looking at buying a cheap AGP upgrade to replace my 9700Pro on my LAN system so I could be ready for Q4. Generally, I don't care about IQ too much in my LAN machine since I turn off shadows, flares, reflections, etc. in order to maximize frame rates and help find people camping. If this works out, it would save me some jingle until I finally build a PCI-E LAN machine.
This change is for the X1K family. The X1Ks have a new programmable memory controller and gfx subsystem mapping. A simple set of new memory controller programs gave a huge boost to memory BW limited cases, such as AA (need to test AF). We measured 36% performance improvements on D3 @ 4xAA/high res. This has nothing to do with the rendering (which is identical to before). X800's also have partially programmable MC's, so we might be able to do better there too (basically, discovering such a large jump, we want to revisit our previous decisions).
But It's still not optimal. The work space we have to optimize memory settings and gfx mappings is immense. It will take us some time to really get the performance closer to maximum. But that's why we designed a new programmable MC. We are only at the beginning of the tuning for the X1K's.
As well, we are determined to focus a lot more energy into OGL tuning in the coming year; shame on us for not doing it earlier.
John Reynolds said:An ATI employee just posted something very interesting over at B3D: http://www.beyond3d.com/forum/showpost.php?p=595463&postcount=155
This change is for the X1K family. The X1Ks have a new programmable memory controller and gfx subsystem mapping. A simple set of new memory controller programs gave a huge boost to memory BW limited cases, such as AA (need to test AF). We measured 36% performance improvements on D3 @ 4xAA/high res. This has nothing to do with the rendering (which is identical to before). X800's also have partially programmable MC's, so we might be able to do better there too (basically, discovering such a large jump, we want to revisit our previous decisions).
But It's still not optimal. The work space we have to optimize memory settings and gfx mappings is immense. It will take us some time to really get the performance closer to maximum. But that's why we designed a new programmable MC. We are only at the beginning of the tuning for the X1K's.
As well, we are determined to focus a lot more energy into OGL tuning in the coming year; shame on us for not doing it earlier.
Interesting. It appears the X1k family has some long legs after all.But It's still not optimal. The work space we have to optimize memory settings and gfx mappings is immense. It will take us some time to really get the performance closer to maximum. But that's why we designed a new programmable MC. We are only at the beginning of the tuning for the X1K's.
About time I say! It seems much of their effort have been put into optimizing for only one of the past year's biggest graphics engines, and they needed to change that.As well, we are determined to focus a lot more energy into OGL tuning in the coming year; shame on us for not doing it earlier.
I got the impression this tool only helps the X1k cards from the Hexus article.Trimlock said:so, brent, does that mean this tool only applies to cards with the new ring bus?
it would be nice if this tool effected x800's
Rori said:ps - Educate the old and out of touch with net lingo, what exactly is QFT?
magoo said:35% from what???? Thats alot.
?emailthatguy said:now ati has sites doing pr releases for them lmao too funny.
jebo_4jc said:Interesting. It appears the X1k family has some long legs after all.About time I say! It seems much of their effort have been put into optimizing for only one of the past year's biggest graphics engines, and they needed to change that.
Rori said:No bump for my old 9700? Imma cry...
As for the quote that John Reynolds posted, I don't see anywhere in there that these optimizations would be limited to OpenGL titles. If any title with bandwidth limitations could be improved by this type of fix that could SERIOUSLY change the way people are looking at the x1000s. I know that right now this software "tweak" is only a rumor and I am naturally leery of claims to gain 35% performance overnight, but this is getting VERY interesting.
ps - Educate the old and out of touch with net lingo, what exactly is QFT?
Looks promising. Also it appears that ATi can improve their performance even more than 35% after polishing off the drivers. Good stuff.sireric said:There's some slides given to the press that explain some of what we do. Our new MC has a view of all the requests for all the clients over time. The "longer" the time view, the greater the latency the clients see but the higher the BW is (due to more efficient requests re-ordering). The MC also looks at the DRAM activity and settings, and since it can "look" into the future for all clients, it can be told different algorithms and parameters to help it decide how to best make use of the available BW. As well, the MC gets direct feedback from all clients as to their "urgency" level (which refers to different things for different client, but, simplifying, tells the MC how well they are doing and how much they need their data back), and adjusts things dynamically (following programmed algorithms) to deal with this. Get feedback from the DRAM interface to see how well it's doing too.
We are able to download new parameters and new programs to tell the MC how to service the requests, which clients's urgency is more important, basically how to arbitrate between over 50 clients to the dram requests. The amount of programming available is very high, and it will take us some time to tune things. In fact, we can see that per application (or groups of applications), we might want different algorithms and parameters. We can change these all in the driver updates. The idea is that we, generally, want to maximize BW from the DRAM and maximize shader usage. If we find an app that does not do that, we can change things.
You can imagine that AA, for example, changes significantly the pattern of access and the type of requests that the different clients make (for example, Z requests jump up drastically, so do rops). We need to re-tune for different configs. In this case, the OGL was just not tuning AA performance well at all. We did a simple fix (it's just a registry change) to improve this significantly. In future drivers, we will do a much more proper job.
Moofasa~ said:For those who are wondering exactly how ATi is doing this, sireric who works at ATi posted this little blurb at Beyond3D.
Looks promising. Also it appears that ATi can improve their performance even more than 35% after polishing off the drivers. Good stuff.
John Reynolds said: