Benchmarking the Benchmarks @ [H]

FrgMstr

Just Plain Mean
Staff member
Joined
May 18, 1997
Messages
55,532
Benchmarking the Benchmarks - It is time to put your money where your mouth is, or maybe where your keyboard is. HardOCP sets out to prove that real world video card testing is where it’s at. Beware, we may make you feel dirty every time your run a benchmark from here out!

Please Digg to Share!
 
Awesome. This needed to be done after the controversy when you tested the 3870 x2.
 
Awesome. This needed to be done after the controversy when you tested the 3870 x2.

Yep, this really helps ppl who DON"T RTFA shut up, heh ;).
But seriously, I think it was helpful that you (Kyle and Brent) worked extra just to put in this article to clarify your testing methodology, and not to mention the comparisons of the canned vs real world scores.

Thumbs up.
 
*grabs lawn chair, tasty beverage, and sun glasses.*

Let the flames start!

I've agreed with Kyle for years on this. The only thing a time demo is good for is E-penis comparisons. :p

 
Yep, this really helps ppl who DON"T RTFA shut up, heh ;).
But seriously, I think it was helpful that you (Kyle and Brent) worked extra just to put in this article to clarify your testing methodology, and not to mention the comparisons of the canned vs real world scores.

Thumbs up.
Ditto. Wow, on a Sunday night! This will be an interesting thread to read tomorrow afternoon.
 
Good article...especially for those who still don't understand the [H] methodology even after all this time. However, the cynic in me says that when the next big video card comes out and there's differences in test results, history will repeat itself again.
 
Oh man, this should be a good one.
No kidding. I can't wait!

I'm going to pour over this article tomorrow, but what I've read so far looks solid. Particularly interesting is the delta between 'FRAPing' the canned demo and running it in timedemo mode. It leads me to wonder what exactly Crysis is doing differently between the two modes of demo playback, if anything. Is there any sound playback during regular demo playback? Are 'effects physics' being calculated in one of the modes, both, or neither?
 
Excellent article. This could silence detractors... or egg them on.

*Runs away from impending flame war*
 
This will be a fun intra-forum war.

I'm in for some Anon. pwning, [H] style.

Great read anyway.
 
Nice read. I prefer your guys reviews I know what the card will do in game. Canned benchmark do a good job at showing which card is faster than the other. But that really means nothing outside of bragging rights. Like you said in the article canned benchmarks can tell you that Card A is 18% faster than Card B. OK thats grand but whats that really mean unless you want to get into a geek fight over who's got the fastest computer
 
That last page really puts it into perspective. A couple more examples would have been even sweeter :)

Thanks, typo fixed. Kyle
 
And this is why I trust [H] to guide multi-$100 purchases.
I want to know how it works, not how it tests.
 
Awesome job guys. Brent, you and Kyle didn't have to do this but I'm glad you guys took the time to do so. Much laud and praise to your staff.

I was kinda surprised the increase was as much as it was on the canned time demos and your fraps run throughs. Thats kinda staggering.

P.S. Wonder how many pages this will be tomorrow night?
 
Wow, great read. This review showed me a whole bunch. I don't usually comment on articles, but this one was great and I think it showed ALOT of information alot of people were wondering about. It opened my eyes halfway, I already followed your reviews first, but I admit I would almost do an average out with what Anandtech would put for their review on the same product.(Never doing that ever again) Nice article, keep up the good work.
 
And this is why I trust [H] to guide multi-$100 purchases.
I want to know how it works, not how it tests.

Exactly. With a hobby as expensive as this I can NOT afford to waste money due to misleading reviews. That's why [H]ardOCP will always be the best and only site for real reviews.

Plus, Kyle is sexy
 
Good article.

Curious - why do you guys use only 2GB of ram with a 64-bit OS? seems like a weird pairing to me.
 
I think your statement "Timedemo benchmarking of video cards is broken" say's it all. Benchmark optimizing is unfortunately being abused by the OEMs. I have to say that Anand Tech video card reviews are the last ones I read or breeze thru, though I like the rest of the work they do. Benchmarking with-out some level of AA & AF is just wrong if it is usable in showing what power a GPU is capable of.
 
Now just if they would have done these kinda tests before I got my $280 8800GTS 320.... and wasted my $ on a P.O.S. that was supposed to be good.

Anyways can't wait to see the results.:cool: I will be reading here and here only when in the market for a new card
 
Nice article, and I agree that the benchmarks only show how good something runs the benchmark, and that doesn't necessarily correlate with how well it performs when you actually use it in a game. Its sad but drivers do have optimizations that are released just to make them perform better in benchmarks, which pretty much invalidates the built in benchmarks.
 
excellent article, confirms my feelings all along with the reviews here always being much more accurate than any other site.

Keep up the great work!
 
Thanks for the article Kyle and Brent. It was a good read. While I'll still read traditional reviews along with your own articles, it sheds some more light on what you do and why you do it. The biggest hurdle I have with your evaluations is simply the fact that you don't show results from 1440x900, which is the highest res my monitor can support. It makes it harder to figure out how your numbers apply to the lower resolution. Obviously the higher resolutions offer better stress testing to the GPU, but until I get a larger monitor I will need to look at the reviewing sites to get some kind of an idea on how well new cards and hardware run on my set-up.

There are still a lot of unavoidable variables on these types of things simply due to how hardware works together depending on which parts. As you said in your article, there never will be an end all be all for hardware sites.

I have a question regarding said variables actually. When testing theses cards do you ever notice a difference depending on the chipset used? Like do the Nvidia cards have a noticeable (in terms of real gameplay, not frames) improvement when running on Nforce chipsets versus Intel or AMD ones? I think this was answered before, but I can't remember.
 
u know kyle, I wondered...

u say that it is hard to find spots that are easy to reproduce results in, and spots that are stressful for the video card.... so why not make your own custom map?

It could be a big island in Crysis. there is a straight path that the player walks through. As the player walks down the path, various things happen at certain points that you can script. At certain points, certain things happen in the players view to the left or right of the path: At one point, 20 humvees explode. At another point, a huge firefight between 10 koreans and 10 marines breaks out. At another point 1000 boxes fall from the sky, and on and on until the player gets to the end of the line.

It is easy to reproduce results, since all the reviewer has to do is walk down a straight the line, and it IS real gameplay, since AI/collision detection/all other elements are actually in play.... then you guys can distribute your custom map to forum members and anyone else that asks for them so they can test their hardware in a similar manner. Crysis has a *ridiculously easy* editor too, making a map like that couldn't take more than a day or two to get right....

I'm sure you guys have thought about this, so what are the drawbacks? Let me know where my bright idea falls flat on its face :)

edit: the thing about this method I'm talking about is: it takes almost all the variability out of the process (in terms of the player's input... since physics and AI engines will always change the experience.. so there is still some!). The people who say that the review is too subjective would lose alot of their ammunition since most of that is taken away by the simple map and extremely, easily reproduced results....
 
I really enjoy the [H] reviews a lot. I believe they tell a lot more than the old method. Even as a little kid reading reviews of the TNT2 vs the Voodoo 3 I always wondered why they benchmarked games with the sound turned off. I don't play my games with the sound off. It never made sense to me. It's the same with flyby demos built into games and 3DMark, no game I played was ever like any of that so what good is it?
 
This confirms what I already believed to be true. At least in my eyes anyway. Still it won't take long for the criers to do what they do best. Operate on the "don't trouble me with the facts" approach.

Nevertheless thanks for the article, interesting read and validation of my own long held opinion.
 
The biggest hurdle I have with your evaluations is simply the fact that you don't show results from 1440x900, which is the highest res my monitor can support.

Well I think it would be an obvious decision, if Card A performs better at the higher resolutions tested, why would it not be equally as well performing at your lower resolution?

Great article guys.

I love the final run-through with Anand's settings :D
 
I have been a HardOCP-er for years...have seen the site get a face-lift 3 times. When I re-build my PC and re-install the OS - it is THE first Favorite I put on / in my Browser. You guys have been my principal provider of tech info and I owe you guys lots - my pc thanks you too for all the cool gizmos inside of it!

Dont change a frappin thing - you all got / are getting it done. TH and AT blow. havent even visited those sites in YEARS - OCP is my only stop...what else do you need?

that is all - keep it up guys.

S
 
Great job. It was needed. Hope it helps other sites see the value in actually using the card instead of only benching it and giving it a report card grade.


You guys moved up a rung on my ladder. (Even though you're in Texas :D )
 
Excellent article to show the differences, there are going to be alot of people eating pie! :D I've been reading through that thread at B3D about [H]'s testing methodoligy, and for a bunch of people over there whom are supposedly familiar with graphics hardware they sure are unable to grasp the differences the testing methods that this article brings to light. Sad. It would seem a great many sites are relying on the fastest way to present information for the least amount of work, consumers and their readers be damned to the actual playability of a video card when it is used for what it was built for, gaming, not demonstrating.
 
First off, "TimEdemos Tell Me What?" (http://enthusiast.hardocp.com/article.html?art=MTQ2MSw0LCxoZW50aHVzaWFzdA==)
Curious - why do you guys use only 2GB of ram with a 64-bit OS? seems like a weird pairing to me.
Vista 64-bit can read over 4GB, yes, but games don't get any performance benefit from more than 2GB of RAM today. If they did, they'd use 4GB.

Secondly, Kyle for president! Wooo! :D

Very nice article pointing out what we already knew.

u know kyle, I wondered...[...] why not make your own custom map?
That would defy the whole point of actual gaming performance, wouldn't it? If you make a map that's too demanding, it'll be unlike anything you'll actually experience when playing the game. If it's not demanding enough, same thing. It would allow you to compare 2 video cards when rendering the same scene, but it would not be a scene found in the game...therefore it wouldn't give you an idea of how the game would perform for you.
 
Nicely done. Now I will just sit back, relax and wait for the flames to start. I have some hot dogs and marshmallows to roast. :D
 
Oh and I don't know if you guys had any influence or not, but I noticed Tom's has started using in game fps rating for crysis rather than the build in demo.
 
Well I think it would be an obvious decision, if Card A performs better at the higher resolutions tested, why would it not be equally as well performing at your lower resolution?

Great article guys.

I love the final run-through with Anand's settings :D

You'd think so, but in the past some cards have actually shown better perfomance at lower resolutions. It makes it weird. It also brings to question more of the price vs perfomance metric. Lets say someone is running at only 1440x900 and they are looking for a good card to buy for their system. Now they see these evals and many others on the net showing the GTX and Ultra at the top. The thing with that is simply at the lower resolutions the cards simply are not worth the money for the performance boost they will give on most games, unless the person buying only cares about getting the highest frame rate possible. At 1440x900 my 3870 maxes out everything thrown at it, besides Crysis. At higher reses these evals are great. At lower ones it just makes it a more complex decision. I do understand why they don't test at those resolutions though.
 
Back
Top