Benchmarking the Benchmarks @ [H]

What has not been addressed here is FRAPS. The program typically has an overhead of 3%~15% depending on the game engine and settings. Also, replaying the same "demo" several times with FRAPS measuring the framerate will usually generate another 5% variance (on average) between each test. In fact, some websites have reported variances anywhere from 3% to 12% on the same demo and is a major reason why it is not used extensively for those wanting "scientific" results.

To put "blind faith trust" into a program that has to be updated for each major game engine or DX release is a little risky in my opinion. In fact, compare each FRAPs revision over the last 18 months with a game engine that was released during that time and you will find the numbers improve as the program is optimized to work "properly" with the game engine or DX updates. This is not to harp on FRAPS, but like all software, as the code changes, so will the results.

Not to mention the expectations of the reviewer to "perfectly" walk through the same scenario each time is asking a lot. While you might become efficient at it, a change in the walk path, turning the head a few degrees, or increasing or decreasing speed even slightly will bring about a whole new set of variables into the numbers.

Not saying any of this FRAPS testing is wrong (as long as it is consistent), just that people need to take a look at FRAPS results in the same way they do "Canned" benchmarks, the truth about performance probably lies somewhere in the middle. ;)

As far as this article goes, what I would really like to see is a technical explanation as to why FRAPS and r_displayinfo 1 numbers are so far off in Crysis. Try it yourself... ;)
 
I think canned benchmarks are obviously a pile of crap. I am glad that you guys stick by real world benchmarks, because they obviously have different results than canned ones and they can tell me what is ACTUALLY PLAYABLE in a game. Way more meaningful. I found it funny that you said "not to pick on anandtech...." in the article, though... and later continued to pick on them :)
 
Props gentlemen. This stands as testament as to why I personally have stuck with the [H] through the years.

Keep up the great work. We look forward to more of it. :)
 
...FRAPS can only give you the rough numbers. It can be off from 1% to 30%(the human factor). Your mistake is to take it too seriously, that is why you have turned this site into a joke. What makes it even funnier is that you seem to be very proud of that "real world" thing. :D

The syllogism you set up is completely false. Please read the article again but this time, don't drink any caffeine, take a deep breath and actually read it.

Let me help you with a quote that you should remember while looking at any of [H] video card evaluations...

The “run-throughs” and framerate data you see represented on our pages is not used to draw conclusions on by the author. We take a video card, play through our suite of games and make notes indicating the highest playable settings we could muster with the card while checking image quality along the way,

That's right, the numbers don't matter, actual game play does. The FRAPs numbers are just their to validate their observations. Honestly, I think Kyle's first instinct was right, the graphs are a distraction. That said, tech junkies get wet for them so they stay.:D
 
The reason I completely and utterly ignore HardOCP's video card reviews is simple: I run my games at a fixed resolution, and I want to know what card is fastest at that revolution with a given feature set. I don't give a tin shit to hear about how Card A runs Game X acceptably at 1280x1024 on high with 2xAA16xAF, and Card B runs Game X acceptably at 1600x1200 on medium at 4xAA4xAF. What I want to know is what card is fastest across a consistent feature set (graphics settings, AA, AF) at a given resolution? What card will, all else being equal, give me the best performance at (in my case) 1680x1050 under a given feature set? You don't need to exhaustively test every possible combination of settings. Just set the bar at a given point, doesn't really matter where, and test. This lets people actually come to a conclusion regarding relative performance, instead of having a bunch of cards at a bunch of settings.

I admire HardOCP for their forward-looking attitude, but they need to understand that in order to perform a meaningful test, they need to decide what variables they are actually testing. Controlling extraneous variables is the most important part of any real test.
Saying that you are testing "gameplay" is all well and good, but what does that mean? Apparently it doesn't mean "max framerate." It also doesn't mean "best image quality" or "highest resolution." Instead, it's a nebulous, weighted average that changes depending on the reviewer, the cards reviewed and their relative performance. Thus, we get cards that receive meaningless awards because they managed to wheeze along at 1920x1600 with no AA at medium settings, while the card that does 1280x1024 at high settings loses out.

If HardOCP reviewed cars, they would say things like "the Mustang is the best car because it has four seats compared to the Corvette's 11 second quarter mile." Meanwhile, Anandtech would report top speed, skidpad tests, and acceleration/braking tests, and get mocked for "not testing real-world driving experiences." Anandtech's review method is simply better; it provides more information and lets the reader draw their own conclusion from the collected data. It also lets us compare cards at a fixed selection of settings, allowing real comparisons to be made.

I'm not asking for factorial ANOVAs or anything, or even any real statistical analysis. I'm just saying that it's impossible to reliably draw any sort of conclusion about the relative rank of all these video cards when you cherrypick what settings to run at, what settings are important and what is acceptable gameplay, and then report back something that is essentially based on the reviewer's gut feeling.
 
why not create a benchmark that throws random events into the mix... so that videocard companies cannot optimize their drivers for benchmarks... since they know the exact events/stress/loads etc...

like.... random explosions, quantities of objects, random effects, random flashes etc... that are not static... set in stone... maybe even use random ai for living npcs in the benchmark so they too can put different stresses into the benchmark like a real world experience?

ps: anandtech still sucks
 
If HardOCP reviewed cars, they would say things like "the Mustang is the best car because it has four seats compared to the Corvette's 11 second quarter mile." Meanwhile, Anandtech would report top speed, skidpad tests, and acceleration/braking tests, and get mocked for "not testing real-world driving experiences." Anandtech's review method is simply better; it provides more information and lets the reader draw their own conclusion from the collected data. It also lets us compare cards at a fixed selection of settings, allowing real comparisons to be made.

I'm not asking for factorial ANOVAs or anything, or even any real statistical analysis. I'm just saying that it's impossible to reliably draw any sort of conclusion about the relative rank of all these video cards when you cherrypick what settings to run at, what settings are important and what is acceptable gameplay, and then report back something that is essentially based on the reviewer's gut feeling.
Oh man, ANOVA, that goes back....With cards like the 8800 Nvidia series, and ATi's HD38xx series, doing tests at 1280X1024 is pretty pointless (note my 19" monitor). All cards at these price levels are going to be just fine, and produce framerates well above what you can notice the difference between. In addition, those resolutions just don't stress the card. Sure, if you turn on 16x AA, you might fill up the frame buffer, but the GPU will be yawning.

By running at the highest available resolutions you max out the GPU. Then, but gradually turning up (or down) the eye candy features (like AA, AF, shaders, etc) you are able to find the highest playable settings.

To use your analogy, a benchmark is like recommending a corvette over a hummer because it's faster, etc on the skidpad, quarter mile, 0-60 etc. But what about real-life? How many people ACTUALLY USE those capabilities? What about people who live in alaska, or here in Michigan where there's snow on the ground most of the time?

Benchmarks are nice if you are someone whose real-life usage is benchmarks.
I know I just LOVE running benchmarks all day, lol (friendly sarcasm).

So it comes down to how applicable are the numbers? Benchmark numbers are great of you are all about bragging rights and running benchmarks. Real-world gameplay tests are for those who use their video cards for gaming.

You pick.

why not create a benchmark that throws random events into the mix... so that videocard companies cannot optimize their drivers for benchmarks... since they know the exact events/stress/loads etc...

like.... random explosions, quantities of objects, random effects, random flashes etc... that are not static... set in stone... maybe even use random ai for living npcs in the benchmark so they too can put different stresses into the benchmark like a real world experience?

ps: anandtech still sucks
Because it removes the randomness of the USER. I don't play the same level teh same way every run-through. Look at [H]'s graphs. They are remarkably consistent when it comes to highs and lows.
 
Well the GPU evaluations along with the power supply stress testing are the major reason I come to this site. I really trust this sites methods of testing hardware, and I feel comfortable making purchases partially based of what results I find here. Thanks for putting in the extra mile guy's.
 
AbsintheCommando makes a good point, but there's something else [H] screwed up in this article.

When comparing your results to Anandtech's numbers, you leave out one glaring inconsistency: You're running your benchmarks on an inferior system. Anandtech is running their rig with twice the RAM and a better processor than you are. How much of an effect is that going to have on the framerates of a game that is so stressful on your system resources?

Trying to compare your benchmark numbers to those given by Anandtech is utterly meaningless unless you can configure the systems used in the tests at least somewhat similarly.

N.B. I know one of the resident [H] minions (actually, probably Kyle himself) is going to try and jump down my throat stating that the main emphasis of the article was not comparing numbers with Anandtech over canned benchmarks, but rather the difference between timedemos and actual gameplay, so let me just go ahead and tell you that I don't care; that's not the part of the article I'm referencing.

Also, speaking on the topic of your comparisons between the timedemo numbers and real-time numbers, Anandtech shows performance deltas that are much greater than the diffference between the performance increases each of the cards when moving to the timedemos in your comparison.
 
It took a few years of dragging me kicking and screaming about "real world gaming experience" but finally, you win Kyle & Brent.
Your argument is really tough to poke holes in.

Cheers!
 
Good article.

Curious - why do you guys use only 2GB of ram with a 64-bit OS? seems like a weird pairing to me.

I agree I think its time to start using 4GB of RAM for benchmarking with vista 64. It made a big difference over 2 gb in games like the Witcher for me.
 
I agree I think its time to start using 4GB of RAM for benchmarking with vista 64. It made a big difference over 2 gb in games like the Witcher for me.

Not a large user base has vista 64 with 4gig ram, but when it does, I bet you will see it added to the evaluation.(not benchmark ;0)
 
Benchmarking the Benchmarks Article said:
Do you want video card reviews that suggest “relative performance of a graphics card” based on timedemo benchmarks when some cards benchmark better than others, or do you want an evaluation of those video cards' in-game performance in the latest and greatest computer games that you are going to be playing with it?

I think this question was asked as if it was meant to be rhetorical; but yes, in fact, I do want a review which indicates the “relative performance of a graphics card” and I want the results to be reproducible, otherwise I just have to have faith in the reviewer's methodology and honesty; and that's bad science.

Ironically, H's "real time" vs. "traditional timedemo" results seem to reinforce their validity as benchmarking tools. Both cards are ~35% faster in traditonal timedemos over their real-time counterparts, which shows that the results are comparable.

I'll assert that people are more interested in knowing the relative performance of a card, with results that they can verify and reproduce, than they are in knowing what settings someone else would play the game with.
 
I agree I think its time to start using 4GB of RAM for benchmarking with vista 64. It made a big difference over 2 gb in games like the Witcher for me.

AbsintheCommando makes a good point, but there's something else [H] screwed up in this article.

When comparing your results to Anandtech's numbers, you leave out one glaring inconsistency: You're running your benchmarks on an inferior system. Anandtech is running their rig with twice the RAM and a better processor than you are. How much of an effect is that going to have on the framerates of a game that is so stressful on your system resources?

Trying to compare your benchmark numbers to those given by Anandtech is utterly meaningless unless you can configure the systems used in the tests at least somewhat similarly.

If the game can only address 2 gb of ram anyway (if the OS is 64 bit, the app still has to be written to allow for more than 2gb) why add more? Also, NAME a game other than supreme commander that is CPU limited...

I'll assert that people are more interested in knowing the relative performance of a card, with results that they can verify and reproduce, than they are in knowing what settings someone else would play the game with.

But if it's applicable to the games you will be actively playing, who cares?
 
AbsintheCommando makes a good point, but there's something else [H] screwed up in this article.

When comparing your results to Anandtech's numbers, you leave out one glaring inconsistency: You're running your benchmarks on an inferior system. Anandtech is running their rig with twice the RAM and a better processor than you are. How much of an effect is that going to have on the framerates of a game that is so stressful on your system resources?

Trying to compare your benchmark numbers to those given by Anandtech is utterly meaningless unless you can configure the systems used in the tests at least somewhat similarly.

N.B. I know one of the resident [H] minions (actually, probably Kyle himself) is going to try and jump down my throat stating that the main emphasis of the article was not comparing numbers with Anandtech over canned benchmarks, but rather the difference between timedemos and actual gameplay, so let me just go ahead and tell you that I don't care; that's not the part of the article I'm referencing.

Also, speaking on the topic of your comparisons between the timedemo numbers and real-time numbers, Anandtech shows performance deltas that are much greater than the diffference between the performance increases each of the cards when moving to the timedemos in your comparison.

True. Kyle would probably argue that they have little benefit on gaming at the settings they test at, but still, who knows how much of an effect it has.
 
Maybe it's because I just woke up an hour ago, but this article really confuses me. The only thing is tells me is that Crysis timedemos don't equate to real world performance. I like HardOCP's testing methodology, but I've said all along that I, or anyone, would be just as much of a dumb fool for blindly following HardOCP's testing results as they would blindly following Tom's, or Anandtech, or whomever.

Here is my biggest beef with the article. It's for the results you only provided either one or the other. Yes, you have tests that show either just the GTX's performance (2), or just the X2's performance (1). This bothers me. This bothers me a lot, because you make claims based on those tests. You might be right, and that's fine, but I'm not convinced with only half of the picture:

(Island Test Results) So it would seem that depending on the settings used, it is quite possible for the 3870 X2 to “benchmark” much better than the 8800 GTX in this example.

(Rescue Map) So, had we used even our custom demos to “timedemo benchmark” our two cards in our 3870 X2 evaluation, the 3870 X2 would have enjoyed a “benchmark” advantage over the 8800 GTX when compared to real world gameplay.

This tells me nothing. You didn't provide X2 benchmarks for the Island Test and Rescue maps with shaders set to high. You only did for the GTX. If what you're saying is that an X2 would have enjoyed a benchmark advantage on medium shaders over a GTX using high shaders, I would have to reply back with, "Duh". Hopefully that's not the case and I'm the one confused here.

And for the final test, did you provide the GTX results for the Crysis Harbor real world demo with all settings set to high? No, you didn't. What exactly are you trying to prove to me then?

*********************************************************************

The only thing this article tells me is that Crysis is not accurate in its timedemo performance. Good, because I think we're on the same level here. I'm not disagreeing with you on that. But you're using just Crysis as an example to convince your readers of your viewpoint that on a general scale, timedemos are not the way to test video cards. Hey, you might be right there, and I'm not campaigning in the defense of timedemos, but that's not enough evidence to support that conclusive of a statement.

Blame Crysis here, not the X2.
 
Now we're at a bit of a sticking point. Which is more accurate? Testing with accurate tools in situations where the results can't be "exactly" reproduced; or testing with known flawed tools but which can be reproduced exactly?

Those are the questions you have to ask yourself.

Easy for me to answer - testing with accurate tools where the results can't be exactly reproduced. Why? Because no one plays any game the exact same way. The exact same steps in the exact same spots on the exact same map do not exactly happen at the same exact time! You do not turn, run, and jump the exact same pathways. So with that, it would be exactly how a real person would play.

As Chris B points out here in post #91 even turning just the ever so slightest in a video game can alter the framerates of a game in gameplay. This is a prime example of something a player would do. They may take the same path, but they will not face the exact same direction, and yet the framerate is significantly different. So how do we know that these canned demos aren't done on a semi-optimized pathway so that the numbers aren't skewed to begin with, with occasional roughspots to make it look like it wasn't completely BS'ed? We won't.

Till a better method comes out, this is the best we can expect. Until companies stop working together to bullshit their numbers, we've only got real world gameplay to determine things. And also? For all those screaming about scientific methods...I want to know if any of them actually go by Doctor or Professor. Because if they did, then I could see why they go by timedemos and not by actual gameplay. And to be honest, I would love to know if any of these so called scientists have actually PLAYED these fucking games with these cards that they tout as being the bigger badder card, proven by their 'science'.
 
If the game can only address 2 gb of ram anyway (if the OS is 64 bit, the app still has to be written to allow for more than 2gb) why add more? Also, NAME a game other than supreme commander that is CPU limited...



But if it's applicable to the games you will be actively playing, who cares?

Ok, Lost Planet, and that's just off the top of my head.

Also, are you going to tell me that you're going to see absolutely NO increase in performance when doubling the amount of RAM in the system? If so, then you're going to need to turn over that forum title, because you have no idea what you're talking about.
 
Great article. I always likes the [H] method the most. Hopefully we'll see some converts.

Can we disable forum registration for today?
 
I don't see how you can question the accuracy of a benchmark timedemo over real world experience. The answer is simple - the real world experience.

We aren't at any sticking point. In fact, I believe we are at more of an enlightening point, than anything stagnant.

While you can't EXACTLY reproduce what a player will do in the real game, I believe you can get a MUCH better idea of what kind of performance you will get out of a card based on you actually playing the game, not running a timedemo that was optimized. How can this even be debated? It's about having an enjoyable gaming experience, isn't it? Who cares if the fps is 30 or 33, if you see no difference? Everyone is too stuck on numbers.

THIS is the way all cards should be 'benchmarked', as the results of said tests can actually give an accurate representation of a cards ability during games.
[H]'s methodology is insightful, and I hope more will take a look after this article.
 
Ok, Lost Planet, and that's just off the top of my head.

Also, are you going to tell me that you're going to see absolutely NO increase in performance when doubling the amount of RAM in the system? If so, then you're going to need to turn over that forum title, because you have no idea what you're talking about.
Maybe I'm missing something, but your argument seem irrelevant. Yes, if you're comparing Anandtech's review with 4GB and [H]'s review with 2GB or RAM there might be some FPS differences. However, when you look at a review and see a 3870x2 not performing as well as a GTX, how is having 4GB of RAM going to make the X2 perform better? If you have a system with 2GB of RAM and you test a X2 and a GTX on the same system and the GTX performs better then it shouldn't matter if you upgrade the RAM to 4GB and do the same tests again. You should still have the GTX performing better.

Edit: Thanks for the article! IMHO, even one broken benchmarking tool is one too many.
 
In every industry the purveyor and manufacturer of the product will fudge the numbers in any way they can to make their product shine. Its been going on forever.

The H takes a product and puts it through a subjective torture test. I will take this approach any day over a bunch of white coats with tape measures and cones telling me that if this thing stops in 100 feet it must be more fun to drive all the time.

I think the gaming industry is probably the only place where you can still find purely canned testing "reviews". Look at review sites or magazines for Home Theater, Audio, Wine, Cigars, Tech Gadget etc. They all give you a bit of stats and then devote most of the time to telling you their "thoughts" based from using the product in a realistic scenario. Or actually consuming the product. Thats exactly what real world testing is about.

I dont care how long your tape measure is, you still cant tell me if you liked the way the game played at some setting by just throwing charts and graphs in my face.

That is what most of the public wants to know anyway. If I have a system close to X specs what kind of experience can I expect to have? We read these sites because these guys have loads of time devoted to playing and testing games. We value their "subjective" input over others because of this. I bet if the guys over at the other sites spent a bit more time hands on, they would have figured out that adding some real world flavor would really help.

I have a feeling that Kyle is right. They just pump out the reviews like a job. There are only a handful of sites that devote as much effort to the testing process as H.

Fun is a subjective measure folks, get over it.

Keep it [H]ard.
 
The people complaining about 2GB of ram used in this article are really grabbing at straws... :rolleyes:
 
I just wanted to say THANKS!!!!

This was a GREAT artical and I realy got alot out of it.

No flames from me just a thank you sorry so boring :p
 
There is not one shred or semblance of scientific testing with HardOcp. You are merely taking Brent or Kyle at their words it's that simple.

I'm ok with that and so are many others because as readers we trust them. But lets stop petting our asses on this one, their testing is in no way scientific because their simply is no definite control.

I think it really comes down like it always has read everyone's reviews and make your own choices.
 
Not a large user base has vista 64 with 4gig ram, but when it does, I bet you will see it added to the evaluation.(not benchmark ;0)

I agree I think its time to start using 4GB of RAM for benchmarking with vista 64. It made a big difference over 2 gb in games like the Witcher for me.

I bet there are a lot more people with Vista 64 and 4GB of Ram than there are running their display at 2560x1600, which is what HardOCP states are the "highest playable settings" for UTIII on the 3870x2. What good is testing at that resolution? I'm far more interested in how it runs at a more common resolution with more eyecandy features turned on.

I think this question was asked as if it was meant to be rhetorical; but yes, in fact, I do want a review which indicates the “relative performance of a graphics card” and I want the results to be reproducible, otherwise I just have to have faith in the reviewer's methodology and honesty; and that's bad science.

Ironically, H's "real time" vs. "traditional timedemo" results seem to reinforce their validity as benchmarking tools. Both cards are ~35% faster in traditonal timedemos over their real-time counterparts, which shows that the results are comparable.

I'll assert that people are more interested in knowing the relative performance of a card, with results that they can verify and reproduce, than they are in knowing what settings someone else would play the game with.

Your assertion is correct, at least in my case. The essence of science is replication and verification (or falsification if you are a Popperian :) ) I've no idea what the reviewer thinks are acceptable scores, and I am unwilling to take their recommendations on faith. Problems with H methodology include lack of third-variable control, willful introduction of uncontrolled variables, and irreproducibility/unfalsifiability.

Kuhn (1962) would claim that this represents a failed attempt to create a new paradigm (of reviewing). Unfortunately, the HardOCP paradigm is flawed; it is inconsistent with any form of useful inquiry, and thus will not succeed the current paradigm.

I am not interested in benchmarks because of the numbers they provide in and of themselves, but the numbers they provide relative to other cards. If I get 100 FPS on Card A on Benchmark X, and 130 FPS on Card B on Benchmark X, then all else being equal, I can assume that Card A is faster. I am not interested in the bench scores in and of themselves, but in how they stack up against other scores. I use them to provide a rank order of scores. I don't use the raw scores themselves.
 
well it proved there is a variance between timed and actual

thats what i was looking for....

i wonder how ANAND will respond....
 
But if it's applicable to the games you will be actively playing, who cares?
You probably wouldn't. However, for those that play other games (or plan on using the card to play currently unreleased games in the future), "best playable settings" is significantly less useful than establishing an overall relative performance of the cards. That aside, it still ignores that issue of the subjectivity of "best playable settings" which we're made painfully aware of every time H reviews something.

You might also care that there's no way to scientifically verify the results.
 
The reason you don't see 4GB in the reviews, is because if you look in most peoples sigs on the forums you see 2GB.

Simple concept, [H] tests stuff for what we use. Not what we could have if we spent a crapton of money. I'm sure they could load up systems with 16gb and go do benchs for us. I bet they have the resources but why bother when the target base of [H] doesn't have it?
 
There is not one shred or semblance of scientific testing with HardOcp. You are merely taking Brent or Kyle at their words it's that simple.

I'm ok with that and so are many others because as readers we trust them. But lets stop petting our asses on this one, their testing is in no way scientific because their simply is no definite control.

I think it really comes down like it always has read everyone's reviews and make your own choices.

Scientific...?

You perceive timedemos as scientific? These are optimized benchmarks, don't you get that? The 'definite control' you speak of is FLAWED.

There is more 'control' in these real world fraps tests than anything demoed. This is real data, however controlled, which paints a much better picture of what one can expect from their video card's performance on a game.
 
Ok, Lost Planet, and that's just off the top of my head.

Also, are you going to tell me that you're going to see absolutely NO increase in performance when doubling the amount of RAM in the system? If so, then you're going to need to turn over that forum title, because you have no idea what you're talking about.
You will see some improvement, but not as much as a video card upgrade, by far. My experience has been that once you hit 2 gb of ram, you see a very quick series of diminishing returns with more ram.
On the CPU side. When I upgraded my system from an E6300 to my current Q6700 (assuming stock clocks here, post- 4gb upgrade), I saw very little improvement in most games. Crysis definitely did. Source based games MAY have snatched a 2-3 fps gain.

And Lost Planet is a bad example, as it is a console port.
 
You probably wouldn't. However, for those that play other games (or plan on using the card to play currently unreleased games in the future), "best playable settings" is significantly less useful than establishing an overall relative performance of the cards. That aside, it still ignores that issue of the subjectivity of "best playable settings" which we're made painfully aware of every time H reviews something.

You might also care that there's no way to scientifically verify the results.
But it's a subjective industry! Everything about gaming is based on personal preferences and individual hardware and software settings.
 
There is not one shred or semblance of scientific testing with HardOcp. You are merely taking Brent or Kyle at their words it's that simple.

I'm ok with that and so are many others because as readers we trust them. But lets stop petting our asses on this one, their testing is in no way scientific because their simply is no definite control.

I think it really comes down like it always has read everyone's reviews and make your own choices.

Science experiments belong in a lab, not in a home. Scientists typically go by doctor, not 3dMaRkS_Ownz_J00.

What Brent and Kyle have to say is what they intend to share. How you interpret that or how you consider its weight is purely up to you. It doesn't matter if you've proven it by science or by subjective performance. If you're purely going to rely on one method alone, let alone one source, you're already misleading yourself. Hell, how often do we see other reviews posted on the front page when new hardware comes out? To me that says that even the [H] wants you to not just rely on their information alone.

But as someone else pointed out, no one's explained why the results are so varying, or even investigated into it. The scientists scream that their method is more accurate - prove it. Find those exact results to happen in real gameplay. Post your proof that its consistent with your findings. Show us that the real world game play performance is reflected in those canned benchmarks. That's all you'd have to do to prove Kyle's method wrong, once and for all.
 
There is not one shred or semblance of scientific testing with HardOcp. You are merely taking Brent or Kyle at their words it's that simple.

I'm ok with that and so are many others because as readers we trust them. But lets stop petting our asses on this one, their testing is in no way scientific because their simply is no definite control.

I think it really comes down like it always has read everyone's reviews and make your own choices.
You don't simply get it? It's just not [H] is making these claims anymore. There are other sites that have tested with similar methods and gues what? They get similar results! That's what makes it more intresting
 
I am not interested in benchmarks because of the numbers they provide in and of themselves, but the numbers they provide relative to other cards. If I get 100 FPS on Card A on Benchmark X, and 130 FPS on Card B on Benchmark X, then all else being equal, I can assume that Card A is faster. I am not interested in the bench scores in and of themselves, but in how they stack up against other scores. I use them to provide a rank order of scores. I don't use the raw scores themselves.

And there are those like myself, who are interested in knowing whether or not the card is worth a crap when it comes to playing games that I am very well interested in, and whether or not that card can even do anything for the game.

I don't find the excitement in comparing video cards. I find excitement in playing video games, however. And finding a card that will let me play GPU-intensive games is what I value. I could care less how one performs in comparison to another. And even if I did, generally in the [H] articles they point out whether or not the card holds its own against other cards of its class. And they typically link to the articles of those cards so you can compare the results anyway. So I guess I fail to see where there's a flaw in that.

Still, I guess it comes down to what we value.
 
Kyle, what exactly were the FRAPS settings used? As 67shelby noted, FRAPS incurs significant CPU overhead and this would go part of the way in explaining the discrepancy between the timedemos and the real-time demos. If all that FRAPS is doing is logging the frame buffer flips to a file, the overhead should be 1-3% at most. However, if FRAPS is used to show a real-time frame counter then the hit will be larger. And if FRAPS is recording the demo as a movie file, the hit can be monstrous depending on the speed of the hard drive etc.

At any rate, the FRAPS overhead should at least be mentioned in the review and an appropriate slowdown percentage assumed and documented.
 
Back
Top