How Accurate are Canned PC Benchmarks?

rgMekanic

[H]ard|News
Joined
May 13, 2013
Messages
6,943
DigitalFoundry has posted a new video on YouTube, asking the question "Just How Useful/Accurate Are PC Benchmarks Anyway?"In the video Alex goes over the differences between canned benchmarks like those found in Tomb Raider, and Metro Last Light to see how they compare with actual in game performance.

This is quite a problem, where canned benchmarks do not necessarily represent actual performance while playing the title. I can't believe it's taken someone so long as to think that canned benchmarks may misrepresent gameplay. Oh wait... Kyle first pointed it out in 2001 with the Quake 3 Arena benchmark.... And again in 2003 Kyle and our GPU Editor Brent Justice looked at the validity of 3DMark03... Culminating in Kyle and Brent working together once again in "Benchmarking the Benchmarks" in 2008 when they completely dissected real world gameplay vs Timedemo benchmarking in Crysis.

Check out the video

Anyone want to bet if Digital Foundry keeps using built-in benchmarks. I won't, but hey, maybe it will make more people aware of the problems with built-in demos, and hopefully stop giving them so much weight when predicting video card performance. But with anything in the PC or PC gaming world, don't forget. Thanks to cageymaru for the story.
 
I stopped using benchmarks decades ago. I only used 3dmark vantage because I got a free key and then only to watch the demos.

Actually, I tried 3dmark 2000 and 2001 a few months ago just for kicks.
 
I remember when Kyle started writing about benchmarking and ultimately playing the games to find the true performance all those years ago. I was an angry pc gamer that spent years trying to figure out why I could run a time demo and get great fps but then when I played the game I had to turn down settings and still not get the fps that a time demo gave me. When Kyle flipped the benchmark world on its head, I finally knew what was going on.
I used to hit a dozen website's for benchmarks with no one in particular first, but when HardOCP started real world benchmarks this was the first place I came to get the information I needed. And most of the other sites I just eventually stopped going, they just kept them canned mostly useless benchmarks.

Now I live here.
 
So you didn't use logic?

[I always thought it was quite clear that a canned benchmark would be crap at reproducing the actual gameplay experience...]

Live and learn. Every video card review back then pounded canned down our throats. No one said anything about performance other than canned. Of course I was aware that my games ran worse when playing, but no one seemed to talk about real game play in reviews. I didn't visit forums much back then. I read articles and played games. It is what it was for me.
 
I'll use them initially when I 1st fire up a game or make some significant hardware or setting changes but generally speaking I'll try to keep a save on hand where I notice pretty major drops in fps. Generally speaking though, canned sucks in terms of accuracy it should only be considered a generic baseline but by no means the definitive gauge or needs of a game.
 
I install a new game, the first place I go are the options before I even start playing the game. I look at the auto detect settings and then hit play. Play for a few minutes. If game looks good and plays well, I play. If it plays really smooth I go back and bump up some settings until I either max out or the game reaches my thresh hold of play-ability.
I like to read the reviews here(both official and on the forum) to get an idea of what I SHOULD be getting performance wise. It is a good review process (probably the best), but it is not the bible. What my machine does is what my machine does.
 
the only thing i find canned benchmarks good for is comparing one card directly to another to get a consistent idea of how they compare to one another, but i don't think they transition to real world with a pinch of poo

also good for testing overclocks

bout all
 
I3 @ 5.5 ghz will outperform my Ryzen 1700 in benchmarks for games, I know which I like living with.
The one where a youtube playlist of music can be playing in background, installation of a steam game, my 70 other chrome tabs, discord voice chat in background.
maybe some coding applications are running in background.

I've never closed anything on my pc since my 1700 due to performance related issues, I just load it and use it.

a mate with i5 7600K is already having issues with closing background tasks, funny thing is that two mates have near identical systems; 7700K vs 7600K is the only difference cause one of them looked at gaming benchmarks ONLY.
he's planning on upgrading already

This is however impossible to benchmark, but it's something we all do to some extent.
 
I3 @ 5.5 ghz will outperform my Ryzen 1700 in benchmarks for games, I know which I like living with.
The one where a youtube playlist of music can be playing in background, installation of a steam game, my 70 other chrome tabs, discord voice chat in background.
maybe some coding applications are running in background.

I've never closed anything on my pc since my 1700 due to performance related issues, I just load it and use it.

a mate with i5 7600K is already having issues with closing background tasks, funny thing is that two mates have near identical systems; 7700K vs 7600K is the only difference cause one of them looked at gaming benchmarks ONLY.
he's planning on upgrading already

This is however impossible to benchmark, but it's something we all do to some extent.

Yeah, canned benchmarks are bullshit if you want to tell game performance and have been so for a VERY long time. I only use 3dmark as part of stress testing as it can load processor as well as gpu/s. It's good for SLI stress testing as well as determining full load temps but in gaming world it's unlikely one will ever see full load across the board. My machines are dual use so gaming and workstation so I do find stress testing being useful, otherwise real game frame time is the way to go. Regarding your friend having to close background tasks, it isn't processor issue but something with his software or he built wrong machine for his use. He probably need to learn more about hardware and software more than upgrading his machine (otherwise it sounds almost like people who are buying a new computer when they pick up malware and it slows down and then their new computer gets same shit and slows down as well) I sometimes have hundreds of tabs and browser taking over 10GB of RAM with no issues. Pretty much only reboot for Windows or other big updates.
 
I liked how some games had in game benchmarks. I know some still do, but that really helped tuning the in game settings to where I want them. I like how forza7 has one since there are so many video settings.
 
Yeah, canned benchmarks are bullshit if you want to tell game performance and have been so for a VERY long time. I only use 3dmark as part of stress testing as it can load processor as well as gpu/s. It's good for SLI stress testing as well as determining full load temps but in gaming world it's unlikely one will ever see full load across the board. My machines are dual use so gaming and workstation so I do find stress testing being useful, otherwise real game frame time is the way to go. Regarding your friend having to close background tasks, it isn't processor issue but something with his software or he built wrong machine for his use. He probably need to learn more about hardware and software more than upgrading his machine (otherwise it sounds almost like people who are buying a new computer when they pick up malware and it slows down and then their new computer gets same shit and slows down as well) I sometimes have hundreds of tabs and browser taking over 10GB of RAM with no issues. Pretty much only reboot for Windows or other big updates.

he looked at gaming benchmarks,
I tried to convince him to buy a 7700K instead but he said no difference.
he regrets...

I try to inform that more cores than required for target task is always good.
I installed both systems and I know that is the only difference, it didn't take long before that difference was noticed.

They are both simple users, but simple users browse, chat on discord, play music through spotify... the general stuff but still amazing what just HT alone can do.
 
I thought it was common knowledge that canned benchmarks are for comparing the performance of different hardware under the same circumstances only. And they're not indicative of the game's actual real world performance.
 
I use benchmark when overclocking my graphic card, as tools like 3DMark are still some of the more stressful applications around.
And as other mention, they provide a consistent means of comparison when you're tuning your hardware.

Not very useful for real gaming experience, but if you enjoy overclocking your hardware just for the fun of it, these tools are probably more useful than actual games.
 
I'm sorry for necro, but I was googling the term "canned benchmark" because someone called CrystalDiskMark that and this thread came up.
What does it mean? It seems to have negative meaning.
 
I'm sorry for necro, but I was googling the term "canned benchmark" because someone called CrystalDiskMark that and this thread came up.
What does it mean? It seems to have negative meaning.
In short canned benchmark means the user input and/or the workload is simulated. It is the only way to compare different systems. It's purpose is to exclude any other variables but the system performance itself.

You can do non-canned real world benchmarks but it will only be representative for that one run, you cannot compare it to any other result, not even on the same system.
 
This is quite a problem, where canned benchmarks do not necessarily represent actual performance while playing the title

I view vid's on Youtube that show the hardware used when making the video and that run the benchmark utility built into the games I play such as the Metro and Tomb Raider series

Even though my hardware is often older than the hardware used in the video as long as I have the same GPU series (970, 1060 6GB, etc) I find that I get fps that are close too or sometimes better so I don't know what a "canned benchmark" is but why view them when there's a much better way to spend your time?
 
Last edited:
I'm sorry for necro, but I was googling the term "canned benchmark" because someone called CrystalDiskMark that and this thread came up.
What does it mean? It seems to have negative meaning.

For hardrive benching, CrystalDisk is fine, it's not exactly "canned", just a set of reads/writes. CPU benchmarks are typically ok as well, although some "overall system" measurment apps, which factor in memory performance into a "score" along with other things, vary in how much weight they assign to each metric measured, and that doesn't reflect real world performance. Doing an encode, winzip operation, or playing a game gives you a real world metric. The first 2 can be easily reproduced so a score between CPU's and after overclocking shows the performance in relation to each other accurately.

Where canned benchmarks cannot be trusted is for video games/GPU's. Both GPU manufacturers have been known to cheat in the past by detecting that a benchmark is running, and take shortcuts to improve performance in those scenarios. But this isn't something to really worry about for hardrive, memory, or even CPU benchmarks. Just note that any particular benchmark is only a piece of information showing a piece, not the whole, of the overall picture.
 
I miss the old benchmark.

~
timedemo 1
demo four


But it doesn't work well anymore since the game caps at 1000fps now.


Where was this 1000fps back when I actually NEEDED it?

You stole my post but it's ok, brother
 
Canned benchmarks are just to establish a baseline when comparing systems and settings. It's a reproducible test time and time again. Real world experiences in gaming or applications will always vary.
 
Canned benchmarks are just to establish a baseline when comparing systems and settings. It's a reproducible test time and time again. Real world experiences in gaming or applications will always vary.

And, importantly, vary in ways that intrinsically affect user experience. This is why I miss the [H]'s gameplay benchmarks with frametime analysis -- literally the best information one could get.
 
I can't find the exact phrase from the [H] reviews, but it went something like "canned benchmarks are used to tell if something is wrong, not to confirm the results are right" when testing components against each other.

I prolly botched that, but that "phrase/line of thinking" always stuck out to me.
 
I can't find the exact phrase from the [H] reviews, but it went something like "canned benchmarks are used to tell if something is wrong, not to confirm the results are right" when testing components against each other.

I prolly botched that, but that "phrase/line of thinking" always stuck out to me.

You've gotten it about right, and if you think about it as an enthusiast, you're going to run some stress tests to see if anything breaks and then some synthetics to ensure that performance is lining up with expectations, and so on too.
 
You've gotten it about right, and if you think about it as an enthusiast, you're going to run some stress tests to see if anything breaks and then some synthetics to ensure that performance is lining up with expectations, and so on too.

Exactly, I need a system stable under the worst conditions at work so it can "relatively" have an easier and sustainable life in the racks crunching away on real life workloads :D
 
Back
Top