Half Life 2 Benchmarks @ HardOCP.com

Status
Not open for further replies.
spyderz said:
Min is important, but i don't think that a 2 second benchmark would do any good besides all 2 second effects would have to be benched just to make it fair as some cards are more efficient at rendering some effects more than others, also theres some hitching going on this game when theres certain effects going on and that will hinder the result obtained, i say the game runs good but there are a few bugs that need to be dealt with before a two second blast effect is taken into consideration

sigh
I see my attempt at an explanation didnt work.

Ok what if I changed the hypothetical scenario to 1 minute instead of 2 seconds.

I was just giving an illustrative example.

Here let me try it this way. You have a 10 minute demo vs a 1 minute demo.

Average method

product 1: 90,90,90,90,90,....... (600 times in total)....,90
product 2: 60,90,90,90,60,60,90,90,60,90 .....(some mix and match of 60's and 90's)....90

Graphically challenging method

product 1: 90,90,90 (60 times in total)
product 2: 60,90,90,90.... (some mix and match 60 times in total)....


Anyways sorry I cant explain it better. I know that most of you know what Im talking about though.


Enough for me now, back to testing. ;-)
 
Brent_Justice said:
as you point out the problem with really short timedemos like that is that you may be benchmarking only one type of shader effect

maybe its a water shader for example, and maybe that water shader does render faster on one card

but

that doesn't take into account other aspects of the game, HL2 is not just one water shader

so you have to bench a long timedemo encompassing lots of gameplay to get a good feel how the game feels between cards

that's why we did very long timedemos, to try and represent what end users really experience when they play this game, and see who comes out on top overall

i still wish the timedemo mode outputted the Min FPS though, if it had I wold have put that in the graphs, or a table, cause i think its important

i agree, it has to be a more complete gameplay benchmark, same settings, recent drivers on both cards all this must be taken into consideration, i don't want to bring DH up again but if what catalyst maker said is true and they did make there own benches why would they remove the article from there site, and on another note acording to Anand nvidia only had two days with the game and ati had a week that can make a big difference when comes to driver development for that game
 
spyderz said:
i don't want to bring DH up again but if what catalyst maker said is true and they did make there own benches why would they remove the article from there site,

No they didn't. it's still there
 
This is a nice "stealth" war being waged in here, can we just forget about who has the bigger **** and go back to just playing the game??

it seems this game runs flawless in both brands so just enjoy it mmmmmmmk thanks ;)
 
Any opinion as to a difference in performance in playability for 256mb cards vs 128mb cards, because I read on one site that in HL2 there was some texture thrashing with the 128mb cards that didnt happen with the 256 cards.
 
CATALYST MAKER said:
sigh
I see my attempt at an explanation didnt work.

Ok what if I changed the hypothetical scenario to 1 minute instead of 2 seconds.

I was just giving an illustrative example.

Here let me try it this way. You have a 10 minute demo vs a 1 minute demo.

Average method

product 1: 90,90,90,90,90,....... (600 times in total)....,90
product 2: 60,90,90,90,60,60,90,90,60,90 .....(some mix and match of 60's and 90's)....90

Graphically challenging method

product 1: 90,90,90 (60 times in total)
product 2: 60,90,90,90.... (some mix and match 60 times in total)....


Anyways sorry I cant explain it better. I know that most of you know what Im talking about though.


Enough for me now, back to testing. ;-)

I didn't mean to sound like an arse (no i'm not a pirate) i understood what you meant i was just questioning the logic behind it, not only does it make for a good conversation but we the non catalyst makers can learn a little of whats going on and how things are done and why
 
I have not read all this thread [HL2 JUST finished installing, I plan on dissapearing for the rest of the day] but Kyle the reason that 6xS takes a greater hit than 8xS is because it simply looks better. Take a close look at the images, it's something myself and rewt [creator of coolbits] have been saying for a while.
 
Netrat33 said:
I'm at work so this is all I got! This is all I got!

I feel your pain, I'm at work as well and my shipment just got delivered @ home (thanks fedex online tracking)

ah man, one more hour to get my hands on this game :cool: (+ 5 more getting STEAMING PILE O- i mean my game up and running)
 
CATALYST MAKER said:
sigh
I see my attempt at an explanation didnt work.

Ok what if I changed the hypothetical scenario to 1 minute instead of 2 seconds.

I was just giving an illustrative example.

Here let me try it this way. You have a 10 minute demo vs a 1 minute demo.

Average method

product 1: 90,90,90,90,90,....... (600 times in total)....,90
product 2: 60,90,90,90,60,60,90,90,60,90 .....(some mix and match of 60's and 90's)....90

Graphically challenging method

product 1: 90,90,90 (60 times in total)
product 2: 60,90,90,90.... (some mix and match 60 times in total)....


Anyways sorry I cant explain it better. I know that most of you know what Im talking about though.


Enough for me now, back to testing. ;-)

I think this is where standard deviations/standard errors come in. Given a large enough sample size, (the 10 minute example you gave would be sufficient IMO), you'll be able to compare with ANOVA. I've found this statistical aspect of benchmarking glaringly absent in a number of hardware tests. Averages are nice, and understandable by many; but, a true statistical comparison would, more often then not, show that these 1-3fps that product 1 has over product 2 are essentially meaningless. Furthermore, how often were the timedemos run? Once? More?

It is understandable that manpower and time constraints affect the level of testinig, and averages are infinitely better than nothing. Minimum FPS is also a great thing to show. Let me say that my comments are meant only to be constructive; I am grateful for any amount of effort reviewers do in order to bring data to the masses :)
 
Oh hell let me get in the timedemo posting action as well then.

Here are the ones that I referred to earlier. These will stress cards big time.

FTP1.ATI.COM
Userid - beta0098
Password - 3WWWWvxp


Sorry about putting them on an FTP as opposed to direct download. But feel free to pass them around as needed.
 
Hey guys, guess what? I have the bigger E-pen0s ;)

go play the game ( I would but I cant until I actually get home after work, so I'll stick around and watch the show) :p
 
Brent_Justice said:
If you go back through the article now I have put in Maximum Setting comparison graphs on page 2 and 3 at the bottom.

So you can see how they stack up with 6XAA/16XAF, 4xSAA/16XAF, 6xSAA/16XAF and 8xSAA/16XAF.
I have updated the conclusion as well.

Update: We have gone back and updated our pages with a couple of graphs that show Maximum IQ settings in terms of AA and AF. Without a doubt the ATI Radeon X800XT-PE did by the far the best job at delivering a playable gaming experience. Of course it is up to the end user to determine if turning these options on give you any tangible gaming returns, but without a doubt if you want to run "ultra high quality settings," the ATI Radeon X800XT-PE gives a much better return than NVIDIA's solution.
 
seanmcd said:
Why do you use an FX-53 cpu that maybe just a handful of readers have? (i know that the CPU is not the point of the article but a lot of us still won't pay of $200 for a processor)

The purpose of the benchies are to test GPU's, so you get the fastest CPU to keep the tests from being CPU as much as possible
 
CATALYST MAKER said:
Sure. I can give it a shot.

The main reason why there are such differences between the Hard benchmarks and DH benchmarks is because of the timedemos used.

In summary [H] created timedemos that were indicative of average gameplay. In other words [H] used a full level as a timedemo. Showing the average framerate of a full level will cause all cards in the same category to have near identical performance. There will be areas in that level that are not graphically comlex at all, and when taking an average will benefit the inferior product.

Example, below are 2 hypothetical products that are benchmarked for 10 seconds. Each sample point is the FPS at the second.

90,90,90,90,90,90,90,90,90,90 = average is 90 FPS
60,60,90,90,90,90,90,90,90,90 = average is 84 FPS

Take a large enough sample (i.e. extend the 90's for another 50 seconds) you will see that the FPS will trend to the same number.

The other alternative to the above method is to benchmark only complex graphic situations. This in the above example would be a 2 second time demo.

90,90 = average of 90 FPS
60,60 = average of 60 FPS

The difference between the two hypothetical products is much more evident in this situation.

We have four timedemos created that are graphically complex. They were used by DH and 2 of them were even used by TheTechLounge.com.

I am prepared to put these demos on an FTP site for anyone who wants them. Interested?
These demos will show complex situations and will show that one hypothetical product is at times 90% faster than the second hypothetical product.

One such case will be using the flashlight. Record a timedemo of your own if you like using the flashlight alot. You will see what I am talking about.

:)


And that concludes my lesson for the day. Cheers guys.

Edit.and as I see as soon as I post that, Brent has commented right below mine a good way of putting what I am trying to say. Average is one thing but for some Min is more important.
This is nothing new to us. This is why we primarily moved away from timedemo benchmarks about 18 months ago. They just simply do not show the full picture.

Then again, we try to give our readers what they want, and this is part of it.
 
Hey [H] vampire boodlines since it uses the source engine would make for a good benchmark also, i wonder how many changes have been done to the engine, but regardless it would still make for a good comparison between HL2 and vampires and the use of the source engine
 
seanmcd said:
Why do you use an FX-53 cpu that maybe just a handful of readers have? (i know that the CPU is not the point of the article but a lot of us still won't pay of $200 for a processor)
It was an effort to remove CPU limitations for the benchmarking. Hard to benchmark video cards (like you see in our 1024x768 results) when the CPU is holding you back.
 
Brent_Justice said:
Hey alI just created another timedemo if y'all want to pass this around, this was made in Canals 09 which is what one of the ATI timedemos is in, except this one is created by me just now and goes a little bit further than the ATI one, its pretty intense.

Feel free to pass it around, its small.

http://home.comcast.net/~b.justice/hardocptimedemo_canals09.zip

Whoo hoo! More chances to see Brent pwn! Or... lack there of... :)

Seriously, props to [H] and Catalyst for actually caring and getting benches out to the [H]ungry [H]oarde. I speak for all of us in saying that we really appreciate it!
 
CATALYST MAKER said:
I know man. I think you did the right thing by moving away.

I was just responding to the guy who wanted me to explain the differences between sites.

Kyle you agree with my explanation right?

Your explanation makes perfect sense, the longer the Timedemo the less the Avg will reflect how the card will perform in the important, high action, bits, by virtue of the fact that "lull noise"* will influence the overall result.
I don't think anyone's that interested in which card gets the best frame rates while staring at a blank wall (possibly useful for CPU comparisons though...).

Min FPS would be a much better indicator of performance under stress, indeed, it's the figure I pay most attention to in [H] reviews.


*Btw, if no one's ever used the term "Lull noise" before, I claim it, it's mine, sorted.
 
CATALYST MAKER said:
At home I am using a 9800XT and at work I just popped in a X800 PCI-E board.

Thats it? I would think you'd be using prototype dually R720 boards voltmodded by ViperJohn or something... Weak sauce :rolleyes:
 
Gavinni said:
Why use an overclocked ultra and GT and not an overclocked pro and xt?
To answer this for the third or fourth time, that is the stock clock for the BFGTech card we used and is the best selling brand of 6800 in North America retail.
 
Gavinni said:
Why use an overclocked ultra and GT and not an overclocked pro and xt?

Because the BFG 6800gt oc is the most popular video card right now and is a measely 20mhz overclock right out of the box, that may give a 1or2% difference. Just look at it being another product in the lineup. If you think about it the 6800ultra is nothing but an overclocked 6800gt, but that doesn't make it void, ;)

edit: beaten like a volkswagon in a H2 offroad race :mad:
 
It may be the #1 selling, but it vs all of the other GT's combined does not make a majority.
 
Gavinni said:
Why use an overclocked ultra and GT and not an overclocked pro and xt?
That one has been explained several times, you need to actually read the posts in this thread.
 
spyderz said:
Hey [H] vampire boodlines since it uses the source engine would make for a good benchmark also, i wonder how many changes have been done to the engine, but regardless it would still make for a good comparison between HL2 and vampires and the use of the source engine

I'll check it out, thanks.
 
Didn't read whole thread, but unless brent turned on +r_fastzreject 1 , the benchmarks are not good enough for me, as the nvidia cards have this enabled by default, and the ati ones dont, because are cpu limited levels, it could be slower, but the gains of said tweak is 40+fps.
 
Moloch said:
Didn't read whole thread, but unless brent turned on +r_fastzreject 1 , the benchmarks are not good enough for me, as the nvidia cards have this enabled by default, and the ati ones dont, because are cpu limited levels, it could be slower, but the gains of said tweak is 40+fps.

that's not how the game was shipped

we benchmarked an "out of box" experience of HL2

the average gamer is not going to know about this command, they will just fire up the game, and play it
 
CATALYST MAKER said:
I did. And I still do. But I guess some would call me biased :)

I wont get into it in here though. Just wanted to pop in and say I love this game.

At home I am using a 9800XT and at work I just popped in a X800 PCI-E board.

I wonder if we could get a poll as to what settings people are playing this game with? Im curious how much everyone is cranking it.
Hey Catalyst Maker! Great job with CAT 4.12 beta drivers! :) I am playing at 1280x1024 with 6X AA and 16X ANISO with geometry instancing on. No lagging in game play, but I do get the minor load half a sec delay on some parts of the game.
 
Brent_Justice said:
that's not how the game was shipped

we benchmarked an "out of box" experience of HL2

the average gamer is not going to know about this command, they will just fire up the game, and play it
But it's supposed to head to head.
I agree if you're going for an out of box experience, but I think it would be worthwhile to see what happens when you enable that command, since users are reporting such large gains.
It seems kinda silly not do it, if you even knew about the command, to satify curiousity.
 
Brent_Justice said:
that's not how the game was shipped

we benchmarked an "out of box" experience of HL2

the average gamer is not going to know about this command, they will just fire up the game, and play it
True, but that don't really mean the cards were exactly "apples to apples" compared. ;)

Next comes the image quality thing, I just got done comparing in-game AA/AF to forced thru control panel/radlinker and the forced one is just drop-dead gorgeous to me....and I'm getting amazing framerates at 6xAAt2 16xAF! :D

I LOVE THIS GAME!!!!

Next up, I gotta somehow tear myself away from playing on me X800 to see how it looks on a GT....but I probably won't be able to until I finish the game. :LOL:

Good benchies, sorry about me earlier confusion about v-sync....I was still pre-coffee/my head in HL2 and I thought the graphs all topped out at 100. :oops:

(BTW-HI TERRY!!!! :D )
 
Thanks to Terry, rent, [K]yle and Anand for the demos.
big%20grin.gif
big%20grin.gif


benchies here. hope others post too. any recommendations terry for further improvemnet.

big%20grin.gif
big%20grin.gif
 
Brent_Justice said:
These were the Settings I used under the Advanced Video Options:

Model Detail = High
Texture Detail = High
Water Detail = Reflect World
Shadow Detail = High
Anti-Aliasing Mode = (None, 2X, 4X)
Filtering Mode = Anisotropic 8X
Shader Detail = High
Wate for Vertical Sync = Disabled
Hey Brent did you try using the driver forced AA/ANISO and disabling the ingame AA? Are the results the same?
 
[RIP]Zeus said:
I just want to know somthing here?

Why is it that all the ati fan's out there, are complaining that [H] used a Overclocked 6800 and comparing it to a stock x800

do these people not know these cards come oced outa box?
if ati did the same thing there would be no bitching. but when nvidia does anything, it's looked down on?
wtf is that shit?
can some explain in there opioion why this is?
I am not doing that :p I know the different between retail product and advertised product. Most of them are grabbing straws that's all. They didn't get their 40% lead like they hoped :p
 
Status
Not open for further replies.
Back
Top