NV40´s 1st benchmarks...

AMDoido · Jan 17, 2004

http://www.ttecx.de/files/news/article.php?article_file=1074033583.txt

Not bad...

Brent_Justice · Jan 17, 2004

A um, extremely rough translation: http://translate.google.com/transla...&hl=en&ie=UTF-8&oe=UTF-8&prev=/language_tools

Did I see Int16 in there?

SnakEyez187 · Jan 17, 2004

Sorta pointless without the map names and system setup...

Mikey20 · Jan 17, 2004

OMG its going to run dual 8 piplelines!?

Brent_Justice · Jan 17, 2004

Originally posted by Mikey20
OMG its going to run dual 8 piplelines!?

not dual 8 pipelines, that would be 16

8x2 indicates 8 pipelines with 2 tmu's each

keep in mind these are rumors, nothing has been released from nvidia as to the specs of the nv40, i don't even know them yet

John Reynolds · Jan 17, 2004

8x2 is very strongly rumored, though, for both NV40 and R420.

defiant · Jan 17, 2004

wuh? At the bottom of the article it states that theinquirer is the source for the info, but ive never seen any article on the inquirer containing benchmark info regarding the NV40.

AMDoido · Jan 17, 2004

Originally posted by defiant
wuh? At the bottom of the article it states that theinquirer is the source for the info, but ive never seen any article on the inquirer containing benchmark info regarding the NV40.

read it again...

My english sux and i understand it...

leukotriene · Jan 17, 2004

Did I see Int16 in there?

Yup.
FX16 is a definite improvement over FX12 - they mention a register combiner in association with this statement, though I'm not entirely clear what that has to do with it.

Good to see nvidia finally put out an 8 pipe (always) core. Pixel Shading numbers better be drastically improved if they want to be competitive. I bet they are

CleanSlate · Jan 17, 2004

sounds like complete bull shit

pakotlar · Jan 17, 2004

seems accurate... I'm interested in the 8x AA.

101 · Jan 17, 2004

Looks like all the marketing morons are going to spin the FUD between now and CeBit. Lets hope nVidia doesn't do a repeat of last year.

leukotriene · Jan 17, 2004

sounds like complete bull shit

Really now?

It sounds a lot like the most reputable rumors I've heard (Humus) circulated for some time. It could be a conglomeration of such rumors thrown together with some fake benchmarks certainly, but the NV40 has almost certainly been taped out in it's original form for at least a month.

The Halo 16x12 benches do look almost too good to be true, but we have no information on what settings were used.

Brent_Justice · Jan 17, 2004

Originally posted by leukotriene
Yup.
FX16 is a definite improvement over FX12 - they mention a register combiner in association with this statement, though I'm not entirely clear what that has to do with it.

Good to see nvidia finally put out an 8 pipe (always) core. Pixel Shading numbers better be drastically improved if they want to be competitive. I bet they are

While Int16 IS better than Int12 its still Integer and not FP

I'd like to see FP24 and or FP32 only.

Pongi · Jan 18, 2004

Originally posted by Brent
I'd like to see FP24 and or FP32 only.

Why?

Brent_Justice · Jan 18, 2004

Originally posted by Yiffy
Why?

i thought it would be obvious

better precision and image quality

the r3xx already does FP24 no matter what, it doesn't ever go below that

nst6563 · Jan 18, 2004

I like the fact that they compare it to the 9800pro...which isn't even ati's fastest offering. If they're going to compare top end offerings, then at least show the benches from the 9800xt.

That's like saying "my Corvette with an intercooled turbo is faster than your hopped up v6 Camaro". ummm.... DUH!

Brent_Justice · Jan 18, 2004

Originally posted by nst6563
I like the fact that they compare it to the 9800pro...which isn't even ati's fastest offering. If they're going to compare top end offerings, then at least show the benches from the 9800xt.

That's like saying "my Corvette with an intercooled turbo is faster than your hopped up v6 Camaro". ummm.... DUH!

Well, at this point we don't know "Who" compared "What" and what the specs on the games and machines were. We don't even know if the numbers are true. So take this as rumor since there are no hard facts.

leukotriene · Jan 18, 2004

While Int16 IS better than Int12 its still Integer and not FP

Int16 could be actually preferable to FP16 depending on the number involved. As I recall, the larger the number the more likely FP16 will not be able to represent an INT16 number perfectly (because it has to use bits for the exponent).

Of course it's probably unlikely you'd see a situation in which this would be a noticable problem on your monitor except maybe with fog effects, and since DX9 hardware and presumably DX Next generation stuff uses floating point precision (AFAIK) exclusively, this is probably irrelevant anyway

I'd like to see FP24 and or FP32 only

Having FP16 around so developers can drop rendering precision when they really don't need it is a good idea, but the die space it wastes encourages abuse by (ahem!) overeager driver teams and management, so I agree.

I've read at a couple places that the Deltachrome S8 has the capability to do both FP16 and FP24, but it has such a small die that this seems somehow unlikely unless you could use the FP24 units as FP16, and I dont know why you'd want to do that because it doesnt seem like anything would be faster (particularly with their 128bit chunk-only bus).

Can you think of any reason why that might be? I'm clueless.

Brent_Justice · Jan 18, 2004

Originally posted by leukotriene
Int16 could be actually preferable to FP16 depending on the number involved. As I recall, the larger the number the more likely FP16 will not be able to represent an INT16 number perfectly (because it has to use bits for the exponent).

Of course it's probably unlikely you'd see a situation in which this would be a noticable problem on your monitor except maybe with fog effects, and since DX9 hardware and presumably DX Next generation stuff uses floating point precision (AFAIK) exclusively, this is probably irrelevant anyway

Having FP16 around so developers can drop rendering precision when they really don't need it is a good idea, but the die space it wastes encourages abuse by (ahem!) overeager driver teams and management, so I agree.

I've read at a couple places that the Deltachrome S8 has the capability to do both FP16 and FP24, but it has such a small die that this seems somehow unlikely unless you could use the FP24 units as FP16, and I dont know why you'd want to do that because it doesnt seem like anything would be faster (particularly with their 128bit chunk-only bus).

Can you think of any reason why that might be? I'm clueless.

the r3xx line seems to be able to do fp24 just fine in all situations, it performs great at that level, it doesn't even know what partial precision is, all i'm sayin is ATI has shown its possible to run at full precision (fp24) all the time and still have great performance, so that is kinda what we expect now, high precision and high performance

here is another rumor, i emphasize rumor, a future update to DX9 may set FP32 as full precision and anything else below being partial precision, thats just a rumor I heard the other day

but anyways, the future is towards realistic 3d images right? in order to achieve that FP at a high bit rate is going to be needed

it'll all come in time eventually though

about the DeltaChrome i dunno, i haven't used one yet, and i haven't read much about them

Pongi · Jan 18, 2004

Originally posted by Brent
i thought it would be obvious

better precision and image quality

the r3xx already does FP24 no matter what, it doesn't ever go below that

I doubt Nvidia would implement it if it didn't have a use. If it can look the same under FP24 and Int16 why take the speed hit? I honestly don't think Nvidia would bother with it if it was so horribly inadequet for anything. (I know theres demos that can show the huge differences, but I'm sure its possible to make some demos that could demonstrate the opposite.) Back in the day you could pimp the benefits of 32bit color in Quake3, but run Half-life in 32Bit color, it looks 99% the same it just runs slower. (ok its a rough analogy I admit)

Not to say Int16 is suited for much more complex shader functions but for some simple ones its probably enough.

But you'll get your wish, Nvidia will probably drop Int12/16 when shader speed is sufficiently fast enough to do everything in FP24/32.

[edited - thought HL was a better comparision in the anology]

Brent_Justice · Jan 18, 2004

Originally posted by Yiffy
I doubt Nvidia would implement it if it didn't have a use. If it can look the same under FP24 and Int16 why take the speed hit? I honestly don't think Nvidia would bother with it if it was so horribly inadequet for anything. (I know theres demos that can show the huge differences, but I'm sure its possible to make some demos that could demonstrate the opposite.) Back in the day you could pimp the benefits of 32bit color in Quake3, but run Quake1 in 32Bit color, looks the same it just runs slower. (ok its a rough analogy I admit)

Not to say Int16 is suited for much more complex shader functions but for some simple ones its probably enough.

But you'll get your wish, Nvidia will probably drop Int12/16 when shader speed is sufficiently fast enough to do everything in FP24/32.

but see, thats the thing, the r3xx line already does shader speed and everything else very fast in fp24, that is all the hardware knows, and internally I think it does do some internal precision in the pipelines at fp32

Lazier_Said · Jan 18, 2004

Originally posted by nst6563
I like the fact that they compare it to the 9800pro...which isn't even ati's fastest offering. If they're going to compare top end offerings, then at least show the benches from the 9800xt.

The 9800P was running at 466/732 which is somewhat faster than a stock 9800XT.

WalteRr · Jan 18, 2004

Originally posted by Lazier_Said
The 9800P was running at 466/732 which is somewhat faster than a stock 9800XT.

Some what? the 9800 xt is 412mhz..huge difference.

panzerAmd · Jan 18, 2004

We know what they are testing the " NV 40 " at from this.

UT2003 1600 x of 1200 pixels|4xAA/8xAF = 100 fps

But we don't know what map it's being run on and we don't what setting's the radeon 9800 is running with, Because it doesn't state FSAA or AF.

ATi Radeon 9800 pro (466/366)UT2003 flyby 1600 x of 1200 pixels|ÂA/ÅF = 70 fps

We also don't what in game setting's they are using or if the " NV 40 " is filtering the way UT tell's it too, It doesn't prove much to run a game that does not use PS's very much ( i think the highest it use's is PS 1.4 ) since it's not very future looking. By the way where would they get the bench's from, If it were a nvidia employee wouldn't they get fired if they were caught ( or it could be a PR move ).

Princess_Frosty · Jan 18, 2004

The numbers are too rounded (nearest 5fps) it seems more of a guess than a benchmark, based on the system specs of the card, thats not to say it cant be accurate, but i certainly wouldnt rely on it as info.

As for FP16 and FP32, as Nvidia stated its the natuarl progression of numbers in base 2, 24 is like a stop inbertween and it just happens to perform well for DX9 because it only just meets minamum requirements. Nvidia taking the aproach of using a mix of 32FP and 16 is a far more sensible, and they have shown that when optimised correctly it can beat the radeons architecture (although it lacks in shader performance due to its pipeline structure) Its somethign ATI will have to get used to themselfs, they will have to make the jump to FP32 sooner or later and when they do, they're going to have to learn how to mix precision. Their current system of doing everything at one precision wont carry on well when they're dealing with shaders needing to be run at 32 24 and or 16, running them all at 32 would be a severe waste, hence Nvidias problems now.

Having tackled this problem first i feel Nvidia will probably do better in the next range of cards, they already have far faster core and memory speed technology with a pipeline increase they're going to be catching up on ATI real quick.

*edit*

it was on the inquirer

http://www.theinquirer.net/?article=13598

John Reynolds · Jan 18, 2004

Originally posted by Princess_Frosty
Having tackled this problem first i feel Nvidia will probably do better in the next range of cards, they already have far faster core and memory speed technology with a pipeline increase they're going to be catching up on ATI real quick.

ATI is also rumored to be doubling their units in R420 (texture, shader), and if that's the case and they can keep their clock speeds up, that means they will have a significant PS 2.0 performance lead on NVIDIA in the next generation. Of course it's going to be hard to accurately determine just which chip is faster with all the benchmark cheating going on.

But time will tell.

goomba_1 · Jan 18, 2004

Originally posted by WalteRr
Some what? the 9800 xt is 412mhz..huge difference.

Core clock speed increases mean almost nothing for 9800s, even huge ones. I took my pro from 380 to 420 and it didn't change anything.

nst6563 · Jan 18, 2004

Originally posted by Lazier_Said
The 9800P was running at 466/732 which is somewhat faster than a stock 9800XT.

sorry, didn't pay attention to their o/c'd numbers. still, it's a crappy comparison if it's even true.

show some real numbers, from real games, and for sh*t's sake show the hardware on which the tests were run.

Almost sounds like an info sheet on the Phantom...some magical numbers/specs that were gotten from somewhere but no one can find anything to back it up....

ZenOps · Jan 18, 2004

I can see INT16 looking better than FP16 in some situations. That makes quite a bit of sense to me.

It depends on how skilled the programmer is, and what tools he is using. If he is still using a 3D system and tools based on integers, then it can be very tough to lineup FP textures. IE: it would probably be like trying to fix a watch with a wrench instead of a pair of tweezers.

But the best scenario is of course to get programmers some floating point precision tools to start with, so they can make some very detailed and precise (otherwise beautiful) 3D scenes far and above what they could ever make with an integer based system. With the proper tools there is no question that FP looks better than INT every time.

As with some of the INQ's reports, take these NV40 benchmarks with a grain of salt. Anyone remember the PCpop scores that showed up about two months before the NV30?

zandor · Jan 18, 2004

FP32 should not produce still clear drops of performance, there complete interpretation of architecture on FP32 precision (comma computations with 32 bits precision) very transistoraufwaendig and at present is necessary.

This is a bit garbled & I don't speak German, but my understaning of this is that it's saying the NV40 is fully 32-bit.

Pixelshader is to control in the meantime HP of 1,4 instructions with 16 bits Integerpraezision. This points on an extension of the Registercombiner, which should clearly increase the HP achievement in the Int16 range, as this was to be observed range in the Int12 with the NV30/35.

I think the 16-bit vs. 12-bit integer handling is when dealing with PS 1.4 instructions. I think it means the card can store either an FP32 pixel or 2 Int 16 pixels in a register, which makes sense since the FP32 pixel needs twice as many bits.

ZenOps · Jan 18, 2004

BTW, IF the NV40 ends up being slower than the R420/423 (and I'm not saying its going to be) or just barely beats out a 9800XT in DX9, I'm going to start looking at the floating point calculation speed.

DX8 games seem "ok" with the FX 5700/5900, but DX9 speed still seems to be a bit lacking (and very lacking on the 5600/5800). Nvidia has already addressed its insatiable memory appetite by mating them with the fastest (and most expensive) memory available. If the NV40 has DX9 speed issues, I'm thinking it might be a function of the floating point core architecture itself.

John Reynolds · Jan 18, 2004

Originally posted by ZenOps

DX8 games seem "ok" with the FX 5700/5900, but DX9 speed still seems to be a bit lacking (and very lacking on the 5600/5800). Nvidia has already addressed its insatiable memory appetite by mating them with the fastest (and most expensive) memory available. If the NV40 has DX9 speed issues, I'm thinking it might be a function of the floating point core architecture itself.

The speed differences are already a result of the core architecture differences. For example, NV3x's performance hit with register usage is well documented. As for your state of INT 16 vs. FP16, that's a non-issue if you're comparing against ATI's FP24. Basically I think it's a waste of die space to support all these precision levels instead of just spending the logic and transistors on getting your float performance up to snuff. It also makes doing a fair and accurate comparison against the two products a massive headache because it's hard to tell when NVIDIA is lowering IQ.

tazdevl · Jan 18, 2004

Originally posted by goomba_1
Core clock speed increases mean almost nothing for 9800s, even huge ones. I took my pro from 380 to 420 and it didn't change anything.

Then you have a system or are using an app where the CPU is the limiting factor, not the GPU.

ZenOps · Jan 18, 2004

I'm not sure how much extra die space would be required to support the intermediate modes, I can't imagine it would be a whole lot. Really I consider FP16 to be the intermediate mode and FP24 as the native mode, just like I consider a 24-bit colour JPG to be a native mode and 16 or 32 bit colour Bitmaps to be intermediate modes.

RGB each with 8-bits fits easily into 24. XYZ precision fits easier into 24. 16 and 32 are actually oddball sizes.

Like color depths. There are some good reasons to have monochrome 80x40 modes, 8-bit 256 colour, and 16 bit when every card out there supports 32-bit colour. For backwards compatiblity and standards sake, I think its pretty important.

zandor · Jan 18, 2004

Originally posted by ZenOps
RGB each with 8-bits fits easily into 24. XYZ precision fits easier into 24. 16 and 32 are actually oddball sizes.

The catch is the numbers in this case are per channel, so FP24 has a 24-bit floating point number to represent each of red, green, blue, and alpha, resulting in a 96-bit pixel. FP16 gives you a 64-bit pixel and FP32 gives a 128-bit pixel.
The latter two make a lot more sense if you're trying to write out full-precision data to memory.

ZenOps · Jan 18, 2004

I'm not so sure Alpha is necessary.

Sort of like 32-bit colour for DX8 games, its RGBA 8-8-8-8. You can always move the Alpha effects to a specialty renderer. For 16-bit colour its usually RGB 5-5-5 or 5-6-5 with no alpha.

With a proper DX9 pixel shader for alpha effects/fog, you wouldn't need alpha. You could just plop in a transparent/translucent shader just before it hits the screen (sort of like spot lighting, and all the other DX9 special effects.)

I think this is how ATi has done it in the past, the fog is actually put in later and not a part of the major rendering pipeline (which might explain why a lot of early boards & games were slow with fog on)

I've never believed that Alpha has ever been very important. In the real world, there are a rare few things that are in any way "see-through" with perhaps the exception of fire, water, fog, and sexy underwear. It should be a specialized instruction.

NV40´s 1st benchmarks...

Limp Gawd

Moderator

Gawd

Gawd

Moderator

Limp Gawd

Limp Gawd

Limp Gawd

2[H]4U

Supreme [H]ardness

Gawd

Gawd

2[H]4U

Moderator

[H]ard|Gawd

Moderator

2[H]4U

Moderator

2[H]4U

Moderator

[H]ard|Gawd

Moderator

Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Weaksauce

2[H]4U

2[H]4U

Supreme [H]ardness

2[H]4U

Limp Gawd

2[H]4U

2[H]4U

Supreme [H]ardness

2[H]4U