Pixel-, Shader-, vertex and ROP-pipe numbers?

Terra

2[H]4U
Joined
Jul 2, 2004
Messages
2,786
I got presented with this in another thread:
(link to thread: http://www.hardforum.com/showthread.php?p=1029456787#post1029456787 )

Thats what I have been saying. You and the other guy seem to not understand how newer video cards work. The 7800GS and GS+ Bliss use only 16 of the 20/24 pipes for output pixel fill rate ...do you understand what this means? In order to FULLY utilize the 20/24 as pixel pipes, ROPs would have to = 20/24. They dont, they are both 16.

This is why a 6800 Ultra (16 pixel pipes/16 ROPs) at 450mhz core has the EXACT same PIXEL Fillrate as 7800 GS+ 450mhz core.........which is 7.2 Billion Pixels/sec. If the GS+ had 24 REAL and USABLE pixel pipelines, it would also need to have 24 ROPs, and only then its PIXEL fill rate would be 10.8 Billion Pixels/sec. So its is WRONG to call these cards 20/24 pixel pipes cards when ONLY 16 of those pipes can be USED for pixel output.

Any care on an educated way to lay out the Pixel-, Shader-, Vertex- and ROP-pipeline of the following cards:
Any X1800XT
Any X1900XT
Any 6800GT
Any 7800GT
Any 7900GT
Any regular 7800GS(AGP)
And the Gainward 7800GS and 7800GS+

Terra - I have(and others too) tried and failed...
 
Terra said:
I got presented with this in another thread:
(link to thread: http://www.hardforum.com/showthread.php?p=1029456787#post1029456787 )



Any care on an educated way to lay out the Pixel-, Shader-, Vertex- and ROP-pipeline of the following cards:
Any X1800XT
Any X1900XT
Any 6800GT
Any 7800GT
Any 7900GT
Any regular 7800GS(AGP)
And the Gainward 7800GS and 7800GS+

Terra - I have(and others too) tried and failed...


Umm these are all easy and freely available..

X1800XT: 16 texture, 16 pixel shader, 16 ROP
X1900XT 16 texture, 48 pixel shader, 16 ROP
6800 GT: 16 texture/pixel shader, 16 ROP
7800 GT 20 texture/pixel shader, 16 ROP
7900 GT 24 texture/pixel shader, 16 ROP

Etc.

Basically ATI has seperate texture units. Nvidia they are part of the pixel shader pipeline. So however many pipes an Nvidia card has is it's texture unit count.

ATI has seperate TMU's. They all have seperate ROPs..

Vertex pipes is whatever is in the spec..easy to look up

Basically the problem is bias toward Nvidia..people continue to use pipelines in reference to Nvidia, but no longer ATI..when neither or both should have pipelines.

Some people will say X1900XTX is a 16 pipe card..which is true in a way, yet it has much more raw shading power than a 7900GTX..

I just think people should be consistent instead of calling Nvidia 24 pipeline cards, when they are not..but they call ATI 16 pipeline cards.

Bottom line all the current high end cards can only write 16 pixels at a time I think..hence they are all 16 pipe cards..but that's irrelevant to performance now anyway..

Basically ATI R580 has crapload more pixel power than Nvidia G71, but they're highly texture limited so it ends up about the same fast, maybe a little faster.
 
"We know many people get caught up in pipeline counts debate lately. The GeForce 7800 GTX does have 24 pixel pipelines and 16 ROPs, while the GeForce 7800 GT has 20 pixel pipelines. We are starting to move into a time now however where pipeline count really isn’t as important as a performance indicator as it used to be. In the past, you could simply look at the fillrate and guess the performance. With video cards today, I think we need to start moving beyond this kind of thinking and not get so caught up in the details that may sway our purchasing decisions. Definitions and architectures are changing; how a pipeline is defined is changing; and how performance is derived from fillrate is changing. We should simply look at the game play performance and image quality delivered by each card and make an informed purchase decision based on that criteria."

This is a quote from a review here at Hardocp. Post some facts to prove us wrong lawgiver. So far a general chart from 3Dmark is little information. I do know my system got significantly higher marks than other 7800GS cards at futuremark. I read the comparison charts and did quite well. lol
 
Sharky974 said:
ATI has seperate TMU's. They all have seperate ROPs..

ATi actually decoupled the ALU. That's why X1900 can have 48 of them yet only 16 TMU's/ROP's. ATI's ROP's are NOT separate - they are coupled to the TMU's.

nV decoupled the ROP, which is why they can have 24 TMU's/ALU's, yet only have 16 ROP's
 
Fwiw, the 3D tables over at Beyond3D may be helpful for future info: http://www.beyond3d.com/misc/chipcomp/

Basically the problem is bias toward Nvidia..people continue to use pipelines in reference to Nvidia, but no longer ATI..when neither or both should have pipelines.

Some people will say X1900XTX is a 16 pipe card..which is true in a way, yet it has much more raw shading power than a 7900GTX..

Iirc, the 'official' lexicon regarding the definition of pipelines has been revised to TMUs, for accessing textures.

Bottom line all the current high end cards can only write 16 pixels at a time I think..hence they are all 16 pipe cards..but that's irrelevant to performance now anyway..

It's still a bit relevant, I think. For fairly trivial operations like shadow map generation, I wouldn't mind a bit more power in the ROP's, since SM generation can just chew through fillrate like nobody's business.without stressing the pixel shaders much at all.
 
Brent_Justice said:
I'm gonna go out on a limb and say it has the 7800 GS GPU.

Wich version of a 7800GS GPU then? :p

Terra - That is the answer I am seeking ;)
 
You do not need to have the same amount of texture units / pixel shaders / z-test units as ROPs. The industry has been doing this for years, going back as far as the Voodoo 2.

The Voodoo 2 has 2 texture units and 1 ROP, allowing you to apply two textures to one pixel in a single pass.

The GeForce 2 extended on this idea, having 8 texture units and 4 ROPs. Same for the Radeon 8500.

The GeForce FX 5800/5900 series had a similar setup: 8 pixel shaders, 8 texture units, 4 ROPs. So did the GeForce 6600.

Now Nvidia has moved from a 2:1 ratio to a 1.5:1 ratio with the 6800 GS, and the 7600/7800/7900. This allows for better AA performance than the 2:1 ratio (more ROPs), while recognizing that games are starting to need more pixel shading power.

ATI's 3:1 ratio on the x1900 series is just a more radical approach, but as must review will tell you, those cards aren't even using HALF their theoretical shader power. According to what I've read, this is due to a lack of shader loads in today's gamers, plus inefficiency in the architecture. According to PURE PS benchmarks, the x1900 series is capable of about %75 of their theoretical performance.

Why did I mention this? Because the x1600 series is hobbled, even though it uses the exact same ratios as the x1900 series. IT also suffers from inefficiencies in the architecture, and when you combine this with TOO FEW texture, z-test and ROP units (4 / 8 / 4),it gets stomped by the 7600 GT (a card it should be able to match).
 
ATI's 3:1 ratio on the x1900 series is just a more radical approach, but as must review will tell you, those cards aren't even using HALF their theoretical shader power. According to what I've read, this is due to a lack of shader loads in today's gamers, plus inefficiency in the architecture. According to PURE PS benchmarks, the x1900 series is capable of about %75 of their theoretical performance.

What pixel shader benches have YOU been looking at?! In raw, arithmetic-driven pixel shader execution (which the X1600 and X1900 perform very well at, and is going to be the case in the near future) the X1900XTX has a massive performance jump to the X1800XT (referring to the "X1900 is % faster than X1800" column in the tables). I mean, a 200% FASTER (i.e. about 3x performance) in pure pixel shader execution matches up perfectly with the hardware specs containing 3x the ALUs. When you see games using shaders that actually have a large amount of arithmetic action going on, you will likely see a fairly substantial jump in performance when comparing the 7900GTX to the 1900XTX
 
G70, Geforce 7800 GT PCIe core
How many pipes would this core have if bridged to AGP?
 
AMD Fan said:
G70, Geforce 7800 GT PCIe core
How many pipes would this core have if bridged to AGP?

The number of pipes wouldn't change. The core is independent from PCIe/AGP, and that's why it's typically up to 3rd parties like Asus, Sapphire, BFG, etc. to decide if a card is PCIe or AGP.
 
Cypher19 said:
What pixel shader benches have YOU been looking at?! In raw, arithmetic-driven pixel shader execution (which the X1600 and X1900 perform very well at, and is going to be the case in the near future) the X1900XTX has a massive performance jump to the X1800XT (referring to the "X1900 is % faster than X1800" column in the tables). I mean, a 200% FASTER (i.e. about 3x performance) in pure pixel shader execution matches up perfectly with the hardware specs containing 3x the ALUs. When you see games using shaders that actually have a large amount of arithmetic action going on, you will likely see a fairly substantial jump in performance when comparing the 7900GTX to the 1900XTX

Let's have a closer look at those numbers:

First of all, let's get this straight. the x1900 XTX is clocked 25 MHz faster (650 MHz) than the x1800 XT. If you wanted a real clock-for-clock comparison, you'd want the x1900 XT. That said, we can scale the numbers for clock speed.

That means the LARGEST improvement over the x1800 XT is actually only %187. Also, the AVERAGE improvement over the x1800 XT in those "pure pixel shader" tests (ignoring the irrelavant PS 1.1 and 1.4, which only make the x1900 XT look bad) is only %150!

Don't like those numbers? How about you take a look at the ShaderMark results on the same page? After adjusting for clock speed, only ONE "pure shader" test managed to top %100 improvement, let alone the %150 seen in the previous tests.

Sure, its a hell of an improvement over the x1800 series, and I'm not saying the x1900 series is BAD. It outperforms its competition in the majority of games, and is priced well to reflect its performance. What I'm trying to state is WHY the x1600 series is so terrible: because the 3:1 ratio is WASTEFUL. If you don't think it is, tell me: how can the 7600 GT, a card with practically the same core and memory clocks as the x1600 XT, stomp the x1600 XT into the ground even in the shader-fest that is Oblivion?

The ONLY REASON I mentioned the x1900 series as a lead-in for the x1600 series is because EVERYONE AND THEIR DOG has done an in-depth performance analysis of the x1900 series. the x1600 series, on the other hand, reviewers have simply blamed the poor performance on the 128-bit memory.

Well, now we know that much can be accomplished with 128-bit wide fast memory, but still no one analyzes the POS that is the x1600 XT. So, I make mention of the x1900 to point to the x1600's REAL fault: pixel shader inefficiency combined with less texture units than it needs. Why else would the 7600 GT be %30-40 faster in most games, even shader-intensive ones?
 
defaultluser said:
That means the LARGEST improvement over the x1800 XT is actually only %187. Also, the AVERAGE improvement over the x1800 XT in those "pure pixel shader" tests (ignoring the irrelavant PS 1.1 and 1.4, which only make the x1900 XT look bad) is only %150!

For what it's worth, you DO realize that the "150%" increase means that the performance is being multiplied not by 1.5, but by 2.5, right?

Don't like those numbers? How about you take a look at the ShaderMark results on the same page? After adjusting for clock speed, only ONE "pure shader" test managed to top %100 improvement, let alone the %150 seen in the previous tests.

I'd have to look more carefully at what results/code ShaderMark is using to get a better opinion on this. It's possible that those shaders have a few texture lookups going on. Still, double the performance on the same generation of cards is REALLY damn impressive, and nothing at all to scoff at.

What I'm trying to state is WHY the x1600 series is so terrible: because the 3:1 ratio is WASTEFUL.

No, it's because the X1600 has 1/4 of the shading power of the X1900 (i.e. an 800x600 fullscreen shader on an X1600 will run at about the same speed as the same scenario at 1600x1200) and 5 VS units instead of 8. Plus, a year or two from now, it WON'T be wasteful as I've said.

If you don't think it is, tell me: how can the 7600 GT, a card with practically the same core and memory clocks as the x1600 XT, stomp the x1600 XT into the ground even in the shader-fest that is Oblivion?

Chances are Oblivion is still texture-bottlenecked, and uses a lot of texture lookups in its shaders. I don't think it's wasteful at all because that texture bottleneck will be gone in the future, and the X1600 will be able to perform relatively better in future games. For people buying mainstream cards like the *600 cards, they're not as concerned over how it will play games now, they're more concerned about how it will play games 2, 3, or 4 years down the road.

And btw, this isn't even including the to-be-greater dependency on branching in pixel shaders, which the entire X1000 series just hauls ass at compared to the 7000 series.
 
Ok this is what the Gainward 7800GS Bliss 512MB AGP !!!

$ffffffffff ----------------------------------------------------------------
$ffffffffff NVIDIA specific display adapter information
$ffffffffff ----------------------------------------------------------------
$0100000000 Graphics core : NV47/G70 revision A1 (20pp,7vp)
$0100000001 Hardwired ID : 0090 (ROM strapped to 0093)
$0100000002 Memory bus : 256-bit
$0100000003 Memory type : DDR3 (RAM configuration 05)
$0100000004 Memory amount : 524288KB
$0100000100 Core clock domain 0 : 275.400MHz
$0100000101 Core clock domain 1 : 275.400MHz
$0100000102 Core clock domain 2 : 275.400MHz
$0100000006 Memory clock : 627.750MHz (1255.500MHz effective)
$0100000007 Reference clock : 27.000MHz
$010000000c HSI bridge : BR02 PCIE-to-AGP
 
And this you get on a Gainward Bliss 7800GS+ AGP (Output done with RivaTuner 2 RC 16 and ForceWare 91.31)

$ffffffffff ----------------------------------------------------------------
$ffffffffff NVIDIA specific display adapter information
$ffffffffff ----------------------------------------------------------------
$0100000000 Graphics core : NV49/G71 revision A2 (24pp,8vp)
$0100000001 Hardwired ID : 0290 (ROM strapped to 0293)
$0100000002 Memory bus : 256-bit
$0100000003 Memory type : DDR3 (RAM configuration 07)
$0100000004 Memory amount : 524288KB
$0100000100 Core clock domain 0 : 450.000MHz
$0100000101 Core clock domain 1 : 450.000MHz
$0100000102 Core clock domain 2 : 450.000MHz
$0100000006 Memory clock : 627.750MHz (1255.500MHz effective)
$0100000007 Reference clock : 27.000MHz
$010000000c HSI bridge : BR02 PCIE-to-AGP

$ffffffffff ----------------------------------------------------------------
$ffffffffff NVIDIA VGA BIOS information
$ffffffffff ----------------------------------------------------------------
$1100000000 Title : GeForce 7800 GS VGA BIOS
$1100000002 Version : 5.71.22.12.03
$1100000100 BIT version : 1.00
$1100000200 Core clock : 450MHz
$1100000201 Memory clock : 660MHz
$1100010000 Performance level 0 : 400(+20)MHz/625MHz/1.20V/20%
$1100010001 Performance level 1 : 450MHz/625MHz/1.20V/50%
$1100020000 VID bitmask : 00000001b
$1100020100 Voltage level 0 : 1.20V, VID 00000001b
$1100030001 Core thermal compensation : 8°C
$1100030002 Core thermal threshold : 125°C
$1100030004 Thermal diode gain : 0.046°C
$1100030005 Thermal diode offset : -250.510°C

$ffffffffff ----------------------------------------------------------------
$ffffffffff Display adapter information
$ffffffffff ----------------------------------------------------------------
$0000000000 Description : NVIDIA GeForce 7800 GS
$0000000001 Vendor ID : 10de (NVIDIA)
$0000000002 Device ID : 00f5
$0000000003 Location : bus 1, device 0, function 0
$0000000004 Bus type : AGP revision 3.0
$0000000005 AGP status : enabled
$0000000006 AGP rate : 4x 8x supported, 8x selected
$0000000007 AGP SBA : hardwired, enabled
$0000000008 AGP FW : supported, enabled
$0000000009 Base address 0 : f8000000 (memory range)
$000000000a Base address 1 : e0000000 (memory range)
$000000000b Base address 2 : f9000000 (memory range)
$000000000c Base address 3 : none
$000000000d Base address 4 : none
$000000000e Base address 5 : none

So in the NVidia Control Center the Board is named "7800 GS", but the chip is really a G71 (ie a 7900) with 24/8 pipelines.

Greetings, Karsten
 
Back
Top