Something that is pipelined can still have 1 CPI?

Cheetoz

[H]ard|Gawd
Joined
Mar 3, 2003
Messages
1,972
Can someone convince me why a CPU that is pipelined would still be considered as having 1 Clock Per Instruction. Why is it based on a filled pipeline and not the 7 cycles you would need to fill it?
 
the pipeline consists of many stages
each stage takes 1 clock cycle

I am not sure what you mean by a filled pipeline or having to fill it?
But no matter what, each stage will take 1 clock cycle, despite the pipeline being "full" or not

Maybe you can rephrase your question?
 
Can someone convince me why a CPU that is pipelined would still be considered as having 1 Clock Per Instruction. Why is it based on a filled pipeline and not the 7 cycles you would need to fill it?

With a pipelined processor you (aspire) to have one instruction occupying each state of the pipeline at any given moment. At the end of every clock cycle each stage finishes its portion of work on its instruction and passes it onto the next stage. The final stage, each clock, does the last segment of work and retires that instruction. Each time a pipeline passes on its instruction to the next stage, it grabs the instruction from the previous stage (most of the time), so normally each pipeline is always working on something. For the most part programs are large enough that a vast majority of the time you will keep the pipeline full and maintain that throughput.

Each instruction would have a latency of 7 cycles sure, but the CPU overall retires 1 instruction per clock under typical operation.
 
Can someone convince me why a CPU that is pipelined would still be considered as having 1 Clock Per Instruction. Why is it based on a filled pipeline and not the 7 cycles you would need to fill it?
Because a pipelined CPU is capable of completing 1 instruction every clock cycle.
 
With a pipelined processor you (aspire) to have one instruction occupying each state of the pipeline at any given moment. At the end of every clock cycle each stage finishes its portion of work on its instruction and passes it onto the next stage. The final stage, each clock, does the last segment of work and retires that instruction. Each time a pipeline passes on its instruction to the next stage, it grabs the instruction from the previous stage (most of the time), so normally each pipeline is always working on something. For the most part programs are large enough that a vast majority of the time you will keep the pipeline full and maintain that throughput.

Each instruction would have a latency of 7 cycles sure, but the CPU overall retires 1 instruction per clock under typical operation.

This is what you're looking for. With a pipelined processor you have a latency, which doesn't necessarily mean the CPU has under 1 instruction per clock cycle. The 1 instruction per clock cycle is a measure of throughput.

The difference is can be described in these two different cases:

1.) Say you're only sending 1 instruction to the CPU every second, and that's what your program does. Your CPU is going to take 7 clock cycles for the instruction to process (assuming 7 stage pipline), so you could possibly say that your CPU is 1/7 instructions per clock, since latency dominates.

2.) Now your program is sending 1 instruction to the CPU every clock cycle. After the first 7 instructions, you're going to see your CPU start executing one instruction per clock cycle, and you would say that your CPU is performing at 1 instruction per clock cycle.

In almost all applications, we see case 2.
 
Back
Top