<architecture> A sequence of functional units ("stages") which performs a
task in several steps, like an assembly line in a factory. Each functional unit
takes inputs and produces outputs which are stored in its output buffer. One
stage's output buffer is the next stage's input buffer. This arrangement allows
all the stages to work in parallel thus giving greater throughput than if each
input had to pass through the whole pipeline before the next input could enter.
The costs are greater latency and complexity due to the need to synchronise the
stages in some way so that different inputs do not interfere. The pipeline will
only work at full efficiency if it can be filled and emptied at the same rate
that it can process.
Pipelines may be synchronous or asynchronous. A synchronous pipeline has a
master clock and each stage must complete its work within one cycle. The minimum
clock period is thus determined by the slowest stage. An asynchronous pipeline
requires handshaking between stages so that a new output is not written to the
interstage buffer before the previous one has been used.
Many CPUs are arranged as one or more pipelines, with different stages
performing tasks such as fetch instruction, decode instruction, fetch arguments,
arithmetic operations, store results. For maximum performance, these rely on a
continuous stream of instructions fetched from sequential locations in memory.
Pipelining is often combined with instruction prefetch in an attempt to keep the
When a branch is taken, the contents of early stages will contain instructions
from locations after the branch which should not be executed. The pipeline then
has to be flushed and reloaded. This is known as a pipeline break.
pin-out « PIP « pipe « pipeline » pipeline
break » Pipeline Burst Cache » pipelined