Improved branch prediction: Uses eight predictors in two pipeline stages.
The number of pipeline stages was increased from 12 to 16, to allow for continued increases in clock speed.
Access to the register file required three pipeline stages due to the physical size of the circuit.
During operation, each pipeline stage would work on one instruction at a time.
The pipeline stages illustrated with a round box are fully programmable.
So, for a given clock speed, there's a limit to how "thick" you can make each pipeline stage.
"Each pipeline stage takes one clock cycle to complete" This isn't quite true, is it?
Sometimes, instructions get hung up in one pipeline stage for multiple cycles.
The resources being used, as well as settings of other pipeline stages, still has a great influence upon the final result.
Also, if the latches are being added in a pipeline stage, they might change the critical path, and hence increase the cycle time.