[ Table of Contents | Index ]
4.1 Hardware Implementation Overview
The PowerPC architecture requires a sequential execution model in which each instruction appears to complete before the next instruction starts from the perspective of the programmer. Because only the appearance of sequential execution is required, implementations are free to process instructions using any technique so long as the programmer can observe only sequential execution. Figure 4-1 shows a series of progressively more complex processor implementations.
Figure 4-1. Processor Implementations
The sequential execution implementation fetches, decodes, and executes one instruction at a time in program order so that a program modifies the processor and memory state one instruction at a time in program order. This implementation represents the sequential execution model that a programmer expects.
The pipelined implementation divides the instruction processing into a series of pipeline stages to overlap the processing of multiple instructions. In principle, pipelining can increase the average number of instructions executed per unit time by nearly the number of pipeline stages. An instruction often starts before the previous one completes, so certain situations that could violate the sequential execution model, called hazards, may develop. In order to eliminate these hazards, the processor must implement various checking mechanisms, which reduce the average number of instructions executed per cycle in practice.
The superscalar implementation introduces parallel pipelines in the execution stage to take advantage of instruction parallelism in the instruction sequence. The fetch and decode stages are modified to handle multiple instructions in parallel. A completion stage following the finish of execution updates the processor and memory state in program order. Parallel execution can increase the average number of instructions executed per cycle beyond that possible in a pipelined model, but hazards again reduce the benefits of parallel execution in practice.
The superscalar implementation also illustrates forwarding (feedback). The General-Purpose Register result calculated by a fixed-point operation is forwarded to the input latches of the fixed-point execution stage, where the result is available for a subsequent instruction during update of the General-Purpose Register. For fixed-point compares and recording instructions, the Condition Register result is forwarded to the input latches of the branch execution stage, where the result is available for a subsequent conditional branch during the update of the Condition Register. Section 4.2.1 on page 100 describes forwarding in greater detail.
The PowerPC instruction set architecture has been designed to facilitate pipelined and superscalar (or other parallel) implementations. All PowerPC implementations incorporate multiple execution units and some out-of-order execution capability.
For descriptive purposes, the generic pipeline stages of instruction processing are given as follows:
The user manual for each implementation describes its particular pipeline stages.
[ Table of Contents | Index ]
|