IBM Search Java Site Map Microelectronics News Products Services Technology About Us IBM Microelectronics Order Contact Legal

power pcProducts

overview
news
products
documents
performance
technology


[ Table of Contents | Index ]
Chapter 2

2. Overview of the PowerPC Architecture


Books I through III of The PowerPC Architecture describe the instruction set, virtual environment, and operating environment, respectively. The user manual for each processor specifies the implementation features of that processor. In this book, the term PowerPC architecture refers to the contents of Books I through III. The compiler writer is concerned principally with the contents of Book I: PowerPC User Instruction Set Architecture.


2.1 Application Environment

The application environment consists of resources accessible from the problem state, which is the user mode (the PR bit in the Machine State Register is set). The PowerPC architecture is a load-store architecture that defines specifications for both 32-bit and 64-bit implementations. The instruction set is partitioned into three functional classes: branch, fixed-point and floating-point. The registers are also partitioned into groups corresponding to these classes; that is, there are condition code and branch target registers for branches, Floating-Point Registers for floating-point operations, and General-Purpose Registers for fixed-point operations. This partition benefits superscalar implementations by reducing the interlocking necessary for dependency checking. The explicit indication of all operands in the instructions, combined with the partitioning of the PowerPC architecture into functional classes, exposes dependences to the compiler. Although instructions must be word (32-bit) aligned, data can be misaligned within certain implementation-dependent constraints. The floating-point facilities support compliance to the IEEE 754 Standard for Binary Floating-Point Arithmetic (IEEE 754).


2.1.1 32-Bit and 64-Bit Implementations and Modes

The PowerPC architecture includes specifications for both 32- and 64-bit implementations. In 32-bit implementations, all application registers have 32 bits, except for the 64-bit Floating-Point Registers, and effective addresses have 32 bits. In 64-bit implementations, all application registers are 64-bits long—except for the 32-bit Condition Register, FPSCR, and XER—and effective addresses have 64 bits. Figure 2-1 shows the application register sizes in 32-bit and 64-bit implementations.



Figure 2-1. Application Register Sizes

Registers 32-Bit Implementation Size (Bits) 64-Bit Implementation Size (Bits)
Condition Register 32 32
Link Register and Count Register 32 64
General-Purpose Registers 32 64
fixed-point Exception Register 32 32
Floating-Point Registers 64 64
Floating-Point Status and Control Register 32 32

Both 32-bit and 64-bit implementations support most of the instructions defined by the PowerPC architecture. The 64-bit implementations support all the application instructions supported 32-bit implementations as well as the following application instructions: load doubleword, store doubleword, load word algebraic, multiply doubleword, divide doubleword, rotate doubleword, shift doubleword, count leading zeros doubleword, sign extend word, and convert doubleword integer to a floating-point value.

The 64-bit implementations have two modes of operation determined by the 64-bit mode (SF) bit in the Machine State Register: 64-bit mode (SF set to 1) and 32-bit mode (SF cleared to 0), for compatibility with 32-bit implementations. Application code for 32-bit implementations executes without modification on 64-bit implementations running in 32-bit mode, yielding identical results. All 64-bit implementation instructions are available in both modes. Identical instructions, however, may produce different results in 32-bit and 64-bit modes:


2.1.2 Register Resources

The PowerPC architecture identifies each register with a functional class, and most instructions within a class use only the registers identified with that class. Only a small number of instructions transfer data between functional classes. This separation of processor functionality reduces the hardware interlocking needed for parallel execution and exposes register dependences to the compiler.


2.1.2.1 Branch

The Branch-Processing Unit includes the Condition Register, Link Register (LR) and Count Register (CTR):


2.1.2.2 Fixed-Point

The Fixed-Point Unit includes the General-Purpose Register file and the Fixed-Point Exception Register (XER):


2.1.2.3 Floating-Point

The Floating-Point Unit includes the Floating-Point Register file and the Floating-Point Status and Control Register (FPSCR):


2.1.3 Memory Models

Memory is considered to be a linear array of bytes indexed from 0 to 232 - 1 in 32-bit implementations, and from 0 to 264 - 1 in 64-bit implementations. Each byte is identified by its index, called an address, and each byte contains a value. For the uniprocessor systems considered in this book, one storage access occurs at a time and all accesses appear to occur in program order. The main considerations for the compiler writer are the addressing modes, alignment, and endian orientation. Although these considerations alone suffice for the correct execution of a program, code modifications that better utilize the caches and translation-lookaside buffers may improve performance (see Section 4.4 on page 133).


2.1.3.1 Memory Addressing

The PowerPC architecture implements three addressing modes for instructions and three for data. The address of either an instruction or a multiple-byte data value is its lowest-numbered byte. This address points to the most-significant end in big-endian mode, and the least-significant end in little-endian mode.

Instructions
Branches are the only instructions that specify the address of the next instruction; all others rely on incrementing a program counter. A branch instruction indicates the effective address of the target in one of the following ways:

Data
All PowerPC load and store instructions specify an address register, which is indicated in the RA field of the instruction. If RA is 0, the value zero is used instead of the contents of R0. The effective byte address in memory for a data value is calculated relative to the base register in one of three ways:

The update forms reload the register with the computed address, unless RA is 0 or RA is the target register of the load.

Arithmetic for address computation is unsigned and ignores any carry out of bit 0. In 32-bit mode of a 64-bit implementation, the processor ignores the high-order 32-bits, but includes them when the address is loaded into a General-Purpose Register, such as during a load or store with update.


2.1.3.2 Endian Orientation

The address of a multi-byte value in memory can refer to the most-significant end (big-endian) or the least-significant end (little-endian). By default, the PowerPC architecture assumes that multi-byte values have a big-endian orientation in memory, but values stored in little-endian orientation may be accessed by setting the Little-Endian (LE) bit in the Machine State Register. In PowerPC Little-Endian mode, the memory image is not true little-endian, but rather the ordering obtained by the address modification scheme specified in Appendix D of Book I of The PowerPC Architecture. In Little-Endian mode, load multiple, store multiple, load string, and store string operations generate an Alignment interrupt. Other little-endian misaligned load and store operations may also generate an Alignment interrupt, depending on the implementation. In most cases, the load and store with byte reversal instructions offer the simplest way to convert data from one endian orientation to the other in either endian mode.


2.1.3.3 Alignment

The alignment of instruction and storage operands affects the result and performance of instruction fetching and storage accesses, respectively.

Instructions
PowerPC instructions must be aligned on word (32-bit) boundaries. There is no way to generate an instruction address that is not divisible by 4.

Data
Although the best performance results from the use of aligned accesses, the PowerPC architecture is unusual among RISC architectures in that it permits misaligned data accesses. Different PowerPC implementations respond differently to misaligned accesses. The processor hardware may handle the access or may generate an Alignment interrupt. The Alignment interrupt handler may handle the access or indicate that a program error has occurred. Load-and-reserve and store-conditional instructions to misaligned effective addresses are considered program errors. Alignment interrupt handling may require on the order of hundreds of cycles, so every effort should be made to avoid misaligned memory values.

In Big-Endian mode, the PowerPC architecture requires implementations to handle automatically misaligned integer halfword and word accesses, word-aligned integer doubleword accesses, and word-aligned floating-point accesses. Other accesses may or may not generate an Alignment interrupt depending on the implementation.

In Little-Endian mode, the PowerPC architecture does not require implementation hardware to handle any misaligned accesses automatically, so any misaligned access may generate an Alignment interrupt. Load multiple, store multiple, load string, and store string instructions always generate an Alignment interrupt in Little-Endian mode.

A misaligned access, a load multiple access, store multiple access, a load string access, or a store string access that crosses a page, Block Address Translation (BAT) block, or segment boundary in an ordinary segment may be restarted by the implementation or the operating system. Restarting the operation may load or store some bytes at the target location for a second time. To ensure that the access is not restarted, the data should be placed in either a BAT or a direct-store segment, both of which do not permit a restarted access.


2.1.4 Floating-Point

The PowerPC floating-point formats, operations, interrupts, and special-value handling conform to IEEE 754. The remainder operation and some conversion operations required by IEEE 754 must be implemented in software (in the run-time library).

A Floating-Point Register may contain four different data types: single-precision floating-point, double-precision floating-point, 32-bit integer, and 64-bit integer. The integer data types can be stored to memory or converted to a floating-point value for computation. The frsp instruction rounds double-precision values to single-precision. The precision of the result of an operation is encoded in the instruction. Single-precision operations should act only on single-precision operands.

The floating-point operating environment for applications is determined by bit settings in the Floating-Point Status and Control Register (FPSCR) and the Machine State Register (MSR). Figure 2-2 shows the bit fields and their functions. Floating-point interrupts may be disabled by clearing FE0 and FE1. If either FE0 or FE1 is set, individual IEEE 754 exception types are enabled with the bits in the FPSCR indicated in Figure 2-2.

The non-IEEE mode implemented by some implementations may be used to obtain deterministic performance (avoiding traps and interrupts) in certain applications. See Section 3.3.7.1 on page 79 for further details.



Figure 2-2. Floating-Point Application Control Fields

Register Field * Name Function
FPSCR 24 VE Floating-Point Invalid Operation Exception Enable

0 Invalid operation exception handled with the IEEE 754 default response.

1 Invalid operation exception causes a Program interrupt.

25 OE Floating-Point Overflow Exception Enable

0 Overflow exception handled with the IEEE 754 default response.

1 Overflow exception causes a Program interrupt.

26 UE Floating-Point Underflow Exception Enable

0 Underflow exception handled with the IEEE 754 default response.

1 Underflow exception causes a Program interrupt.

27 ZE Floating-Point Zero-Divide Exception Enable

0 Zero divide exception handled with the IEEE 754 default response.

1 Zero divide exception causes a Program interrupt.

28 XE Floating-Point Inexact Exception Enable

0 Inexact exception handled with the IEEE 754 default response.

1 Inexact exception causes a Program interrupt.

29 NI Floating-Point Non-IEEE Mode

0 The processor executes in an IEEE 754 compatible manner.

1 The processor produces some results that do not conform with IEEE 754.

30:31 RN Floating-Point Rounding Control

00 Round to Nearest

01 Round toward 0

10 Round toward +

11 Round toward -

MSR 64-bit: 52 32-bit: 20 64-bit: 55 32-bit: 23 FE0 FE1 Floating-Point Exception Modes 0 and 1

00 Ignore exceptions mode. Floating-point exceptions do not cause interrupts.

01 Imprecise nonrecoverable mode. The processor interrupts the program at some point beyond the instruction that caused the enabled exception, and the interrupt handler may not be able to identify this instruction.

10 Imprecise recoverable mode. The processor interrupts the program at some point beyond the instruction that caused the enabled exception, but the interrupt handler can identify this instruction.

11 Precise mode. The program interrupt is generated precisely at the floating-point instruction that caused the enabled exception.

* 64-bit and 32-bit refer to the type of implementation.


2.2 Instruction Set

All instructions are 32 bits in length. Most computational instructions specify two source register operands and a destination register operand. Only load and store instructions access memory. Furthermore, most instructions access only the registers of the same functional class. Branch instructions permit control transfers either unconditionally, or conditionally based on the test of a bit in the Condition Register. The branch targets can be immediate values given in the branches or the contents of the Link or Count Register. The fixed-point instructions include the storage access, arithmetic, compare, logical, rotate and shift, and move to/from system register instructions. The floating-point instructions include storage access, move, arithmetic, rounding and conversion, compare, and FPSCR instructions.


2.2.1 Optional Instructions

The PowerPC architecture includes a set of optional instructions:

If an implementation supports any instruction in a group, it must support all of the instructions in the group. Check the documentation for a specific implementation to determine which, if any, of the groups are supported.


2.2.2 Preferred Instruction Forms

Some instructions have a preferred form, which may execute significantly faster than other forms. Instructions having preferred forms include:


2.2.3 Communication Between Functional Classes

A special group of instructions manage the communication between the resources of different functional classes. No branch instructions can use resources of the non-branch classes. The communication always occurs through an fixed-point or floating-point instruction. The execution of these instructions may cause substantial implementation-dependent delays because both execution units must be available simultaneously as both are involved in the execution of the instruction.


2.2.3.1 Fixed-Point and Branch Resources

The fixed-point instructions manage the following transfers between fixed-point and branch registers:


2.2.3.2 Fixed-Point and Floating-Point Resources

No direct connection exists between General-Purpose Registers and Floating-Point Registers. A transfer between these registers must be done by storing the register value in memory from one functional class and then loading the value into a register of the other class.


2.2.3.3 Floating-Point and Branch Resources

The floating-point instructions manage the following transfers between Floating-Point Registers and Branch Unit registers:


[ Table of Contents | Index ]
Copyright 1998 IBMchips