IBM Search Java Site Map Microelectronics News Products Services Technology About Us IBM Microelectronics Order Contact Legal

power pcProducts

overview
news
products
documents
performance
technology


[ Table of Contents | Index ]
Appendix A

A. ABI Considerations


A compiler converts the source code into an object module. The linker resolves cross-references between one or more object modules to form an executable module. The loader converts an executable module into an executable memory image.

An Application Binary Interface (ABI) includes a set of conventions that allows a linker to combine separately compiled and assembled elements of a program so that they can be treated as a unit. The ABI defines the binary interfaces between compiled units and the overall layout of application components comprising a single task within an operating system. Therefore, most compilers target an ABI. The requirements and constraints of the ABI relevant to the compiler extend only to the interfaces between shared system elements. For those interfaces totally under the control of the compiler, the compiler writer is free to choose any convention desired, and the proper choice can significantly improve performance.

IBM has defined three ABIs for the PowerPC architecture: the AIX ABI for big-endian 32-bit PowerPC processors and the Windows NT and Workplace ABIs for little-endian 32-bit PowerPC processors. Other PowerPC users have defined other ABIs. As a practical matter, ABIs tend to be associated with a particular operating system or family of operating systems. Programs compiled for one ABI are frequently incompatible with programs compiled for another ABI because of the low-level strategic decisions required by an ABI. As a framework for the description of ABI issues in this book, we describe the AIX ABI for big-endian, 32-bit systems. For further details, check relevant AIX documentation, especially the Assembler Language Reference manual (IBM Corporation [1993a]). The AIX ABI is nearly identical to what was previously published as the PowerOpen ABI.

The AIX ABI supports dynamic linking in order to provide efficient support for shared libraries. Dynamic linking permits an executable module to link functions in a shared library module during loading.


A.1 Procedure Interfaces

Compiled code exposes interfaces to procedures and global data. The program model for the AIX ABI consists of a code segment, a global data segment, and a stack segment for every active thread. A thread is a binding of an executing program, its code segment, and a stack segment that contains the state information corresponding to the execution of the thread. Global variables are shared.

The procedure (or subroutine) is the fundamental element of execution and, with the exception of references to globally defined data and external procedures, represents a closed unit. Many compilers make the procedure the fundamental unit of compilation and do not attempt any interprocedural optimization. An ABI specifies conventions for the interprocedure interfaces.

The interface between two procedures is defined in terms of the caller and the callee. The caller computes parameters to the procedure, binds them to arguments, and then transfers control to the callee. The callee uses the arguments, computes a value (possibly null), and then returns control to the statement following the call. The details of this interface constitute much of the content of the ABI.

When a procedure is called, some prolog code may be executed to create an a block of storage for the procedure on the run-time stack, called an activation record, before the procedure body is executed. When the procedure returns, some epilog code may be executed to clean up the state of the run-time stack.


A.1.1 Register Conventions

At the interface, the ABI defines the use of registers. Registers are classified as dedicated, volatile, or non-volatile. Dedicated registers have assigned uses and generally should not be modified by the compiler. Volatile registers are available for use at all times. Volatile registers are frequently called caller-save registers. Non-volatile registers are available for use, but they must be saved before being used in the local context and restored prior to return. These registers are frequently called callee-save registers. Figure A-1 describes the AIX register conventions for management of specific registers at the procedure call interface.



Figure A-1. AIX ABI Register Usage Conventions

Register Type Register Status Use
General-
Purpose
GPR0 Volatile Used in function prologs.
GPR1 Dedicated Stack Pointer.
GPR2 Dedicated Table of Contents (TOC) Pointer.
GPR3 Volatile First argument word; first word of function return value.
GPR4 Volatile Second argument word; second word function return value.
GPR5 Volatile Third argument word.
GPR6 Volatile Fourth argument word.
GPR7 Volatile Fifth argument word.
GPR8 Volatile Sixth argument word.
GPR9 Volatile Seventh argument word.
GPR10 Volatile Eighth argument word.
GPR11 Volatile Used in calls by pointer and as an environment pointer.
GPR12 Volatile Used for special exception handling and in glink code.
GPR13:31 Non-volatile Values are preserved across procedure calls.
Floating-Point FPR0 Volatile Scratch register.
FPR1 Volatile First floating-point parameter; first floating-point scalar return value.
FPR2 Volatile Second floating-point parameter; second floating-point scalar return value.
FPR3 Volatile Third floating-point parameter; third floating-point scalar return value.
FPR4 Volatile Fourth floating-point parameter; fourth floating-point scalar return value.
FPR5 Volatile Fifth floating-point parameter.
FPR6 Volatile Sixth floating-point parameter.
FPR7 Volatile Seventh floating-point parameter.
FPR8 Volatile Eighth floating-point parameter.
FPR9 Volatile Ninth floating-point parameter.
FPR10 Volatile Tenth floating-point parameter.
FPR11 Volatile Eleventh floating-point parameter.
FPR12 Volatile Twelfth floating-point parameter.
FPR13 Volatile Thirteenth floating-point parameter.
FPR14:31 Non-volatile Values are preserved across procedure calls.
Special-Purpose LR Volatile Branch target address; procedure return address.
CTR Volatile Branch target address; loop count value.
XER Volatile Fixed point exception register.
FPSCR Volatile Floating-point status and control register.
Condition Register CR0, CR1 Volatile Condition codes.
CR2, CR3, CR4 Non-volatile Condition codes.
CR5, CR6, CR7 Volatile Condition codes.


A.1.2 Run-Time Stack

The stack provides storage for local variables. A single dedicated register, GPR1 (also called SP), maintains the stack pointer, which is used to address data in the stack. The stack grows from high addresses toward lower addresses. To ensure optimal alignment, the stack pointer is quadword aligned (i.e., its address is a multiple of 16).

To examine the structure of the run-time stack, consider the following sequence of procedure calls: aaa calls bbb calls ccc calls ddd. Figure A-2 on page 162 shows the relevant areas of the run-time stack for procedure ccc. These areas include:


Figure A-2. Relevant Parts of the Run-Time Stack for Subprogram ccc


A.1.3 Leaf Procedures

A leaf procedure is a procedure that does not call another procedure. During the normal procedure calling process, the non-volatile registers are saved on the stack with a negative offset to the stack pointer. If an interrupt occurs, a handler that uses the stack must avoid modifying a 220-byte area (the size of a full save of all non-volatile registers) at a negative offset to the stack pointer to ensure that program execution will continue properly following the interrupt handling. Therefore, a leaf procedure can use this 220-byte area for saving non-volatile registers or as a local stack area. If the procedure either calls another procedure or requires more than 220 bytes of space on the stack, it should establish a new activation record.


A.2 Procedure Calling Sequence


A.2.1 Argument Passing Rules

Where possible, the actual parameters to subprogram arguments are passed in registers. The procedure's argument list maps to the argument build area on the stack, even if the actual parameters are not stored on the stack. The storage location on the stack reserved for a parameter that has been passed in a register is called its home location. Only the first 8 words of the argument list need not be stored on the stack; the remaining portion of the argument list is always stored on the stack.

The argument-passing rules provide the maximum level of support for inter-language calls and facilitate consistent handling of calls between mismatched or incorrectly prototyped C function definitions. Compilers may discover and exploit the properties of individual calls (e.g., through the use of prototypes), and thereby modify various parts of the following description; however, the program behavior must appear as if the rules were applied uniformly. The argument-passing rules are:

Figure A-3 shows how the arguments are passed for the following function:


void foo1(long a, short b, char c);



Figure A-3. Argument Passing for foo1

Argument Type Argument Words in Build Area Registers
General-Purpose Floating-Point
a long (0) GPR3
b short (1) GPR4
c char (2) GPR5
() indicate that the resource is reserved on the stack or in the register, but the value may not be present.

Figure A-4 shows how the arguments are passed for another function that has both integer and floating-point values:


void foo2(long a, double b, float c, char d,
double e, double f, short g, float h);



Figure A-4. Argument Passing for foo2

Argument Type Argument Words in Build Area Registers
General-Purpose Floating-Point
a long (0) GPR3
b double (1:2) (GPR4:5) FPR1
c float (3) (GPR6) FPR2
d char (4) GPR7
e double (5:6) (GPR8:9) FPR3
f double (7),8 (The word at 32 contains the low-order half of f) (GPR10) (high-order part of f) FPR4
g short 9
h float 10 FPR5
() indicate that the resource is reserved on the stack or in the register, but the value may not be present.

The first 8 words of the argument list are passed in registers. This example assumes that the function prototype is visible at the point of the call; hence, floating-point parameters need not be copied to the General-Purpose Registers.


A.2.2 Function Return Values

Where a function returns its value depends upon the type of the value being returned. The rules are:


A.2.3 Procedure Prologs and Epilogs

A procedure prolog sets up the execution environment for a procedure; a procedure epilog unwinds the execution environment and re-establishes the old environment so that execution can continue following the call. The AIX ABI does not specify a prescribed code sequence for prologs and epilogs, but it does stipulate that certain actions be performed. Any update of the SP must be performed atomically by a single instruction to ensure that there is no timing window during which an interrupt can occur and the stack is in a partially updated state.

Prolog code is responsible for establishing a new activation record and saving on the stack any state that must be preserved:

Subtracting this displacement from the current stack pointer to form the new stack pointer and saving the previous stack pointer at offset 0 from the new stack pointer must be performed atomically so that an interrupt cannot perturb the creation of a new activation record. If the magnitude of the displacement is less than 215, use:


stwu R1,-offset(R1).

If the displacement is greater than or equal to 215, load the offset into R3 and use:


stwux R1,R1,R3.

Epilog code is responsible for unwinding and deallocating the activation record:

The prolog and epilog sequences support a number of variations, depending upon the properties of the procedure being compiled. For example, a stackless leaf procedure (that is, a procedure which makes no calls and requires no local variables to be allocated in its stack frame) can save its caller's registers at a negative offset from the caller's stack pointer and does not actually need to acquire an activation record for its own execution.

The content of the prolog and epilog code involves a number of trade-offs. For example, if the number of General-Purpose Registers and Floating-Point Registers that need to be saved is small, the saves should be generated in-line. If there are many registers to be saved, the save and restore could be done with a system routine at the cost of a branch and link and return. For a high performance machine, the branch penalty may be substantial and needs to be traded-off against the additional code (and instruction cache penalties) associated with doing the saves and restores in-line. Although load and store multiple instructions could be used, scalar loads and stores offer better performance for some implementations. Also, they do not function in Little-Endian mode.


A.3 Dynamic Linking

The AIX ABI supports the dynamic linking of procedures. In effect, all symbols need not be resolved during linking and the execution module can bind to routines in other modules at load time or dynamically during program execution. This dynamic linking permits different applications to share library routines and modification of these routines without the requirement of statically relinking the applications, reducing the size of a program. On the other hand, there is a performance cost associated with out-of-module references of approximately eight machine cycles per call.


A.3.1 Table Of Contents

The Table Of Contents (TOC) is a common storage area that may contain address constants and external scalars for a given object module. Each object module has its own unique TOC. The calling conventions between object modules involve multiple TOCs. The TOC contains addresses of data objects and load-time bound procedure addresses. The General-Purpose Register GPR2 (also called RTOC) contains the address of the current TOC.

Variables that are visible outside of the module are accessed using the TOC. The address of the variable is stored in the TOC at a compiler-known offset. The value may be accessed as follows:


lwz R3,offset_&value(RTOC)
lwz R4,0(R3)

To optimize the access of a number of variables, a single reference address may be stored in the TOC, and the different variables may be indexed from this address. Another optimization is to directly store the value of the variable in the TOC:


lwz R3,offset_value(RTOC)


A.3.2 Function Descriptors

Figure A-5 shows the three-word structure defining a function descriptor. Every function that is externally visible has a function descriptor. The first word contains the address of the function. The second word contains the function's TOC pointer. The third word contains an optional environment pointer, which is useful for some programming languages. The loader initializes the function descriptors when a module is loaded for execution.



Figure A-5. Function Descriptor

struct {
void *(func_ptr)(); /* the address of the function */
void *toc_value; /* RTOC value for the function */
void *env; /* environment pointer */
}


A.3.3 Out-of-Module Function Calls

Figure A-6 shows a C fragment and the assembly code generated by a compiler for a function call by pointer and a function call by name. The instructions indicated by asterisks on the left in the assembly listing represent the function calls.



Figure A-6. main: Function-Calling Code Example

C Source Code
extern int printf(char *,...);
main()
{
int (*foo_bar)(char *,...);
foo_bar = printf;
foo_bar("Via pointer\n");
printf("Direct\n");
}
Assembly Code
mflr R0 # get value of LR
stw R31,-4(SP) # save old R31 in stack
lwz R31,.CONSTANT(RTOC) # get address of strings
stw R0,8(SP) # save LR in callers stack frame
stwu SP,-80(SP) # create activation record
lwz R11,.printf(RTOC) # get &(function descriptor)
mr R3,R31 # string address to parameter 1
* bl .ptrgl # call pointer glue
* lwz RTOC,20(SP) # reload RTOC from stack frame
addi R3,R31,16 # string address to parameter 1
* bl .printf # call printf via glink code
* ori R0,R0,0 # reload RTOC from stack frame
lwz R12,88(SP) # reload old LR
lwz R31,76(SP) # restore R31
mtlr R12 # load LR
addi SP,SP,80 # remove activation record
blr # return via LR

The function call by pointer uses a system routine, ptrgl, shown in Figure A-7. The "." immediately preceding the function name in the assembly listing is a linker convention indicating that the address of the function is represented by the symbol. The ptrgl routine performs a control transfer to an external function whose address is unknown at compile-link time. On entry, it assumes that GPR11 contains the address of the function descriptor for the function being called. The ptrgl routine acts as a springboard to the external function, which will return directly to the call point and not to ptrgl. A compiler may inline the code for ptrgl.



Figure A-7. ptrgl Routine Code Sequence

lwz R0,0(R11) # load function's address
stw RTOC,20(SP) # save RTOC in stack frame
mtctr R0 # CTR = function address
lwz RTOC,4(R11) # RTOC = callee's RTOC
lwz R11,8(R11) # R11 = environment of callee
bctr # transfer to function

When an external function is called by name, the linker injects a call to a global linkage (glink) routine and replaces the no-op with code to restore the caller's TOC address in RTOC on return:


bl .glink_printf # call glink for printf
lwz RTOC,20(SP) # restore TOC pointer

Figure A-8 shows the glink routine, which intercepts the call to the out-of-module function, obtains the location of the callee's function descriptor from the TOC, saves the caller's RTOC value, load RTOC with the callee's TOC address, and transfers control to the function as in the case of call by pointer. This springboard code is unique for each procedure and is generated at link time.



Figure A-8. glink_printf Code Sequence

lwz R12,.printf(RTOC) # get address of descriptor
stw RTOC,20(SP) # save RTOC in stack frame
lwz R0,0(R12) # load function address
lwz RTOC,4(R12) # RTOC = callee's RTOC
mtctr R0 # CTR = function address
bctr # transfer to function
Statically (compiler-time) bound procedures do not need springboard code and can be compiled without the no-op following the branch and link. The linker introduces the springboard code only when necessary. If the called routine is linked to the same module as its caller, then the compiled code sequence is unchanged; that is, the branch and link target is not directed to the springboard code and the no-op remains (control transfers directly to the called function).


[ Table of Contents | Index ]
Copyright 1998 IBMchips