preface
This article mainly introduces the ARM64 program call rules, detailed analysis of program call process, how the parameters are passed. Android, iOS, Linux, etc., generally follow these rules, but each operating system platform has a small number of its own specific rules. In the next post, I’ll cover iOS platform specific rules.
The term is introduced
The term | meaning |
---|---|
A32 | In THE ARMv7 architecture, an ARM instruction set that uses 32-bit fixed-length instructions. |
A64 | The instruction set when AArch64 is available. |
AAPCS64 | AArch64 program calls the standard. (PCS: Procedure Call Standard) |
AArch32 | 32 bit general purpose register in ARMv8, compatible with ARMV7-A. |
AArch64 | 64 bit general purpose register in ARMv8 |
ABI (Application Binary Interface) | The assembly interface specification, which is specific to the execution environment, such as the Linux ABI, refers to the assembly interface specification for the Linux environment; |
ARM-based | Based on ARM |
Floating point | (1) Follow IEEE 754 2008 floating point arithmetic; (2) ARMv8 floating point instruction set; (3) a register set shared by the ARMv8 floating point instruction set and the ARMv8 SIMD instruction set. |
Q-o-I | Quality of Implementation |
SIMD | Single Instruction Multiple Data Operation |
T32 | T32 uses variable 16bit and 32bit |
Routine, subroutine | Routine: caller; Subroutine: the called one |
Procedure | A function that has no return value |
Function | A function that returns a value |
PIC, PID | Position-independent code, position-independent data. |
Program state | The values of program memory and registers |
Caller- saved register | The caller saves the register before calling the function (usually pushing) and restores the register after the function returns (usually pushing) |
Callee-saved register | The caller (inside the function) saves the register at the beginning and restores the register at the end |
NGRN (The Next General purpose Register Number) | Can be understood as, record the number of r0-R7 (see register below) used, set to 0 before parameter transfer, each parameter put into the register (integer register), the value increased by 1. When the value is 8, it indicates that the r0-R7 registers are used up, and the parameters can only be put into memory. |
NSRN (The Next SIMD and Floating-point Register Number) | As above, record the number of v0-v7 used |
NSAA (The Next STACKED Argument Address) | Record the parameters into the memory, set to SP before parameter transfer, so the range of parameters in the memory should be SP ~NSAA. See parameter passing below for details |
Data types and alignment
Basic data types
Type Class | Machine Type | Byte size |
Natural Alignment (bytes) |
---|---|---|---|
Integral | Unsigned byte | 1 | 1 |
Signed byte | 1 | 1 | |
Unsigned half- word |
2 | 2 | |
Signed half- word |
2 | 2 | |
Unsigned word | 4 | 4 | |
Signed word | 4 | 4 | |
Unsigned double-word |
8 | 8 | |
Signed double- word |
8 | 8 | |
Unsigned quad- word |
16 | 16 | |
Signed quad- word |
16 | 16 | |
Floating Point | Half precision | 2 | 2 |
Single precision | 4 | 4 | |
Double precision |
8 | 8 | |
Quad precision | 16 | 16 | |
Short vector | 64-bit vector | 8 | 8 |
128-bit vector | 16 | 16 | |
Pointer | Data pointer | 8 | 8 |
Code pointer | 8 | 8 |
Program call rule
register
Arm64 has two types of registers:
- Registers that handle integers and Pointers
- General purpose register and AAPCS64 usage
register | The alias | meaning |
---|---|---|
SP | Stack Pointer: indicates the Stack Pointer | |
r30 | LR | Link Register: Saves the address of the next instruction to be executed when a function is called. |
r29 | FP | Frame Pointer: Holds the base address of the function stack. |
r19… r28 | Callee-saved registers (see terms above) | |
r18 | A platform register that has a platform specific explanation for its use. If the platform does not use it for special purposes, it can be used as a temporary register. (Registers reserved for iOS platform are not available for application) | |
r17 | IP1 | The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register. |
r16 | IP0 | The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register. |
r9… r15 | Temporary register | |
r8 | In some cases, the return value is returned via R8 | |
r0… r7 | R0-r7 passes arguments and return values during a function call | |
NZCV | Status register: N (Negative) Negative Z(Zero) Zero C(Carry) Carry V(Overflow) Overflow |
Arm64 has 31 general-purpose integer registers, R0-R30. When using 64bits, name x0-x30. If 32bits are used, the names are w0-w30. Use uppercase when registers have a fixed role in this program call standard.
- SIMD and Floating-point registers
The ARM64 has 32 registers, V0-V31, for SIMD and floating point operations. B, H, S, D, and Q represent byte(8 bits), half(16 bits), Single (32 bits), double(64 bits), and quad(128 bits), respectively. V0-v7 passes arguments and return values during a function call; V8-v15 saves the callee-saved registers (see the terminology) and saves the first 64bits (the larger number of bits that the caller saves). V0-v7 and v16-v31 do not need to save or save the caller.
Process, memory, stack
The memory of a process can be divided into five categories:
- Code section. Can only be read by the process, not written.
- Writable static data.
- Read-only static data.
- The heap.
- The stack.
Writable static data can be subdivided into initialized, zero initialized, and uninitialized data. Except for the stack, the other four types of memory need not occupy contiguous memory. Processes must have some code and stacks, but not the other three classes. A heap is an area of memory managed by a process that is typically used to create dynamic data objects.
Memory address
The address space consists of one or more disjoint regions. Zones cannot span zero addresses, but can start from zero. The use of tagged addressing is platform specific. When marker addressing is disabled, all 64 bits of the pointer are passed to the address translation system. When marker addressing is enabled, the first eight bits of the pointer are ignored for address translation. Note: This tagged Addressing is not tagged Pointer in iOS.
The stack
A stack is a continuous memory space that can be used to store local variables and parameter passes (when the registers used to pass parameters are insufficient). The stack address is from high to low, and the stack address is stored in SP. Stack usage restrictions:
- Stack-limit < SP <= stack-base
- [SP, stack-base-1] [SP, stack-base-1]
- SP mod 16 = 0
A function call
The A64 instruction set contains the function call instructions BL and BLR. Execute the next value of the BL: PC (Program Counter) order, namely the return address (the address of the instruction to be executed after the function call is completed), and store it in LR, passing the jump address to PC. BLR is similar to BL, except that the PC value is read from a register.
Parameter passing
Parameters can be passed through r0-R7, v0-v7, stack; If the number of parameters is small and the parameters can be put into a register, only the register is used to pass the parameters.
Variable parameter
Variable parameters can be classified as named (declared) and anonymous (optional) parameters. When a variable argument function is called with no optional arguments (only declared arguments), the procedure is the same as for a fixed argument function.
Parameter passing rule
Parameter passing can be conceptually divided into two stages:
- Mapping from source language parameter types to machine types (different mapping rules for different source languages)
- Sort out the machine types to generate the final parameter list
The parameter transfer process is divided into three stages:
-
Phase A – Initialization (this phase is performed only once before you start processing parameters)
- NGRN = 0 (NGRN meaning, see glossary)
- NSRN = 0 (NSRN meaning, see glossary)
- NSAA = SP (NSAA meaning, see terminology)
-
Phase B – Prepopulate and extend parameters (apply each parameter in the parameter list to match the following rule, and apply the first matched rule to the parameter.)
- If the parameter type is a compound type and neither caller nor caller can determine its size, the parameter is copied into memory and replaced with a pointer to that memory. (C/C ++ languages do not have such types; other languages do.)
- If the parameter type is HFA or HVA, the parameter is not modified.
- If the argument is a compound type larger than 16 bytes, the caller allocates a memory, copies the argument into memory, and replaces it with a pointer to that memory.
- If the parameter is a compound type, the size of the parameter is rounded up to the nearest multiple of 8 bytes. (For example, change the parameter size from 9 bytes to 16 bytes)
-
Phase C- Placing parameters in register or stack (For each parameter in the parameter list, the following rules apply in turn until the parameter is placed in register or stack, the parameter is processed, and then the parameter is fetched from the parameter list. Note: When assigning a parameter to a register, the value of the unused bits in the register is uncertain. When assigning a parameter to the stack, the value of the unfilled byte is uncertain.
- (1) If the argument is half(16bit), single(16bit), double(32bit), or quad(64bit) floating point or Short Vector Type, and NSRN is less than 8, the argument is put in the least significant bit of register V [NSRN]. NSRN increases by 1. This parameter is processed.
- (2) If the parameter is HFA(homogeneous floating-point aggregate) or HVA(homogeneous short vector Aggregate) type, and NSRN + (number of HFA or HVA members) ≤ 8, Then, each member is put into SIMD and Floating-point registers in turn, and NSRN= number of NSRN+ HFA or HVA members. This parameter is processed.
- (3) If the parameter is HFA(homogeneous floating-point aggregate) or HVA(homogeneous short vector Aggregate), but NSRN has already been equal to 8 (indicating that v0-V7 has been used up). The size of the parameter is rounded up to the nearest multiple of 8 bytes. (For example, change the parameter size from 9 bytes to 16 bytes)
- (4) If the parameter is HFA(homogeneous floating-point Aggregate), HVA(homogeneous short vector Aggregate), QUAD (64bit) floating point number, or short vector Type, NSAA = NSAA+ Max (8, parameter natural alignment size).
- (5) If the argument is half(16bit), single(16bit) float, the argument is extended to 8 bytes.
- (6) If the parameters are HFA(homogeneous floating-point aggregate), HVA(homogeneous short vector aggregate), half(16bit), single(16bit), Double (32bit) or quad(64bit) floating point or Short Vector Type, parameter copy to memory, NSAA=NSAA+size (parameter). This parameter is processed.
- (7) If the parameter is an integer or pointer type, size(parameter)<=8 bytes, and NGRN is less than 8, the parameter is copied to the least significant bit in x[NGRN]. NGRN increases by 1. This parameter is processed.
- (8) If the argument is 16 bytes after alignment, NGRN is up to an even number. (For example, if NGRN is 2, that value remains the same; If NGRN is 3, take 4. Note: the iOS ABI does not have this rule)
- (9) If the parameter is an integer, 16 bytes after alignment, and NGRN is less than 7, copy the parameter to x[NGRN] and x[NGRN+1], where x[NGRN] is the low value. NGRN = NGRN plus 2. This parameter is processed.
- (10) If the parameter is a compound type, and the parameter can be fully put into the X register (8-ngrn >= parameter byte size /8). Put the parameters in order, starting with x[NGRN]. The value of the unpopulated bits is uncertain. NGRN = NGRN + Number of registers used for this parameter. This parameter is processed.
- (11) NGRN Set to 8.
- (12) NSAA = NSAA+ Max (8, parameter natural alignment size).
- (13) If the parameter is a compound type, the parameter is copied to memory, NSAA=NSAA+size (parameter). This parameter is processed.
- (14) If the parameter is smaller than 8 bytes, set the parameter to 8 bytes, and the high bits value is uncertain.
- Parameter copy to memory, NSAA=NSAA+size (parameter). This parameter is processed.
From the above rules, experience can be gained:
- After processing all the arguments in the argument list, the caller must know how much stack space was used to pass the arguments. (NSAA-SP)
- Floating point numbers and short vector types are passed through the V register and stack, not through the R register. (Unless it is a member of a small compound type)
- Values of unfilled parts of registers and stacks, indeterminate.
Function returns result
How a function returns depends on the type of result returned.
- If the return is of type T, as follows
void func(T arg)
Copy the code
The arG value is passed through the register (group), and the returned result is returned through the same register (group). 2. The caller applies for memory (memory that is large enough to hold the return result and is memory aligned) and passes the memory address to the child function in X8. When the child function runs, it can update the contents of x8 pointing to memory to return the result.
conclusion
If the article has wrong place, welcome everyone to point out the message; Or send me an email at [email protected].
reference
- Infocenter.arm.com/help/topic/…
- Blog.csdn.net/adaptiver/a…
- Developer.apple.com/library/arc…
–EOF– Reprint please keep the link, thanks