Ancient artifacts with the scale, law should be the moment. – Su Xun

What is a function

An executable program is a collection of different machine instructions arranged according to specific rules in order to achieve a function. In both high-level and low-level programming languages, both object-oriented and procedural languages, the final code is translated into the form of machine instructions to be executed. For the convenience of management and the reuse of code, it is often necessary to separate and process a certain set of instructions to achieve a specific function, thus forming the concept of function, function can also be called subroutine or subroutine. The machine instruction set of an executable program is no longer a single piece of code, but a block of code composed of multiple functions, so that the executable program becomes built and organized by the way of calling each other.

A function is composed of four parts: function signature, parameter, return and implementation. The first three functions define explicit boundary information, also known as function interface description. The significance of function interface description is that the caller no longer needs to know the implementation details of the called function, but only needs to interact according to the interface defined by the caller. How to define a function, how to implement a function, how to call a function, how to pass parameters to the called function, how to use the callee function to return all of these need a unified standard specification to define, this rule has two levels of the standard: rules called the API in a high-level language level; At the machine instruction level, different operating systems and DIFFERENT CPU architectures provide different instruction sets and different ways of constructing programs, so the rules at the system level are called ABI rules. The focus of this article is on the ABI rules for function calls, function parameter passing, and function return values, which will give you a better understanding of what a function is. It is important to note that the ABI rules here refer to the ABI rules for programs implemented based on the OC language. These rules do not apply to programs implemented via Swift or the ABI rules that do not apply to other operating systems such as Linux.

Due to the excessive content, I will be divided into two articles to do the specific introduction, the former article introduces the function interface related content, the latter article introduces the function implementation related content.

Function calls

The program counter (IP/PC) in the CPU always holds the memory address of the next instruction to be executed, so that each execution of an instruction updates the value in the program counter and can proceed to the next instruction. The system is constantly changing the value of the program counter to achieve the execution of the program instruction. In general, the values in the program counter are always updated in program instruction order, which is broken only when jump instructions and function calls are executed.

The essence of a function call is to assign the first address of the function in memory to the program counter (IP/PC), so that the next instruction to be executed becomes the instruction at the first address of the function, thus realizing the call of the function. In addition to updating the value of the program counter, we also need to save the call scene so that the next instruction of the function call can be continued when the function call returns, so the so-called save call scene here is to save the address of the next instruction of the function call. Different CPU systems provide specific function call instructions to implement the function call function. For example, x86 systems provide an instruction called call to implement function calls. In addition to updating the program counter, the call instruction pushes the next instruction of the function call onto the stack for saving. The ARM system provides B series of instructions to realize function calls. B series of instructions not only update the value of the program counter, but also save the next instruction of the function call to the LR register.

The essence of a function return is to assign the saved call site address to the program counter, so that the next instruction to be executed becomes the next instruction that the caller calls the called function. Different CPU systems also provide specific function return instructions to implement the function return function (except arm32-bit systems). For example, x86 systems provide an instruction called RET to implement function returns. This instruction assigns the address stored at the top of the stack to the program counter and then executes the stack exit operation. Arm64-bit systems also provide a RET instruction for function return, which assigns the value of the current LR register to the program counter.

For x86 systems, the next instruction of the caller will be pushed onto the stack before the function call is executed, and the top of the stack will be moved down due to the definition of a local stack frame inside the called function. Therefore, before the called function executes ret instruction and returns, it is necessary to ensure that the top stack address pointed by the current stack register SP is consistent with the top stack address before the called function executes; otherwise, the value of the caller’s next instruction taken out when ret instruction executes will be wrong, resulting in a crash exception.

For arm system for LR register only one, so if the called function also call other functions will update internal LR register values, once the LR register has been updated will be unable to restore the right call site, so the called function is normally the first few instructions will do is LR register values saved to the stack memory, What the last few instructions of the called function do is restore the contents of stack memory to the LR register.

A special function call scenario is that when the function call occurs in the last instruction of the caller’s function, the protection processing of the call site is not required, and the function call instruction is changed to jump instruction, because the last instruction of the caller has no next effective instruction. If the call instruction is still used, the saved call field is an invalid address, so that when the function returns, it will jump to the invalid address, resulting in an execution exception!

To better describe the calling rules of functions, suppose function A calls function B and function C internally, the following defines the address of each function, the address of the function invocation, and the pseudocode block of the function invocation:

// where XX,YY,ZZ represent the memory address of the function instruction. A XX1: XX2: call B function address YY1 XX3: XX4: XXn: Jump to C function ZZ1 B YY1: YY2: YY3: YYn: return C ZZ1: ZZ2: ZZ3: ZZn: returnCopy the code

1. Function call rules in x86_64

1.1 Function invocation

The instruction for the function call is the call instruction. In assembly language, the operand after the call instruction is the absolute address of the target function to be called. In actual machine instructions, the operand is a relative address value, which is the relative offset of the target function address from the current instruction address. On both x86 and ARM systems, if the operand part of an instruction is a memory address, it is usually an offset address relative to the current instruction rather than an absolute address. Here are the function call instructions and their internal equivalents.

call YY1   <==>   RIP = YY1,   RSP = RSP-8,  *RSP = XX3
Copy the code

That is, executing a function call instruction is equivalent to assigning the address of the instruction to the IP register and pushing the return address of the function onto the stack register.

1.2 Jump of functions

The instruction for a function jump is the JMP instruction. Behind the JMP instruction operand in assembly language is called the objective function of an absolute address, while the actual machine instructions in the operand is a relative address values, the address value is the objective function from the current instruction address relative offset value, jump instruction below is function and its internal implementation of equivalent operation.

  jmp ZZ1  <==>  RIP = ZZ1
Copy the code

That is, executing a jump instruction is equivalent to assigning an address from the instruction to an IP register.

1.3 Return of function

The instruction returned by the function is ret. Ret instructions are generally not followed by operands. Here are the function return instructions and their internal equivalents.

 ret   <==>   RIP = *RSP,   RSP = RSP + 8
Copy the code

That is, executing a RET instruction is equivalent to assigning a value from the current stack register to the IP register while the stack register performs a POP operation.

2. Function call rules in ARM32-bit system

2.1 Function Invocation

The call instruction is BL/BLX. The operands of these two instructions can be relative address offsets or registers. The difference between BL/BLX is that BL function calls do not switch instruction sets, whereas BLX calls switch from thumb instruction sets to ARM instruction sets or vice versa. There are two sets of instruction sets in ARM32 system, namely thumb instruction set and ARM instruction set. In the ARM instruction set, all the instruction sets are 32 bits in length while thumb instruction set has 32 bits and 16 bits in length. Both instruction sets are used on a function basis, which means that all instructions in a function are either ARM instructions or thumb instructions. Because of this, if the caller and the called function are using different instruction sets, the function call needs to be performed through BLX, and if the two are using the same instruction set, the call needs to be performed through BL instructions. Here are the function call instructions and their internal equivalents.

 bl/blx  YY1  <==>  PC = YY1,  LR = XX3
Copy the code

That is, executing a function call instruction is equivalent to assigning the address of the instruction to the PC register and the return address of the function to the LR register.

2.2 Jump of functions

The jump instruction of a function is B /bx. The operands of these two instructions can be relative address offset or register. The difference between B and bx is that b function calls do not switch instruction sets. Here are the function jump instructions and their internal equivalents.

b/bx ZZ1   <==>  PC = ZZ1
Copy the code

That is to say, the jump instruction is equivalent to assigning the address in the instruction to the PC register.

2.3 Return of functions

Arm32-bit systems do not have a special function to return RET instructions, because ARM32-bit systems can directly modify the value of the PC register, so the function return can be directly assigned to the PC instructions, or by calling B /bx LR to implement the function return processing.

B /bx LR // or mov PC, XXXCopy the code

The ARM32-bit system can directly modify the value of the PC register, so when the function returns, it can directly set the value of the PC register to the return address of the function, or it can execute the B/BX jump instruction and specify the target address to be the value in the LR register.

3. Function call rules in ARM64-bit system

3.1 Function Invocation

The instructions for the function call are BL/BLR, where the operand of BL instruction is the offset address relative to the current position, and the operand of BLR instruction is the register, indicating the address specified by the call register. Because the operand part of BL instruction is the relative offset address of the function, and because an instruction of ARM64-bit system occupies 4 bytes, according to the definition of the instruction, the range of BL instruction can jump is ±32MB from the current position, so if you want to jump to the farther address, you need to use THE BLR instruction. Here are the function call instructions and their internal equivalents.

// If YY1 address is within ±32MB from the call instruction, then use bl instruction. Bl YY1 <==> PC = YY1, LR = XX3 // if the distance between YY1 address and the calling instruction is greater than ±32MB, the indirect call is performed using the BLR instruction. ldr x16, YY1 blr x16Copy the code

That is, executing a function call instruction is equivalent to assigning the address of the instruction to the PC register and the return address of the function to the LR register.

3.2 Jump of functions

The instruction of function jump is B/BR, where the operand of B instruction is the offset address relative to the current position, and the operand of BR instruction is the register, indicating the jump to the address specified by the register. Here are the function jump instructions and their internal equivalents.

b ZZ1   <==>  PC = ZZ1
Copy the code

That is to say, the jump instruction is equivalent to assigning the address in the instruction to the PC register.

3.3 Return of a function

The instruction returned by the function is RET. Here are the instructions returned by the function and their internal equivalents.

 ret  <==>   PC = LR
Copy the code

That is, executing a RET instruction is equivalent to assigning a value from an LR register to a PC register.

Function parameter transfer

Some function definitions have parameters that need to be passed by the caller function to the called function. Therefore, when calling such functions, the parameters of the function need to be passed before executing the function call instruction. A function can have zero arguments, a fixed number of arguments, or any number of arguments (variable arguments). Each parameter type of a function can be an integer data type, a floating-point data type, a pointer, or a structure. Therefore, the rules for function passing need to specify how the caller should save the parameters and where the caller will get the values of the externally passed parameters. Different systems under different systems will make different rules based on the number and type of parameter definitions. In general, each system will agree on some special registers for parameter passing and switching, or use stack memory for parameter passing and switching.

1. Parameter transfer rules in x86_64

1.1 General Type Parameters

Normal type parameters here refer to parameter types other than floating-point and struct types. Here are the rules for passing normal parameters:

  • R1: If the function has no arguments, nothing is done except to execute the function call. If the function has arguments, the parameter values should be set as follows before executing the function call instruction.

  • R2: If the number of parameters of the function is less than =6, the parameters will be stored in the six registers RDI, RSI, RDX, RCX, R8, and R9 in the order defined from left to right.

  • R3: If the number of arguments is greater than 6, then more than 6 arguments will be pushed onto the stack from right to left. (Since the stack is decreasing from the highest address to the lowest address, the following parameters are still in left-to-right order when counting from the top of the stack)

  • R4: If the size of each parameter type is less than 8 bytes, the first 6 parameters will be stored in the corresponding 32-bit or 16-bit or 8-bit version of the register described above.

Here are several function definitions and implementation rules for executing this function call and parameter passing (the function interface described in the top section of the code block below, and the function call ABI rules in the bottom section) :

Void foo1(long, long); void foo2(long, long, long, long, long, long); void foo3(long, long, long, long, long, long, long, int, short); Foo1 (a,b) <==> RDI = a, RSI = b, call foo1 foo2(a,b, C,d,e,f) <==> RDI = a, RSI = b, RDX = c, RCX = d, R8 = e, R9 = f, call foo2 foo3(a,b,c,d,e,f,g,h,i) <== > RDI = a, RSI = b, RDX = c, RCX = d, R8 = e, R9 = f, RSP -= 2, *RSP = i, RSP-=4, *RSP = h, RSP-=8, *RSP = g, call foo3Copy the code

1.2 Floating point parameters

If the function argument has a floating point (either single or double) type. The parameter is stored not in a general purpose register, but in a specific floating-point register. Here are the rules for passing:

  • R5: If the number of floating-point arguments is less than =8, parameter passes are saved in the eight registers Xmm0-XMM7 in the order defined from left to right.

  • R6: If the number of floating-point arguments is greater than 8, the arguments that exceed the number will be pushed on the stack from right to left.

  • R7: The order and rules saved to the register do not affect each other if the function parameters are both floating point and regular.

  • R8: If the argument type is long double and the length of the extended floating-point type is 16 bytes, then all arguments of the long double type are pushed directly into the stack (note that the stack is not the floating-point register stack) and not into the floating-point register.

Here are some examples of functions:

Void foo4(double, double); void foo5(double,float, double, double, double, double, double, double, float, double); void foo6(long, double, long, double, long, long, double); void foo7(double, long double, long); Foo4 (a,b) <==> XMM0 = a, XMM1 = b, call foo4 foo5(a,b, C, D,e,f,g,h, I,j) <==> XMM0 = a, XMM1 = b, XMM2 = c, XMM3 = d, XMM4 = e, XMM5 = f, XMM6 = g, XMM7 = h, RSP-=8, *RSP = j, RSP-=4 *RSP = i, call foo5 foo6(a,b,c,d,e,f,g) <==> RDI = a, XMM0 = b, RSI = c, XMM1 = d, RDX = e, RCX = f, XMM2 = g, Call foo6 foo7(a,b,c) <==> XMM0=a, RSP-=16, *RSP = lower 8 bytes of b, *(RSP+8) = higher 8 bytes of b, RDI = c, call foo7Copy the code

1.3 Structural parameters

For the parameters of structure type, the data types of the members in the structure and the size of the structure should be considered. The size of the structure can be divided into three types: 8 bytes less than or equal to, 16 bytes less than or equal to, and greater than 16 bytes. The structure’s member types are all regular data types, all floating-point data types (excluding long doubles), and mixed types. Thus, there are altogether 9 combinations. The following table describes the transfer rules of structure parameters:

  • R9:
Type/Size < = 8 < = 16 > 16
All regular data types One of the six general purpose registers Two of six general purpose registers in succession Push into the stack memory
All floating point data types One of eight floating-point registers Two of eight floating-point registers Push into the stack memory
Mixed type General purpose registers take precedence over floating-point registers, and the order in which the members are arranged Refer to the left Push into the stack memory
  • R10: Structures less than or equal to 16 bytes are stored in registers not individually for each data member, but in 8-byte split units according to the order of memory layout boundaries in the structure.

  • R11: If there is a mixture of structure, regular and floating point parameters, the first 10 rules save the passed parameters respectively

Here is some sample code for structures used as arguments:

Struct S1 {char a; struct S1 {char a; char b; int c; }; Struct S2 {struct S2 {float a;
   floatb; double c; }; Struct S3 {int a; struct S3 {int a; int b; double c; }; Struct S4 {long a; long b; double c; } void foo8(struct S1); void foo9(struct S2); void foo10(struct S3); void foo11(struct S4); Struct S1 S1; struct S1 S1; struct S2 s2; struct S3 s3; struct S4 s4; foo8(s1) <==> RDI = s1.a | (s1.b <<8) | (s1.c << 32), call foo8 foo9(s2) <==> XMM0 = s2.a | (s2.b << 32), XMM1 = s2.c, call foo9 foo10(s3) <==> RDI = s3.a | (s3.b << 32), XMM0 = s3.c, call foo10 foo11(s4) <==> RSP -= 24, *RSP = s4.a, *(RSP+8) = s4.b, *(RSP+16)=s4.c, call foo11Copy the code

The recommended parameter for structure types is to pass Pointers rather than the structure value itself.

1.4 Variable Parameters

Variable-parameter functions have variable parameter types and number of parameters. Therefore, the system will treat variable-parameter functions differently during compilation according to the value types of the parameters passed during function calls. Therefore, the rules are as follows:

  • R12: The function will be called according to the number and type of the parameters passed from left to right in the corresponding 6 conventional parameter transfer registers or XMM0-XMM7, if the number exceeds the specified, the remaining parameters will be pushed into the memory in turn.

  • R13: The AL register is used for the call of the variadic function. The rule is: if there is no floating point type in the variadic parameter, the AL register is set to 0, and if there is a floating point type in the variadic parameter, the AL register is set to 1. AL registers are used to mark the reason is that the variable parameters of the internal implementation because they do not know what kind of parameters and external will pass the number of parameters, so that all will be passed as a parameter in the internal implementation of conventional register and save the floating-point register will be passed as a parameter to an array, in order to convenient for processing. So using this AL register to determine whether there are floating points can reduce the length of the array to some extent.

    Here is an example of a call to mutable arguments:

Void foo12(int a,...) ; Foo12 (10,20,30.0, 40) <==> RDI = 10, RSI = 20, XMM0 = 30.0, RDX = 40, AL=1, call foo12 foo12(10,20,30) <==> RDI = 10, RSI = 20, RDX = 30, RCX = 40, AL=0, call foo7Copy the code

An interesting example is when the printf function is called with the following arguments:

printf("%f,%d,%d", 10, 20.0, 30.0); // The output will be: 20.0,10,??Copy the code

The reason is that the rules for passing parameters do not match the format string. Can you explain why by referring to the rules for passing mutable parameters?

2. Parameter transfer rules in ARM32-bit system

The entire ARM 32-bit system does not use floating point registers for parameter passing and parameter return. Basic types larger than 4 bytes are split into two parts and stored successively in two registers.

2.1 General Parameters

  • R1: For 32-bit general parameters, if the number is less than =4, they are stored in R0-R3, and if the number is greater than 4, the remaining parameters are pushed into the memory from right to left.

  • R2: If a parameter has a 64-bit parameter such as long long, the parameter occupies two registers, with the lower 32 bits stored in the former register and the higher 32 bits in the latter register.

  • R3: If the first three parameters are 32-bit and the fourth parameter is 64-bit, then the first three parameters are put in R0,R1,R2, respectively, while the lower 32 bits of the fourth parameter are put in R3 and the higher 32 bits are pushed into stack memory.

2.2 Floating Point Parameters

  • R4: Floating-point parameters use registers R0 to R3 like regular parameters, one register for single-precision floating-point and two registers for double-precision floating-point. The excess is pushed into the stack memory.

2.3 Structure parameters

  • R5: THE structure of ARM32-bit system does not distinguish the member data type, only the size of the structure, the system according to the memory layout of the structure in 4 bytes as a partition unit stored in the register or stack memory.

  • R6: Structure size <=4 saves parameters to one register, two consecutive registers if size <=8, three consecutive registers if size <=12, and four consecutive registers if size <=16. If the size is greater than 16, it is saved to stack memory.

  • R7: If the first three arguments are all 32-bit and the fourth argument is a structure of size >4, the lower four bytes of the fourth argument are saved to R3 and the rest to stack memory.

2.4 Variable Parameters

  • R8: Variable parameter transfer according to the number of parameters from left to right in the r0-R3 four registers, the excess part from right to left in the stack memory. Example code:
Void foo1(int a,...) ; // High-level language function calls and corresponding machine instruction pseudo-code implementation. Foo1 (10,20,30,40,50) <==> R0 = 10, R1 = 20, R2 = 30, R3 =40, SP -=4, *SP = 50, bl foo1Copy the code

3. Parameter transfer rules in ARM64-bit system

3.1 General Parameters

General parameters are non-floating point and non-struct parameters. Here are the rules for passing general parameters:

  • R1: If the function has no arguments, nothing is done except to execute the function call. If the function has arguments, the parameter values should be set as follows before executing the function call instruction.

  • R2: If the number of arguments in the function is less than =8, parameter passes are saved in the eight registers x0-x7 in the order defined from left to right.

  • R3: If the number of parameters is greater than 8, the parameters that exceed the number will be pushed on the stack from right to left.

  • R4: If the parameter type is less than 8 bytes, the first 8 parameters will be stored in the corresponding 32-bit, 16-bit, or 8-bit registers respectively.

Here are some examples of functions:

Void foo1(long, long); void foo2(long, long, long, long, long, long, long, long); void foo3(long, long, long, long, long, long, long, long, long, int, short); // High-level language function calls and corresponding machine instruction pseudo-code implementation. foo1(a,b) <==> X0 = a, X1 = b, bl foo1 foo2(a,b,c,d,e,f,g,h) <==>X0 = a, X1 = b, X2 = c, X3 = d, X4 = e, X5 = f, X6=g, X7 =h, bl foo2 foo3(a,b,c,d,e,f,g,h,i,j,k) <==>X0 = a, X1 = b, X2 = c, X3 = d, X4 = e, X5 = f, X6=g, X7=h, *SP -=2, *SP=k, SP-=4, *SP = j, SP-= 8, *SP = i, bl foo3Copy the code

3.2 Floating Point Parameters

If the function argument has floating point numbers (whether single or double). The parameter is stored not in a general purpose register, but in a specific floating-point register. The system provides 32 128-bit floating point registers q0-Q31 (v0-V31), where the lower 64 bits are called D0-D31, where the lower 32 bits are called S0-S31, where the lower 16 bits are called H0-H31, and where the lower 8 bits are called B0-B31. That is, single-precision floats are stored in registers starting with S, and double-precision floats are stored in registers starting with D. Long doubles in ARM systems are 8 bytes long and therefore can be treated as a double floating-point.

Here are the rules for passing:

  • R5: If the number of floating-point arguments is less than =8, parameter passes are stored in the eight registers d0-D7 or S0-S7 in left-to-right order.

  • R6: If the number of floating-point arguments is greater than 8, the arguments that exceed the number will be pushed onto the stack from right to left.

  • R7: The order and rules saved to the register do not affect each other if the function parameters are both floating point and regular.

Here are some examples of functions:

Void foo4(double, double); void foo5(double,float.float, double, double, double, double, double, double, double); void foo6(long, double, long, double, long, long, double); // High-level language function calls and corresponding machine instruction pseudo-code implementation. foo4(double a, double b) <==> D0 = a, D1 = b, bl foo4 foo5(double a,float b, float c, double d, double e, double f, double g, double h, double i, double j) <==> D0 = a, S1 = b, S2 = c, D3 = d, D4 = e, D5 = f,  D6 = g, D7 = h,    *SP -=8,  *SP = j,   *SP -=8,  *SP = i,  bl foo5
foo6(long a, double b, long c, double d, long e, long f, double g) <==> X0 = a, D0 = b,  X1 = c,  D1 = d,  X2 = e, X3 = f,  D2 = g,  bl foo6

Copy the code

3.3 Structural parameters

For structure-type parameters, you need to consider the size of the structure as well as the data type and number. The size of the structure here is considered less than or equal to 8 bytes, less than or equal to 16 bytes, and greater than 16 bytes. The structure member types are: all non-floating-point data members, all floating-point members (single and double are distinguished here), and members of mixed type (if the structure has both single and double are mixed). Here are the rules for struct parameters:

  • R8: If all data members are non-floating point data members, the value is saved to one of the registers in X0-X8 if size <=8, and to two consecutive registers in X0-X8 if size <=16. If size >16, the structure is no longer passed as a value but as a pointer and stored in a register in x0-X8.

  • R9: If all the data members are single-precision floating-point members, if the number of members is <=4, the data members will be stored in one of the four consecutive floating-point registers in S0-S7. If the number is >4, the structure will no longer be passed by value but will be passed as Pointers and stored in one of the registers in X0-X8.

  • R10: If all the data members are double-precision floating-point members, if the number of members is <=4, the data members will be stored in one of the four consecutive floating-point registers in D0-D7. If the number is >4, the structure will no longer be passed by value but will be passed as a pointer and stored in one of the registers in X0-X8.

  • R11: If the data member is of mixed type, it is stored in one of the registers in X0-X8 if the size is <=8, in one of the two consecutive registers in X0-X8 if the size is <=16. If the size is >16, the structure is no longer passed as a value but as a pointer and stored in one of the registers in X0-X8.

  • R12: Because the register rules of structure parameters affect the above non-structure parameters’ transfer rules, structures can be treated as multiple parameter transfers to a certain extent.

Here is the demo code:

Struct S1 {char a; struct S1 {char a; char b; int c; }; Struct S2 {struct S2 {float a;
   float b;
   floatc; }; Struct S3 {int a; struct S3 {int a; int b; double c; }; Struct S4 {long a; long b; double c; } void foo8(struct S1); void foo9(struct S2); void foo10(struct S3); void foo11(struct S4); Struct S1 S1; struct S1 S1; struct S2 s2; struct S3 s3; struct S4 s4; foo8(s1) <==> X0= s1.a | (s1.b <<8) | (s1.c << 32), bl foo8 foo9(s2) <==> S0 = s2.a, S1 = s2.b, S3 = s2.c bl foo9 foo10(s3) <==> X0 = s3.a | (s3.b << 32), X1 = s3.c, bl foo10 foo11(s4) <==> X0 = &s4, bl foo11Copy the code

3.4 Variable Parameters

Variable-parameter functions have variable parameter types and number of parameters. Therefore, the system will treat variable-parameter functions differently during compilation according to the value types of the parameters passed during function calls. Therefore, the rules are as follows:

  • R13: Function calls are made based on the number and type of arguments passed, where the clearly typed parts are passed according to the rules described above, while the variable parts are pushed onto the stack from right to left.

Here is the sample code:

Void foo7(int a,...) ; Foo7 (10, 20, 30.0, 40) <==> X0 = 10, SP-=8, *SP = 40, SP-=8, *SP = 30.0, SP-=8, *SP = 20, bl foo7Copy the code

An interesting example is when the printf function is passed as follows:

 printf("%f,%d,%d", 10, 20.0, 30.0); // Then the output will be:? ,? ,?Copy the code

Since arm system passes variable parameters differently from x86 system, there will be inconsistency between the real machine and simulator results. There are even differences in parameter passing rules between ARM32 – and ARM64 – bit systems. Can you tell why the output is inconclusive when the arguments passed above do not match the description?

The return value of the function

In addition to passing parameters, function calls also return parameters. The passing of parameters is the caller to the direction of the called function, and the return of function is the called function to the direction of the calling function, so there should be a unified rule between the caller and the called. Processing of the return value within the called function should occur before the return instruction of the called function is executed. Calling a function should process the returned result as early as possible in the next instruction of the function call instruction. There are four types of function return type: whether or not, non-floating point, floating point and structure. Therefore, there are different processing rules for different return type systems.

1. Function return value rule in x86_64

1.1 General Type Return

  • R1: Always save the return value to the RAX register if the function has a return value.

1.2 Floating point type return

  • R2: The returned floating-point type is saved in the XMM0 register.

  • R3: The returned (extended double)long double is stored at the top of the floating point register stack. Eight independent 128-bit registers STMM0-STMM7 are provided in the FPU computing unit. These eight registers are organized together in the form of a stack, collectively known as the floating point register stack. At the same time, the system also provides special instructions for loading and unloading floating point register stack. When writing floating point instructions, these registers are also written as ST (x), where X is the index of the floating point register. It should be noted that the REGISTERS in the XMM series and STMM series are completely different sets of registers.

1.3 Structure type return

For structure-type returns, the size of the structure and the data types of the members need to be considered. The size of the structure is divided into 8 bytes less than or equal to, 16 bytes less than or equal to, and greater than 16 bytes. The structure member types are: all non-floating-point data members, all floating-point data members (excluding long doubles), and members of mixed types. There are nine cases, and the following table describes the rules for returning structures:

  • R4
Type/Size < = 8 < = 16 > 16
All non-floating point data members RAX RAX,RDX The returned structure is saved to the memory address pointed to by the RDI register. The RDI register is a structure address pointer, so the first parameter in the function argument will be saved to the RSI register instead of RDI.
All are floating point data members XMM0 XMM0,XMM1 Same as above
Mixed type First store in RAX, or XMM0, and then store in RDX or XMM1. A special case is that if there is a long double in the member, the return value is always treated as >16 bytes With the left Same as above

Here is the code shown:

Struct S1 {char a; struct S1 {char a; char b; int c; }; Struct S2 {int a; struct S2 {int a; int b; double c; }; Struct S3 {long a; struct S3 {long a; long b; double c; } struct S1 foo1(); struct S2 foo2(); struct S3 foo3(int ); Struct S1 S1 = foo1() <==> struct S2 S2 = foo2() <==> Struct S3 S3 = foo3(a) <==> RDI = &s3, RSI = a, call foo3Copy the code

2. Function return value rule in ARM32-bit system

2.1 General Type Return

  • R1: The size of the return value of the function <=4 bytes is saved to register R0, if the size of the return value <=8 bytes (such as long long type) is saved to register R0, where the lower 32 bits are saved to register R0 and the higher 32 bits are saved to register R1

2.2 Floating point Type Return

  • R2: Single-precision floating-point numbers are stored in register R0, and double-precision floating-point numbers are stored in R0,R1, where R0 holds the lower 32 bits and R1 holds the higher 32 bits. The return of long Double is the same as that of a floating-point double.

2.3 Structure Type Return

  • R3: Regardless of any type of structure, always structure back to the R0 register points to memory, so the R0 register save is a pointer, the first parameter to this function will be saved to the R1 register and ordinal push back, that is to say, if a function returns a structure system will replace the value returned as the first parameter, Take the actual first parameter as the second parameter.

The following code illustrates the situation:

Struct XXX {// arbitrary contents}; Struct XXX foo(int a) {//... Void foo(struct *pret, int a) {}Copy the code

In arm32-bit systems, any function that returns a structure is stored in R0 as the first argument to the function call, and in R1 as the first argument to the source code.

3. Function return value rule in ARM64-bit system

2.1 General Type Return

  • R1: The return parameters of the function are saved to register X0

2.2 Floating point Type Return

  • R2: Single-precision floating-point returns are saved to S0 and double-precision floating-point returns to D0

2.3 Structure Type Return

For structure type parameters, you need to consider the data types of the members in the structure and the size of the overall structure. The size of the structure here is considered less than or equal to 8 bytes, less than or equal to 16 bytes, and greater than 16 bytes. The structure member types are: all non-floating-point data members, all floating-point members (single and double are distinguished here), and members of mixed type (if the structure has both single and double are mixed). There are nine cases, and here are the rules for returning struct types:

  • R3: For structures with non-floating point data members, if the size of the structure is <=8, the value of the structure is saved to X0,X1 if the size is <=16. If the size is >16, the return of the structure is saved to the memory pointed to by the X8 register. A pointer specifically used to hold the returned structure.

  • R4: If all members of the structure are single-precision and the number is <=4, each member of the structure is saved to S0,S1,S2, S3 respectively. If the number of members of the structure is more than 4, the structure is returned to the memory pointed to by the X8 register.

  • R5: If all the members of the structure are double precision and the number is <=4, each member of the return structure is saved to the four registers D0,D1,D2,D3 respectively. If the number of members of the structure is more than four, the return structure is saved to the memory pointed to by the X8 register.

  • R6: If the structure is a mixed data member and the size of the structure is <=8 bytes, the value of the structure is saved to X0, X0,X1 if the size is <=16 bytes, and the structure return is saved to the memory pointed to by the X8 register if the size is >16.

Here are some struct definitions and functions that return structs:

Struct S1 {char a; char b; double c; }; Struct S2 {int a; struct S2 {int a; int b; int c; double d; }; Struct S3 {int a; struct S3 {int a; int b; }; CGRectfoo1() {// The return of the high-level language implementationreturn,20,30,40 CGRectMake (10); /* D0 = 10 D1 = 20 D2 = 30 D3 = 40 ret */} struct S1foo2() {// The return of the high-level language implementationreturn(struct S1){10, 20, 30}; / / machine instructions () function returns the pseudo code is as follows: / * X0 = 10 | < < 20 8 X1 = 30 ret * /} struct S2foo3() {// The return of the high-level language implementationreturn(struct S2){10, 20, 30, 40}; /* struct S2 *p = X8 p->a = 10 p->b = 20 p->c = 30 p->d = 40 ret */foo4() {// The return of the high-level language implementationreturn(struct S3){20, 30}; / / machine instructions () function returns the pseudo code is as follows: / * X0 = 20 | 30 < < 32 ret * /}Copy the code

As you can see from the code above, in x86_64/ ARM32, if the type returned is a structure and certain requirements are met, the system treats the structure pointer as the first parameter of the function, and moves the register passed as the first parameter in the source code back. In ARM64-bit systems, the X8 register handles the case where the return value is a special structure.

The objc_msgSend series of functions

All OC methods will eventually be called through the objc_msgSend series of functions. This family of functions has the following functions:

objc_msgSend(void /* id self, SEL op, ... */ )
objc_msgSend_stret(void /* id self, SEL op, ... */ )
objc_msgSend_fpret(void /* id self, SEL op, ... */ )
objc_msgSend_fp2ret(void /* id self, SEL op, ... */ )
Copy the code

The main difference in this series of functions is the use of different message sending functions for different return types.

As you can see from the function return value rule above, the x86_64-bit system handles long double returns in a special floating-point stack register. So the objc_msgSend_fpret function is only used in message distribution for OC methods that return type long double on x86_64-bit systems, and is not used in any other architecture. Also because C99 introduced the _Complex keyword, the objc_msgSend_fp2ret function is used for long double returns of this type.

If the size of the structure is larger than a certain threshold, both x86_64-bit and ARM32-bit systems will convert the returned structure into the first parameter, which will make the actual parameter transfer register later. Arm64 uses only the X8 register to store structure Pointers that are larger than the threshold without affecting the order of parameter passing. So the objc_msgSend_stret function will be used for message distribution for OC methods that return structures greater than a certain threshold in architecture systems other than ARM64-bit systems.

The same function return rules as above apply to other functions in <objc/message.h>.


These are the rules for calling functions, passing arguments, and returning values of functions, and of course these rules apply to OC class methods as well as ordinary functions. There are rules as to how a function should be implemented internally. Through these rules you can understand how to function with the stack memory together, and how the function call stack is constructed, you can understand why some function calls will not appear in the call stack, and so on related knowledge, as well as internal variable parameter function is how to implement and so on this part of the details will be: Deep into the bottom of the iOS system functions (ii): implementation of in-depth discussion.

Seven, reference

  • Blog.csdn.net/q_l_s/artic…

  • Developer.apple.com/library/arc…

  • Armv8, ARMV7, X86_64-bit system CPU manual

  • Blog.sina.com.cn/s/blog_8619…

👉 [Back to directory]


Welcome to visit myMaking the address