preface

This article mainly analyzes the essence of function from the point of view of assembly. In the process of analyzing function, it will solve the problem of dead loop at the end of the last article.

First, basic knowledge

Following the content of the previous article 01- compilation foundation (1), we will introduce several common basic knowledge points.

1.1 the stack

Before we talk about functions, let’s look at the stack, because the scope of the implementation code of a function corresponds to the stack in memory. A stack is a storage space with special access methods (Last In first Out, Last In Out Firt, LIFO). As we all know, the stack operation is nothing more than 👇

  • Stack 👉 push
  • The stack 👉 pop

1.1.1 Stack structure

The stack structure is like a tube with only one port, as shown below 👇

  • We start with two Pointers 👉 to the top and bottom of the stack, both pointing to the bottom of the empty stack
  • Open the space
    • Used to be: variablesPush (push), memory will open up space
    • Now it’s: make space first, thenPush (push)variable

So how does the 👉 system know how much space to make?

The compiler decides, because once your code is compiled, the compiler knows how many stacks to apply for.

1.1.2 SP FP register

The SP and FP registers can be used to check the stack size, because 👇

  • Sp registersWill save us at any timeAddress at the top of the stack
  • Fp registersAlso known asX29 register, belong toUniversal register, but inAt some pointWe use it to preserveThe address at the bottom of the stack!
    • Fp is not required when there are no nested calls to functions, equivalent to a cut-off point

⚠️ ARM64 start, cancel 32-bit LDM,STM,PUSH,POP instructions! LDR \ LDP STR \ STP was used instead. Stack manipulation in ARM64 is 16 byte aligned!!

In ARM64, the stack space is opened first, fp moves to the top of the stack and then stores the contents in the stack. As mentioned above, the size is determined at compile time, so there is no push operation. Meanwhile, in iOS, the stack space is opened from the high address to the low address.

1.2 Function call stack

As we all know, the implementation of the function is carried out in the stack, the function is completed after the space of the stack automatically released, so in assembly, the common function call stack open up and restore the code 👇

Sub sp, sp, #0x40; STP x29, x30, [sp, #0x30]; Add x29, sp, #0x30; X29 points to the bottom of the stack frame... LDP x29, x30, [sp, #0x30]; Add sp, sp, #0x40; Stack balance retCopy the code

The above assembly code execution process, as shown below 👇

  1. throughSub reduced instructionIn this case, SP points to the lower address, x29 also points to fp at the bottom of the stack, that is, the higher address
  2. functionretThat is, the call is completed before returning, need to passAdd to add instructions, restore the SP register address pointing, this is calledThe stack balancing
  3. After the recoveryData not destroyed, the next time after stretching the stack space, willOverwrite before you read. If I read it first, I read itGarbage data.

Two, memory read and write instructions

Note that ⚠️ reads/writes data to higher addresses.

There are two main read/write instructions 👇

  1. STR (store register) instructions👉 Read the data out of the register and store it in memory.
  2. LDR (load register) instructions👉 Reads data out of memory and stores it in a register

STR LDR is a special instruction for memory and register interaction.

Two other common instructions, STP and LDP, mean that you can read and write two registers at the same time.

practice

X0 (x0); x0 (x1); x0 (x1); The code is 👇

.text .global _C _C: sub sp, sp, #0x20 ; STP x0, x1, [sp, #0x10] LDP x1, x0, [sp, #0x10] add sp, sp, #0x20; Restore stack space RETCopy the code

The code above, the first and penultimate lines, general operations, stretching and restoring stack space, focusing on the middle two lines of code (assembly code from right to left) 👇

  • stp x0, x1, [sp, #0x10]
    • [sp, # 0 x10] 👉[]Address sp’s address plus 0x10, but note ⚠️spThe address itself points toconstant
    • STP x0, x1 👉 we talked about this above, operating two registersX0 and x1To store the value in memory
  • LDP x1, x0, [sp, #0x10] 👉The x1 and x0In the

At this point, the above code completes the exchange of values in registers X0 and x1. We know that the exchange of values a and B requires the use of a third temp variable, so the memory here acts as temp, as shown below 👇

Sample debugging

Note, however, that the value in ⚠️ memory is unchanged, and the pointing address of sp register is unchanged, only the values in x0 and X1 register are changed. And then we can debug it.

At breakpoint 0x104631C8c, assign 0xA to x0 and 0xb to x1, read the address value of sp, and then step down 👇

As shown in the figure above, x0 and x1 have been switched, but when read the sp address again, there is no change, it is still 0x000000016b7d1190. Then step to 👇

If sp is restored, the stack space is freed, and 0xA and 0xB are still in memory, it is not freed. In fact, if you think carefully, every time sub stretches the stack space, it writes data to cover the value of the memory space through STR or STP, so there will be no problem. We can view memory via view Memory 👇

In the figure above, enter 0x000000016b7d1190 to check the address. As expected, a and B are not released in the memory.

2.1 BL and RET instructions

Next we look at the BL and RET directives.

Bl label

  • Place the address of the next instruction into the LR (X30) register
  • Go to the label to execute the instruction

B means jump, and L means place the address of the next instruction in the LR (X30) register. Again, look at the example above, check the lr register address after the jump C function, as shown below 👇

Lr is equivalent to a saved way home.

ret

The default value of lr(X30) register is used, and the underlying instruction prompts the CPU to use this as the next instruction address!

Ret will only look at LR.

Note the features of the ⚠️ ARM64 platform, which are optimized for hardware.

2.2 X30 register

The X30 register, also known as lr register, holds the return address of the function. When ret is executed, it looks for the address value stored in the X30 register! #### case demonstration we still use a case demonstration to everyone see, very simple, C function bl jump to D function (C function call D function) 👇

.text
.global _C, _D

_C:
    mov x0,#0xaaaa
    bl _D
    mov x0,#0xaaaa
    ret

_D:
    mov x0,#0xbbbb
    ret
Copy the code

The code to call is 👇

int C();
int D();
- (void)viewDidLoad {
    [super viewDidLoad];
    printf("C");
    C();
    printf("D");
}
Copy the code

In (C); On this line, run, view assembly 👇

The current LR refers to the address of the bl c() instruction, and then steps into c() 👇

Then jump to D() 👇

At this point, lr’s address is changed again to 0x00000001021e5C78, and then proceed to C() 👇

Lr’s address is the same as it was in D(), and if you keep going, you’ll notice that the breakpoint keeps jumping between 0x1021e5C78 and 0x1021e5C7c, and you can’t go back to viewDidLoad, so it’s in an infinite loop.

This is our last article 01- assembly foundation (1) in the last encountered the problem of the loop, now we will analyze:

Now that we know what the BL directive does, which is to save the address back to (the way home), we need to find a way to save the address back to viewDidLoad, and we need to save it before the BL, because 👉 encounters the BL and lr changes.

Now we know when to save, before BL, but where to save?

If you save to another register, there is no guarantee that the system will overwrite the address value of the other register, so you have to try to save in a private area of their own, where is this area? Obviously, the stack area of the function itself.

At this point, we know that lr’s address is stored in the function’s own stack area before BL.

Next, how to write assembly to implement the save operation. Since we don’t know how to write, why don’t we write OC instead of assembly, and then see what happens at the bottom of assembly.

void a() {
    b();
    return;;
}

void b() {
    
}

- (void)viewDidLoad {
    [super viewDidLoad];
    a();
}
Copy the code

Step into view the compilation of a() 👇

It seems that the key is the first and ret before the instruction, we first look at the meaning of the first instruction, the old rules, from right to left 👇

  • stp x29, x30, [sp, #-0x10]!
    • [sp, #-0x10]!👉 because it is #-0x10 negative, it stretches 16 bytes of space. Note the exclamation mark!, which means that if I assign this value to sp, the whole thing is equal tosp -= 0x10
    • stp x29, x30👉 is very simple, the address of sp is stretched and stored in x29 and x30 registers successively, so x29 address is sp, x30 address is sp-0x08

After analyzing the first sentence, look at the LDP directive, which is not so difficult 👇

  • ldp x29, x30, [sp], #0x10
    • [sp], #0x10👉 is not hard to guess, is to restore sp pointer pointing, the whole is equivalent tosp += 0x10, restore stack space
    • ldp x29, x30👉 gives the stack value to x29, x30

C() and D(), just write 👇

.text
.global _C, _D

_C:
    str x30, [sp,#-0x10]!
    mov x0,#0xaaaa
    bl _D
    mov x0,#0xaaaa
    ldr x30,[sp],#0x10
    ret

_D:
    mov x0,#0xbbbb
    ret
Copy the code

Run, debug see 👇

Step into C function 👇

Then step into the D function 👇

Then step down to 👇

0x0000000100C7dc50 saved in LR is the address saved back to C function. The sp register address is 0x000000016f1851A0 and view memory 👇

As we know, the SP register is the address pointing to the top of the stack. Let’s go back to the compilation of the bl jump C() function in ViewDidLoad 👇

The value 0x0100C7dCD0 is the address of the next instruction in the bl jump C() function, which confirms that ViewDidLoad’s LR register is stored on its own stack.

[sp],#0x10,x30 = 0x0100C7dcd0, ViewDidLoad = 0x0100C7dcd0, and the loop has been resolved.

From what has been discussed above

⚠️ when the function is called in a nested way, x30 needs to be pushed onto the stack!

If I stretch it by 8 bytes

What happens if you stretch only 8 bytes of space? 👇

_C:
    str x30, [sp,#-0x8]!
    mov x0,#0xaaaa
    bl _D
    mov x0,#0xaaaa
    ldr x30,[sp],#0x8
    ret
Copy the code

STR x30, [sp,#-0x8]! Just stretch 8 bytes, run👇

LDR x30,[sp],#0x8,[sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp], [sp]

Therefore, the stack must maintain the principle of 16-byte alignment!

Function parameters and return values

Next, let’s look at how assembly handles functions that take arguments and return values. For example 👇

int sum(int a, int b) { return a + b; } - (void)viewDidLoad { [super viewDidLoad]; The sum (10, 20); }Copy the code

In sum (10, 20); Put a breakpoint on this line to see assembly 👇

W0, w1 = 10, w1 = 20, w0 = 10, w1 = 20

Finally, before returning ViewDidLoad, the result is saved in register W0. So, we implement an assembly of the sum function ourselves, which can be written as 👇

.text
.global _sum

_sum:
    add x0,x0,x1
    ret
Copy the code

X0 is equal to x0 plus x1, because the arguments are stored in x0 and x1.

Call the 👇

int sum(int a, int b); - (void)viewDidLoad { [super viewDidLoad]; Printf (" % d ", the sum (10, 20)); }Copy the code

Run it 👇

  1. Under ARM64, function parameters are storedX0 to X7(W0 to W7)In these eight registers.
  2. ifMore than eightParameters, will beInto the stack.
  3. Function of theThe return value is placed in the X0 registerThe inside of the.
The number of parameters exceeds 8
int test(int a, int b, int c ,int d, int e, int f, int g, int h, int i) {
    return a + b + c + d + e + f + g + h + i;
}

- (void)viewDidLoad {
    [super viewDidLoad];
    test(1, 2, 3, 4, 5, 6, 7, 8, 9);
}
Copy the code

Parameter distribution and SP direction are shown in the following figure 👇

Then we step into the test function 👇

The whole process of accumulation is shown in the following figure 👇

The final function return value is put into w0.

If test is not called in Release mode (optimized out because it doesn’t make sense and has no effect on the app)

The return value

  • The return value of the function is normallyA pointer, will not exceed 8 bytes. So,X0 registerIt’s perfectly enough.
  • If YOU want to return oneThe structure type exceeds 8 bytes.

Take a look at the following example 👇

Struct STR {int a; int b; int c; int d; int e; int f; }; struct str getStr(int a, int b, int c, int d, int e, int f) { struct str str1; str1.a = a; str1.b = b; str1.c = c; str1.d = d; str1.e = e; str1.f = f; return str1; } - (void)viewDidLoad { [super viewDidLoad]; Struct STR str2 = getStr(1,2,3,4,5,6); }Copy the code

Click on the breakpoint to view assembly 👇

Step into the getStr function 👇

The entire assembly assignment process of getStr is shown below 👇

Eventually, instead of returning x0, you write the return value to the stack X8 register of the previous function (the ViewDidLoad function).

In summary, if the return value is larger than 8 bytes, the return value is stored in the previous function stack space.

The structure has more than 8 members

What happens if you have more than eight members of your structure?

struct str { int a; int b; int c; int d; int e; int f; int g; int h; int i; int j; }; struct str getStr(int a, int b, int c, int d, int e, int f, int g, int h, int i, int j) { struct str str1; str1.a = a; str1.b = b; str1.c = c; str1.d = d; str1.e = e; str1.f = f; str1.g = g; str1.h = h; str1.i = i; str1.j = j; return str1; } - (void)viewDidLoad { [super viewDidLoad]; struct str str2 = getStr(1, 2, 3, 4, 5, 6, 7, 8, 9, 10); Printf (" % d ", func (10, 20)); }Copy the code

ViewDidLoad assembly 👇

ASMPrj '-[ViewController viewDidLoad]: 0x100f31C80 <+0>: sub sp, sp, #0x60; 👉 'save the path home' 0x100f31C84 <+4>: 0x100f31C88 <+8>: add x29, sp, #0x50; Stur x0, [x29, #-0x8] 0x100F31C90 <+16>: stur x0, [x29, #-0x8] 0x100F31C90 <+16>: Stur x1, [x29, #-0x10]; stur x1, [x29, #-0x10]; stur x1, [x29, #-0x10]; stur x1, [x29, #-0x10]; sub x9, x29, #0x20 ; =0x20 0x100f31C9c <+28>: Stur x8, [x29, #-0x20] // adrp 👉 address page #offset_to_exper #offset_to_exper #offset_to_exper #offset_to_exper #offset_to_exper #offset_to_exper 0x100F31CA0 <+32>: adRP x8, 4 0x100F31CA4 <+36>: add x8, x8, #0x4e0; =0x4e0 0x100f31ca8 <+40>: ldr x8, [x8] 0x100f31cac <+44>: str x8, [x9, #0x8] 0x100f31cb0 <+48>: adrp x8, 4 0x100f31cb4 <+52>: add x8, x8, #0x458 ; =0x458 0x100f31cb8 <+56>: ldr x1, [x8] 0x100f31cbc <+60>: mov x0, x9 0x100f31cc0 <+64>: bl 0x100f32524 ; Symbol stub for: objc_msgSendSuper2 // x8 points to sp + 0x8 0x100F31cc4 <+68>: add x8, sp, #0x8; =0x8 0x100f31cc8 <+72>: mov w0, #0x1 0x100f31ccc <+76>: mov w1, #0x2 0x100f31cd0 <+80>: mov w2, #0x3 0x100f31cd4 <+84>: mov w3, #0x4 0x100f31cd8 <+88>: mov w4, #0x5 0x100f31cdc <+92>: mov w5, #0x6 0x100f31ce0 <+96>: Mov w6, #0x7 0x100f31ce4 <+100>: mov w7, #0x8 // sp to x9 0x100F31ce8 <+104>: mov x9, sp // w10 store 0x100F31cec <+108>: Mov w10, # 0x100f31cf0 <+112>: STR w10, [x9] Mov w10, #0xa // x9 offset 4 bytes, w10 0x100f31cf8 <+120>: bl 0x100f31bf4 ; getStr at ViewController.m:30 0x100f31d00 <+128>: ldp x29, x30, [sp, #0x50] 0x102499d04 <+132>: add sp, sp, #0x60 ; =0x60 0x102499d08 <+136>: retCopy the code

Then look at the getStr compilation 👇

ASMPrj`getStr: -> 0x1004ddbf4 <+0>: sub sp, sp, #0x30 ; 0x1004dDBF8 <+4>: LDR w9, [sp, #0x30] 0x1004dDBFC <+8>: STR w0, [sp, #0x2c] 0x1004DDc04 <+16>: [sp, #0x2c] 0x1004ddc04 <+16>: str w1, [sp, #0x28] 0x1004ddc08 <+20>: str w2, [sp, #0x24] 0x1004ddc0c <+24>: str w3, [sp, #0x20] 0x1004ddc10 <+28>: str w4, [sp, #0x1c] 0x1004ddc14 <+32>: str w5, [sp, #0x18] 0x1004ddc18 <+36>: str w6, [sp, #0x14] 0x1004ddc1c <+40>: str w7, [sp, #0x10] 0x1004ddc20 <+44>: str w9, [sp, #0xc] 0x1004ddc24 <+48>: STR w10, [sp, #0x8] 0x1004DDC28 <+52>: LDR w9, [sp, #0x2c] 0x1004ddc2c <+56> str w9, [x8] 0x1004ddc30 <+60>: ldr w9, [sp, #0x28] 0x1004ddc34 <+64>: str w9, [x8, #0x4] 0x1004ddc38 <+68>: ldr w9, [sp, #0x24] 0x1004ddc3c <+72>: str w9, [x8, #0x8] 0x1004ddc40 <+76>: ldr w9, [sp, #0x20] 0x1004ddc44 <+80>: str w9, [x8, #0xc] 0x1004ddc48 <+84>: ldr w9, [sp, #0x1c] 0x1004ddc4c <+88>: str w9, [x8, #0x10] 0x1004ddc50 <+92>: ldr w9, [sp, #0x18] 0x1004ddc54 <+96>: str w9, [x8, #0x14] 0x1004ddc58 <+100>: ldr w9, [sp, #0x14] 0x1004ddc5c <+104>: str w9, [x8, #0x18] 0x1004ddc60 <+108>: ldr w9, [sp, #0x10] 0x1004ddc64 <+112>: str w9, [x8, #0x1c] 0x1004ddc68 <+116>: ldr w9, [sp, #0xc] 0x1004ddc6c <+120>: str w9, [x8, #0x20] 0x1004ddc70 <+124>: ldr w9, [sp, #0x8] 0x1004ddc74 <+128>: STR w9, [x8, #0x24] // stack balance 0x1004ddc78 <+132>: add sp, sp, #0x30; =0x30 0x1004ddc7c <+136>: retCopy the code

The entire execution process is shown in the following figure 👇

As shown in the figure above, both the argument and the return value are in the stack of the previous function (ViewDidLoad), and the address of the return value is high and the argument is low.

Local variables of functions

Finally, let’s look at the local variables of a function, starting with the following example 👇

int func(int a, int b) {
    int c = 6;
    return  a + b + c;
}

- (void)viewDidLoad {
    [super viewDidLoad];
    func(10, 20);
}
Copy the code

First look at the func compilation 👇

The local variables of 👉 are placed on the function’s own stack.

The nested calls

What if it’s a nested call scenario? What would it be like, for example 👇

int func1(int a, int b) {
    int c = 6;
    int d = func2(a, b, c);
    int e = func2(a, b, c);
    return  d + e;
}

int func2(int a, int b, int c) {
    int d = a + b + c;
    printf("%d",d);
    return d;
}

- (void)viewDidLoad {
    [super viewDidLoad];
    func1(10, 20);
}
Copy the code

Assembly code 👇

As you can see in the figure above, parameters and return values are still stored on the stack.

Field protection includes: FP, LR, parameters, and return values.

conclusion

  • The stack
    • Is a storage space with special access (LIFO, LIFO)
    • SP and FP registers
      • The SP register holds the address at the top of the stack at any time
      • The FP (X29) register is a general-purpose register that is used at some point to hold the address at the bottom of the stack (nested calls)
    • Stack operations in ARM64 are 16 bytes aligned
    • Stack read/write instruction
      • Read: Load Register (LDR) LDR, LDP
      • Store register (STR) command STR, STP
    • Assembly instruction:
      • sub sp, sp,#0x10 ; Stretch stack space 16 bytes
      • stp x0,x1,[sp]; Put x0 and x1 where sp is
      • ldp x0,x1,[sp]; Read SP stores X0 and X1
      • add sp,#0x10; Restore stack space
    • Abbreviations:
      • stp x0, x1,[sp,#-0x10]! ; The prerequisite is just open up space to put full stack. First open space, store value, then change the value of SP.
      • ldp x0,x1,[sp],#0x10
  • Bl instruction
    • Jump instruction: bl label, go to the label to execute the instruction and save the address of the next instruction in the LR register
    • B stands for jump
    • L stands for LR (X30) register
  • Ret instruction
    • Similar to return in a function
    • Make the CPU execute the instruction pointed to by the LR register
    • If there is a jump, you need to protect the scene.
  • function
    • Function call stack
      • The ARM64 stack is a decrement stack that extends to a lower address
      • The SP register points to the top of the stack
      • The X29(FP) register points to the bottom of the stack
    • Parameters of a function
      • In ARM64, by default, parameters are stored in the eight registers of X0 to X7
      • If it’s a floating point number, you use a floating point register
      • If there are more than 8 parameters, they will be passed by the stack. (If there are more than 8 parameters, they will not be released after the function call ends, which is equivalent to local variables, which belong to the caller and will be released only after the calling function performs the end stack balance.)
    • The return value of the function
      • Normally, the return value of a function is stored in the X0 register
      • If the return value is larger than 8 bytes, memory is utilized. Write inside the previous call stack, using the X8 register as a reference.
    • The local variable of a function
      • Use a stack to hold local variables
      • Nested calls to a function
        • X29(FP), X30(LR) registers will be pushed into the stack protection.
        • At the same time, the field protection is: FP, LR, parameters, return value.