1. Register in ARM64

When a program is executed in a computer, the most important thing to rely on is the register in the CPU. The number and types of registers in different CPU architectures are different.

Registers ARM64 currently has 31 64-bit (8-byte) General Purpose Registers, which can be expressed as x0 to x30. Some of the more critical registers have special uses as follows:

  1. x29 — fp(frame pointerRegister, used for recordingAt the bottom of the stack
  2. x30 — lr (link registerRegister, connect register, used for recordingThe current function returnsIs the address of the next code instruction to execute.
  3. x31 — sp (stack pointerRegister, used for recordingIn memory, top of stack addressThe register holds the address at the top of our stack at any given time.

There are also two more special registers:

  1. PC (Program Counter) program count register: Indicates the address of the instruction that the CPU is currently reading
  2. Current Program Status Register (CPSR) Program status register: Records some status flags during CPU calculation
  3. xzr(Zero register)
  1. w0~w30Is 31 32-bit registers, actually represented in the lower 32 bits of x0 to x30
  2. The CPU also has other floating-point registers, vector registers, and so on

2. View the program execution process from the CPU

The program goes through the following process during execution:

  1. The CPU stores data in memory (heap/stack, etc.) into general-purpose registers
  2. Operations are then performed on the data in the universal register
  3. The CPU stores the result back into memory

In addition, in memory or disk binary, instructions and data are no different!! For example, 1110 0000 0000 0011 0000 1000 1010 1010:

  1. As data:0xE003008AA
  2. As an instruction: mov x0, x8

The understanding of CPU execution can be as follows:

  1. pcWhat the register points to is interpreted and executed by the CPU as an instruction
  2. pcAfter executing the current command, thepcThe register points to the next location of the current instruction and continues
  3. We can change the contents of the PC to control the CPU to execute target instructions in ARM64blInstruction is changepcRegister contents, so as to realize the function jump,blCluster instructions are called transfer instructions

Here is A simple assembly code describing an example of calling the void B() function from inside the function void A() :

_A:
    mov x0,#0xa0
    mov x1,#0x00
    add x1, x0, #0x14
    mov x0,x1
    bl _B
    mov x0,#0x0
    ret

_B:
    add x0, x0, #0x10
    ret
Copy the code

3. View the function from the CPU’s perspective

In ARM64, the stack is stretched and balanced by the sub/add sp pointer. Here is an example of the stack:

sub sp, sp, #0x40 ; STP x29, x30, [sp, #0x30]; Add x29, sp, #0x30; X29 points to the bottom of the stack frame... ldp x29, x30, [sp, #0x30] ; Add sp, sp, #0x40; Stack balance retCopy the code

Memory operations in functions:

  1. Storage pair/register: Use STP/STR to store register contents in memory.

  2. Use LDP/LDR to read memory data into a specified register

Note: THE ARM64 register is 64-bit (8 bytes), so STR/LDR can only manipulate 8 bytes of data at a time, whereas with STP/LDP you can manipulate 16 bytes at a time.

Two ARM64 instructions and registers related to functions:

  1. Bl label

    • Place the address of the next instruction into the LR (X30) register

    • Go to the label to execute the instruction

  2. ret

    • The default value of the LR (X30) register is used, and the underlying instruction prompts the CPU to use this as the address of the next instruction
  3. X30 (LR) register

    • The X30 register holds the return address of the function. When the RET instruction is executed, the address value saved in the X30 register is searched!

Function parameters and return values:

  1. ARM64,Parameters of a functionIs stored inx0tox7In these eight registers. If there are more than eight parameters, they are pushed into memory ahead of time.
  2. The return value of the functionIs placed onx0In the register.
  3. The local variables of the function are placedStack memoryThe inside.

4. Multi-threading from a CPU perspective

In ARM64, the CPU is not aware of the existence of threads during the entire program execution. The operating system kernel maintains a separate function call stack for each thread, and uses the kernel data structure to store CPUInfo, including key information about each register, when the system switches threads:

// This is a thread context structure for Linux on an arm32CPU. http://elixir.free-electrons.com/linux/latest/source/arch/arm/include/asm/thread_info.h / / did not save all the register here, Because the registers defined in the ABI to be used by Linux running on ARM are not total registers, you only need to save the contents of the specified registers. Not all cpus store the same content, and the content can vary depending on the CPU architecture. The thread context structure defined by iOS is not available because the iOS kernel is not open source. Struct cpu_context_save {__u32 r4; __u32 r5; __u32 r6; __u32 r7; __u32 r8; __u32 r9; __u32 sl; __u32 fp; __u32 sp; __u32 pc; __u32 extra[2]; /* Xscale 'acc' register, etc */ }; Struct thread_info {unsigned long flags; /* low level flags */ int preempt_count; /* 0 => preemptable, <0 => bug */ mm_segment_t addr_limit; /* address limit */ struct task_struct *task; /* main task structure */ __u32 cpu; /* cpu */ __u32 cpu_domain; /* cpu domain */ struct cpu_context_save cpu_context; /* cpu context */ __u32 syscall; /* syscall number */ __u8 used_cp[16]; /* thread used copro */ unsigned long tp_value[2]; /* TLS registers */ #ifdef CONFIG_CRUNCH struct crunch_state crunchstate; #endif union fp_state fpstate __attribute__((aligned(8))); /* float register */ union vfp_state vfpstate; / / #ifdef CONFIG_ARM_THUMBEE unsigned long thumbee_state; /* ThumbEE Handler Base register */ #endif }; Source: author: ouyang eldest brother 2013 links: https://juejin.cn/post/6844903560145010702 rare earth nuggets copyright owned by the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.Copy the code

5. Objc_msgSend Assembler analysis

Many online, you can refer to www.jianshu.com/p/62ecc3f31…

6. HOOK all OC methods and count the time consuming of each method

Consider two implementations:

  1. Github.com/czqasngit/o…

  2. Github.com/maniackk/Ti…

Reference:

  1. www.cnblogs.com/bwangblog/p…
  2. Juejin. Cn/post / 684490…