This article first appeared on my blog: blog.shenyuanluo.com. Please subscribe.

Arm64 assembly preparation

register

General purpose register

31 R0 ~ R30, each register can access a 64 – bit number. When accessed using x0-X30, is a 64-bit number; When w0-W30 is used, it is a 32-bit number, and the lower 32 bits of the register are accessed, as shown in figure:

Vector register

The size of each register is 128 bits. Different bits can be accessed by Bn, Hn, Sn, Dn and Qn respectively. As shown in figure:

** Note: ** Word is 32 bits, that is, 4 bytes.

  • Bn:The size of a Byte, i.e8
  • Hn:Half word, namely16
  • Sn:Single word, namely32
  • Dn:Double word, namely64
  • Qn:Quad word, namely128

Special register

  • Sp: (Stack Pointer), a register at the top of the Stack, used to store the top of the Stack address;
  • Fp (x29) : (Frame Pointer) is stored for the base address of the stack, used to save the bottom address of the stack;
  • Lr (x) :(Link Register) to save the call jump instructionblThe memory address of the instruction next to the instruction;
  • Zr (x31) :(Zero Register),xzr/wzr64 bits, 32 bits, 0 bits, 0 bits, 0 bits, 0 bits, 0 bits0;
  • PC: The address where the instruction to be executed is saved (the operating system determines its value and cannot be overwritten).

Status register CPSR

CPSR (Current Program Status Register) is different from other registers. Other registers are used to store data. The whole Register has one meaning. The CPSR register is bitwise, that is, each bit has a special meaning and records specific information. The following figure

Note: the CPSR register is 32 bits.

  1. The lower 8 bits of the CPSR (including I, F, T, and M[4:0]) are called control bits and cannot be modified by the program unless the CPU is running in privileged mode.

  2. N, Z, C, V are the flag bits of conditional code; Its contents can be changed by the results of arithmetic or logical operations and can determine whether an instruction is executed.

    • N (Negative) flag: The 31st bit of CPSR is N, a symbol flag bit. Record whether the result is negative after the relevant instruction is executed. If it is negative, then N = 1; If it’s non-negative, then N is equal to 0.

    • Z(Zero) : The 30th bit of the CPSR is Z and the 0 flag bit. Record whether the result is 0 after the relevant instruction is executed. If the result is 0, then Z = 1; If the result is not 0, then Z = 0.

    • C(Carry) flag: the 29th bit of CPSR is C, carrying flag bit;

      • Addition operation: when the result of the operation is producedcarry(unsigned number overflow),C = 1, otherwise,C = 0
      • Subtraction operations (includingCMP) : is generated when an operation is performedA borrow(unsigned number overflow),C = 0, otherwise,C = 1
    • V(Overflow) flag: the 28th bit of CPSR is V, the Overflow flag bit; When performing a signed number operation, if it exceeds the range that the machine can identify, called overflow.

Conditional code list

opcode Conditional code mnemonic mark meaning
0000 EQ Z=1 equal
0001 NE(Not Equal) Z=0 Not equal to the
0010 CS/HS(Carry Set/High or Same) C=1 The unsigned number is greater than or equal to
0011 CC/LO(Carry Clear/LOwer) C=0 The unsigned number is less than
0100 MI(MInus) N=1 A negative number
0101 PL(PLus) N=0 Positive or zero
0110 VS(oVerflow set) V=1 The overflow
0111 VC(oVerflow clear) V=0 There is no overflow
1000 HI(High) C=1,Z=0 The unsigned number is greater than
1001 LS(Lower or Same) C=0,Z=1 The unsigned number is less than or equal to
1010 GE(Greater or Equal) N=V The signed number is greater than or equal to
1011 LT(Less Than) N! =V The signed number is less than
1100 GT(Greater Than) Z=0,N=V The signed numbers are greater than
1101 LE(Less or Equal) Z=1,N! =V Signed numbers are less than or equal to
1110 AL any Unconditional execution (default)
1111 NV any Never perform

Read instructions

In the ARM64 architecture, each instruction read is 64-bit, or 8 bytes of space.

Arm64 conventions (in general)

  • x0 ~ x7The first eight parameters of the method are stored separately; If the number of arguments exceeds 8, the extra arguments are stored on the stack, and the new method reads through the stack.
  • Methods usually return values on x0; If the method returns a large data structure, the result is stored at the x8 execution address.

Common assembly instruction

  • Mov: copy the value of one register to another register (can only be used to transfer values between registers or between registers and constants, but not for memory addresses), e.g.

    mov x1, x0 ; Copies the value of register x0 into register X1Copy the code
  • Add: To add the value of one register to the value of another register and save the result in the other register, as:

    add x0, x0, #1 ; Add the value of register x0 to the constant 1 and save it in register x0 add x0, x1, x2; Add x0, x1, [x2]; add x0, x1, [x2]; Add the value of register X1 plus the value of register x2 as the address, and take the contents of that memory address and put it in register X0Copy the code
  • Sub: Subtracts the value of one register from the value of another register and saves the result in the other register, as:

    sub x0, x1, x2 ; Save the subtraction of the values of registers X1 and x2 in register X0Copy the code
  • Mul: To multiply the value of one register by the value of another and save the result in the other register, as:

    mul x0, x1, x2 ; Multiply the values of registers X1 and x2 and save the result in register X0Copy the code
  • Sdiv (signed number, corresponding to udiv: unsigned number) divides the value of one register into the value of another register and saves the result in the other register, as:

    sdiv x0, x1, x2 ; Divide the values of registers X1 and x2 and save the result in register x0Copy the code
  • And: to bitwise and bitwise the value of one register with the value of another register and save the result in the other register, as:

    and x0, x0, #0xf ; Save the value of register x0 and the constant 0xf in bits to register x0Copy the code
  • ORR: To combine the value of one register with the value of another in bits or save the result in another register, as:

    orr x0, x0, #9 ; Save the value of register X0 and the constant 9 in bit or post to register X0Copy the code
  • Eor: Bitwise xor between the value of one register and the value of another register and saves the result in the other register, as:

    eor x0, x0, #0xf ; Save the value of register x0 and the constant 0xf in bit xor to register x0Copy the code
  • STR: (store register) writes a value from a register to memory, as:

    str w9, [sp, #0x8] ; Save the value in register W9 to stack memory [sp + 0x8]Copy the code
  • STRB: (Store register byte) writes a value from a register to memory (only storing one byte), such as:

    strb w8, [sp, #7] ; Save the lower 1 byte value in register W8 to stack memory [sp + 7]Copy the code
  • LDR: (load register) reads a value from memory into a register, such as:

    ldr x0, [x1] ; LDR w8, [sp, #0x8]; Read the stack memory value at [sp + 0x8] into register W8 LDR x0 [x1, #4]! ; Take the value of register x1 plus 4 as the memory address, take the value of that memory address and put it in register x0, then put the value of register x1 plus 4 into register X1 LDR x0, [x1], #4; Take the value of register X1 as the memory address, take the value of the memory address in register x0, then add the value of register X1 to LDR x0 in register X1, [x1, x2]; Add the values of register X1 and register x2 as the address, and take the value of that memory address and put it in register X0Copy the code
  • LDRSB: (Load register byte) Reads a value (only one byte) from memory into a register, such as:

    ldrsb w8, [sp, #7] ; Read the lower 1 byte value from stack memory [sp + 7] into register W8Copy the code
  • Stur: The same as STR to write a value from a register to memory (usually used in negative address operations), as:

    stur w10, [x29, #-0x4] ; Save the value in register W10 to stack memory [x29-0x04]Copy the code
  • Ldur: The same as LDR to read a value from memory into a register (usually used in negative address operations), as:

    ldur w8, [x29, #-0x4] ; Reads the stack memory value at [x29-0x04] into the W8 registerCopy the code
  • STP: push instruction (a variant of STR that can operate two registers simultaneously), as:

    stp x29, x30, [sp, #0x10] ; Store the values of X29, x30 at sp offset 16 bytesCopy the code
  • LDP: out-stack instruction (a variant of LDR that can operate two registers simultaneously), such as:

    ldp x29, x30, [sp, #0x10] ; Take the sp offset by 16 bytes and store it in registers X29 and X30Copy the code
  • SCVTF: (Signed Convert To Float) Converts a Signed fixed point number To a floating point number, as:

    scvtf d1, w0 ; Save the value of register W0 (vertex number, converted to floating point) to vector register/floating point register D1Copy the code
  • FCVTZS: (Float Convert To Zero Signed) Convert a floating point number To a fixed point number (rounded To Zero), as:

    fcvtzs w0, s0 ; Save the value of vector register S0 (floating point, converted to fixed point) to register W0Copy the code
  • CBZ: Compare with 0, and transfer if the result is Zero (only skip to the following instruction);

  • CBNZ: Compare with non-zero, and transfer if the result is non-zero (only skip to the following instruction);

  • CMP: Comparison instruction, equivalent to subs, affecting the program status register CPSR;

  • Cset: comparison instruction, if the condition is met, set to 1, otherwise set to 0, such as:

    cmp w8, #2 ; Compare register w8 to constant 2 cset w8, gt; Set the value of register W8 to 1 if it is greater than (grater than), and to 0 otherwiseCopy the code
  • BRK: can be understood as a special kind of jump instruction

  • LSL: Logic shifts to the left

  • LSR: Logical shift to the right

  • ASR: Arithmetic right shift

  • ROR: Moves the loop right

  • Adrp: Used to locate the data in the data segment, because ASLR will lead to the random address of the code and data, adRP is used to assist the location according to the PC

  • B: (branch) jump to an address (no return) without changing the value of the LR (x30) register; Usually a jump within this method, such as a while loop, if else, etc., such as:

    b LBB0_1 ; Jump to the label 'LLB0_1' and start the executionCopy the code
  • Bl: jump to an address (return), first save the next instruction address (function return address) to the register LR (X30), then jump; Usually used for direct calls to different methods, such as:

    bl 0x100cfa754 ; Save the next instruction address (the return address of function '0x100Cfa754') to register 'lr' before calling the function '0x100Cfa754'Copy the code
  • BLR: jump to the address (with return) pointed to by a register (value), first save the next instruction address (that is, function return address) to register LR (x30), then jump; Such as:

    blr x20 ; Save the next instruction address (the return address of the function referred to by 'x20') to the register 'lr', and then call the function referred to by 'x20'Copy the code
  • Br: jump to the address pointed to by a register (no return) without changing the value of lr (x30) register.

  • BRK: can be understood as a special kind of jump instruction.

  • Ret: subroutine (function call) returns instruction, the return address has been saved in register LR (x30) by default

A function call

For every function call, there is a push and an out operation.

Example: PushAndPop. C

The source code

#include <stdio.h> void TestPushAndPop() { printf("Push an Pop !" ); }Copy the code

Assembly code

  • Xcode “Product — >Perform Action — >Assemble pushandpop.c”

  • It can also be compiled into assembly code via Clang:

    Clang -s pushandpop. c // Arm64 assembly requires the following command to specify the directory where the architecture and system header files are located. Please be sure to change the SDK version of isysroot to the version existing in xcode! clang -S -arch arm64 -isysroot / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneOS platform/Developer/SDKs/iPhoneOS11.1 SDK PushAndPop. CCopy the code

Remove a bunch of irrelevant things to get the corresponding assembly code, as follows:

sub sp, sp, #32 ; STP x29, x30, [sp, #16]; Add x29, sp, #16; add x29, sp, #16; Update the value of the register at the bottom of the stack, adrp x0, l_.str@PAGE; Add x0, x0, l_.str@PAGEOFF; Get the offset bl _printf of the page address corresponding to the 'l_. STR' tag; Stur w0, [x29, #-4]; stur w0, [x29, #-4]; LDP x29, x30, [sp, #16] LDP x29, x30, [sp, #16] Add sp, sp, #32; Restores the value of the top register on the stack before the function was called ret; returnCopy the code

For the above assembly code, 32 bytes of space is allocated, 16 bytes of which are used for pushing operations and the remaining 16 bytes are used for storing temporary variables.

Question: the example function is named without temporary variables, why also need to apply for space?

Explanation: although the function not temporary variables, but after call printf function, the compiler will automatically add the processing of the function return value, because arm64 specifies the integer return values in x0 registers, therefore hides has a local variable int return_value; The temporary variable takes up 4 bytes; Also, because arm64 requires 16 bytes-alignment when addressing sp addresses, 16 bytes of space is used as a temporary variable. See here for details.

  • Its push operation assembly code flow analysis is as follows:

  • Its out of the stack operation assembly code flow analysis is as follows:

    Note: Allocate/release operations on the stack only add or subtract the stack pointer and do not change the contents of the stack memory (and do not set the freed stack space to 0).

reference

  • Introduction to ARM64 assembly for iOS developers
  • IOS Reverse Journey (Basics) – Assembly (I) – Assembly Basics
  • [C in ASM(ARM64)
  • Using the Stack in AArch64: Implementing Push and Pop
  • Arm instruction help document