Learn assembly recently, record

The development of assembly language

1. Machine language

Machine instructions made up of zeros and ones.

  • Add: 0100, 0000
  • Reduction: 0100 1000
  • Times: 1111 1111 1110 0000
  • Except: 1111 1111 1111 0000

2. Assembly language

Use mnemonics instead of machine language

  • Plus: INC EAX through compiler 0100 0000
  • Minus: DEC EAX via compiler 0100 1000
  • Multiply: MUL EAX through the compiler 1111 0111 1110 0000
  • Except for: DIV EAX through the compiler 1111 0111 1111 0000

3. Assembly Language

Use mnemonics instead of machine language such as:

  • Plus: INC EAX through compiler 0100 0000
  • Minus: DEC EAX via compiler 0100 1000
  • Multiply: MUL EAX through the compiler 1111 0111 1110 0000
  • Except for: DIV EAX through the compiler 1111 0111 1111 0000

High-level Programming Language

C++ Java OC Swift, closer to a natural human language such as C:

  • Plus: A+B through the compiler 0100 0000
  • Minus: A-b through the compiler 0100 1000
  • Multiply: A*B through the compiler 1111 0111 1110 0000
  • Except: A/B through the compiler 1111 0111 1111 0000

Our code on a terminal device looks like this:

  • Assembly language and machine language one – to – one correspondence, each machine instruction has a corresponding assembly instruction
  • Assembly language can be compiled into machine language, and machine language can be disassembled into assembly language
  • High-level languages can be compiled into assembly language machine language, but assembly language machine language is almost impossible to restore to high-level language

Second, the characteristics of assembly language

  • It can directly access and control all kinds of hardware devices, such as memory and CPU, which can maximize the functions of hardware

  • The ability to have complete control over the generated binary code without the limitations of the compiler

  • The object code is short, occupies little memory and executes fast

  • Assembly instruction is a mnemonic of machine instruction, corresponding to machine instruction. Each CPU has its own machine instruction set, assembly instruction set, so assembly language is not portable

  • Knowledge points too much, developers need to understand the CPU and other hardware structure, is not easy to write, debugging, maintenance

  • Case insensitive, for example mov is the same as MOV

Third, the use of assembly

  • Write drivers, operating systems (such as some key parts of the Linux kernel)
  • A program or snippet of code that requires high performance and can be mixed with a high-level language (inline assembly)
  • Software security
  • The best starting point and most efficient way to understand the entire computer system
  • Lay the foundation for writing efficient code
  • Get to the bottom of the code
    • What is the nature of a function?
    • ++a ++ A ++ +a how does the bottom layer implement?
    • What does the compiler really do for us?
    • What are the key aspects of DEBUG and RELEASE modes that we missed
    • .

Four, the types of assembly language

  • At present, more assembly language is discussed

    • 8086 assembly (the 8086 processor is a 16-bit CPU)
    • Win32 compilation
    • Win64 assembly
    • ARM Assembly (Embedded, Mac, iOS)
    • .
  • We use ARM assembly in the iPhone, but it varies from device to device. The CPU architecture is different.

architecture equipment
armv6 IPhone, iPhone2, iPhone3G, first generation, second generation iPod Touch
armv7 iPhone3GS, iPhone4, iPhone4S,iPad, iPad2, iPad3(The New iPad), iPad mini, iPod Touch 3G, iPod Touch4
armv7s iPhone5, iPhone5C, iPad4(iPad with Retina Display)
arm64 IPhone5S, iPhoneX, iPad Air, iPad mini2

Five, a few necessary common sense

  • Understand hardware structure such as CPU
  • APP/ program execution process

  • The most important hardware related is CPU/ memory
  • In assembly, most instructions are CPU – and memory-specific

The bus

  • Each CPU chip has a number of pins connected to a bus through which the CPU interacts with external devices
  • Bus: A collection of wires
  • Bus classification
    • The address bus
    • The data bus
    • Control bus

For example

  • The address bus
    • Its width determines the addressing power of the CPU
    • The address bus width of the 8086 is _20_, so the addressing capability is _1M_ (2^20).

  • The data bus
    • Its width determines the amount of data that the CPU can transmit at a time, that is, the speed at which it can transmit data
    • The data bus width of the 8086 is _16_, so a maximum of _2 bytes of data can be passed at a time
  • Control bus
    • Its width determines how much control the CPU can have over other devices

For example,

  • A CPU with 8KB addressing capacity has an address bus width of 13 (2^N = 8*1024)
  • The address bus width of 8080,8088,80286,80386 is 16,20,24,32, respectively. So what are their addressing capacities 2^6 KB,

1 MB,2^4 MB,2^2 GB?

  • The data bus width of 8080808 8808 6802 86803 86 8, respectively, 8, 16, 16, and 32. The data they can transmit at one time is: 1 B, 1 B, 2 B, 2 B,4 B
  • To read 1024 bytes of data from memory, the 8086 must read at least 512 times, and the 80386 must read at least 256 times.

memory

  • The size of the memory address space is limited by the CPU address bus width. The address bus width of the 8086 is 20, and it can locate 2^20 different memory units (memory address range 0x00000 to 0xFFFFF), so the memory size of the 8086 is 1MB

  • 0x00000 to 0x9FFFF: Main memory. Can read but write

  • 0xA0000 to 0xBFFFF: Writes data to the video memory, which is output to the display by the video card. Can read but write

  • 0xC0000~0xFFFFF: Stores various hardware \ system information. read-only

Six, into the system

Barriers to learning

We are used to thinking about other bases in terms of decimal, and always convert to decimal first when we need to do an operation. Why convert to decimal? Did you switch just because you were most familiar with decimal? Every base system is perfect, so we need to forget about decimal, that is, the conversion between bases!

### Base definition

  • Octal is made up of eight symbols :0, 1, 2, 3, 4, 5, 6, 7
  • The decimal system consists of 10 symbols :0, 1, 2, 3, 4, 5, 6, 7, 8, 9
  • The n-base system is made up of N symbols: carry one every N

##### Consider a question

  • 1 + 1 in the case of ____ is equal to 3, right?

The decimal system consists of 10 symbols: 0, 1, 3, 2, 8, A, B, E, S, 7

If you define the decimal system like this: 1 + 1 = 3! Just right!

What is the purpose of this? The traditional decimal notation is different from the custom decimal notation. So these 10 symbols if we don’t tell people about this symbol table, they can’t get our specific data! For encryption!

The decimal system consists of ten symbols, every ten into one, the symbol can be customized!!

### short for binary

Binary: 1011 1011 111 00 Three binary groups: 101 110 111 100 octal: 5 6 7 4 Four binary groups: 1011 1011 1100 Hexadecimal: B B CCopy the code

Binary: write from 0 to 1111 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 This binary is too cumbersome to use, change to a simpler symbol: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F that’s hexadecimal

Seven, the width of data

Mathematical numbers have no size limit and can be infinitely large. In computers, however, due to hardware constraints, data is limited in length (we call it data width), and any data exceeding the maximum width is discarded.

#import <UIKit/UIKit.h>
#import "AppDelegate.h"

int test(){
    int cTemp = 0x1FFFFFFFF;
    return cTemp;
}

int main(int argc, char * argv[]) {
    printf("%x\n",test());
    @autoreleasepool {
        return UIApplicationMain(argc, argv, nil.NSStringFromClass([AppDelegate class])); }}Copy the code

The width of data commonly found in computers

  • Bit: a Bit is a binary Bit, 0 or 1
  • Byte: A Byte consists of eight bits (8 bits). The smallest unit of memory Byte.
  • Word: A Word consists of two bytes (16 bits), which are called high bytes and low bytes.
  • 2. A Doubleword consisting of two words (32 bits).

So the computer will store data and it will divide it into signed and unsigned numbers. So look at this picture to understand!

Unsigned number, direct conversion! Signed numbers: positive numbers: 0 1 2 3 4 5 6 7 Negative numbers: F E D B C A 9 8-1-2-3-4-5-6-7-8Copy the code

Viii. Register

The internal components are connected by a bus

CPU computing speed is very fast. For performance purposes, the CPU creates a small temporary storage area and copies data from the memory to this small temporary storage area before performing operations. We call this small temporary storage area a register.

  • For programmers, the most important parts of CPU are registers, which can be controlled by changing the contents of registers
  • The number and structure of registers are different for different cpus

Universal register

  • ARM64 has 31 64-bit general-purpose registers x0 through X30. These registers are usually used to store general data, called general-purpose registers (and sometimes special-purpose registers).
    • So w0 through W28 these are 32 bits. 64-bit cpus are 32-bit compatible. So you can use only the lower 32 bits of the 64-bit register.
    • For example, w0 is the lower 32 bits of x0!

  • Typically, the CPU stores the data in memory into a general purpose register, and then performs operations on the data in the general purpose register
  • Suppose you have a chunk of red memory with a value of 3, and now you want to increase its value by 1 and store the result into blue memory

  • The CPU first places the value of the red memory space in register X0: MOV X0, red memory space
  • Then add register X0 to 1: add X0,1
  • Finally, assign the value to the memory space: mov blue memory space,X0

PC Register (Program Counter)

  • Is the instruction pointer register, which indicates the address of the instruction that the CPU is currently reading
  • In memory or on disk, instructions and data are indistinguishable as binary information
  • The CPU works by treating some information as instructions and some as data, assigning different meanings to the same information
    • For example, 1110 0000 0000 0011 0000 1000 1010 1010
    • Can be regarded as data 0xE003008AA
    • Can also be used as instruction mov x0, x8
  • On what basis does the CPU interpret information in memory as instructions?
    • The CPU treats the contents of the memory cell that the PC points to as instructions
    • If something in memory has been executed by the CPU, the memory location it resides in must have been pointed to by the PC

Data address register

Data address register is usually used for temporary storage, accumulation, counting, address storage and other functions of data calculation. The main purpose of these registers is to store operands in CPU instructions and use them as regular variables in the CPU. In the ARM64

  • 64-bit: X0-X30, XZR(zero register)
  • 32 bits: W0-W30, WZR(zero register)

Note: the 8086 assembly has a special register segment register :CS,DS,SS,ES four registers to hold the base address of these segments, this belongs to the Intel architecture CPU. Not in ARM

Floating point and vector registers

Because of the storage of floating point numbers and the special nature of their operations, floating point registers are provided in the CPU to handle floating point numbers

  • Floating point register 64-bit: D0-D31 32-bit: S0-S31

The current CPU support vector operation.(vector operation in the graphics processing related field is very much used) for the support vector calculation system also provides a number of vector registers.

  • Vector register 128 bits :V0-V31

SP and FP registers

  • The SP register holds our address at the top of the stack at any given time.
  • The FP register, also known as the X29 register, is a general purpose register, but at some point we use it to store the address at the bottom of the stack! (a)

A stack is a storage space with special access methods (Last In first Out, Last In Out Firt, LIFO)

Note :ARM64 starts, cancel 32-bit LDM,STM,PUSH,POP instructions! LDR \ LDP STR \ STP was used instead

Stack manipulation in ARM64 is 16 byte aligned!!

STR (store register) instructions

To read data out of a register and store it in memory.

LDR (load register) instructions

To read data from memory and store it in a register

This LDR and STR variant LDP and STP can also operate on two registers.

Status register

Inside the CPU, there is a special type of register (the number and structure may vary from processor to processor). This kind of register is called status register (CPSR) register in ARM. CPSR is different from other registers, which are used to store data. The whole register has one meaning. The CPSR register works bitwise, meaning that each bit has a special meaning and records specific information.

Note: the CPSR register is 32 bits

The lower 8 bits of the CPSR (including I, F, T, and M[4:0]) are called the control bits and cannot be modified by a program unless the CPU is running in privileged mode!

N, Z, C, and V are all conditional code flag bits. Their contents can be changed by the results of arithmetic or logical operations, and can determine whether an instruction is executed or not! Significant!

  • N indicates Negative

Bit 31 of the CPSR is N, the symbol flag bit. It records whether the result is negative after the relevant instruction is executed. If it’s negative N is equal to 1, if it’s non-negative N is equal to 0.

Note that in the ARM64 instruction set, some instructions that affect the status register, such as add, sub, or etc., are mostly operational instructions (perform logical or arithmetic operations).

  • Z (Zero)

The 30th bit of CPSR is the Z, 0 flag bit. It records whether the result is 0 after the relevant instruction is executed. If the result is 0, then Z = 1. If it’s not 0, then Z is equal to 0.

For the value of Z, we can see in this way,Z marks whether the calculation result of relevant instructions is 0. If it is 0, N should record such positive information as “yes”. In computers, 1 means logical truth, positive. So when the result is 0 Z = 1 means “the result is 0″. If the result is not 0, Z records the negative message” not 0″. In computers, 0 means logic false, means negation, so when the result is not 0, Z = 0 means “the result is not 0”.

  • C (Carry)

Bit 29 of the CPSR is C, the carry flag bit. In general, unsigned numbers are performed. Addition operation: C=1 if the operation results in a carry (unsigned overflow), otherwise C=0. Subtraction operations (including CMP) : C=0 when a debit occurs (unsigned overflow), otherwise C=1.

For an unsigned number with bits N, the highest bit of the corresponding binary information, i.e., the n-1st bit, is its most significant bit, while the imaginary NTH bit is the higher bit relative to the most significant bit. As shown below:

carry

We know that when two pieces of data are added, it is possible to produce a carry from the most significant bit to a higher one. For example, two 32-bit bits of data: 0xAAAAaAAA + 0xAAAAaAAA will produce a carry. Since the carry value cannot be stored in 32 bits, we simply say that the carry value is lost. In fact, the CPU does not discard the carry system, but records it in a special register. ARM uses C bits to record the carry value. For example, the following command

Mov w0, xaaaaaaaa # 0; The binary of 0xA is 1010 adds W0,w0,w0; After execution equals 1010 << 1 carry 1 (unsigned overflow) so C is marked with 1 adds W0,w0,w0; After execution equals 0101 << 1 carry 0 (unsigned without overflow) so C is marked with 0 adds W0,w0,w0; Repeat the above to add W0,w0,w0Copy the code

A borrow

When you subtract two numbers, it’s possible to borrow higher. For another example, two 32-bit data: 0x00000000-0x000000FF will generate a debit. After the debit, it is equivalent to calculating 0x100000000-0x000000FF. I get the value 0xffffFF01. Since we borrowed one bit, the C bit is used to mark the borrowing. C = 0. For example:

mov w0,#0x0
subs w0,w0,#0xff ;
subs w0,w0,#0xff
subs w0,w0,#0xff
Copy the code
  • V(Overflow) Indicates the Overflow flag

Bit 28 of the CPSR is V, the overflow flag bit. When a signed number operation is performed, if it exceeds the range that the machine can identify, it is called an overflow.

  • Positive + positive overflow for negative numbers
  • Negative + negative is positive overflow
  • Positive and negative numbers cannot overflow

The cache

The ARM processor A11 on the iPhoneX has a level 1 cache of 64KB and a level 2 cache of 8M.

Before executing an instruction, the CPU reads the instruction from memory into the CPU and executes it. Registers run much faster than memory reads and writes, and the CPU integrates a cache storage area for performance. When a program is running, the code and data to be executed are copied to the cache (done by the operating system). The CPU reads the instructions from the cache to execute them.