preface
As a programmer in our development and learning process, more or less have been exposed to assembly, that we have not thought about what is assembly, master assembly what we can do? Simply put, it is the process of translating assembly language into machine language that a computer can understand. We call it assembly. Learning to assemble can play the reverse engineering, but also can understand the principle of computer. An important part of doing reverse analysis is called static analysis, and the reason an APP can be executed on a phone is because the phone has an executable, which is essentially a binary file, and the phone is essentially performing binary, and static analysis is based on binary. To analyze binary you need to know assembly.
The development of assembly language
- What is the
Machine language
?
It’s actually made up of zeros and onesMachine instructions
.add
: 0100, 0000,Reduction of
: 0100, 1000,take
: 1111 0111 1110 0000,In addition to
: 1111 0111 1111 0000.Assembly language
(Assembly Language), which uses mnemonics instead of machine language.add
: INC EAX via compiler 0100 0000,Reduction of
: DEC EAX via compiler 0100 1000,take
: MUL EAX via compiler 1111 0111 1110 0000,In addition to
: DIV EAX via compiler 1111 0111 1111 0000. How is the code we develop everyday translated on the terminal device?The figure above gives you an intuitive view of how the code is transformed. We should also recognize that assembly language corresponds to machine language, and that every machine instruction has a correspondingAssembly instruction
. Assembly language can be compiledMachine language
.Machine language
Can be obtained by disassemblyAssembly language
. High level language can be compiled into assembly language machine language, but assembly language machine language is almost impossible to restore to high level language.
- Features of assembly language
Can directly access and control a variety of hardware devices, such asmemory
,CPU
Can maximize the function of the hardware. Object code is short, consumes less memory, and executes quickly. Assembler instructions are mnemonics of machine instructions, corresponding to machine instructions one by one. Each of theseCPU
Has its own machine instruction set, assembly instruction set, so assembly language is not portable. Case insensitive, for example, mov is the same as MOV. At present, the more common assembly language is:8086 assembly
,Win32 compilation
,Win64 assembly
,ARM Assembler (Embedded, Mac, iOS)
. The iPhone usesARM
Assembly, but different devices also differ, because they have different CPU architectures.armv6
: iPhone, iPhone2, iPhone3G, iPod Toucharmv7
: iPhone3GS, iPhone4, iPhone4S,iPad, iPad2, iPad3(The New iPad), iPad mini, iPod Touch 3G, iPod Touch4armv7s
: iPhone5, iPhone5C, iPad4(iPad with Retina Display)arm64
: after iPhone5S, iPhoneX, iPad Air, iPad Mini2 later assembly will generate different instructions, differentThe CPU architecture
It’s going to correspond to a different set of instructions. How does the application execute on a phone or computer?Through the picture above, we can basically feel the execution process of the program. The most important thing about hardware isCPU
. Most instructions in assembly areCPU
withmemory
Between. Usually what we say32 -
andA 64 - bit
System, the difference is in the data throughput difference, related to the data bus. The working principle of the CPU is external discharge. 32-bit discharge throughput bit of the system CPU4 bytes
, 64-bit system CPU discharge throughput bit8 bytes
. So what is dataThe bus
? The bus
It’s a collection of wires. There are many for each CPU chippin
, thesepin
Connected to the bus, the CPU interacts with external devices through the bus. The bus is divided into three categories:The address bus
,The data bus
,Control bus
. The address bus
There are multiple roots that determine the busThe width of the
.The width of the
To determine theAddressing capability
. During program execution, we often refer to the noun image file (image ()). It is actually a representation of executable files being loaded from disk into memory.8086
The address bus width of is20
, so the addressing capability is1M
.8080
The address bus width of is16
, so the addressing capability is64K
. Well, that’s where we started this compilation, and we’ll pick it up next time.