This is the sixth day of my participation in the August More text Challenge. For details, see: August More Text Challenge
IOS underlying principles + reverse article summary
This article focuses on the Mach-O file format and generic binaries
Overview of Mach-o files
Mach-o is actually short for Mach Object file format, which is the MAC and iOS executable file format, Similar to Portable PE on Windows, ELF on Linux (Executable and Linking Format)
Mach-o is a file format for executable files, object code, and dynamic libraries. As an alternative to the A.out format, Mach-O provides greater extensibility.
Mach-o File format
There are several common Mach-O formats
1. Object file.o
2, library files, subdivision mainly has the following types:
3. Executable files
4, dyld
We can see the type of file by using the terminal’s file directive
- $
File File path
1. Object file.o
- 1, create a C file
#include <stdio.h>
int main(){
return 0;
Copy the code
- 2, through the
Command to compile C files
– View folders:
- 'clang -c test.c' : compile a. C file into a. O file (OC uses LLVM compilation in the back end and clang in the front end, meaning LLVM contains clang). O ': check the file type - Mach-o file, 64-bit, x86_64 schema -' clang test.o ': compile the. O file into an executable file -'./a.out ': Execute the executable fileCopy the code
- C files can also be compiled into executable files with a single command:
clang -o test2 test.c
- Repeat 3 to generate an executable file test3 again. This is where the problem comes in that the three executables (i.e
A, test2, test3
) Is it the same? We can go throughmd5
Verify that the hash values are the same if they are the same, and not the other way around
The intermediate.o file
The difference between.c and.out files is that there’s an extra.o file in the middle. However, in our actual development, there are actually multiple source codes, so the final executable file is generated from multiple source codes. As shown below, two.o files are compiled into one executable file
The. O file is generated by clang -c test1.c test.c
Compile the. O file into an executable file using clang: clang -o demo1 test1.o test.o
- What if I changed the link order? , such as:
clang -o demo2 test.o test1.o
- Multiple source codes generate executable files at once:
clang -o demo test1.c test.c
- Compare the three executable files generated above, are they the same? Here we also go through
The generated hash values are compared
conclusionBy comparison, if you change the concatenation order of.o files, then the mach-o file will also change
- You can go through here
To see the Mach-o link order, for example:objdump --macho -d demo
+objdump --macho -d demo1
Here the source files for demo and demo1 are linked in the following ordertest1 + main
objdump --macho -d demo2
, the connection order of the source files in demo2 ismain + test1
Target -> Build Phase -> Compiles Sources. If the order of the source files is changed, the generated executable file is different
2. Library files
Library: mainly a collection of program code, that is, N files organized, is a way to share program code
The classification of the library
Open source libraries (public libraries) : The source code is open and you can see the implementation of each file, such as AFNetwoeking, SDWebImage, etc. in Github
Closed source library (private library) : the source code is not open, is a compiled binary file, can not see the concrete implementation, and then subdivided into static library and dynamic library
Static library & dynamic library
There are two main forms of static libraries (.framwork is recommended) :.a +.framework
1. A is a pure binary file. A cannot be used directly
2,. Framework in addition to binary files, there are resource files, and can be used directly
A +. H + sourceFile =.framework
Advantages: Easy to share code, easy to use
- To realize the modularization of iOS, the fixed business modularization is to be static library
- Share code, but do not want to see the implementation of the code
Dynamic libraries also exist in two forms:.dylib +.framework
Why is.Framework both a static and a dynamic library?
1. The system’s. Framework is a dynamic library
The custom. Framework is a static library
The difference between static and dynamic libraries:
1. The static library will be completely copied to the project when linking. If multiple apps use the same static library, multiple copies will be copied and memory will be wasted
2. There is only one copy of the dynamic library, which is loaded into memory dynamically when the program is running. Multiple apps share one copy to save memory
Reference links:
- IOS static library. A and Framework
Verify that.a,.dylib are mach-o files
Validation. A
A: find /usr -name “*.a”
View a file randomly. This class is a dynamically: file. A file path
Validation. Dylib
Dylib: find/usr-name “*.dylib”
View the. Dylib file: file. Dylib file path
3. Executable files
The executable file here, namely generally refers to the daily project, the executable generated after compilation has not seen, can passfile
View the file type
4, dyld
Dyld (The Dynamic Link Editor) is a dynamic linker of Apple. It is an important part of Apple operating system. After the system kernel completes the program preparation, dyLD is responsible for the rest of the work. And it’s open source. Anyone can download the source code from apple’s official website to read and understand how it works and how the system loads dynamic libraries.
Shared cache mechanism
In iOS, the dynamic libraries that each program depends on need to be loaded into memory one by one via dyld (located in /usr/lib/dyld). However, many system libraries are used by almost every program. If every program is repeatedly loaded at the same time, it will cause slow performance. In order to optimize the startup speed and improve the performance of the application, the shared cache mechanism came into being. All the default dynamic link libraries are merged into one large cache file, in/System/Library/Caches/com. Apple. Dyld/directory, preserved separately according to the different architecture preservation,
Verify dyld
- Look for dyLD in the Mac
– Enter directory:
cd /usr/lib
- Search for dyld: 'ls dylib'Copy the code
- Check the file type of dyld: file dyld, is a
Dynamic linker
, which is itself a Mach-o file
5. Dsym file
Dsym introduction
When Xcode compiles the project, we will see a dSYM file with the same name. DSYM is a relay file that holds the address mapping information for hexadecimal functions. The symbols we are debugging will be included in this file, and a new dSYM file will be generated each time we compile the project. Is located in the/Users / < user name > / Library/Developer/Xcode/Archives directory, so for every release we it is necessary to save the corresponding Archives file.
When our software is packaged in release mode or goes online, we will not see the crash error as intuitively as we did in Xcode. At this time, we need to analyze the crash report file. There will be a log file in iOS device to save the memory address of the function where each application fails. The Organizer of Xcode can export DeviceLog from the iOS device into a crash file. At this time, we can query the corresponding function name and file name of the program in the dSYM file through the error function address. The main precondition is that we need to have dSYM files for the software version, which is why it is necessary to keep Archives files for every release.
Verify. Dsym files
- The program
Real machine + release
At compile time, there is oneThe dsym file
Is also amach-O
File, it’s aThe symbol table
, mainly used for the emergence of a crash can be through this file to the symbol, convenient troubleshooting problems
Universal binary
The CPU and hardware platforms supported by the MAC system changed a lot. To solve the problem of software compatibility across multiple hardware platforms, Apple developed a Universal Binary format, also known as the Fat Binary.
A program code proposed by Apple. Binaries that can be used for multiple architectures simultaneously
Provides optimal performance for multiple architectures simultaneously in the same package.
General-purpose binary applications are generally larger than single-platform binary applications because they need to store multiple codes.
However, because the two architectures share non-executable resources (other than code), it is not twice as large as the single version.
And since only part of the code is called during execution, no extra memory is required to run it.
- In daily development projects, you can use Build Setting –
Mach-O type
You can specify the type of the Mach-o file, as shown below
Generally, the executable that we generate through the real machine, the architecture is ARM64, is a single architecture
iOS 11
The above systems only supportA 64 - bit architecture
- At this time, change the lowest version of the case to 10.3, and check the executable file, it is found that there are two architectures:
armv7 + arm64
You can also set the architecture for compilation in Build setting-Architectures
- The environment variable
: containsArm64, armv7
- The environment variable
– There is also a framework:
To generate the usable architecture of iphone5 and iphone5c and add it in the projectView the executable file at this time, which supports three schemas
The ARM architecture
The ARM architecture used to be called Advanced RISC machines (earlier, Advanced RISC machines). Acorn RISC Machine (Acorn RISC Machine) is a 32-bit reduced-instruction set (RISC) processor architecture. ARM processors are ideal for mobile communications and are designed for low power consumption.
The first difference between ARM and Intel processors is that the former uses a reduced instruction set (RISC) while the latter uses a complex instruction set (CISC).
ARM processor instruction set: refers to the computer ARM operation instruction system.
Armv6, ARMV7, ARMV7s, ARM64, and ARM64E are all instruction sets of ARM processors. In principle, all instruction sets are backward compatible
The Apple A7 processor supports two different instruction sets:
32 -
ARM instruction set (Armv6 | armv7 | armv7s
Bit ARM instruction set (arm64
I386 | x86_64 is the instruction set for the Mac processor.
Instruction set supported by iOS devices
ARM instruction set | The corresponding equipment |
armv6 | iPhone, iPhone 3G, iPod 1G/2G |
armv7 | iPhone 3GS, iPhone 4, iPhone 4S, iPod 3G/4G/5G, iPad, iPad 2, iPad 3, iPad Mini |
armv7s | iPhone 5, iPhone 5c, iPad 4 |
arm64 | IPhone X, iPhone 8(Plus), iPhone 7(Plus), iPhone 6(Plus), iPhone 6S (Plus), iPhone 5S, iPad Air(2), Retina iPad Mini(2,3) |
arm64e | iPhone XS\XR\XS Max |
Reference: iOS instruction set architecture ARMv6, ARMV7, ARMV7s, ARM64, ARM64E, X86_64, I386
Universal binary source code
- through
- Find the header structure for the generic binary
As shown below.
struct fat_header { uint32_t magic; /* Uint32_t nfat_arch; /* uint32_t nfat_arch; /* how many mach-o files are there */};Copy the code
- Use every fat binary
Structure indicates that fat_header is followed by one or more contiguous FAT_ARCH structures.
struct fat_arch { cpu_type_t cputype; /* CPU type */ cpu_subtype_t cpusubtype; /* uint32_t offset; / uint32_t offset; /* uint32_t size */ uint32_t size */ uint32_t size; /* uint32_t align; /* The memory alignment boundary of the data, which must be raised to the power of 2, ensures that the data is memory-aligned when the object file of the current CPU architecture is loaded into memory */};Copy the code
Terminal command
Fat_header information can be viewed by otool: otool -f executable file
2. You can split and merge fat binaries using the LIPo command. Common commands are as follows
$lipo-info MachO files: Use lipo-info to view the schemas contained in MachO files
$Lipo MachO file – Thin schema – output Output file path: Split a schema using LIPO-thin
$lipo-create MachO1 MachO2 -output Output file path: Use lipo-create to merge multiple schemas
Lipo demo
- Open the executable with Hopper, and you will see that it is a FAT Archive (FAT binary file, indicating that multiple schemas are supported), select
- View the schemas contained in the binaries:
lipo -info 12-macho
- Resolution:
lipo 12-macho -thin armv7 -output macho_armv7
An error will be reported if the schema is not split - To view the split executable file type:
file macho_armv7
- Merger:
lipo -create macho_armv7 macho_arm64 -output macho_v7_64
- View the merged executable file type:
file macho_v7_64
Mach-o is short for Mach Object file format, which is the MAC and iOS executable file format. Is a file format for executable files, object code, and dynamic libraries. And Mach-O provides more scalability
Common Mach-O formats:.o, library files (.a,.dylib,.framework), executable files, dyld,.dsym
.a + .h + sourceFile = .framwork
Dynamic. Framework: System Framework library
Statically. Framwork: Custom Framework library
To view the file type, run the file path command
See the link order for mach-o source files: objdump — Macho -d executable
Dyld (The Dynamic Link Editor) is apple’s dynamic linker, MAC path is /usr/lib
DSYM is to save the hexadecimal address mapping function of information transfer files, located in the/Users / < user name > / Library/Developer/Xcode/Archives. Can be used to query the dSYM file by the error function address corresponding to the function name and file name
Universal Binary (also known as Fat Binary). It is mainly used to solve the compatibility problems of multiple platforms
To view fat_header information, run otool -f executable file
The lipo command splits and merges fat binaries
$lipo-info MachO files: Use lipo-info to view the schemas contained in MachO files
$Lipo MachO file – Thin schema – output Output file path: Split a schema using LIPO-thin
$lipo-create MachO1 MachO2 -output Output file path: Use lipo-create to merge multiple schemas