The topic outside
APP Compilation process
More knowledge of APP loading other
Cold start and hot start
Cold start: Restarts the APP after the APP is killed by the background. This startup mode is called cold start. Hot start: The state of the APP changes from running to suspend. The APP is still running in the background without being killed. Switch the APP to the foreground again. This startup mode is called hot startup
Composition of startup time
The partition of startup time can be divided by the main() function as a key point
- The time required for processing before main() in stage T1 is called pre-main
- The processing time after main() and main() in t2 stage
T1 stage: pre-main
T2 stage
The time spent in t2 phase is mainly recommended by business code BLStopwatch, which can make statistics of service time
Xcode measures the pre-main time
Edit scheme -> Run -> uments Add the environment variable DYLD_PRINT_STATISTICS, value set to YES.
phase | work | To optimize the |
---|---|---|
Load dylibs | Dyld gets the list of dependent dynamic libraries to load from the header of the main executable. It then needs to find each dylib, and the dylib files on which the application depends may in turn depend on other Dylibs, so all it needs to load is the list of dynamic libraries, a collection of recursive dependencies | 1. Try not to use embedded Dylib, because loading embedded Dylib costs a lot of performance; 2. Merge existing dylib and use static Archives to reduce the number of dylib uses; 3. Lazy loading of dylib, but be aware that dlopen() can cause some problems and actually does more work |
Rebase and Bind | 1. Rebase adjusts the pointer pointer inside the Image. In the past, the dynamic library was loaded at the specified address, and all Pointers and data were correct for the code. Now the address space layout is randomized, so it needs to be modified at the original address based on random offsets. 2. Bind refers to the correct pointer to the content outside the Image. These external Pointers are bound by symbol names. Dyld needs to search the symbol table to find the corresponding implementation of symbol | 1. Reduce the number of ObjC classes, methods, and categories; 2. Reduce the number of C++ virtual functions (creating a virtual function table is expensive); 3. Use Swift structs (internally optimized for fewer symbols) |
Objc setup | 1. Register all declared OC classes in a global class registration; 2. Insert the category definition in category registration; 3. Ensure that every selector is unique | Reduce the number of Objective-C classes, selectors, and categories by merging or deleting OC classes |
Initializers | 1.Objc +load(); 2.C++ constructor attribute function; 3. Creation of C++ static global variables of non-primitive types (usually classes or structs) | 1. Do fewer things in the class’s +load method and try to defer them until +initiailize; 2. Reduce the number of constructor functions and do fewer things in constructor functions. Reduce the number of C++ static global variables |
Physical memory and virtual memory
Memory management
Memory is managed paging, mapping tables are not in bytes, but in pages.
- Linux uses 4K as a page
- MacOS comes in 4K as a page
- IOS starts at 16K per page
Terminal type pageSize
Memory is a waste of
In the early days, computers kept starting applications. After a certain number of applications were started, an error would be reported. Applications could not run properly. This is because early computers didn’t have virtual addresses, and once loaded they were all loaded into memory. Once the physical memory runs out, the application can no longer be started. Applications in memory are sorted sequentially, so that a process can access the memory address of another process only by moving its own address tail back a bit, which is quite unsafe.
Virtual memory
Users will not use all the memory when they use it. If the App is loaded into the memory as soon as it starts, a lot of memory space will be wasted. Virtual memory technology appears to solve this memory waste problem. After the App starts, it will think that it has obtained the memory space needed for the entire App operation, but in fact, it does not apply for so much space in physical memory, but only generates a table associated with virtual memory and physical memory.
Address translation
When the App needs to use the address of a certain virtual memory, it checks whether the virtual address has applied for space in the physical memory through this table.
- If requested, access the physical memory address through the table’s record,
- If not, request a physical memory space and record it in the table (Page Fault).
This process mapping to different physical memory Spaces through the process mapping table is called address translation, and this process requires CPU and operating system cooperation.
Page Fault
The following operations are performed when the data is not in physical memory
-
The system blocks the process
-
Load the data corresponding to the Page on the disk into the memory
-
Point virtual memory to physical memory
This behavior is Page Fault
Flexible Memory Management
This solves the waste problem, but what if all of the physical memory is used up? There may be insufficient memory. In order to ensure the normal use of the current App, data loading follows the following principles:
- If there is free memory space, put it in the empty memory space
- If not, overwrite the other process’s data
- Specific coverage is handled by the operating system
Resolving security issues
The space problem has been solved, but what about the security problem?
ASLR (Address Space Layout Randomization) technology and code signature are introduced in the loading process of dylib for safety.
ASLR: Images, executables, dylib, and bundles will add a slide in front of the preferred_address to prevent the internal address from being located.
Why binary rearrangement
Virtual memory technology produces Page faults, which can be a time-consuming operation. Page time also varies widely, from 1 microsecond to 0.8 milliseconds.
This time is not obvious in the process of use, but a large amount of data is loaded when starting up. If a large number of Page faults occur, users will have an obvious perception after time stacking.
If we put all the boot-time code on one or two pages, we can greatly reduce Page faults and optimize startup speed, which is binary rearrangement.
Binary rearrangement
The binary rearrangement is designed to reduce startup time by reducing Page faults
The principle of
By default, when generating binary code, the compiler writes files in the order of the linked Object file(.0) and functions in the order of the functions inside the Object file.
Static library file. A is a set of.o files ar package, can be viewed with the nm command
Let’s say we only have two pages, Page1 and Page2, and the green method1 and method3 are called when the application starts. In order to execute the corresponding code, the system must do two Page faults. But if we arrange method1 and method3 together, all we need is a Page Fault, which is the core principle of binary rearrangement.
But if we arrange method1 and method3 together, all we need is a Page Fault.
How to check APP Page Fault
- First open the items to be analyzed, and then
command+i
Open theinstruments
Debug tool, openSystem Trace
2. Click “Run”, and click “stop” when the APP starts and you see the home page
3. After the operation is complete, you can see the whole analysis diagram. Enter it in the search boxmain thread
“, and then go to the following selectionMain Thread --> Virtual Memory
(Virtual memory)
Binary rearrangement mode
To actually implement the binary rearrangement, we need to take all the symbols of the methods, functions, etc. that are started, save their order, and then write the order file to implement the binary rearrangement
There are several common ways to obtain symbols:
- fishHook
- Clang plugging pile
fishHook
FishHook github.com/facebook/fi… Facebook open source is a tool that can hook system functions, we can hook into the system objc_msgSend way, collect function symbols. However, this implementation method initialize, block and direct call method hook does not
Clang plugging pile
OC methods, functions, and blocks can be hooked. This is done by adding hook code to the binary data inside each function at compile time to implement the hook effect of the global method
We need to track the execution of each method to obtain the order in which the methods were executed at startup, and then write the order file in that order.
So what does Clang do
Code Coverage tool
LLVM has a built-in simple code coverage tool
- It can insert user-defined functions and provide callbacks at the function, block, and edge levels
- It enables simple visual coverage reports
Tracing PCs with Guards
The specific implementation
- In the project
Build Settings
->Other C Flags
add-fsanitize-coverage=trace-pc-guard
configuration
2. Write the following two methods in a class
void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop) { static uint64_t N; // Counter for the guards. if (start == stop || *start) return; // Initialize only once. printf("INIT: %p %p\n", start, stop); for (uint32_t *x = start; x < stop; x++) *x = ++N; // Guards should start from 1. } void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { if (! *guard) return; void *PC = __builtin_return_address(0); char PcDescr[1024]; printf("guard: %p %x PC %s\n", guard, *guard, PcDescr); }Copy the code
Run the code and take a look
No, let’s break the point
Start and stop store a bunch of serial numbers
Could this be the number of the function? Arrange a function for him to try, sure enough after adding a function08
into09
__sanitizer_cov_trace_pc_guard
The guard_init method above can get the number of all methods, so there must be a way to get the specific information about the method. The key is the __sanitizer_cov_trace_pc_guard, which is analyzed next. Add the click method and call the method you just added
Click on the call test method to see the breakpoint
You can see that it was inserted when the method was called__sanitizer_cov_trace_pc_guard
Methods.
For those of you who don’t understand, let’s use Hopper to look at the generated Mach-O binaries
As you can see in the figure above, the hook function is called inside each function
That is, we can now use the __sanitizer_cov_trace_pc_guard function, Use the __builtin_return_address number to get the address of the assembly code instruction that called __sanitizer_cov_trace_pc_guard
Gets the function name based on the memory address
How do I get the function name when I get the address of the line inside the function? There is a method in dlfcn.h as follows:
typedef struct dl_info { const char *dli_fname; /* file */ void *dli_fbase; /* file address */ const char *dli_sname; /* symbol name */ void *dli_saddr; /* function start address */} Dl_info; Int dladdr(const void *, Dl_info *);Copy the code
To experiment, import the header #import <dlfcn.h> and modify the code as follows:
void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { if (! *guard) return; // Duplicate the guard check. void *PC = __builtin_return_address(0); Dl_info info; dladdr(PC, &info); printf("fname=%s \nfbase=%p \nsname=%s\nsaddr=%p \n",info.dli_fname,info.dli_fbase,info.dli_sname,info.dli_saddr); char PcDescr[1024]; printf("guard: %p %x PC %s\n", guard, *guard, PcDescr); }Copy the code
Now that we have the function information, let’s briefly talk about swift configuration
Block, C function
- Add a block
- Add a c function
We can hook it, but we won’t do practice here
Swift mixed processing
Target -> Build Setting -> Custom Complier Flags -> Other Swift Flags added
-sanitize-coverage=func
-sanitize=undefined
Binary rearrangement practice
- Clang piling is configured
- Get symbol list
The lazy version directly defines a class that collects the symbolic methods needed to start a list of symbols, generating a local file called binary.order
- configuration
binary.file
LinkMap
Write Link Map File in Build Settings of Xcode:
Then click Run to generate a Linkmap file that contains the symbol order list of links:
This file is divided into four sections:
Path
- Path is the Path to generate the. O object file
- Arch is an architectural type
- Object Files lists all obJ and TBD files in the executable. The number at the beginning of each line represents the file number.
2.Section(Mach-O information) Sections Record the address range of each Segment/ Section
3.Symbols
- Address indicates the Address of the method in the file.
- Size Indicates the memory Size occupied by the method.
- File indicates the number of the File, corresponding to the number in brackets in the Object Files section
- Name Indicates the method Name.
Dead Stripped Symbols
Symbols that the linker considers useless are not counted when linking