Start the
The process of launching is usually from the time the user clicks on the app icon until the didFinishLaunching method of AppDelegate is complete, with both cold and hot launching.
- Cold boot means that the memory does not contain relevant memory data and must be loaded from disk to memory. This process is called cold boot.
- Killing an app does not necessarily put it into cold boot. It’s the system, or when memory is overwritten. Cold startup can usually be achieved by restarting the phone
Warm start
: startup when data still exists after the app process has been killed
The startup optimization mentioned here generally refers to the case of cold startup, which is mainly divided into two parts:
- T1 :
pre-main
The stage, before main, is when the operating system loads the App executable into memory, performing a series of loads and links, and so onDyld loading process
- T2After main, that is, from main to Appdelegate’s
didFinishLaunching
Until the method is executed, the main task is to build the first interface and finish rendering
Therefore, the process of T1+T2 is the process from the user clicking the App icon to the user seeing the main interface of the App, that is, the part that needs to be optimized.
Pre-main phase optimization
You have learned about the dyLD loading process in OC Underlying Principles 09: DYLD loading Process. The startup time of the pre-main phase is actually the time of the dyLD loading process.
For the main function before the startup time, Apple provides a built-in measurement method, inEdit Scheme -> Run -> Arguments ->Environment Variables
Click + to add environment variablesDYLD_PRINT_STATISTICS
Set to1
), and then run. The following is the pre-main time of iPhone7p normal startup (take WeChat as an example)
The pre-main phase takes 1.7 seconds
-
Dylib loading time: it takes 320.32ms to load the dynamic library
-
Rebase /binding time (offset correction/symbol binding time), 160.52ms
Rebase (offset correction)
: The binary file generated by any app has an address for all methods and function calls inside the binary fileThe offset address in the current binary file
. Once it is run time (that is, in memory), the system will run each timeAssign an ASLR (Address Space Layout Randomization) Address value
For example, if the binary file has a test method, the offset is 0x0001, and the value is randomly assignedASLR
Is 0x1f00. If you want to access the test method, its memory address (the real address) changes toASLR+ offset = memory address determined at runtime
(i.e. 0x1F00 +0x0001 = 0x1F01)Binding
: such asNSLog
Method, which creates a symbol in the Mach-O file generated at compile time! NSLog
(currently pointing to a random address), and then at run time (loading from disk into memory as a mirror file), will give the real address to the symbol (that is, bind the address to the symbol in memory)dyld
Made of, also calledDynamic library symbol binding
), in a word:Binding is the process of assigning values to symbols
-
ObjC Setup Time (time required for OC class registration) : The more OC classes, the more time required
-
Initializer Time (time taken to execute load and constructor)
For these, there are the following optimization suggestions:
- As far as possible
Use less external dynamic libraries
Apple officially recommends custom dynamic libraries as the bestNo more than six
, if more than 6, yesmerge
The dynamic library - Reduce OC classes, because the more OC classes, the more time consuming.
- Will not have to be in
+load
Method to do things deferred to+initialize
In, try not to use C++ virtual functions - If it’s Swift, try to use it
struct
Optimization after main
In the didFinishLaunching method after main, there’s basically all sorts of stuff going on, much of it not necessarily going on right away, and it’s something we can lazily load so as not to affect startup time.
There are three main types of business in didFinishLaunching
- [First type] Initialize third-party SDKS
- [Second type] Configuration of APP running environment
- [Third class] initialization of their own tool classes, etc
The optimization suggestions for the main function stage are as follows:
Reduce the process of initiating initialization
, can lazy load lazy load, can delay delay, can put in the background initialization put in the background, try not to occupy the main thread startup time- Optimize the code logic,
Remove all necessary code logic
To reduce the time consumed by each process - Start-up energy
Using multiple threads
To initialize, use multithreading - As far as possible
Use pure code
To build the UI framework, especially the main UI framework, such as UITabBarController. Try to avoid using XIBs or SB, which are more time consuming than pure code - Delete obsolete classes and methods
Next, I would like to focus on an optimization scheme in the pre-main stage, namely binary rearrangement. This scheme was first introduced due to the research and development practice of Douyin in this article: The solution based on binary file rearrangement became popular by increasing the APP startup speed by more than 15%.
Binary rearrangement principle
In the virtual memory section, we know that when a process accesses a virtual memory page where the corresponding physical memory does not exist, a Page Fault is triggered, thus blocking the process. At this point, the data needs to be loaded into physical memory and then accessed again. This has some impact on performance.
Based on Page Fault, we think that in the process of cold startup of App, there will be a large number of classes, categories, and third parties that need to be loaded and executed, and the resulting Page Fault will take a lot of time. Taking WeChat as an example, let’s take a look at the number of Page faults in the startup stage
-
CMD+i
Shortcut key, selectSystem Trace
-
Click Start (you need to restart the phone and clear the cache data before starting), stop the first interface, and follow the operation in the following figure
It can be seen from the figure that WeChat has occurred PageFault 2800+ times. It can be imagined that this isVery bad performance
.
Let’s create a demo project of our own and check the order of methods at compile time. Define the following methods in the following order in ViewController:
@implementation ViewController
void test(){
block1();
}
int test1(){
return 0;
}
void(^block1)(void) = ^(void){
};
- (void)viewDidLoad {
[super viewDidLoad];
test();
}
+(void)load
{
[SwiftTest swiftTest];
}
@end
Copy the code
-
in
Build Setting -> Write Link Map File
Set toYES
-
CMD+B Compile the demo, and then search the LinkMap file in the corresponding path as shown below. You can see that the loading order of the functions in the class is from top to bottom, and the loading order of the files is based on the order in Build Phases -> Compile Sources
- LinkMap file location
- LinkMap file location
From the above PageFault times and loading order, we can find that in factThe root cause of too many PageFaults is the method that needs to be called at startup time, which is in a different Page
. Therefore, our optimization idea is:Arrange all the methods that need to be called at startup time in one page, so that multiple PageFaults become one PageFault
. This is binary rearrangementCore principles
As shown below.
Note: In the iOS production environment, when a Page Fault occurs, the iOS system performs a signature verification on the app when it is reloaded. Therefore, the Page Fault in the iOS production environment takes more time than that in the Debug environment.
Binary rearrangement practice
Now, let’s do some concrete practice, first understand some nouns
LinkMap Is an intermediate product of iOS compilation, which records the layout of binary files. You need to enable the Write LinkMap File in Xcode’s Build Settings. The LinkMap consists of three parts:
Object Files
The path and file number of the link unit used to generate the binarySections
Record the range of addresses for each Segment/section in Mach-OSymbols
Record the address range of each symbol in order
ld
Ld is the linker used by Xcode and has an order_file parameter. We can configure a File path with the suffix Order by setting it to Build Settings -> Order File. In this order file, the required symbols are written in the order in which they are loaded when the project is compiled to achieve our optimization.
- The absence of methods in the order file is automatically ignored
So the essence of binary rearrangement is to rearrange the symbols that start loading.
If the project is small, it is possible to customize an order file and manually add the order of methods. However, if the project is large and involves many methods, how do we get the function to start running? There are several ideas
1, the hook objc_msgSend
As we know, the essence of a function is to send a message that will come at the bottomobjc_msgSend
, but because the objc_msgSend parameter is mutable, it needs to passassembly
Acquisition, higher requirements for developers. And you can only get itOC
And the swift,@objc
Methods after2. Static scanning
Scanning:Mach-O
Symbol and function data stored in a particular section or section3. Clang piling
: batch hook, can achieve 100% symbol coverage, that is, full accessSwift, OC, C, block
function
Clang plugging pile
LLVM comes with a simple code coverage test built in. It inserts calls to user-defined functions at the function level, base block level, and edge level. Santizer coverage is needed for our batch hook here.
The official documentation for clang’s pile coverage is as follows: The clang code Coverage tool documentation provides a detailed overview, as well as a brief Demo.
-
[Step 1: Configure] Enable Santizer Coverage
- OC project, need to be in:
In the Build Settings
In the”Other C Flags
“Add-fsanitize-coverage=func,trace-pc-guard
The __sanitizer_cov_trace_pc_guard function is added to each method/function/block at compile time when clang tracing is required - If Swift project or OC has compiled Swift, additional information in”
Other Swift Flags
“Add-sanitize-coverage=func
和-sanitize=undefined
- All binaries linked to the App need to be turned on
SanitizerCoverage
In order to fully cover all calls. - Also through
podfile
To configure the parameterspost_install do |installer| installer.pods_project.targets.each do |target| target.build_configurations.each do |config| config.build_settings['OTHER_CFLAGS'] = '-fsanitize-coverage=func,trace-pc-guard' config.build_settings['OTHER_SWIFT_FLAGS'] = '-sanitize-coverage=func -sanitize=undefined' end end endCopy the code
After the configuration as shown in figure, there is an error in compiling There are two errors in compiling, which means that setting that parameter will call the two functions in the above example. Let’s implement the above two functions.
- OC project, need to be in:
-
In viewController.m, override two methods :__sanitizer_cov_trace_pc_guard_init and __sanitizer_cov_trace_pc_guard. The code is as follows:
#import "ViewController.h" #include <stdint.h> #include <stdio.h> #include <sanitizer/coverage_interface.h> #import <dlfcn.h> #import <libkern/OSAtomic.h> #import "Test-Swift.h" @interface ViewController () @end @implementation ViewController void test(){ block1(); } int test1(){ return 0; } void(^block1)(void) = ^ (void){ }; - (void)viewDidLoad { [superviewDidLoad]; test(); } + (void)load { [SwiftTest swiftTest]; } // Atomic queue, whose purpose is to ensure write safety, thread safety static OSQueueHead symbolList = OS_ATOMIC_QUEUE_INIT; // Define a symbolic structure in the form of a linked list typedef struct { void *pc; void *next; }MMNode; /* -start: start position -stop: not the address of the last symbol, but the address of the last symbol in the entire symbol table =stop-4 (because stop is an unsigned int, 4 bytes). Stop stores the value of the symbol */ void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop) { static uint64_t N; if (start == stop || *start) return; printf("INIT: %p - %p\n", start, stop); for(uint32_t *x = start; x < stop; x++) { *x = ++N; }}/* Fully hook methods, functions, and block calls, used to capture symbols, are multithreaded. This method stores only PCS, in the form of a linked list - guard is a sentinel that tells us the number of */ to be called void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { // if (! *guard) return; // The load method is filtered out, so it needs to be commented out / / for the PC /* -pc The current function returns the address of the previous call. -0 The current function address, i.e. the return address of the current function. -1 The current function caller's address, i.e. the return address of the previous function */ void *PC = __builtin_return_address(0); // Create node and assign the value MMNode *node = malloc(sizeof(MMNode)); *node = (MMNode){PC, NULL}; // Join the queue // The symbol is accessed not by subscript, but by next pointer to the list, so we need to borrow offsetof (structure type, next address is next). OSAtomicEnqueue(&symbolList, node, offsetof(MMNode, next)); } @end Copy the code
-
__sanitizer_cov_trace_pc_guard_init method
- Parameter 1
start
Is a pointer to an unsigned int, 4 bytes long, equivalent to an arrayThe starting position
, the starting position of the symbol (read from high to low - Argument 2 stop, since the address of the data is read down (i.e
Read from high to low
, so the address is not the real address of stop, but the last address marked. When reading stop, because stop takes up 4 bytes,Stop Real address = stop printed address -0x4
)
- What does the value stored in the stop memory address represent? When adding a method/block /c++/ attribute to a method (three more), find that its value is also increased by the corresponding number, such as adding a test1 method
- Parameter 1
-
The __sanitizer_cov_trace_pc_guard method captures all symbols at the start time, enqueuing all symbols
- parameter
guard
It was a sentinel,Tell us which number was called
- The storage of symbols requires a help
The list
, so you need to define the linked list nodeMMLNode
. - through
OSQueueHead
Atomic queues are created to ensure read and write security - through
OSAtomicEnqueue
Methods the nodeThe team
The next symbol is accessible through the next pointer to the list
- parameter
-
– The while loop fetches the symbols from the queue, processes the prefixes of non-OC methods, and stores them in an array
- An array of
The not
, because the queue is stored in reverse order - An array of
duplicate removal
And removes the symbol of its own method - Convert the symbols in the array into strings and write them to the SANDbox TEM folder
mm.order
In the file
Let’s write it in the touch method:
- (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event { // Define an array NSMutableArray<NSString *> * symbolNames = [NSMutableArray array]; while (YES) {// a loop! Will also be HOOK once!! MMNode * node = OSAtomicDequeue(&symbolList, offsetof(MMNode, next)); if (node == NULL) { break; } Dl_info info = {0}; dladdr(node->pc, &info); // printf("%s \n",info.dli_sname); NSString * name = @(info.dli_sname); free(node); // if it is an OC function and not preceded by "_" BOOL isObjc = [name hasPrefix:@"+ ["]||[name hasPrefix:@"-"]; NSString * symbolName = isObjc ? name : [@"_" stringByAppendingString:name]; // Whether to remove?? [symbolNames addObject:symbolName]; / * if ([name hasPrefix: @ "+ ["] | | [name hasPrefix: @" - ["]) {/ / if the OC method name direct deposit! [symbolNames addObject:name]; continue;} [symbolNames addObject:[@"_" stringByAppendingString:name]]; */ } // Reverse the array // symbolNames = (NSMutableArray
*)[[symbolNames reverseObjectEnumerator] allObjects]; NSEnumerator * enumerator = [symbolNames reverseObjectEnumerator]; // Create a new array NSMutableArray * funcs = [NSMutableArray arrayWithCapacity:symbolNames.count]; NSString * name; / / to heavy! while (name = [enumerator nextObject]) { if(! [funcs containsObject:name]) {// The array does not contain name [funcs addObject:name]; } } [funcs removeObject:[NSString stringWithFormat:@"%s",__FUNCTION__]]; // Array to string NSString * funcStr = [funcs componentsJoinedByString:@"\n"]; // Write the string to the file // File path temp real machine NSString * filePath = [NSTemporaryDirectory() stringByAppendingPathComponent:@"mm.order"]; // File contents NSData * fileContents = [funcStr dataUsingEncoding:NSUTF8StringEncoding]; [[NSFileManager defaultManager] createFileAtPath:filePath contents:fileContents attributes:nil]; } Copy the codeTo connect to the real machine, click. Then download it locally
- An array of
-
Copy the mm. Order File, place it in the specified location, and configure the path./mm. Order.
Below is the comparison before and after order configuration (above is the LinkMap before configuration, below is the LinkMap symbol order after configuration)
Before:After:
Note: Avoid endless loops
-
Build Settings -> Other C Flags
If yes is configured-fsanitize-coverage=trace-pc-guard
In theThe while loop
Part of it will appearInfinite loop
(we are intouchBegin
Debug in method) -
We opened assembly debugging and found three
__sanitizer_cov_trace_pc_guard
The call
-
The first time bl is
touchBegin
-
The second bl is because
The while loop
. That as long asIf it is a jump, it will be hooked
That there areb
(Unconditional jump)bl
(conditional jump) instruction, will be hooked
-
The third bl is printf
-fsanitize-coverage=func,trace-pc-guard -fsanitize-coverage=func,trace-pc-guard -fsanitize-coverage=func
Refer to the link
- IOS optimization chapter App startup time optimization
- AppOrderFiles