Summary of basic principles of iOS

Premise: In the previous two articles, some basic concepts and ideas for starting optimization have been roughly introduced. Now I will focus on an optimization scheme in the pre-main stage, namely binary rearrangement. This scheme was originally based on the research and development practice of Douyin in this article: The solution based on binary file rearrangement has increased APP startup speed by more than 15%.

Binary rearrangement principle

In the virtual memory section, we know that when a process accesses a virtual memory page where the corresponding physical memory does not exist, a Page Fault is triggered, thus blocking the process. At this point, the data needs to be loaded into physical memory and then accessed again. This has some impact on performance.

Based on Page Fault, we think that in the process of cold startup of App, there will be a large number of classes, categories, and third parties that need to be loaded and executed, and the resulting Page Fault will take a lot of time. Taking WeChat as an example, let’s take a look at the number of Page faults in the startup stage

  • CMD+iShortcut key, selectSystem Trace

  • Click Start (you need to restart the phone and clear the cache data before starting), stop the first interface, and follow the operation in the following figure

  • As can be seen from the figure, WeChat has 2800+ PageFault times, which, as can be imagined, has a great impact on performance.

  • Then we’ll use the Demo to see how the methods are sorted at compile time, and define the following methods in the following order in the ViewController

@implementation ViewController

void test1(){
    printf("1");
}

void test2(){
    printf("2");
}

- (void)viewDidLoad {
    [super viewDidLoad];
    
    test1();
}

+(void)load{
    printf("3");
    test2();
}
@end
Copy the code

In Build Setting -> Write Link Map File set to YES

CMD+B Build the demo, and then search the link map file in the corresponding path as shown below. You can see that the loading order of the functions in the class is from top to bottom, and the loading order of the files is based on the order in Build Phases -> Compile Sources

From the number of Page faults and the loading order above, it can be found that the root cause of the excessive number of Page faults is the method that needs to be called at the startup time, which is caused by different pages. Therefore, our optimization idea is to line up all the methods that need to be called at startup time in a single Page, so that multiple Page faults become a single Page Fault. This is the core principle of binary rearrangement, as shown below

Note: In the iOS production environment, when a Page Fault occurs, the iOS system performs a signature verification on the app when it is reloaded. Therefore, the Page Fault in the iOS production environment takes more time than that in the Debug environment.

Binary rearrangement practice

Now, let’s do some concrete practice, first understand some nouns

Linkmap is an intermediate product of iOS compilation and records the layout of binary files. You need to enable the Write Link Map File in Xcode’s Build Settings. The Link Map consists of three parts:

  • Object FilesThe path and file number of the link unit used to generate the binary
  • SectionsRecord the range of addresses for each Segment/section in Mach-O
  • SymbolsRecord the address range of each symbol in order

ld

Ld is the linker used by Xcode and has an order_file parameter. We can configure a File path with the suffix Order by setting it to Build Settings -> Order File. In this order file, the required symbols are written in the order in which they are loaded when the project is compiled to achieve our optimization

So the essence of binary rearrangement is to rearrange the symbols that start loading.

If the project is small, it is possible to customize an order file and manually add the order of methods. However, if the project is large and involves many methods, how do we get the function to start running? There are several ideas

  • 1, the hook objc_msgSendAs we know, the essence of a function is to send a message that will come at the bottomobjc_msgSend, but because the objc_msgSend parameter is mutable, it needs to passassemblyAcquisition, higher requirements for developers. And you can only get itOCAnd the swift,@objcMethods after
  • 2. Static scanningScanning:Mach-OSymbol and function data stored in a particular section or section
  • 3. Clang piling: batch hook, can achieve 100% symbol coverage, that is, full accessSwift, OC, C, blockfunction

Clang plugging pile

LLVM comes with a simple code coverage test built in. It inserts calls to user-defined functions at the function level, base block level, and edge level. Santizer coverage is needed for our batch hook here.

The official documentation for clang’s pile coverage is as follows: The clang code Coverage tool documentation provides a detailed overview, as well as a brief Demo.

  • [Step 1: Configure] Enable Santizer Coverage

    • OC project, need to be in:In the Build SettingsIn the”Other C Flags“Add-fsanitize-coverage=func,trace-pc-guard
    • In case of Swift project, additional information in”Other Swift Flags“Add-sanitize-coverage=func 和 -sanitize=undefined
    • All binaries linked to the App need to be turned onSanitizerCoverageIn order to fully cover all calls.
    • Also throughpodfileTo configure the parameters
post_install do |installer| installer.pods_project.targets.each do |target| target.build_configurations.each do |config|  config.build_settings['OTHER_CFLAGS'] = '-fsanitize-coverage=func,trace-pc-guard' config.build_settings['OTHER_SWIFT_FLAGS'] = '-sanitize-coverage=func -sanitize=undefined' end end endCopy the code

Create an OC file, CJLOrderFile, and rewrite the two methods

  • __sanitizer_cov_trace_pc_guard_init method

    • Parameter 1startIs a pointer to an unsigned int, 4 bytes long, equivalent to an arrayThe starting position, the starting position of the symbol (read from high to low)

- parameter 2 stop, since the address of the data is read down (i.e., 'read from high to low', so the address is not the real address of stop, but the last address marked, when reading stop, because stop takes 4 bytes, 'stop real address = stop printed address -0x4')Copy the code

- stop What does the value stored in the memory address represent? When adding a method/block /c++/ attribute to a method (three more), find that its value is also increased by the corresponding number, such as adding a test1 methodCopy the code

The __sanitizer_cov_trace_pc_guard method captures all symbols at the start time, enqueuing all symbols

  • parameterguardIt was a sentinel,Tell us which number was called
  • The storage of symbols requires a helpThe list, so you need to define the linked list nodeCJLNode.
  • throughOSQueueHeadAtomic queues are created to ensure read and write security
  • throughOSAtomicEnqueueMethods the nodeThe teamThe next symbol is accessible through the next pointer to the list
Static OSQueueHead queue = OS_ATOMIC_QUEUE_INIT; Typedef struct {void * PC; typedef struct {void * PC; void *next; }CJLNode; /* -start: start position -stop: not the address of the last symbol, but the address of the last symbol in the entire symbol table =stop-4 (because stop is an unsigned int, 4 bytes). Void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop) {static uint64_t N; if (start == stop || *start) return; printf("INIT: %p - %p\n", start, stop); for (uint32_t *x = start; x < stop; x++) { *x = ++N; }} /* Can fully hook methods, functions, and block calls to capture symbols, is multithreaded, this method only stores PC, in the form of a linked list - guard is a sentry, */ void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {// if (! *guard) return; // get PC /* -pc the current function returns the address of the previous call -0 the current function address, i.e. the return address of the current function -1 the address of the current function caller, */ void *PC = __builtin_return_address(0); // Create a node and assign CJLNode *node = malloc(sizeof(CJLNode)); *node = (CJLNode){PC, NULL}; // The symbol is not accessed by the subscript, but by the next pointer to the list, so we need to borrow offsetof (structure type, Next) OSAtomicEnqueue(&queue, node, offsetof(CJLNode, next)); }Copy the code

– The while loop fetches the symbols from the queue, processes the prefixes of non-OC methods, and stores them in an array

  • An array ofThe not, because the queue is stored in reverse order
  • An array ofduplicate removalAnd removes the symbol of its own method
  • Converts the symbols in the array to a string and writes tocjl.orderIn the file
extern void getOrderFile(void(^completion)(NSString *orderFilePath)){ collectFinished = YES; __sync_synchronize(); NSString *functionExclude = [NSString stringWithFormat:@"_%s", __FUNCTION__]; Dispatch_after (dispatch_time (DISPATCH_TIME_NOW, (int64_t) (0.01 * NSEC_PER_SEC)), dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{// Create symbol array NSMutableArray<NSString *> *symbolNames = [NSMutableArray array]; While (YES) {CJLNode *node = OSAtomicDequeue(&queue, offsetof(CJLNode, next)); if (node == NULL) break; // Save the PC to info Dl_info info; dladdr(node->pc, &info); // printf("%s \n", info.dli_sname); If (info.dli_sname) {// if (info.dli_sname) {if (info.dli_sname) {// If (info.dli_sname) {// If (info.dli_sname); BOOL isObjc = [name hasPrefix:@"+["] || [name hasPrefix:@"-["]; NSString *symbolName = isObjc ? name : [@"_" stringByAppendingString:name]; [symbolNames addObject:symbolName];  } } if (symbolNames.count == 0) { if (completion) { completion(nil); } return; } // reverseObjectEnumerator *emt = [symbolNames reverseObjectEnumerator]; / / to heavy NSMutableArray < > nsstrings * * funcs = [NSMutableArray arrayWithCapacity: symbolNames. Count]; nsstrings * name;  while (name = [emt nextObject]) { if (! [funcs containsObject: name]) {[funcs addObject: name].}} / / get rid of their [funcs removeObject: functionExclude]; / / an array into a string nsstrings * funcStr = [funcs componentsJoinedByString: @ "\ n"); NSLog (@ "Order: \ n % @," funcStr); / / the string written to the file nsstrings * filePath = [NSTemporaryDirectory () stringByAppendingPathComponent: @ "CJL. Order"].  NSData *fileContents = [funcStr dataUsingEncoding:NSUTF8StringEncoding];  BOOL success = [[NSFileManager defaultManager] createFileAtPath:filePath contents:fileContents attributes:nil];  if (completion) { completion(success ? filePath : nil); } }); }Copy the code
  • Step 4: IndidFinishLaunchingWithOptionsNote that the location of the call is up to you, and is generally the first interface to render
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {
    
    [self test11];
    
    getOrderFile(^(NSString *orderFilePath) {
        NSLog(@"OrderFilePath:%@", orderFilePath);
    });
    
    return YES;
}

- (void)test11{
    
}
Copy the code

These are the only three methods in cjL.order at this point

[step 5: Copy the File, place it in the specified location, and configure the path] Usually put the File in the main project path, and configure it in Build Settings -> Order File./cjl.order.

Note: Avoid endless loops

  • Build Settings -> Other C FlagsIf yes is configured-fsanitize-coverage=trace-pc-guardIn theThe while loopPart of it will appearInfinite loop(we are intouchBeginDebug in method)

We turned on assembly debugging and found three calls to __sanitizer_cov_trace_pc_guard

The first one is BL, touchBegin

  • The third bl isprintf
  • The second bl is becauseThe while loop. That as long asIf it is a jump, it will be hookedThat there areBl, b“Will be hooked

-fsanitize-coverage=func,trace-pc-guard -fsanitize-coverage=func,trace-pc-guard -fsanitize-coverage=func

Refer to the link

  • IOS optimization chapter App startup time optimization
  • AppOrderFiles