This article is based on dai Ming’s course iOS development master class, plus personal practice + understanding to write this article has been synchronized to the brief book :App start speed monitoring – method level start time check tool
How to make a method level startup time check tool to aid in analysis and monitoring
When using Hook objc_msgSend to check the execution time of the startup method, we need to implement a balanced startup time check tool
First of all, we need to understand why we can hook all objective-C methods by hook objc_msgSend
Every object in Objective-C refers to a class, and every class has a list of methods, and each method in the list of methods is composed of selector, function Pointers, and metadata
The objc_msgSend method finds the corresponding function pointer based on the selector of the object and method at runtime and executes. In other words, objc_msgSend is a necessary path for methods to execute in Objective-C, controlling all objective-C methods
Objc_msgSend itself is written in assembly language for two main reasons:
- Objc_msgSend is called the most frequently, and the performance optimizations made on it can improve performance throughout the App lifecycle. And assembly language in performance optimization belongs to atomic level optimization, can optimize the extreme
- It is difficult to jump to any function pointer for unknown parameters in other languages
Apple has open-source objective-C runtime code. You can find the source code for objc_msgSend on the Apple open source website
The figure above lists all the implementations of the architecture, including X86_64 and so on. Objc_msgSend is at the heart of iOS method execution
The objc_msgSend method performs the following logic: first obtains the information of the corresponding class of the object, then obtains the cache of the method, searches for the function pointer according to the selector of the method, and finally jumps to the implementation of the corresponding function after handling the exception error
Hook objc_msgSend method
Facebook has opened source a library for dynamically rebinding symbols in Mach-O binaries running on iOS called Fishhook: GitHub Addresses
The general idea behind the Fishhook implementation is that you can hook the C method by rebinding a match. Dyld binds the lazy and non-lazy symbols by updating Pointers in specific parts of the _DATA segment of the Mach-O binary. By confirming where each symbol is updated in the rebind_symbol, the corresponding replacement can be found to rebind these symbols.
#### Fishhook is implemented by first iterating through all images in dyLD and fetching image header and Slide
if (! _rebindings_head->next) { _dyld_register_func_for_add_image(_rebind_symbols_for_image); }else { uint32_t c = _dyld_image_count(); Image for (uint32_t I = 0; i < c; I ++) {// Read image header and slider _rebind_SYMBOLS_for_image (_dyLD_GET_image_header (I), _dyld_get_image_vmaddr_slide(i)); }}Copy the code
Next, find the symbol table-related commands, including Linkedit Segment Command, symtab Command, and dysymtab Command
segment_command_t *cur_seg_cmd; segment_command_t *linkedit_segment = NULL; struct symtab_command * symtab_cmd = NULL; struct dysymtab_command *dysymtab_cmd = NULL; uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t); for (uint i = 0; i < header->ncmds; cur += cur_seg_cmd->cmdsize) { cur_seg_cmd = (segment_command_t *)cur; if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) { if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) { //linkedit segment command linkedit_segment = cur_seg_cmd; } }else if (cur_seg_cmd->cmd == LC_SYMTAB){ //symtab command symtab_cmd = (struct symtab_command*)cur_seg_cmd; }else if (cur_seg_cmd->cmd == LC_DYSYMTAB){ //dysymtab command dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd; }}Copy the code
Then, get the Base and indirect symbol tables:
Uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment-> vmaddr-linkedit_segment ->fileoff; nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff); char *strtab = (char *)(linkedit_base + symtab_cmd->stroff); Uint32_t *indirect_symtab = (uint32_t *)(uintdit_base + dysymtab_cmd->indirectsymoff)Copy the code
Finally, with the symbol table and the method replacement array passed in, we can replace the symbol table access pointer address:
uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1; void **indirect_symbol_bindings = (void **)(uintptr_t)slide + section->addr); for (uint i = 0 ; i < section->size/sizeof(void *); i++) { uint32_t symtab_index = indirect_symbol_indices[i]; if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL || symtab_index == (INDIRECT_SYMBOL_LOCAL | INDIRECT_SYMBOL_ABS)) { continue; } uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx; char *symbol_name = strtab + strtab_offset; if (strnlen(symbol_name,2) < 2) { continue; } struct rebindings_entry *cur = rebindings; while (cur) { for (uint j = 0; j < cur->rebindings_nel; j++) { if (strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) { if (cur->rebindings[j].replaced ! = NULL && indirect_symbol_bindings[i].replaced! = cur->rebindings[j].replacement) { *(cur->rebindings[j].replaced) = indirect_symbol_bindings[i]; } // Indirect_symbol_bindings [I] = cur->rebindings[j].replacement; goto symbol_loop; } } cur = cur->next; } symbol_loop:; }Copy the code
Fishhook is an underlying operation in which the process of finding a symbol table is similar to the implementation of stack symbolization. Understanding this principle will be helpful to understand the internal structure of the executable file Mach-O.
Next, let’s look at a question: can fishhook alone handle objc_msgSend’s hook?
Of course not, objc_msgSend is implemented in assembly language, so we need to add a little bit more from the assembly layer
We need to implement pushCallRecord and popCallRecord to record the time before and after the objc_msgSend method invocation, and then subtract it to get the method execution time.
The following for ARM64 architecture, write a can retain unknown parameters and jump to c any function pointer assembly code, implement the hook to objc_msgSend.
Arm64 has 31 64-bit integer registers, respectively represented by X0 to X30. The main implementation ideas are as follows:
- The parameter register is X0 ~ X07. For the objc_msgSend method, the first argument to x0 is the incoming object, the second argument to x1 is the selector _cmd, and the number of syscall is placed in x8.
- Swap the parameters stored in the register, and move the data in the returned register LR to x1
- Call pushCallRecord using bl label syntax
- Execute the original objc_msgSend and save the return value
- Call the popCallRecord function with bl label syntax
The specific assembly code is as follows:
Copy the code