“This is the first day of my participation in the Gwen Challenge in November. See details of the event: The last Gwen Challenge in 2021”.

hook objc_msgsend

This method writes hook_msgSend directly in assembly and then hooks it using Fishhook. The principle is also relatively simple. Fishhook traverses the image looking for the __la_symbol_ptr and __NL_symbol_ptr /__got tables in the image, usually after the dynamic linking and after the main function has been run. Rebind can also be done with +load, but it is less controllable. Therefore, directly hook objC_MSgsend method can know the time consuming of OC method, and then monitor and optimize accordingly.

Existing problems:

  1. Classification can’t hook?
  2. Can only hook all, not part of the hook;
  3. The timing method after hook itself is also a performance consumption;

The second problem leads to:

  1. [Fixed] All hooks will have performance issues
  2. The project itself may crash, bad_access, problems may not be easy to solve. If a crash cannot be solved or partially hook, the whole project cannot use this tool.

Hence the development of the second way……

Insert the static pile

Static piling I don’t know where the term came from, but I don’t think it should be piling. Stub means stub stub function, stub probably refers to the __TEXT section stub function.

This is done by changing the pointer to objc_msgsend in the symbol table or hook_msgSend in the string table.

Compared with static piling, hook can be carried out in static period in the following ways:

  1. LLVM syntax tree parsing, that is, parsing objc_msgSend into our own function, if hook_msgSend;
  2. After compiling, before linking, modify the text section of the target file to call;

The above two methods are relatively difficult and have a high threshold, while static piling is simpler and more effective:

Because before linking, all files will be compiled into the object file. Calls to external functions in the object file will have a Symbol table. When linking, the symbol table of all object files will be integrated into the final executable file to complete the linking operation.

The basic idea is to replace the string tables in all object files (.o files) after compilation and before linking. Because the Symbol table does not store strings, it stores symbol related information and Pointers. Static external symbols are 0 before static linking, and dynamic external symbols are 0 before dynamic linking. In addition, hook String table itself is simpler. After replacing the string, all objc_msgSend methods of the object file are changed to hook_msgSend methods.

Conclusion:

  1. It is not possible to modify the string in the symbol table directly, or the modification cost is too high (iterate over the symbol table and add hook_msgSend to strtab and hook objc_msgSend).
  2. To replace the entire object file, modify objc_msgSend in string table.
  3. Because it is before the link, so the whole object file replacement is relative to the.o file, so it can support part of the file hook;

Binary rearrangement

The Mach-O file is loaded into memory before the App starts. The difference between virtual memory and disk is that the segment in Mach-o may not be the same size as the segment in disk due to paging.

The App calls a lot of functions during its startup cycle, so it loads a lot of symbolic implementations, the __TEXT section of the code. Because of the paging problem, if the functions are loaded too thinly at startup, there will be many Page faults at startup.

Do I load all the Mach-O files into memory when loading an image? Feel should not, dyld source code will be the first to load a page, if not enough, then continue to load the page. It is estimated that fixed virtual memory is allocated first to ensure the continuity of addresses and comply with Mach-O. In this case, the virtual memory without Page In does not point to specific physical memory. Trigger Page In when the calling code is not loaded; Doubt ~ ~

The core of binary rearrangement is to sort the symbols called at startup into a page as much as possible to minimize the number of page faults, reduce the number of I/O, and finally optimize the startup time;

Problems with binary rearrangement:

  1. Difficult;
  2. Effect to be confirmed;
  3. After iOS13, dyld3 introduced closure, rearrangement cost performance may be very low;

In DYLD3, the information required for App startup is stored in disk, which is equivalent to the work that needs to be done before startup in advance. The exact mechanism is unknown, so the binary rearrangement scheme needs to be validated.

Related articles

Douyin Quality Construction – iOS Startup optimization principles

Douyin Quality Construction – iOS startup optimization

Tiktok RESEARCH and development practice: APP startup speed increased by more than 15% based on binary file rearrangement solution

Hook Method is realized by static pile insertion

TimeProfiler

Monitor the time of all OC methods.

Simple implementation of static interception of iOS object method calls