The topic outside

APP Compilation process

More knowledge of APP loading other

Cold start and hot start

Cold start: Restarts the APP after the APP is killed by the background. This startup mode is called cold start. Hot start: The state of the APP changes from running to suspend. The APP is still running in the background without being killed. Switch the APP to the foreground again. This startup mode is called hot startup

Composition of startup time

The partition of startup time can be divided by the main() function as a key point

The time required for processing before main() in stage T1 is called pre-main
The processing time after main() and main() in t2 stage

T1 stage: pre-main

T2 stage

The time spent in t2 phase is mainly recommended by business code BLStopwatch, which can make statistics of service time

Xcode measures the pre-main time

Edit scheme -> Run -> uments Add the environment variable DYLD_PRINT_STATISTICS, value set to YES.

phase	work	To optimize the
Load dylibs	Dyld gets the list of dependent dynamic libraries to load from the header of the main executable. It then needs to find each dylib, and the dylib files on which the application depends may in turn depend on other Dylibs, so all it needs to load is the list of dynamic libraries, a collection of recursive dependencies	1. Try not to use embedded Dylib, because loading embedded Dylib costs a lot of performance; 2. Merge existing dylib and use static Archives to reduce the number of dylib uses; 3. Lazy loading of dylib, but be aware that dlopen() can cause some problems and actually does more work
Rebase and Bind	1. Rebase adjusts the pointer pointer inside the Image. In the past, the dynamic library was loaded at the specified address, and all Pointers and data were correct for the code. Now the address space layout is randomized, so it needs to be modified at the original address based on random offsets. 2. Bind refers to the correct pointer to the content outside the Image. These external Pointers are bound by symbol names. Dyld needs to search the symbol table to find the corresponding implementation of symbol	1. Reduce the number of ObjC classes, methods, and categories; 2. Reduce the number of C++ virtual functions (creating a virtual function table is expensive); 3. Use Swift structs (internally optimized for fewer symbols)
Objc setup	1. Register all declared OC classes in a global class registration; 2. Insert the category definition in category registration; 3. Ensure that every selector is unique	Reduce the number of Objective-C classes, selectors, and categories by merging or deleting OC classes
Initializers	1.Objc +load(); 2.C++ constructor attribute function; 3. Creation of C++ static global variables of non-primitive types (usually classes or structs)	1. Do fewer things in the class’s +load method and try to defer them until +initiailize; 2. Reduce the number of constructor functions and do fewer things in constructor functions. Reduce the number of C++ static global variables

Physical memory and virtual memory

Memory management

Memory is managed paging, mapping tables are not in bytes, but in pages.

Linux uses 4K as a page
MacOS comes in 4K as a page
IOS starts at 16K per page

Terminal type pageSize

Memory is a waste of

In the early days, computers kept starting applications. After a certain number of applications were started, an error would be reported. Applications could not run properly. This is because early computers didn’t have virtual addresses, and once loaded they were all loaded into memory. Once the physical memory runs out, the application can no longer be started. Applications in memory are sorted sequentially, so that a process can access the memory address of another process only by moving its own address tail back a bit, which is quite unsafe.

Virtual memory

Users will not use all the memory when they use it. If the App is loaded into the memory as soon as it starts, a lot of memory space will be wasted. Virtual memory technology appears to solve this memory waste problem. After the App starts, it will think that it has obtained the memory space needed for the entire App operation, but in fact, it does not apply for so much space in physical memory, but only generates a table associated with virtual memory and physical memory.

Address translation

When the App needs to use the address of a certain virtual memory, it checks whether the virtual address has applied for space in the physical memory through this table.

If requested, access the physical memory address through the table’s record,
If not, request a physical memory space and record it in the table (Page Fault).

This process mapping to different physical memory Spaces through the process mapping table is called address translation, and this process requires CPU and operating system cooperation.

Page Fault

The following operations are performed when the data is not in physical memory

The system blocks the process
Load the data corresponding to the Page on the disk into the memory
Point virtual memory to physical memory

This behavior is Page Fault

Flexible Memory Management

This solves the waste problem, but what if all of the physical memory is used up? There may be insufficient memory. In order to ensure the normal use of the current App, data loading follows the following principles:

If there is free memory space, put it in the empty memory space
If not, overwrite the other process’s data
Specific coverage is handled by the operating system

Resolving security issues

The space problem has been solved, but what about the security problem?

ASLR (Address Space Layout Randomization) technology and code signature are introduced in the loading process of dylib for safety.

ASLR: Images, executables, dylib, and bundles will add a slide in front of the preferred_address to prevent the internal address from being located.

Why binary rearrangement

Virtual memory technology produces Page faults, which can be a time-consuming operation. Page time also varies widely, from 1 microsecond to 0.8 milliseconds.

This time is not obvious in the process of use, but a large amount of data is loaded when starting up. If a large number of Page faults occur, users will have an obvious perception after time stacking.

If we put all the boot-time code on one or two pages, we can greatly reduce Page faults and optimize startup speed, which is binary rearrangement.

Binary rearrangement

The binary rearrangement is designed to reduce startup time by reducing Page faults

The principle of

By default, when generating binary code, the compiler writes files in the order of the linked Object file(.0) and functions in the order of the functions inside the Object file.

Static library file. A is a set of.o files ar package, can be viewed with the nm command

Let’s say we only have two pages, Page1 and Page2, and the green method1 and method3 are called when the application starts. In order to execute the corresponding code, the system must do two Page faults. But if we arrange method1 and method3 together, all we need is a Page Fault, which is the core principle of binary rearrangement.

But if we arrange method1 and method3 together, all we need is a Page Fault.

How to check APP Page Fault

First open the items to be analyzed, and thencommand+iOpen theinstrumentsDebug tool, openSystem Trace

2. Click “Run”, and click “stop” when the APP starts and you see the home page

3. After the operation is complete, you can see the whole analysis diagram. Enter it in the search boxmain thread“, and then go to the following selectionMain Thread --> Virtual Memory(Virtual memory)

Binary rearrangement mode

To actually implement the binary rearrangement, we need to take all the symbols of the methods, functions, etc. that are started, save their order, and then write the order file to implement the binary rearrangement

There are several common ways to obtain symbols:

fishHook
Clang plugging pile

fishHook

FishHook github.com/facebook/fi… Facebook open source is a tool that can hook system functions, we can hook into the system objc_msgSend way, collect function symbols. However, this implementation method initialize, block and direct call method hook does not

Clang plugging pile

OC methods, functions, and blocks can be hooked. This is done by adding hook code to the binary data inside each function at compile time to implement the hook effect of the global method

We need to track the execution of each method to obtain the order in which the methods were executed at startup, and then write the order file in that order.

So what does Clang do

Code Coverage tool

LLVM has a built-in simple code coverage tool

It can insert user-defined functions and provide callbacks at the function, block, and edge levels
It enables simple visual coverage reports

Tracing PCs with Guards

The specific implementation

In the projectBuild Settings-> Other C Flagsadd-fsanitize-coverage=trace-pc-guardconfiguration

2. Write the following two methods in a class

void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop) { static uint64_t N; // Counter for the guards. if (start == stop || *start) return; // Initialize only once. printf("INIT: %p %p\n", start, stop); for (uint32_t *x = start; x < stop; x++) *x = ++N; // Guards should start from 1. } void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { if (! *guard) return; void *PC = __builtin_return_address(0); char PcDescr[1024]; printf("guard: %p %x PC %s\n", guard, *guard, PcDescr); }Copy the code

Run the code and take a look

No, let’s break the point

Start and stop store a bunch of serial numbers

Could this be the number of the function? Arrange a function for him to try, sure enough after adding a function08into09

__sanitizer_cov_trace_pc_guard

The guard_init method above can get the number of all methods, so there must be a way to get the specific information about the method. The key is the __sanitizer_cov_trace_pc_guard, which is analyzed next. Add the click method and call the method you just added

Click on the call test method to see the breakpoint

You can see that it was inserted when the method was called__sanitizer_cov_trace_pc_guardMethods.

For those of you who don’t understand, let’s use Hopper to look at the generated Mach-O binaries

As you can see in the figure above, the hook function is called inside each function

That is, we can now use the __sanitizer_cov_trace_pc_guard function, Use the __builtin_return_address number to get the address of the assembly code instruction that called __sanitizer_cov_trace_pc_guard

Gets the function name based on the memory address

How do I get the function name when I get the address of the line inside the function? There is a method in dlfcn.h as follows:

typedef struct dl_info { const char *dli_fname; /* file */ void *dli_fbase; /* file address */ const char *dli_sname; /* symbol name */ void *dli_saddr; /* function start address */} Dl_info; Int dladdr(const void *, Dl_info *);Copy the code

To experiment, import the header #import <dlfcn.h> and modify the code as follows:

void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { if (! *guard) return; // Duplicate the guard check. void *PC = __builtin_return_address(0); Dl_info info; dladdr(PC, &info); printf("fname=%s \nfbase=%p \nsname=%s\nsaddr=%p \n",info.dli_fname,info.dli_fbase,info.dli_sname,info.dli_saddr); char PcDescr[1024]; printf("guard: %p %x PC %s\n", guard, *guard, PcDescr); }Copy the code

Now that we have the function information, let’s briefly talk about swift configuration

Block, C function

Add a block
Add a c function

We can hook it, but we won’t do practice here

Swift mixed processing

Target -> Build Setting -> Custom Complier Flags -> Other Swift Flags added

-sanitize-coverage=func
-sanitize=undefined

Binary rearrangement practice

Clang piling is configured

Get symbol list

The lazy version directly defines a class that collects the symbolic methods needed to start a list of symbols, generating a local file called binary.order

configurationbinary.file

LinkMap

Write Link Map File in Build Settings of Xcode:

Then click Run to generate a Linkmap file that contains the symbol order list of links:

This file is divided into four sections:

Path

Path is the Path to generate the. O object file
Arch is an architectural type
Object Files lists all obJ and TBD files in the executable. The number at the beginning of each line represents the file number.

2.Section(Mach-O information) Sections Record the address range of each Segment/ Section

3.Symbols

Address indicates the Address of the method in the file.
Size Indicates the memory Size occupied by the method.
File indicates the number of the File, corresponding to the number in brackets in the Object Files section
Name Indicates the method Name.

Dead Stripped Symbols

Symbols that the linker considers useless are not counted when linking

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

APP Startup optimization

The topic outside

Cold start and hot start

Composition of startup time

T1 stage: pre-main

T2 stage

Xcode measures the pre-main time

Physical memory and virtual memory

Memory management

Memory is a waste of

Virtual memory

Address translation

Page Fault

Flexible Memory Management

Resolving security issues

Why binary rearrangement

Binary rearrangement

The principle of

How to check APP Page Fault

Binary rearrangement mode

fishHook

Clang plugging pile

Code Coverage tool

Tracing PCs with Guards

The specific implementation

__sanitizer_cov_trace_pc_guard

Gets the function name based on the memory address

Block, C function

Swift mixed processing

Binary rearrangement practice

LinkMap

APP Startup optimization

The topic outside

Cold start and hot start

Composition of startup time

T1 stage: pre-main

T2 stage

Xcode measures the pre-main time

Physical memory and virtual memory

Memory management

Memory is a waste of

Virtual memory

Address translation

Page Fault

Flexible Memory Management

Resolving security issues

Why binary rearrangement

Binary rearrangement

The principle of

How to check APP Page Fault

Binary rearrangement mode

fishHook

Clang plugging pile

Code Coverage tool

Tracing PCs with Guards

The specific implementation

__sanitizer_cov_trace_pc_guard

Gets the function name based on the memory address

Block, C function

Swift mixed processing

Binary rearrangement practice

LinkMap

Related Posts

How to achieve high performance dynamic template rendering on Flutter

Structs, unions, and bitfields

🐻 The practice and some ideas of componentization and modularization in article 18