1. Introduction
In our daily development, it’s common for apps to get bigger and bigger after long iterations, and start up slowly. Is there a solution? Let’s take a look at the launch time of the application.
1.1 Printing the application startup time
We add DYLD_PRINT_STATISTICS to our project Settings, DYLD_PRINT_STATISTICS_DETAILS prints startup information.
Run on the simulator iPhone12, and the running results are as follows:
After a process has been killed for some time:It takes a little bit less time on a real machine, and the simulator performance is a little bit worse.
在pre-main
Stage is mainly done
- Dylib loading: As we know from analyzing DYLD before, dynamic link is mainly linked to our dynamic library.
- Rebase /binding: redirection /binding
- ObjC Setup: Registration of OC classes includes classifications.
- Initializer: Class initialization and construction
The specific details displayed under Total. Total time: 4.0 seconds Total images loaded: 475 (445 from dyld shared cache
1.2 Optimization Ideas
For these three aspects, we can make the following optimization:
- Minimize the use of dynamic library links,
remove
If you don’t need a dynamic library, try to use the system library, and Apple recommends limiting the number to6
The following. remove
Classes that are not needed,merge
Similar classes and classifications.- To reduce
+ the load ()
Use, use lazy load, use+ the initialize ()
Replacement.
2. Concepts to know about startup optimization
When we start tuning, we need to understand the concepts involved so that we can understand how it works.
2.1 Physical memory and Virtual Memory
In early development, our program ran on an operating system. The maximum operating memory size is supported based on the number of bits of the operating system, such as 2^32 = 4GB for 32-bit operating systems and 2^64 = 8GB for 64-bit operating systems. Our program is loaded into physical memory at runtime. Physical memory refers to the memory space obtained through the physical memory. The main function of memory is to provide temporary storage for the operating system and various programs when the computer is running. Common physical memory specifications are 256M, 512M, 1G, 2G, etc. Nowadays, with the development of computer hardware, there have been 4G, 8G or even higher capacity memory specifications.
2.1.1 What Do I Do if THE MEMORY is Insufficient?
Our operating system to support multiple program runs, the use of simple physical memory will be stretched, with 200 MB of physical memory, for example, the program takes up 100 m program B to 150 m, this time running A running after B complains, long before the windos will error “system fault, out of memory, try again later”. But in fact, program A does not use 100M memory, and some functions are not used by users, which will cause A great waste of memory. This is the low efficiency of memory use. At the same time, we will also face A problem. When our program runs, the loading of physical memory is continuous. If the program A accesses the modified data out of bounds, it will cause the memory data of program B to be modified or cause errors. Early plugins were also used to modify data in this way, for example, the value of a game could be modified by a plugin modifier. Therefore, there is also a situation where the program is not secure, which is that the address space is not isolated. Another problem is that every time the program runs, we need to allocate a large enough free area in the memory space, but this area is not fixed, but many data read and write and instruction jump target address is fixed, there will be a redirection problem. We want each program to be able to monopolize memory, not be influenced by others, and monopolize CPU.
2.1.2 Virtual Memory
Adding an intermediate layer, the common approach in development, is to use an indirect address access method. Roughly, we regard the address given by the program as a virtual address, and then through some mapping methods, the virtual address into the actual physical address, so long as we can properly control the virtual address to the physical address mapping process, In this way, the physical memory regions that can be accessed by any program and other programs do not overlap each other, thus achieving the effect of address space isolation. Virtual address space means virtual. The address space imagined by people does not actually exist. Each process has its own virtual space, and each process can only access its own address space, thus effectively achieving process isolation.
Section 2.2
Segmentation is the most started using methods, and basic idea is to put a program memory space needed for the size of the virtual space is mapped to a single address space, we put the address space of the same size the two one-to-one mapping, namely virtual space each byte for each byte physical space, this process by the software to set up, such as the operating system to set up the mapping function, The actual address translation is completed by hardware. The following figure shows the mapping between virtual space and physical space of program A and PROGRAM B in the segmented case:
The segmentation method basically solved two of the above three problems, and it didAddress the isolation
When program A accesses virtual space addresses beyond its own scope, the system defines illegal access. Second, for each program, no matter what region is assigned to the physical address, for the programtransparent
, do not need to care about physical address changes, proceduresRelocation is no longer required
. But the segmentation method is not solvedThe efficiency of
Problem, segmentation of the memory region mapping or according toProgram bit unit
, out of memory, being swapped in and out to diskThe entire program
, causing a large number of disk access, thus seriousImpact velocity
.
2.3 the paging
The granularity of segmentation is relatively large, so people think of the method of memory segmentation and mapping with smaller granularity, which makes full use of the principle of locality of the program and greatly improves the utilization rate of memory. This method is paging. The basic approach to paging is to artificially divide the address space into fixed-size pages, the size of which is determined by the hardware, or the hardware supports multiple page sizes, the size of which is determined by the operating system. PAGESIZE is currently 16k on iOS and 4k on Mac(PAGESIZE command). Pages in virtual space are called virtual pages, pages in physical space are called physical pages, and pages on disk are called disk pages. When using a page, but not loaded into memory operating system will find the missing page exception) (of a page fault, this time the CPU to perform the assembly code breaking, the operating system will need to load data into physical memory, where there is a free position is inserted into the here, in general, mobile phones start after a period of time, almost no free position, The operating system overwrites inactive memory using a page replacement algorithm
The implementation of virtual storage requires hardware support, which varies from CPU to CPU, but almost all hardware uses a component called THE Memory Management Unit (MMU) to map.
3. PageFault debugging & startup optimization principle
In the virtual memory section, we know that when a process accesses a virtual memory page where the corresponding physical memory does not exist, a Page Fault is triggered, thus blocking the process. At this point, the data needs to be loaded into physical memory and then accessed again. This has some impact on performance. We used the script to run Weike-7.0.8, and the pre-main took 2.3 seconds.
So let’s look at the number of pageFaults, open Instruments
CMD+i
Shortcut key, selectSystem Trace
Click on the top leftrecord
Select the Main ThreadSunmmary: virtual memery
The thirdI’m using a lot ofThe cache
Therefore, it takes less time.
Turn off the phone and restart it to clear the cached data.
3.1 PageFault produce
Let’s search in Build SettingWrite Link Map File, set Yes.After compiling, go to Show in Finder
Choose one level above your level
After opening
The function methods we call all have them, and here is the order in which the code implementations of all the methods are arranged. Address+ASLR is the Address of the method in virtual memory. Let’s add some methods
Compile again
From the abovePage Fault
The number of times and the loading order can be found in factThe root cause of too many Page faults is that the methods that need to be called at startup time are in different pages
. Therefore, our optimization idea is:Lining up all the methods that need to be called at startup time, i.e. on a single Page, turns multiple Page faults into a single Page Fault
.
3.2 Binary rearrangement
In the objC source file, we found a libobjc.order file, which opens as follows:
The order is used by the compiler. When the compiler reads the ODER file, it sorts the binaries in the order shown here.
Let’s create a new order file, place it under root, and edit it as shown below
Let’s go to Build Settings and search for Order File
Compile again
Therefore, we can reduce the load time caused by missing page interrupts by putting the methods we need in.order during startup. How to do this will be discussed in the next chapter.
4. To summarize
Any optimization is built on waste. When we do startup optimization, in addition to the normal method of reducing +load, using lazy loading, removing and merging some classes, we can also do binary rearrangement, by reducing page interruptions, and putting the classes and methods that need to be loaded on the same page, Order the.order files in the desired order to reduce the time it takes to load pages and reduce startup time. In addition, some concepts described in the article are from “Self-cultivation of programmers”, interested partners can have a look.