preface

Hi Coder, I’m CoderStar!

It has been nearly a month since I finished my year-end summary last time. During this month, I have interviewed nearly 40 candidates, and I have more or less got some impressions. Later, I will write a separate article to talk about this topic.

As mentioned in the iOS Optimization – Slimming article, iOS optimization will be a special topic. Today, the second article in the iOS optimization series focuses on startup optimization, which is how to reduce the startup time of your app.

In fact, about this, the online information has been a lot of, this paper mainly combed the optimization scheme I know and combined with my actual use to sum up. WWDC has a session Optimizing the Mobile Phone App Launch. We recommend that you take a look at this as Apple’s engineers are more authoritative. Some of the concepts described below will be based on this video.

App startup Type

There are three kinds of App startup process: cold startup, warm startup/warm startup, and recovery

Cold Warm Resume
After reboot Recently terminated App is suspended
App is not in memory App is partially in memory App is fully in memory
No process exists No process exists Process exists

Here are the differences between these boosters:

Cold startup: occurs when the device is rebooted or the App has not been started for a long time. This process requires setting up the process and starting the system side service that supports the App; Warm start: This process does not re-establish system side services as compared to cold start. Recovery: This is not technically a boot, but a state change from background to foreground.

Why cold startup occurs even if the App has not been started for a long time: On iOS, the application in the background will gradually removed from memory so as to provide more memory for the foreground application, so that when users are using game memory intensive applications, and then re-enter your App, then your application relies on the framework of startup and daemon may need to reboot and call from disk.

When we actually measure the startup time, we should measure the warm startup type, mainly because the cold startup state is not uniform, because it is difficult to determine the running state of some system side services or the use of some cache.

App Startup Process

Before optimization, we need to have an understanding of the entire startup process of the App, so that we can know the stages of startup time distribution, which stage can be optimized, and which stage has the highest ROI.

In most cases, the startup process of APP is divided into two parts, namely pre-main and post-main. In fact, it can be further divided into three steps:

  • pre-mainBefore the: main() function, the operating system loads the App executable into memory, then performs a series of loading & linking tasks, and finally executes to the App’smain()Functions;
  • post-main:main()After the function, that is, frommain()Began to,appDelegatethedidFinishLaunchingWithOptionsMethod completed;
  • First screen rendering: browsable/operable page is completed on the first screen;

pre-main

In this phase, the operating system does almost all the work, and if you want to optimize this time, you must first understand what the operating system does before main().

Before the main() function, all the operating system had done was load the executable (in Mach-O format) into memory, load the dyLD library, and perform a series of dynamic linking and initialization operations (load, bind, and initialization methods).

The program is loaded from the exec() function, which is a system call. The operating system first allocates a memory space for the process. Then, the executable file of App is loaded into the file and DYLD is loaded. After completion, the startup process is transferred to DYLD for control.

Loading process

In fact, the loading process in the pre-main stage is also the loading process of DYLD, so the following is mainly to sort out the loading process of DYLD.

Dyld (The Dynamic Link Editor) is apple’s dynamic link editor, is a dedicated library for loading dynamic link libraries, is open source. After the XNU kernel is ready for the program to start, the execution switches from kernel mode to user mode, and dyLD completes the subsequent loading.

Dyld first reads Header and Load Commands from the Mach-O file to learn about the dynamic library that the executable depends on. For example, load dynamic library A into memory, then check the dynamic library A depends on, and so on, until all dynamic libraries are loaded. Generally, an App relies on about 100-400 dynamic libraries, most of which are system dynamic libraries and are cached in dyld shared cache, so that reading efficiency is high.

  1. dylib loading
    • Set the operating environment. This step is mainly to set operating parameters, environment variables, etc. That’s what we usually set up with XcodeEnvironment Variables,Arguments Passed On LaunchAnd so on.
    • Load the shared cache. Load system-level dynamic libraries such asUIKitEtc., is located in the/System/Library/Caches/com.apple.dyld/dyld_shared_cache_armXX is the instruction set architecture of the ARM processor.
    • Instantiate the main program. This step will be the main programMach-OLoad it into memory and instantiate oneImageLoader.The kernelThe main program to load.
    • Load the inserted dynamic library. This step is to load the environment variablesDYLD_INSERT_LIBRARIESDynamic library configured in,dyldBe responsible for.
  2. fixup:rebaseOffset correction /binding(Symbol binding)
    • Link the main program. This step callslink()Function will be instantiated after the main program dynamic correction, so that the binary into a normal state of execution.
    • Link inserted dynamic library.
    • Perform weak symbol binding
  3. Objc setup & initializer
    • Execute the initialization method.Dyld initializes the dynamic library first and then the main program.The main initialization content consists of two parts:
      • Objc setup
        • Initialize objective-C Runtime (including registration of objC-related classes,CategoryRegistration,SelectorUniqueness check, etc.),
      • initializer
        • Calling ObjC+loadfunction
        • The execution statement is__attribute__((constructor))C/C + + function
        • Create C++ static global variables
  4. Execute main
    • Find the entry point and return, executemainfunction

The above process combines the common startup process of App pre-main with the process of DYLD. In fact, we can also see that this stage is mainly a load flow of DyLD. Therefore, Apple engineers will also optimize the loading process of DYLD. Compared with DYLD2, DYLD3 has some optimization methods, such as starting closures, etc. A separate article will be published later to introduce the iterative process of DYLD.

Rebase & Bind

For those of you who have some questions about the Rebase and Bind procedures, here are some additional details.

All methods and function calls inside the binary generated by any App have an address that is the offset address in the current binary. Before ASLR (Address Space Layout Randomization), a program would load at a fixed Address, so that the hacker would know the Address of a function inside the program. Planting some malicious code, modifying the address of the function, etc., brings a lot of danger.

ASLR technology is that every time the App starts, the system randomly assigns an ASLR address value (a security mechanism that assigns a random value to be inserted at the beginning of the binary file). For example, the binary file has a test method with an offset of 0x0001. The randomly assigned ASLR is 0x1F00. If you want to access the test method, its memory address (i.e. real address) changes to ASLR+ offset = memory address determined at runtime (i.e. 0x1F00 +0x0001 = 0x1F01).

Rebase is the process of modifying the application memory address based on ASLR random address values during program startup. The main process is to take the function pointer from __LINKEDIT, modify the function pointer according to the offset, and store it in __DATA. Rebase solves the internal symbol reference problem.

Binding: When referring to other functions or variables in the dynamic library, the current Mach-o file points to another dylib. In this case, Binding is required. Dyld will find the corresponding function and variable address according to the symbol table. Binding solves the problem of fixing external pointer pointing. For example, calling the NSLog method in a program creates a symbol NSLog (currently pointing to a random address) in the mach-O file generated at compile time, and then at run time (a mirror file loaded from disk into memory) gives the real address to the symbol (that is, binding the address to the symbol in memory, Dyld, also known as dynamic library symbol binding), in a nutshell: binding is the process of assigning values to symbols.

Interview question Extension

  • loadCan a method in cateory call a method with the same name?
  • loadMethod in dynamic library, main project load order?

post-main

After this phase refers to the main function to perform the AppDelegate class applicationDidFinishLaunching: withOptions: method performs before the end of this period of time.

This process involves some startup items such as SDK initialization, setting RootViewController, and so on.

The first screen rendering

This process is mainly the rendering process of the first screen page. You would normally use the viewDidApper of the RootController as the end of the render, but it’s been a while since the first frame was rendered, Apple defines the start endpoint in MetricsKit as the first CA::Transaction::commit().

Indicators and quantitative means

When the application starts, a startup animation is played. 400ms on the iPhone and 500ms on the iPad. Apple recommends that the boot time should not exceed the start time of the animation, and any boot time longer than 20 seconds will be directly killed by the system’s watchdog mechanism.

Generally, quantification of startup time can be divided into offline and offline parts. Offline, we can use the tools provided by Apple, and online, we can bury points or obtain relevant data through the performance monitoring tools provided by Apple.

Generally, the pre-main stage is automatically completed by the operating system, so the measurement of this stage generally requires tool support, while for the latter two stages, we can complete it by burying points.

When we verify our optimization measures, we usually use offline methods first. However, due to the small number of samples and other factors, the conclusions obtained by offline methods may not be very accurate. The actual indicators still need to look at online statistical data, such as TP90.

offline

We do this in the testing process to facilitate the consistency of the test environment. So how to keep the test environment consistent:

  • Restart the device and leave it for 2-3 minutes.
  • Enable flight mode or Mock network data to exclude the impact of network on the start-up stage;
  • Shut downiCloud;
  • Use whenever possibleRelease BuildConduct tests; This can be used to reduce unnecessary debugging code overhead during measurement and take advantage of compile-time optimizations;
  • Choose a device with slightly lower performance, so that more users can be satisfied to a greater extent;

Environment Variables

Dyld embedded environment variables in the loading process code, so we can get the time of pre-main by adding environment variables.

Go to Product > Scheme > Edit Scheme… > Run > Arguments > Environment Variables, add DYLD_PRINT_STATISTICS and set it to 1. For more detailed information, use DYLD_PRINT_STATISTICS_DETAILS.

After DYLD_PRINT_STATISTICS is added, the following information is displayed:

The time display sequence is actually the same as the dyLD loading process we introduced above.

After joining DYLD_PRINT_STATISTICS_DETAILS

There are two things to note when using this approach:

  • IOS 15 or later does not support printing time-consuming data.
  • The data obtained in the Debug environment will havedebugger pause timeThe impact that we can haveschemeIn thedebug executablePerform a shutdown to remove this influence factor.

App Launch

Xcode 11 added the App Launch template to Instruments to measure the Launch process of our App and record analysis.

Select the Profiling option for compilation in Xcode.

I won’t go into the details of how to use this, but there are many tutorials online.

XCTest

In UITest, Xcode automatically generated test cases for us to test App startup:

This test launches your application six times and only uses the last five metrics. The first startup is skipped because it is considered a “cold start” that requires caching to be set up.

The log

After iOS 13.0, there is a log file named log-power-xxx.session in privacy analysis and Improvement Analysis Data, which provides some basic data information about application running.

The log file is not one App corresponding to one file, but all applications are in the same file, we can find the corresponding log according to the package name of our application.

The following is a snippet of an application’s record that I took from the log-power-2022-01-09-113331.session file on my device.

 {
      "app_sessionreporter_key" : "69A6A581-C7E1-4ECD-BF82-3EAC569B13A7"."app_build_version" : "10.2.51.6000"."app_is_beta" : "false"."app_multiple_versions" : 0."app_cohort" : "7|date=1603521000000&sf=143465&pgtp=Search&pgid=c6323522-55b2-4d88-b4ae-b3b338b1fd0d&prpg=Genre_179183&ctxt=Search&issr ch=1"."app_version" : "10.2.51"."app_adamid" : 333206289."app_arch" : ""."app_bundleid" : "com.alipay.iphoneclient"./ / App package name
      "slice_uuid" : ""."app_storefront" : 143465. ."performance_metrics" : {
        "disk_io" : {
          "totalWrites" : 160706560."totalReads" : 2143322112
        },
        "memory" : {
          "average" : 164666852."peak" : 223054080
        },
        "app_performance" : {
          "launch" : { // App startup duration is related
            "fg" : {
              "count" : 0."sessions": []},"bg" : {
              "count" : 2.// Startup times
              "sessions" : [ // Each startup duration
                1000.1200]}},"processExits" : {
            "bg" : {
              "cumulativeMemoryPressureExitCount" : 3}}}},"app_is_clip" : 0
    }
Copy the code

From the comments above, we can clearly see the data related to App package name, startup times and startup duration.

online

Xcode Organizer

Select Xcode – > Window – > Organizer, and select Launch Time in the left menu bar to view the data of online users’ APP startup Time. In this way, the distribution of online users’ overall startup Time interval is mainly observed.

MetricKit

MetricKit is a framework introduced in Apple’s iOS 13 that aggregates performance data collected over the last 24 hours at the end of the day and calls back to us the next time an App launches, via a delegate method.

At the official level, it has achieved a unified monitoring of App performance, including the monitoring of App startup, power, memory and other aspects.

MXAppLaunchMetric can be used to monitor App startup.

For details, see WWDC 2020-10081 What’s New in MetricKit, or the new feature in WWDC20 10081-Metrickit, translated by Veteran Drivers Weekly.

Buried point

Process start time

Methods a

This way according to partners and some articles will have a certain deviation.

#import <sys/sysctl.h> #import <mach/mach.h> + (BOOL)processInfoForPID:(int)pid procInfo:(struct kinfo_proc*)procInfo { int cmd[4] = {CTL_KERN, KERN_PROC, KERN_PROC_PID, pid}; size_t size = sizeof(*procInfo); return sysctl(cmd, sizeof(cmd)/sizeof(*cmd), procInfo, &size, NULL, 0) == 0; } + (NSTimeInterval)processStartTime { struct kinfo_proc kProcInfo; if ([self processInfoForPID:[[NSProcessInfo processInfo] processIdentifier] procInfo:&kProcInfo]) { return Kpprocinfo.kp_proc.p_un.__p_starttime.tv_sec * 1000.0 + kprocinfo.kp_proc.p_un.__p_starttime.tv_usec / 1000.0; } else {NSAssert(NO, @" can't get process information "); return 0; }}Copy the code
extension ProcessInfo {
    public var uptime: TimeInterval {
        return Date().timeIntervalSince(startTime)
    }

    public var startTime: Date {
        return processStartTime(for: processIdentifier)
    }

    public func processStartTime(for pid: Int32) -> Date {
        var mib = [CTL_KERN.KERN_PROC.KERN_PROC_PID, pid]
        var proc = kinfo_proc()
        var size = MemoryLayout.size(ofValue: proc)
        mib.withUnsafeMutableBufferPointer { p in
            _ = sysctl(p.baseAddress, 4.&proc, &size, nil.0)}let time = proc.kp_proc.p_starttime
        let seconds = Double(time.tv_sec) + Double(time.tv_usec) / Double(NSEC_PER_SEC)

        return Date(timeIntervalSince1970: seconds)
    }
}
Copy the code

Way 2

Create a custom dynamic library (or directly use the existing custom dynamic library) and bury the point in the +load method as the startup time of APP. In order to count the time in other dynamic libraries as much as possible, we can put the custom dynamic library at the first place of all dynamic libraries loading.

So how do you get dynamic libraries first in the load order?

If the project is managed by CocoaPods, adjust the OTHER_LDFLAGS configuration order in pods-XXXX.debug. xcconfig. (Default should be alphabetical)

If you are a native project, change the order of Build Phases Link Binary With Library.

pre-mainThe end of time

It is recommended to use the point at which the __attribute__(((()) constructor function is called as the end time of the pre-main phase: this maximizes decoupling.

The main reason for not using the +load method is that it is not certain which +load method will be executed last, and even if it is confirmed at the time, it will change as the business iterations proceed.

Void static __attribute__((constructor)) before_main() {// Get time}Copy the code
mainThe end of time

ApplicationDidFinishLaunching: withOptions: at the end of the function of the dot.

First screen rendering completion time

In the viewDidApper of the RootController, or in the first CA::Transaction::commit() as MetricsKit defines the start endpoint.

extension

In iOS 15 and later, the system may warm up your application based on device conditions-start application processes that are not running to reduce the amount of time users wait before the application becomes available.

This warm-up mechanism can also have an impact on our startup burying point, where the time between the process starting and main can be longer than normal.

if let activePrewarm = ProcessInfo.processInfo.environment["ActivePrewarm"] {
  // activePrewarm is 1
}
Copy the code

about_the_app_launch_sequence

Optimization measures

Below are some commonly used optimization means, optimization is easy, difficult to prevent deterioration.

Pre-main phase optimization

The optimization of library

  • Dynamic library to static library;
  • To reduce the number of dynamic libraries, merge multiple dynamic libraries into one. It is recommended that the number of dynamic libraries be less than 6.
  • Dynamic library lazy loading;

The dynamic library here does not refer to the system dynamic library, but the newly created dynamic library, also known as the so-called Embedded Framework. It cannot be publicly shared by other applications like the system library, and can only be shared between App Extension and App.

Normally, we have static libraries in our projects, but there are also cases where we need to use dynamic libraries.

  • We take advantage of the dynamic library’s first-execute initialization properties to execute some methods that need to be executed first.
  • Early versions of CocoaPods managed Swift projects only in the form of dynamic libraries to introduce third-party and two-party libraries.
  • In the projectApp Extension, can use the dynamic library within a certain range of sharing characteristics to reduce package size
  • .

Of course, we can also take advantage of the runtime link feature of dynamic libraries to lazily load some dynamic libraries. Weak_frameworks = ‘XXX’; / / Add spec. Weak_frameworks = ‘XXX’; / / add spec.weak_frameworks = ‘XXX’ to podSpec. Link Binary With Libraries and Other Linker Flags do not Link to dynamic Libraries. Load the dynamic library through [NSBundle loadAndReturnError:] or dlopen() before calling the actual business code.

This optimization method is suitable for stable libraries with fewer dependencies. At present, it is understood that the dynamic library lazy loading includes 58, shells, etc..

Lazy loading of dynamic libraries takes less time to fixup and initialize than static libraries.

Combined with my actual project, the project is a swift-OC mixed programming project, the main body is Swift, CocoaPods management library is dynamic library, local adjustment is static library, the specific method is as follows:

  • To get rid ofuse_frameworks!;
  • adduse_modular_headers!, the reason is that some Swift two-party libraries use OC code internally, which is not supportedmodularThe library is adjusted, which mainly containsWCBD;
  • Check the use of internal resources in the library, whether there is a problem of hard-coded resource path, after checking, it is found that there is an existing problem, then repair;

After the replacement, according to offline test data, the start-up time of pre-main stage is nearly 100% higher than before.

After the replacement, it also brings some positive benefits in terms of package volume. In the past, the Style of each Pod Strip was Debugging Symbols, but when it is changed into a static library, All Symbols under Project are used. From the Apple Connect background, the download size of the package was actually reduced by 5.1m after the iteration of the upgrade.

Initialization control

  • The process of cleaning up unused classes, categories, methods, and so on in a project affects many areas and reduces codefixupThe number of times, will also decreaseObjc setupTime will also reduce package volume; Developers themselves need to get in the habit of removing useless code.
  • willloadMethod to delay execution of logic, such as after the first screen rendering or+initializeImplementation; This need to be combined with the specific business to adjust;
  • Control the number of C++ global variables;
  • Avoid using C++ virtual functions;

other

  • Binary rearrangement;
  • inSwiftWell, use the function it has to distribute directly;

Binary rearrangement is probably pretty obvious, but let me just say it a little bit. The core principle is to use binary rearrangement to reduce the number of Page faults that occur at startup.

There are two main steps:

  • usingclangPegging gets symbols for all functions, blocks, swift methods, and c++ constructors that need to be loaded at startup;
  • The XcodeBuild Settingsconfigurationorder fileCan;

The first step is the key step, and the core is to use the santizer Coverage tool built into LLVM for symbol collection.

  • inBuild SettingsTo add compilation optionsOther C Flagsincrease-fsanitize-coverage=func,trace-pc-guard;
  • If OC Swift is mixed, add it to Other Swift Flags
    • -sanitize-coverage=func
    • -sanitize=undefined

If you are a CocoaPods managed project, you need to add these compilation options for each Pod.

As for the code examples, look directly at Yandi’s AppOrderFiles

Under the extension, the compiler will compile the OC code first and then Swift code by default when generating binary code. In this order, the compiler will generate the OC code in the order of the compiled file and the methods in the file.

post-mainPhase optimization

This stage of more closely with our business, we would normally require applicationDidFinishLaunching: withOptions: do a lot of initialization work, such as network, unified style, three-way SDK initialization, etc., We need to optimize this phase for the nature of our business.

  • Sorting out the relevant business logic, delaying the loading of libraries or logic that can be delayed;
  • Consider using multiple threads to maximize CPU performance;
  • .

It is recommended that you take a look at the decoupling of the AppDelegate that you wrote earlier and undelegate it so that you can decouple services related to the startup process and adjust them accordingly.

Optimized first screen rendering

In fact, this stage of UI rendering efficiency, optimization means is also a general means of rendering optimization.

  • Try to use pure code to write, reducexib/storyboardDon’t overcomplicate your home page layout if necessaryAutoLayoutintoFrameLayout;
  • inviewDidLoadAs well asviewWillAppearDo less logic in methods, or do it asynchronously;
  • Reduce view hierarchy;
  • Lazy loading View;
  • .

The last

During pre-main, the optimization measures used by each App may be more consistent, while the following two stages need to be optimized according to our business characteristics. The principle is also very simple, that is, do as little as possible, and it is better not to do it. At the same time, after the special optimization of the start time, corresponding measures should also be formulated to prevent deterioration.

Let’s be CoderStar!

The resources

  • reducing-your-app-s-launch-time
  • 58.com App Performance Governance Practices -iOS startup time optimization
  • IOS Optimization – Start optimization of Clang peg implementation binary rearrangement
  • How does iOS start in seconds
  • IOS app startup process and optimization details
  • Douyin Quality Construction – iOS Startup optimization principles
  • Optimizing App Launch
  • Meituan Takeout iOS App cold startup governance
  • Dyld,
  • launch-time-performance-optimization

It is very important to have a technical circle and a group of like-minded people, come to my technical official account, here only talk about technical dry stuff.

Wechat official account: CoderStar