preface

With the continuous increase of App business, many versions have been iterated and the functions have been constantly improved, followed by the continuous improvement of users’ mobile phone performance experience.

The research on App startup and the bottom of the framework will be divided into five blogs for explanation according to the following logic: Welcome to pay attention and like!!

  • App system kernel loading

  • LLVM+Clang+ compiler + linker — Value preservation

  • App startup optimization idea [advanced road three] — this article

  • Binary rearrangement can be seen in this way.

  • How to monitor App startup? [Road to Progress 5]

If you want to talk about App startup optimization, make sure you know what happened at startup! How to optimize?

The following will cover the general knowledge of Mach-O and Dyld, otherwise it would be impractical to start with optimization, but it is best to understand the principles clearly

MachO file

Before we talk about boot optimizations, let’s talk about the MachO file!

1. An overview of the

Mach-O is short for Mach Object file format, an executable file format for iOS and Mac, similar to Windows’ EXE format and Linux’s ELF format. Mach-O is a file format for executables, dynamic libraries, and object code. It is an alternative to the A. out format and provides higher extensibility.

2. Common formats

The common format of Mach-O is as follows:

  • Object file.o
  • The library files
  1. .a
  2. .dylib
  3. .framework
  • Executable file
  • dyld
  • .dsym

View the file type in the file path

2.1 Object file.o

From the test.c file, you can compile it into the object file using the clang command

Let’s check the file type again with the file command (below)

It’s a Mach-O file.

2.2 dylib

Run the CD /usr/lib command to view dylib

Run the file command to view the file type

3. Common binary files

Universal binaries were invented by Apple itself, and the basic content is as follows

Let’s look at the common binaries by looking at the Macho file with the command

Then view the file type through the file command

The MachO file contains three architectures: ARM V7, ARM V7S and ARM 64.

We do a few things with the MachO file, using the lipo command to split and merge schemas

3.1 Use lipo-info to view MachO file architecture

3.2 Thin MachO files and split them

A new MachO_armv7 is displayed

3.3 Add architecture and merge

Incorporate multiple architectures using Lipo-Create

The lipo command is as follows:

4. Mach-o file structure

Here is an official apple illustration of the MachO file structure

The MachO file consists of three parts

  • The Header contains general information about the binary file, as follows:
  1. Byte order, number of load instructions, and schema type
  2. Quickly determine some information, such as the current file is 32-bit or 64-bit, the corresponding file type and processor
  • Load Commands A table that contains many contents

Includes the location of the region, dynamic symbol table and symbol table, etc

  • Data is generally the largest part of an object file

Generally, it contains specific data of seinterfaces

4.1 Data structure of header

In the project code, press Command+ space, then type loader.h to view loader.h and find mach_header

Above is mach_header, and the corresponding structure has the following meaning

View the Mach64 Header information using MachOView

4.2 LoadCommands

LoadCommand: LoadCommand: LoadCommand: LoadCommand: LoadCommand: LoadCommand: LoadCommand: LoadCommand: LoadCommand

But you may not understand the content, there are pictures for annotation, you can see the main meaning

4.3 the Data

Data includes seinterfaces, store specific Data, view the address mapping content through MachOView

Second, the Dyld

2.1 dyld overview

Dyld (The Dynamic Link Editor) is the Apple dynamic link editor, is an important part of the Apple system, after the system kernel is ready, the rest will be handed over to DyLD. System will first read the enforceability of App file (MachO) obtaining dyld path from the inside, and then load the dyld, dyld to initialize the running environment, open caching strategies, loader related dependent libraries, including the executable file 】 【 and links to these libraries, and finally call each dependent libraries initialization method, in this step, The Runtime is initialized. After all the dependent libraries have been initialized, the last bit [executable file] is initialized, at which point the Runtime initializes the class structure for all the classes in the project and then calls all the load methods. Finally, dyld returns the address of the function, main is called, and you arrive at the familiar program entrance.

2.2 DyLD loading process

The entry point to a program is usually in the main function, but few people care what happens before main(). This time let’s explore the dyLD loading process. (But load precedes main)

2.2.1 Create a project and break the main function

支那

支那

There is a libdyld. Dylib start entry before main(), but it is not what we want. Find the ** __dyLD_start ** function from dyld source

2.2.2 Dyld main function

The **dyld main() function is the key function. Dyld is also an executable with a main function. Dyld is also an executable with a main function

//
// Entry point for dyld.  The kernel loads dyld and jumps to __dyld_start which
// sets up some registers and call this function.
//
// Returns address of main() in target program which __dyld_start jumps to
//
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
        int argc, const char* argv[], const char* envp[], const char* apple[], 
        uintptr_t* startGlue)
{
    // Grab the cdHash of the main executable from the environment
    // 第一步,设置运行环境
    uint8_t mainExecutableCDHashBuffer[20];
    const uint8_t* mainExecutableCDHash = nullptr;
    if ( hexToBytes(_simple_getenv(apple, "executable_cdhash"), 40, mainExecutableCDHashBuffer) )
        // 获取主程序的hash
        mainExecutableCDHash = mainExecutableCDHashBuffer;

    // Trace dyld's load
    notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
#if !TARGET_IPHONE_SIMULATOR
    // Trace the main executable's load
    notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif

    uintptr_t result = 0;
    // 获取主程序的macho_header结构
    sMainExecutableMachHeader = mainExecutableMH;
    // 获取主程序的slide值
    sMainExecutableSlide = mainExecutableSlide;

    CRSetCrashLogMessage("dyld: launch started");
    // 设置上下文信息
    setContext(mainExecutableMH, argc, argv, envp, apple);

    // Pickup the pointer to the exec path.
    // 获取主程序路径
    sExecPath = _simple_getenv(apple, "executable_path");

    // <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
    if (!sExecPath) sExecPath = apple[0];

    if ( sExecPath[0] != '/' ) {
        // have relative path, use cwd to make absolute
        char cwdbuff[MAXPATHLEN];
        if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
            // maybe use static buffer to avoid calling malloc so early...
            char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
            strcpy(s, cwdbuff);
            strcat(s, "/");
            strcat(s, sExecPath);
            sExecPath = s;
        }
    }

    // Remember short name of process for later logging
    // 获取进程名称
    sExecShortName = ::strrchr(sExecPath, '/');
    if ( sExecShortName != NULL )
        ++sExecShortName;
    else
        sExecShortName = sExecPath;
    
    // 配置进程受限模式
    configureProcessRestrictions(mainExecutableMH);


    // 检测环境变量
    checkEnvironmentVariables(envp);
    defaultUninitializedFallbackPaths(envp);

    // 如果设置了DYLD_PRINT_OPTS则调用printOptions()打印参数
    if ( sEnv.DYLD_PRINT_OPTS )
        printOptions(argv);
    // 如果设置了DYLD_PRINT_ENV则调用printEnvironmentVariables()打印环境变量
    if ( sEnv.DYLD_PRINT_ENV ) 
        printEnvironmentVariables(envp);
    // 获取当前程序架构
    getHostInfo(mainExecutableMH, mainExecutableSlide);
    //-------------第一步结束-------------
    
    // load shared cache
    // 第二步,加载共享缓存
    // 检查共享缓存是否开启,iOS必须开启
    checkSharedRegionDisable((mach_header*)mainExecutableMH);
    if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
        mapSharedCache();
    }
    ...

    try {
        // add dyld itself to UUID list
        addDyldImageToUUIDList();

        // instantiate ImageLoader for main executable
        // 第三步 实例化主程序
        sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
        gLinkContext.mainExecutable = sMainExecutable;
        gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);

        // Now that shared cache is loaded, setup an versioned dylib overrides
    #if SUPPORT_VERSIONED_PATHS
        checkVersionedPaths();
    #endif


        // dyld_all_image_infos image list does not contain dyld
        // add it as dyldPath field in dyld_all_image_infos
        // for simulator, dyld_sim is in image list, need host dyld added
#if TARGET_IPHONE_SIMULATOR
        // get path of host dyld from table of syscall vectors in host dyld
        void* addressInDyld = gSyscallHelpers;
#else
        // get path of dyld itself
        void*  addressInDyld = (void*)&__dso_handle;
#endif
        char dyldPathBuffer[MAXPATHLEN+1];
        int len = proc_regionfilename(getpid(), (uint64_t)(long)addressInDyld, dyldPathBuffer, MAXPATHLEN);
        if ( len > 0 ) {
            dyldPathBuffer[len] = '\0'; // proc_regionfilename() does not zero terminate returned string
            if ( strcmp(dyldPathBuffer, gProcessInfo->dyldPath) != 0 )
                gProcessInfo->dyldPath = strdup(dyldPathBuffer);
        }

        // load any inserted libraries
        // 第四步 加载插入的动态库
        if  ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
            for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib)
                loadInsertedDylib(*lib);
        }
        // record count of inserted libraries so that a flat search will look at 
        // inserted libraries, then main, then others.
        // 记录插入的动态库数量
        sInsertedDylibCount = sAllImages.size()-1;

        // link main executable
        // 第五步 链接主程序
        gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
        if ( mainExcutableAlreadyRebased ) {
            // previous link() on main executable has already adjusted its internal pointers for ASLR
            // work around that by rebasing by inverse amount
            sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
        }
#endif
        link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
        sMainExecutable->setNeverUnloadRecursive();
        if ( sMainExecutable->forceFlat() ) {
            gLinkContext.bindFlat = true;
            gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
        }

        // link any inserted libraries
        // do this after linking main executable so that any dylibs pulled in by inserted 
        // dylibs (e.g. libSystem) will not be in front of dylibs the program uses
        // 第六步 链接插入的动态库
        if ( sInsertedDylibCount > 0 ) {
            for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                ImageLoader* image = sAllImages[i+1];
                link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
                image->setNeverUnloadRecursive();
            }
            // only INSERTED libraries can interpose
            // register interposing info after all inserted libraries are bound so chaining works
            for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                ImageLoader* image = sAllImages[i+1];
                image->registerInterposing();
            }
        }

        // <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
        for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
            ImageLoader* image = sAllImages[i];
            if ( image->inSharedCache() )
                continue;
            image->registerInterposing();
        }
        ...

        // apply interposing to initial set of images
        for(int i=0; i < sImageRoots.size(); ++i) {
            sImageRoots[i]->applyInterposing(gLinkContext);
        }
        gLinkContext.linkingMainExecutable = false;
        
        // <rdar://problem/12186933> do weak binding only after all inserted images linked
        // 第七步 执行弱符号绑定
        sMainExecutable->weakBind(gLinkContext);

        // If cache has branch island dylibs, tell debugger about them
        if ( (sSharedCacheLoadInfo.loadAddress != NULL) && (sSharedCacheLoadInfo.loadAddress->header.mappingOffset >= 0x78) && (sSharedCacheLoadInfo.loadAddress->header.branchPoolsOffset != 0) ) {
            uint32_t count = sSharedCacheLoadInfo.loadAddress->header.branchPoolsCount;
            dyld_image_info info[count];
            const uint64_t* poolAddress = (uint64_t*)((char*)sSharedCacheLoadInfo.loadAddress + sSharedCacheLoadInfo.loadAddress->header.branchPoolsOffset);
            // <rdar://problem/20799203> empty branch pools can be in development cache
            if ( ((mach_header*)poolAddress)->magic == sMainExecutableMachHeader->magic ) {
                for (int poolIndex=0; poolIndex < count; ++poolIndex) {
                    uint64_t poolAddr = poolAddress[poolIndex] + sSharedCacheLoadInfo.slide;
                    info[poolIndex].imageLoadAddress = (mach_header*)(long)poolAddr;
                    info[poolIndex].imageFilePath = "dyld_shared_cache_branch_islands";
                    info[poolIndex].imageFileModDate = 0;
                }
                // add to all_images list
                addImagesToAllImages(count, info);
                // tell gdb about new branch island images
                gProcessInfo->notification(dyld_image_adding, count, info);
            }
        }

        CRSetCrashLogMessage("dyld: launch, running initializers");
        ...
        // run all initializers
        // 第八步 执行初始化方法
        initializeMainExecutable(); 

        // notify any montoring proccesses that this process is about to enter main()
        dyld3::kdebug_trace_dyld_signpost(DBG_DYLD_SIGNPOST_START_MAIN_DYLD2, 0, 0);
        notifyMonitoringDyldMain();

        // find entry point for main executable
        // 第九步 查找入口点并返回
        result = (uintptr_t)sMainExecutable->getThreadPC();
        if ( result != 0 ) {
            // main executable uses LC_MAIN, needs to return to glue in libdyld.dylib
            if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
                *startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
            else
                halt("libdyld.dylib support not present for LC_MAIN");
        }
        else {
            // main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
            result = (uintptr_t)sMainExecutable->getMain();
            *startGlue = 0;
        }
    }
    catch(const char* message) {
        syncAllImages();
        halt(message);
    }
    catch(...) {
        dyld::log("dyld: launch failed\n");
    }
    ...
    
    return result;
}
Copy the code

Collapse the dyld main function, and the steps are summarized as follows

The top is just pulling out of itself. Let’s draw another flow chart to help you understand.

After talking about MachO files and Dyld dynamic link libraries, we move on to the real topic of today – startup optimization

Iii. IOS startup process

Total App startup time = Pre-main time + Main time

The startup of iOS programs can be divided into pre-main() phase and main() phase.

  • The pre – the main stage
  1. Load all dependent Mach-O files through recursive calls
  2. Load the dynamic link library dyLD
  3. Class extension method [Category]
  4. Call objc’s load function, c++ static object load
  5. Execute the declared attribute(constructor) function

The diagram below:

  • main
  1. Call the main ()
  2. Calls UIApplicationMain ()
  3. Call applicationWillFinishLaunching

4. Startup optimization

4.1 Optimization in pre-main stage

From Dyld’s description above, you can optimize the pre-main phase in the following ways

  • Delete unnecessary code [unused static variables, classes and methods] – Use AppCode
  1. Local variables that are not used
  2. Unused parameters
  3. Is the value used

  • + Load method optimization

The +load method does things to delay to +initialize, or the +load method does not do time-consuming operations, remove unnecessary load methods

We can Instrument Xcode to start all the +load methods, and the time it takes.

Here’s a more revealing picture:

So how can you optimize the +load method? – By **__attribute optimization +load method **

Because there are many load methods in engineering projects, most of them are for cells, and each cell corresponds to a template, and each template corresponds to a string. This is similar to the key-value method of dictionaries.

Try using the Data section in the Mach-O file structure above and write a string from the Data section of “TempSection” at compile time using __attribute((used, section(“__DATA,”# sectName “”)). The key:value dictionary converts json to key and value, respectively.

#ifndef ZXYStoreListTemplateSectionName#define ZXYStoreListTemplateSectionName "ZYTempSection"#endif #define ZXYStoreListTemplateDATA(sectname) __attribute((used, section("__DATA,"#sectname" "))) #define ZXYStoreListTemplateRegister(templatename,templateclass) \ class NSObject; char * k##templatename##_register ZXYStoreListTemplateDATA(ZXYTempSection) = "{ \""#templatename"\" : \""#templateclass"\"}"; / * * by ZXYStoreListTemplateRegister (key, the classname) registration process template class name (must be ZXYStoreListBaseTemplate subclass) 【 notes 】 In this way, registration information is bound to the __attribute attribute during compilation, which is fast at runtime. Registration information is read when the call is triggered for the first time and does not affect the pre-main time. In this way, symbols other than underscore '_' are not supported in the 'key' field. @ZYStoreListTemplateRegister(baseTemp,ZYStoreListBaseTemplate) **/Copy the code

This allows you to optimize a large number of repetitive +load methods. In addition, the __attribute attribute is used to bind the registration information during compilation. To the advantage of fast reading at runtime, the registration information is read when the call is triggered for the first time, without affecting the pre-main time.

  • Reduce unnecessary frameworks

For pre – the main stage, trill share the binary rearrangement of the scheme, can help improve startup time 15%, binary rearrangement scheme is not to say, you can refer to the article mp.weixin.qq.com/s/Drmmx5Jtj… And juejin. Cn/post / 684490…

4.2 Optimization in main stage

This stage is mainly the period from the beginning of main function to the completion of the first interface rendering. The starting point of optimization is to reduce the time from the beginning of main function to the first interface.

2 didFinishLaunchingWithOptions

  1. Configure the environment required by the App
  2. Integration of third-party SDK [push, selling point, payment, etc.]
  3. Log etc.

Sometimes lazy loading operations can be used, as well as asynchronous loading methods to prevent serial operation time-consuming.

The Call Tree->Hide System Libraries filter out the System Libraries to check the elapsed time of methods in the main thread.

You can also determine the time-consuming operation of a function by printing time

double launchTime = CFAbsoluteTimeGetCurrent(); [SDWebImageManager sharedManager]; NSLog(@"launchTime = %f seconds ", CFAbsoluteTimeGetCurrent() -launchTime);Copy the code

** Identify which can be loaded lazily, which can be loaded in child threads, and which can be processed lazily. 支那

4.2.2 Optimization of home page

  • Many apps do not launch all at once and go directly to the home page. Instead, they need to show users a short flash screen. When an App is complex, it takes a long time to build the UI of the App for the first time at startup. Let’s say the time is 0.3 seconds. If we build the UI of the home page first and then add the splash screen to the Window, the App will actually be stuck for 0.2 seconds at cold startup. But if we started with the splash page as the App RootViewController, the build process would be quick. Since the splash screen only has a simple ImageView, and this ImageView is displayed to the user for a short period of time, we can use this time to build the home UI, killing two birds with one stone. The diagram below:

  • Instead of using xib or storyboard, use code directly
  • For viewDidLoad and viewWillAppear do it as little as possible, do it late or do it asynchronously

conclusion

I will constantly update nutritious, have their own views and experience of the blog, if you want to grow up together, welcome to praise and pay attention to myself and message. Thanks!!