Basic principles of iOS - Application loading

preface

In today’s information age, the popularity of mobile applications has solved many problems and provided higher convenience for life. The App we use every day when you open it, how does the application load and run? What happens between clicking open and the first visualization? This article focuses on the application loading process and how it is loaded into memory.

Application loading principles

The compilation process

The source fileafterprecompiledThat is, passlexicalandgrammarAnalysis of the
precompiledProvide to upon completionThe compilerforcompile
compileAnd then someAssembly file
Assembly filethroughlinkLoad it in, and getExecutable file

Static and dynamic libraries

Libraries are executable binaries that are loaded into memory by the operating system.

The form of libraries
- Static library
- The dynamic library
The difference between static and dynamic libraries
- Static libraries are generally.a/.libAnd so on, the dynamic library is generally.so/.dll/.framework/.dylibFormat such as
- Static libraries are loaded by static links, and dynamic libraries are loaded by dynamic links
Link structure diagram for static and dynamic libraries
- Static libraries are loaded one by one, and duplicate static library files may exist
- Dynamic libraries can be loaded through shared library files to save space

How are libraries loaded into memory

Loading flow chart

The loading process of dynamic linker DYLD is roughly as follows:

It loads when the App startslibSystem
RuntimeRegister the related callback functions
Load the newImages (image file)To map the library files into memory
performmap_images(),load_images()
callmainfunction

LLDB debugging

According to the above process, we in the actual project through LLDB debugging to analyze the application loading process, from App startup to main function do what?

First create an empty project, then inmainMake a breakpoint at the function entry and start running

What we found was thatmainThere’s a function before it executesstartDelta function. Now let’s look at thisstart

You can seelibdyld.dylib startthestartIs fromlibdyld.dylibDynamic library, but how to load in and call, just from the above results do not know, next symbol breakpoint.

Add to the projectstartSign breakpoint, and run again

Turns out it didn’t break. It just made itmainFunction, as you can see from the flow aboveloadMethods in themainBefore delta function, and then after delta functionViewControllerthe+loadMake a break point in the method.

ViewControllertheloadMethod to break, and then run again

Found in themainThe function comes firstloadmethods

The inputbtPrinting stack information

You can see that the _DYLD_START method is called from dyLD. You can download the latest dyLD open-source code (dyLD-852) from Apple’s official web site to see how dyLD is loaded.

Source code analysis (forward projection)

From the above analysis, we get the _dyLD_START function, and then we get the dyLD source code for a global search.

Dyldbootstrap ::start C++ function dyldbootstrap::start C++ function dyldbootstrap::start C++

dyldbootstrap::start

// // This is code to bootstrap dyld. This work in normally done for a program by dyld and crt. // In dyld we have to do  this manually. // uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[], Const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue) {// Omit some code...... // now that we are done bootstrapping dyld, call dyld's main uintptr_t appsSlide = appsMachHeader->getSlide(); return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue); }Copy the code

The main function is dyld::_main. The main function is dyld::_main.

dyld::_main

The dyld::_main function contains more than 800 lines of code, which will not be posted. It mainly involves the following processes:

Conditions (Environment, Platform, version, path, host information…)
instantiateFromLoadedImageInstantiate the main program
loadInsertedDylibLoad the inserted dynamic library
mapSharedCacheShared cache loading
linkThe main program
linkInsert the dynamic library
weakBindWeak references bind the main program
initializeMainExecutableInitialize the
notifyMonitoringDyldMainNotify dyld that it is ready to enter main

According to the return value of dyld::_main, the result returned is generated by the sMainExecutable function that loads the associated image file.

uintptr_t _main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, int argc, const char* argv[], const char* envp[], const char* apple[], uintptr_t* startGlue) { ...... // <rdar://problem/12186933> do weak binding only after all inserted images linked sMainExecutable->weakBind(gLinkContext); . / / weak binding gLinkContext linkingMainExecutable = false; sMainExecutable->recursiveMakeDataReadOnly(gLinkContext); . #if SUPPORT_OLD_CRT_INITIALIZATION // Old way is to run initializers via a callback from crt1.o if ( ! gRunInitializersOldWay ) initializeMainExecutable(); #else // run all initializers initializeMainExecutable(); Motoring #endif // Notify any motoring proccesses that this process is about to enter main() notifyMonitoringDyldMain(); // Tell dyld to enter main function...... { // find entry point for main executable result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN(); if ( result ! = 0) {...... } else { // main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main() result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();  *startGlue = 0; } } } if (sSkipMain) { notifyMonitoringDyldMain(); . result = (uintptr_t)&fake_main; *startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit; } return result; }Copy the code

initializeMainExecutable

One important function call we can see in the dyld::_main function is initializeMainExecutable(), for all initialization operations, and continue to see how this function is implemented.

void initializeMainExecutable() { ....... const size_t rootCount = sImageRoots.size(); if ( rootCount > 1 ) { for(size_t i=1; i < rootCount; ++i) { sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); } } // run initializers for main executable and everything it brings up sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); . }Copy the code

InitializeMainExecutable calls runInitializers. Look globally at the definitions of runInitializers.

runInitializers

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
    uint64_t t1 = mach_absolute_time();
    mach_port_t thisThread = mach_thread_self();
    ImageLoader::UninitedUpwards up;
    up.count = 1;
    up.imagesAndPaths[0] = { this, this->getPath() };
    processInitializers(context, thisThread, timingInfo, up);
    context.notifyBatch(dyld_image_state_initialized, false);
    mach_port_deallocate(mach_task_self(), thisThread);
    uint64_t t2 = mach_absolute_time();
    fgTotalInitTime += (t2 - t1);
}
Copy the code

Find calls to processInitializers based on the function above.

processInitializers

void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
    uint32_t maxImageCount = context.imageCount()+2;
    ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
    ImageLoader::UninitedUpwards& ups = upsBuffer[0];
    ups.count = 0;
    // Calling recursive init on all images in images list, building a new list of
    // uninitialized upward dependencies.
    for (uintptr_t i=0; i < images.count; ++i) {
        images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
    }
    // If any upward dependencies remain, init them.
    if ( ups.count > 0 )
        processInitializers(context, thisThread, timingInfo, ups);
}
Copy the code

RecursiveInitialization is a function defined by recursiveInitialization

recursiveInitialization

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { ...... if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles fState = dyld_image_state_dependents_initialized-1; try { ...... NotifySingle (DYLD_IMAGe_STATE_dependentS_initialized, this, &timingInfo); // initialize this image bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image fState = dyld_image_state_initialized; oldState = fState; NotifySingle (dyLD_IMAGe_STATE_initialized, this, NULL); . } } recursiveSpinUnLock(); }Copy the code

The recursiveInitialization function flows as follows:

Get the image file path and initialize the operation
Dependency file initialization
Initialization of its own file

Let’s look at the definition of a notifySingle with a global search.

notifySingle

static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo) { ...... if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! = NULL) && image->notifyObjC() ) { uint64_t t0 = mach_absolute_time(); dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); MachHeader (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); . }... }Copy the code

Now let’s see where sNotifyObjCInit is called.

sNotifyObjCInit

You get a definition that looks like this

static _dyld_objc_notify_init		sNotifyObjCInit;
Copy the code

The _dyLD_OBJC_notify_init search found to be the second parameter of the registerObjCNotifiers.

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { // record functions to call sNotifyObjCMapped = mapped; sNotifyObjCInit = init; sNotifyObjCUnmapped = unmapped; . }Copy the code

SNotifyObjCInit = init, and use this function to find out where the registerObjCNotifiers were called.

registerObjCNotifiers

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}
Copy the code

Step by step, loading from a single image file leads to the function _dyLD_OBJC_notify_register, which comes from the _objc_init initialization call of libobjC.dylib, A review of the _objc_init function implementation shows that the process is strung together.

void _objc_init(void) { static bool initialized = false; if (initialized) return; initialized = true; // fixme defer initialization until an objc-using image is found? environ_init(); tls_init(); static_init(); runtime_init(); exception_init(); #if __OBJC2__ cache_t::init(); #endif _imp_implementationWithBlock_init(); _dyLD_OBJC_notify_register (&map_images, load_images, unmap_image); #if __OBJC2__ didCallDyldNotifyRegister = true; #endif }Copy the code

LLDB debugging

Through the above derivation and source code analysis, we run in the actual project, verify whether the process is the same as the analysis. Create a breakpoint in the _objc_init method, run it and type bt in the LLDB to print the current stack information.

It can be seen that the process marked with red box is the process of forward conjecture and analysis above, but the call above red box is not known for the time being, and then we need to analyze the following process and continue to explore through reverse derivation. The _objc_init call is preceded by _OS_object_init. Click here to see the details.

The _OS_object_init function comes from the libdispatch. Dylib file. You can download libdispatch from apple’s open source library.

Source code analysis (backward derivation)

Stack flow information

The _OS_object_init function is located based on the stack output above. Next, search the libdispatch source code for _OS_object_init.

_os_object_init

void _os_object_init(void) { _objc_init(); Block_callbacks_RR callbacks = { sizeof(Block_callbacks_RR), (void (*)(const void *))&objc_retain, (void (*)(const void *))&objc_release, (void (*)(const void *))&_os_objc_destructInstance }; _Block_use_RR2(&callbacks); . }Copy the code

Call _objc_init in _OS_object_init. Search for _objc_init in libDispatch

You can see that _objc_init comes from LibobJC, and combined with the stack output flow above, you can probably list something like this: libSystem_initializer ~> libdispatch_init ~> _os_object_init ~> _objc_init

libdispatch_init

Search for libdispatch_init in the libdispatch source

void
libdispatch_init(void)
{
    ......
    
    _dispatch_hw_config_init();
    _dispatch_time_init();
    _dispatch_vtable_init();
    _os_object_init();
    _voucher_init();
    _dispatch_introspection_init();
}
Copy the code

You can see that _OS_object_init is called to validate the above process, but for libSystem_initializer you also need to download the libSystem source code.

libSystem_initializer

// libsyscall_initializer() initializes all of libSystem.dylib // <rdar://problem/4892197> __attribute__((constructor)) static void libSystem_initializer(int argc, const char* argv[], const char* envp[], const char* apple[], const struct ProgramVars* vars) { ...... libdispatch_init(); _libSystem_ktrace_init_func(LIBDISPATCH); . }Copy the code

The search for libSystem_initializer from the libSystem source code also calls libdispatch_init, and before libSystem_initializer is dyld’s doModInitFunctions, So let’s go back to dyLD. So now adjust the call flow again.

doModInitFunctions ~> libSystem_initializer ~> libdispatch_init ~> _os_object_init ~> _objc_init

doModInitFunctions

void ImageLoaderMachO::doModInitFunctions(const LinkContext& context) { ...... Initializer* inits = (Initializer*)(sect->addr + fSlide); . Initializer func = inits[j]; . Initializer func(context.argc, context.argv, context.envp, context.apple, & context.programvars); // To obtain the path to libSystem, run Initializer func(context.argc, context.argv, context.envp, context.apple, & context.programvars); . }Copy the code

Where is doModInitFunctions called

doInitialization

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
    CRSetCrashLogMessage2(this->getPath());

    // mach-o has -init and static initializers
    doImageInit(context);
    doModInitFunctions(context);

    CRSetCrashLogMessage2(NULL);

    return (fHasDashInit || fHasInitializers);
}
Copy the code

The doModInitFunctions are called in the recursiveInitialization function we predicted above, which is called in the recursiveInitialization function, and the process is closed.

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { ...... if ( fState < dyld_image_state_dependents_initialized-1 ) { ...... try { ...... // let objc know we are about to initialize this image uint64_t t1 = mach_absolute_time(); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // initialize this image bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image fState = dyld_image_state_initialized; oldState = fState; context.notifySingle(dyld_image_state_initialized, this, NULL); . }... }}Copy the code

conclusion

From the source code analysis flow and in conjunction with engineering debugging we get a chain of function calls like this:

_dyld_start ~> dyld::_main ~> initializeMainExecutable ~> runInitializers ~> processInitializers ~> recursiveInitialization ~> doInitialization ~> doModInitFunctions ~> libSystem_initializer ~> libdispatch_init ~> _os_object_init ~> _objc_init

NotifySingle called in recursiveInitialization, and the registerObjCNotifiers located in recursiveInitialization are only assignment operations. When to execute the call?

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init 
                            init, _dyld_objc_notify_unmapped unmapped)
{
    // record functions to call
    sNotifyObjCMapped	= mapped;
    sNotifyObjCInit	= init;
    sNotifyObjCUnmapped = unmapped;
    
    // call 'mapped' function with all images mapped so far
    try {
        notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
    }
    catch (const char* msg) {
        // ignore request to abort during registration
    }
}
Copy the code

(sNotifyObjCMapped) (notifyBatchPartial) (notifyBatchPartial) (sNotifyObjCMapped) (notifyBatchPartial)

The sNotifyObjCInit method is called in notifySingle, and notifySingle is executed in recursiveInitialization above. Since recursiveInitialization is a recursive process, perform initialization for the first time and load image files.

Refer to the video

For an introduction to dyLD2 and DyLD3, see apple’s WWDC video.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Basic principles of iOS – Application loading

preface

Application loading principles

The compilation process

Static and dynamic libraries

How are libraries loaded into memory

Loading flow chart

LLDB debugging

Source code analysis (forward projection)

dyldbootstrap::start

dyld::_main

initializeMainExecutable

runInitializers

processInitializers

recursiveInitialization

notifySingle

sNotifyObjCInit

registerObjCNotifiers

LLDB debugging

Source code analysis (backward derivation)

Stack flow information

_os_object_init

libdispatch_init

libSystem_initializer

doModInitFunctions

doInitialization

conclusion

Refer to the video

Basic principles of iOS – Application loading

preface

Application loading principles

The compilation process

Static and dynamic libraries

How are libraries loaded into memory

Loading flow chart

LLDB debugging

Source code analysis (forward projection)

dyldbootstrap::start

dyld::_main

initializeMainExecutable

runInitializers

processInitializers

recursiveInitialization

notifySingle

sNotifyObjCInit

registerObjCNotifiers

LLDB debugging

Source code analysis (backward derivation)

Stack flow information

_os_object_init

libdispatch_init

libSystem_initializer

doModInitFunctions

doInitialization

conclusion

Refer to the video

Related Posts

Swift control flow (conditional statements, loop statements, control turn statements)

Assembler analysis structure, memory, closure

Ios-14. Slow search process analysis of method search process