The introduction of

We all know that the load method is called before main when our app is running.

  • What did the load method do before?
  • How does Apple load these dynamic libraries into memory?
  • How were dynamic libraries connected before?

With that in mind we introduce a really cool thing called the DyLD linker.

Dyld profile

Dyld (The Dynamic Link Editor) is apple’s dynamic linker, which is an important part of Apple’s operating system. After the program preparation of the system kernel, DyLD is responsible for the remaining work. It is also open source, so anyone can download the source code on apple’s website to read how it works and learn the details of how the system loads the dynamic library.

The preparatory work

  • Dyld-852 The latest DYLD3
  • Libdispatch source link
  • Libsystem source code link
  • Objc4-818.2 source link

Compile and library

IOS programmers tend to write upper-level code, most familiar with the.h and.m files, and how they go through the process of compiling the.h and.m files to produce the executable files

  • The source file:.h,.m,.cppAnd other documents
  • Preprocessing: During preprocessing, comments are removed, conditional compilation is processed, headers are expanded, and macros are replaced
  • Compilation: performs lexical analysis and syntax analysis as well as intermediate layersIRFile, and finally generate assembly file.sfile
  • Assembly:.sFiles are converted to machine language generation.ofile
  • Links: will all.oFiles as well as links to third-party libraries that generate onemachoType of executable file

The build process

The process in which the DyLD linker works is shown below

Dyld introduction and process analysis

First we should make it clear that the dyLD link precedes the main function. Let’s create a new app project and print the stack at the breakpoint in main.

The stack shows the libdyld. Dylibstart -> main function, but we don’t know what happens in between. Find a method before main, load, that prints the stack at the breakpoint.

  • The load flow 1 stack. _dyld_start – > 2. Dyldbootstrap: : start – > 3. Dyld: : _main – > 4. Dyld: : initializeMainExecutable () – > 5.runInitializers -> 6.processInitializers -> 7.recursiveInitialization -> 8.dyld::notifySingle -> 9.load_images -> 10.[ViewController load]

  • There are a lot of processes and when we analyze it, the general idea is to grasp what is DYLD? Linker, analysis of dyLD source code is still relatively sleepy, hope to give yourself some patience. Now that we have started to analyze DYLD, let’s take a look at the source code DYLD. Dyld source code is linked in the preparation work, please download it first.

_dyld_start

  • Open dyLD source location_dyld_startmethods

  • Bl instructions are calling functionsdyldbootstrap::start
  • dyldbootstrap::startC++ syntax we search the namespace firstdyldbootstrapThe inside of thestartmethods

dyldbootstrap::start

  • Before are the dyld boot processing code and processing c++ code
  • The return value is calleddyld::_mainWe locatedyld::_mainfunction

dyld::_main

  • Dyld ::_main The total code is about 1000 lines, most of which are conditional preparations: environment, platform information, path. Host information. We are not going to analyze it one by one. That’s not what we’re focusing on. Those interested can go to WWDC for details.

  • getHostInfo(mainExecutableMH, mainExecutableSlide)Platform information processing;

  • Dyld condition preparation stage

MapSharedCache Indicates the shared cache

In iOS, the dynamic libraries that each program depends on need to be loaded into the memory one by one through the dyld (located at /usr/lib/dyld). However, many system libraries are used by almost every program. If each program is repeatedly loaded once, it will inevitably cause slow running. In order to optimize startup speed and improve program performance, the shared cache mechanism emerged. All the default dynamic link libraries are merged into one large cache file, in/System/Library/Caches/com. Apple. Dyld/directory

  • We’re looking at two functionscheckSharedRegionDisableCheck whether shared cache is disabled iOS cannot run without a shared area.mapSharedCachecallloadDyldCacheLoad the cached

MapSharedCache Shared cache loading

There are three cases of shared cache loading:

  • Only loaded into the current process, calledmapCachePrivate().
  • The shared cache is loaded and no action is taken.
  • Called when the current process loads the shared cache for the first timemapCacheSystemWide().

Mainly instantiateFromLoadedImage program initializes the image loader

  • segmentThe maximum number of segments is256a
  • commandThe maximum number of alpha is alpha4096a
  • Make sure you have to rely on itlibSystemlibrary

Segment and Command as well as Macho, using the MachOView tool to learn about executable files

LoadInsertedDylib inserts the dynamic library

LoadInsertedDylib inserts dynamic libraries, the number of dynamic libraries is all the image files minus one

Link Links the main program

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath{ ... This ->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath); context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly); . __block uint64_t t2, t3, t4, t5; { dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0); t2 = mach_absolute_time(); / / recursive relocation this - > recursiveRebaseWithAccounting (context); context.notifyBatch(dyld_image_state_rebased, false); t3 = mach_absolute_time(); if ( ! Context. LinkingMainExecutable) / / recursive binding the lazy loading this - > recursiveBindWithAccounting (context, forceLazysBound neverUnload); t4 = mach_absolute_time(); if ( ! The context. LinkingMainExecutable) / / weak binding this - > weakBind (context); t5 = mach_absolute_time(); }}Copy the code

The ImageLoader::link method is called. The ImageLoader is responsible for loading the image file

  • Load all dynamic libraries recursively
  • recursiveimagerelocation
  • Recursive binding is not lazy loading
  • Weak binding

RecursiveLoadLibraries Recursively bound dynamic libraries

void ImageLoader::recursiveLoadLibraries(const LinkContext& context, bool preflightOnly, const RPathChain& loaderRPaths, const char* loadPath){ ... DependentLibraryInfo libraryInfos[fLibraryCount]; this->doGetDependentLibraries(libraryInfos); STD ::vector<const char*> rpathsFromThisImage; this->getRPaths(context, rpathsFromThisImage); const RPathChain thisRPaths(&loaderRPaths, &rpathsFromThisImage); For (unsigned int I =0; i < fLibraryCount; ++i){ ... dependentLib = context.loadLibrary(requiredLibInfo.name, true, this->getPath(), &thisRPaths, cacheIndex); SetLibImage (I, dependentLib, depLibReExported, requiredlibinfo.upward); . } '// for(unsigned int I =0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage ! = NULL ) { dependentImage->recursiveLoadLibraries(context, preflightOnly, thisRPaths, libraryInfos[i].name); }}}Copy the code
  • Gets the currentimageDependent dynamic library and the file path of the dynamic library
  • loadingimageDependency dynamic library and save it
  • Tell the image-dependent dynamic libraries to load the required dynamic libraries

WeakBind Weakly binds the main program

  • All the preparation isinitializeMainExecutable()Let’s focus on him

InitializeMainExecutable initializes the main program

  • Run the initialization method of the dynamic library first, and then run the initialization method of the main program
  • Global searchrunInitializers
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo) { uint64_t t1 = mach_absolute_time(); mach_port_t thisThread = mach_thread_self(); ImageLoader::UninitedUpwards up; up.count = 1; up.imagesAndPaths[0] = { this, this->getPath() }; ProcessInitializers (Context, thisThread, timingInfo, up); context.notifyBatch(dyld_image_state_initialized, false); mach_port_deallocate(mach_task_self(), thisThread); uint64_t t2 = mach_absolute_time(); fgTotalInitTime += (t2 - t1); }Copy the code
  • Global searchprocessInitializers
// upward dylib initializers can be run too soon // To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs // have their initialization postponed until after the recursion through downward dylibs // has  completed. void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images) { uint32_t maxImageCount = context.imageCount()+2; ImageLoader::UninitedUpwards upsBuffer[maxImageCount]; ImageLoader::UninitedUpwards& ups = upsBuffer[0]; ups.count = 0; // Calling recursive init on all images in images list, Building a new list of // uninitialized upward dependencies. // i < images.count; ++i) { images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups); } // If any upward dependencies remain, init them. If (ups.count > 0) processInitializers(context, thisThread, timingInfo, ups) }Copy the code
  • Global searchrecursiveInitialization
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { recursive_lock lock_info(this_thread); recursiveSpinLock(lock_info); . For (unsigned int I =0; // initialize lower level libraries first; // initialize lower level libraries first i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage ! = NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); Uint64_t t1 = mach_absolute_time(); fState = dyld_image_state_dependents_initialized; oldState = fState; NotifySingle (DYLD_IMAGe_STATE_dependentS_initialized, this, &timingInfo); // Context. notifySingle(DYLD_IMAGe_STATE_dependents_initialized, this, &timingInfo); // initialize this image bool hasInitializers = this->doInitialization(context); // Let anyone know we finished initializing this image fState = dyLD_image_state_initialized; oldState = fState; // The next handle notifies context.notifySingle(dyLD_IMAGe_STATE_INITIALIZED, this, NULL); recursiveSpinUnLock(); }Copy the code
  • RecursiveInitialization does three things.

  • 1. Initialize the system library

  • 2. Initialize the image file that the system library depends on

    1. Example Initialize the image filedoInitializationAnd notification that the injection initialization is complete
  • The context in notifySingle is a const LinkContext and it’s a struct that calls the function

void (*notifySingle)(dyld_image_states, const ImageLoader* image, InitializerTimingList*);

  • The following focuses on analysisnotifySingleThe process anddoInitialization

NotifySingle Notifies injection

Global search notifySingle

gLinkContext.notifySingle = &notifySingle;
Copy the code

  • (*sNotifyObjCInit)(image->getRealPath(), image->machHeader())This is a function call, and the arguments areimageOkay, so now we just have to figure out where to assign. Global searchsNotifyObjCInit, the source code is as follows

  • The second parameter init in the registerobjC_notify_INIT is of type _dyLD_OBJC_notify_init assigns sNotifyObjCInit, so what is _dyLD_OBJC_notify_init equal to?

  • A global search for _dyLD_OBJC_notify_init did not find a single assignment for _dyLD_OBJC_notify_init. The only solution is to follow the flow with symbolic breakpoints

  • Stack information shows that_dyld_objc_notify_registerbylibobjc.A.dylib_objc_init ‘initiated, we follow up libobJC source location take a look.

  • Process analysis by doInitialization –> doModInitFunctions –> libSystem_initializer –> libdispatch_init –> _os_object_init –> _objc_init –> _dyld_objc_notify_register –> registerObjCNotifiers

  • The libSystem_initializer method is in the libSystem system library

  • The libdispatch_init and _OS_object_init methods are in the libDispatch system library

  • The _objc_init method is in the LibobJC system library

  • From here we can get the complete closed loop: dyld calls libSystem initialization and calls libDispatch, libobJC series of dynamic libraries, after the image file and dependency file initialization is complete, send notification link libobJC to do the methods and classes and so on.

The flow chart of dyld

It’s still being made.

conclusion

Load method call flow

  • _dyld_start –> dyldbootstrap::start –> dyld::_main –> intializeMainExecutable –> runInitializers –> processInitializers –> runInitializers –>recursiveInitialization –> notifySingle –> load_images –>+[ViewController load]

Procedure for calling the _objc_init method

  • doInitialization –> doModInitFunctions –> libSystem_initializer –> libdispatch_init –> _os_object_init –> _objc_init –> _dyld_objc_notify_register –>registerObjCNotifiers

These two call flows form a complete loop through doInitialization and notifySingle