This is the 22nd day of my participation in the August Wen Challenge.More challenges in August

Following on from the previous article, let’s continue our analysis of dyLD loading

3.3 Loading the shared Cache with mapSharedCache

Shared cache Specializes in caching system dynamic libraries, such as UIKit and Foundation. MapSharedCache calls loadDyldCache:

Static void mapSharedCache(uintptr_t mainableslide) {...... LoadDyldCache loadDyldCache(opts, &sSharedCacheloadInfo); ... }Copy the code

3.3.1 loadDyldCache

bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results) { results->loadAddress = 0; results->slide = 0; results->errorMessage = nullptr; #if TARGET_OS_SIMULATOR // simulator only supports mmap()ing cache privately into process return mapCachePrivate(options, results); #else if (options. ForcePrivate) {// mmap cache into this process only // return mapCachePrivate(options, results); } else { // fast path: when cache is already mapped into shared region bool hasError = false; If (reuseExistingCache(options, results)) {hasError = (results->errorMessage! = nullptr); } else {// slow path: this is the first process to load cache // hasError = mapCacheSystemWide(options, results); } return hasError; } #endif }Copy the code

LoadDyldCache has three logics: 1. MapCachePrivate is invoked only for the current process. Do not put in the shared cache, only for your own use. 2. If yes, no processing is performed. 3. The current process invokes mapCacheSystemWide for the first time

The shared cache of the dynamic library is loaded first during the entire application startup process.

3.4 instantiateFromLoadedImage instantiation of the main program (createimage)

static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, Const char * path) {/ / instantiate the image ImageLoader * image = ImageLoaderMachO: : instantiateMainExecutable (mh, slide, path, gLinkContext); // addImage to all Images addImage(image); return (ImageLoaderMachO*)image; // throw "main executable not a known format"; }Copy the code
  • Passed into the main programHeader,ASLR,pathInstantiate main program generationimage.
  • willimagejoinall imagesIn the.

Instantiate the real call is actually ImageLoaderMachO: : instantiateMainExecutable:

// create image for main executable ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context) { bool compressed; unsigned int segCount; unsigned int libCount; const linkedit_data_command* codeSigCmd; const encryption_info_command* encryptCmd; // Get LoadCommands sniffLoadCommands(mh, path, false, &compressed, &segCount, &libcount, context, &codesigcmd, &encryptCmd); // Instantiate concrete class based on content of load commands // Instantiate concrete class based on content of load Select the corresponding subclass based on the value to instantiate image. if ( compressed ) return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); else #if SUPPORT_CLASSIC_MACHO return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); #else throw "missing LC_DYLD_INFO load command"; #endif }Copy the code
  • callsniffLoadCommandsGenerate relevant information, such ascompressed.
  • According to thecompressedDetermine which subclass to load withimage.ImageLoaderIs an abstract class that instantiates the main program by selecting the corresponding subclass based on the value.

sniffLoadCommands

void ImageLoaderMachO::sniffLoadCommands(const macho_header* mh, const char* path, bool inCache, bool* compressed, unsigned int* segCount, unsigned int* libCount, const LinkContext& context, const linkedit_data_command** codeSigCmd, Const encryption_info_command** encryptCmd) {// Select *compressed = false from LC_DYLIB_INFO and LC_DYLD_INFO_ONLY; // Segment number *segCount = 0; //lib number *libCount = 0; CodeSigCmd = NULL; *encryptCmd = NULL; ... If (*segCount > 255) dyld::throwf("malformed Mach-o image: more than 255 segments in %s", path); If (*libCount > 4095) dyld::throwf("malformed mach-o image: more than 4095 dependent libraries in %s", path); if ( needsAddedLibSystemDepency(*libCount, mh) ) *libCount = 1; // dylibs that use LC_DYLD_CHAINED_FIXUPS have that load command removed when put in the dyld cache if ( ! *compressed && (mh->flags & MH_DYLIB_IN_CACHE) ) *compressed = true; }Copy the code
  • compressedIs based onLC_DYLIB_INFOandLC_DYLD_INFO_ONLYTo get.
  • segCountmost256A.
  • libCountmost4096A.

3.5 loadInsertedDylib Inserts and loads dynamic libraries

static void loadInsertedDylib(const char* path) { unsigned cacheIndex; Try {... // Call load to load the actual dynamic library function load(path, context, cacheIndex); }... }Copy the code
  • Initialize the configuration call based on the contextloadLoad the dynamic library.

3.6 ImageLoader::link Links the main program/dynamic library

void link(ImageLoader* image, bool forceLazysBound, bool neverUnload, const ImageLoader::RPathChain& loaderRPaths, unsigned cacheIndex) { // add to list of known images. This did not happen at creation time for bundles if ( image->isBundle() && ! image->isLinked() ) addImage(image); // we detect root images as those not linked in yet if ( ! image->isLinked() ) addRootImage(image); // process images try { const char* path = image->getPath(); #if SUPPORT_ACCELERATE_TABLES if ( image == sAllCacheImagesProxy ) path = sAllCacheImagesProxy->getIndexedPath(cacheIndex); Link image->link(gLinkContext, forceLazysBound, false, neverUnload, loaderRPaths, path); }}Copy the code
  • linkThe final call is going to beImageLoader::link.

ImageLoader::link

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath) { // clear error strings (*context.setErrorStrings)(0, NULL, NULL, NULL); // Start time. Uint64_t0 = mach_absolute_time(); // Recursively load the library that the main program depends on. this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath); context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly); ... uint64_t t1 = mach_absolute_time(); context.clearAllDepths(); this->updateDepth(context.imageCount()); __block uint64_t t2, t3, t4, t5; { dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0); t2 = mach_absolute_time(); / / Rebase correction ASLR this - > recursiveRebaseWithAccounting (context); context.notifyBatch(dyld_image_state_rebased, false); t3 = mach_absolute_time(); if ( ! This context. LinkingMainExecutable) / / bind NoLazy symbols - > recursiveBindWithAccounting (context, forceLazysBound neverUnload); t4 = mach_absolute_time(); if ( ! Context. LinkingMainExecutable) / / binding weak symbols this - > weakBind (context); t5 = mach_absolute_time(); } // interpose any dynamically loaded images if ( ! context.linkingMainExecutable && (fgInterposingTuples.size() ! = 0) ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0); / / recursive application insert dynamic library this - > recursiveApplyInterposing (context); } // now that all fixups are done, make __DATA_CONST segments read-only if ( ! context.linkingMainExecutable ) this->recursiveMakeDataReadOnly(context); if ( ! context.linkingMainExecutable ) context.notifyBatch(dyld_image_state_bound, false); uint64_t t6 = mach_absolute_time(); if ( context.registerDOFs ! = NULL ) { std::vector<DOFInfo> dofs; this->recursiveGetDOFSections(context, dofs); / / register context. RegisterDOFs (dofs); Uint64_t t7 = mach_absolute_time(); // Clear error strings // Configure the environment variables to see how long the dyld application is loading. (*context.setErrorStrings)(0, NULL, NULL, NULL); fgTotalLoadLibrariesTime += t1 - t0; fgTotalRebaseTime += t3 - t2; fgTotalBindTime += t4 - t3; fgTotalWeakBindTime += t5 - t4; fgTotalDOF += t7 - t6; // done with initial dylib loads fgNextPIEDylibAddress = 0; }Copy the code
  • correctionASLR.
  • The bindingNoLazySymbols.
  • Bind weak symbols.
  • Registration.
  • Record the time, which can be seen through the configurationdyldApplication loading duration.

3.7 initializeMainExecutable Initializes the main program

void initializeMainExecutable() { // record that we've reached this step gLinkContext.startedInitializingMainExecutable = true; / / the run initialzers for any inserted dylibs / / to get all of the image file ImageLoader: : InitializerTimingList initializerTimes[allImagesCount()]; initializerTimes[0].count = 0; const size_t rootCount = sImageRoots.size(); If (rootCount > 1) {// Start from 1 to end. For (size_t I =1; i < rootCount; SImageRoots [I]->runInitializers(gLinkContext, initializerTimes[0]); }} // Run initializers for main executable and everything it brings up sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); // register cxa_atexit() handler to run static terminators in all loaded images when this process exits if ( gLibSystemHelpers ! = NULL ) (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL); // dump info if requested if ( sEnv.DYLD_PRINT_STATISTICS ) ImageLoader::printStatistics((unsigned int)allImagesCount(),  initializerTimes[0]); if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS ) ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]); }Copy the code
  • Initialize theimages, the subscript from1Start, and then initialize the main program (subscript0)runInitializers.
  • Environment variables can be configuredDYLD_PRINT_STATISTICSandDYLD_PRINT_STATISTICS_DETAILSPrint related information.

dyld ImageLoader::runInitializers(ImageLoader.cpp)

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo) { uint64_t t1 = mach_absolute_time(); mach_port_t thisThread = mach_thread_self(); ImageLoader::UninitedUpwards up; up.count = 1; up.imagesAndPaths[0] = { this, this->getPath() }; // processInitializers(context, thisThread, timingInfo, up); context.notifyBatch(dyld_image_state_initialized, false); mach_port_deallocate(mach_task_self(), thisThread); uint64_t t2 = mach_absolute_time(); fgTotalInitTime += (t2 - t1); }Copy the code
  • up.countValue is set to1And then callprocessInitializers.

ImageLoader::processInitializers

void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread, InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images) { uint32_t maxImageCount = context.imageCount()+2; ImageLoader::UninitedUpwards upsBuffer[maxImageCount]; ImageLoader::UninitedUpwards& ups = upsBuffer[0]; ups.count = 0; for (uintptr_t i=0; i < images.count; [I]. First -> Initialization(context, thisThread, recursiveInitialization) {// Initialize images. images.imagesAndPaths[i].second, timingInfo, ups); } // If any upward dependencies remain, init them. if ( ups.count > 0 ) processInitializers(context, thisThread, timingInfo, ups); }Copy the code
  • And finally calledrecursiveInitialization. ,

ImageLoader::recursiveInitialization(ImageLoader.cpp)

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, Const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) {... if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles fState = dyld_image_state_dependents_initialized-1; Try {// initialize lower level libraries first // Initialize lower level libraries lib for(unsigned int I =0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage ! = NULL) {... > >recursiveInitialization(context, recursiveInitialization) -> initialization (context, recursiveInitialization) this_thread, libPath(i), timingInfo, uninitUps); }}}... fState = dyld_image_state_dependents_initialized; oldState = fState; Dyld_image_state_dependents_initialized = dyLD_IMAGe_STATE_dependentS_initialized So you end up calling your own +load. Start with libobjc.a.dylib. context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // initialize this image file and call the c++ constructor. This is where libSystem_initializer for libSystem is called. It's going to call objc_init. The _dyld_objc_notify_register calls its own +load method, followed by the c++ constructor. //1. Call libSystem_initializer->objc_init to register the callback. //2. _dyLD_OBJC_notify_register calls map_images load_images. Dylib, libsystem_featureflags.dylib, libsystem_trace.dylib, libxpc.dylib. Bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image fState = dyld_image_state_initialized; oldState = fState; // The +load method is not called here. NotifySingle internal fState== DYLD_IMAGE_STATE_dependentS_INITIALIZED + LOAD is called. context.notifySingle(dyld_image_state_initialized, this, NULL); ... }... } recursiveSpinUnLock(); }Copy the code
  • The whole process is a recursive process, calling the dependent libraries first, then calling their own.
  • callnotifySingleAnd finally it callsobjcAll of the+ loadMethods. Here’s the first onenotifySingleCall is+loadMethod, number twonotifySingleBecause the parameter isdyld_image_state_initializedDoes not call+loadMethods. Here,dyld_image_state_dependents_initializedThe dependency file has been initialized and is ready to initialize itself.
  • calldoInitializationAnd finally calledc++The system constructor of. The first thing I call islibSystem_initializer -> objc_initMake a registration callback. Called in the callbackmap_images,load_images(+load). Here,load_imagesCall some system libraries to load, such as:Dylib, libsystem_featureflags.dylib, libsystem_trace.dylib, libxpc.dylib.

C++ system constructor

__attribute__((constructor)) void func() {
   printf("\n ---func--- \n");
}
Copy the code

The + load method is called earlier than the c++ constructor for the same image.

Dyld ::notifySingle (dyld2.cpp) notifySingle corresponds to a function that assigns a value to setContext:

Static void notifySingle(dyLD_image_states state, const ImageLoader* image, ImageLoader: : InitializerTimingList * timingInfo) {... if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! = NULL) && image->notifyObjC() ) { uint64_t t0 = mach_absolute_time(); dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); // The callback pointer sNotifyObjCInit is assigned in the registerObjCNotifiers. Here the execution goes to load_images of objC (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); uint64_t t1 = mach_absolute_time(); uint64_t t2 = mach_absolute_time(); uint64_t timeInObjC = t1-t0; uint64_t emptyTime = (t2-t1)*100; if ( (timeInObjC > emptyTime) && (timingInfo ! = NULL) ) { timingInfo->addTime(image->getShortName(), timeInObjC); }}... }Copy the code
  • notifySingleCan’t find inload imageAs can be seen from the stack informationnotifySingleAfter thatload image).
  • This function performs a callbacksNotifyObjCInitOn the condition thatstatefordyld_image_state_dependents_initialized.

Search for the assignment operation of the next callback sNotifyObjCInit and find the registerObjCNotifiers assigned in registerObjCNotifiers

// Who calls the registerObjCNotifiers? _dyld_objc_notify_register. There are three parameters assigned _dyLD_OBJC_notifY_mapped, _dyLD_OBJC_notify_init, _dyld_objc_notify_unmapped void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyLD_OBJC_notifY_unmapped unmapped) {// record functions to call // first parameter map_images snotifyobjCMUTTERED = mapped; // The second parameter load_images sNotifyObjCInit = init; // Unmap_image sNotifyObjCUnmapped = unmapped; // Call 'mapped' function with all images mapped so far try {// notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true); } catch (const char* msg) { // ignore request to abort during registration } // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem) for (std::vector<ImageLoader*>::iterator it=sAllImages.begin();  it ! = sAllImages.end(); it++) { ImageLoader* image = *it; if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); // Call load_images for some system libraries. (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); }}}Copy the code
  • registerObjCNotifiersThe assignment comes from the second argument_dyld_objc_notify_init.
  • It’s called inside after the assignmentnotifyBatchPartial(Internal callsNotifyObjCMapped).
  • Cycle callload_images, which is called by the dependent system libraryDylib, libsystem_featureflags.dylib, libsystem_trace.dylib, libxpc.dylib.

A search turns up a registerObjCNotifiers for the _dyLD_OBJC_notify_register call. _dyld_objc_notify_register (dyldAPIs. CPP)

// called in _objc_init. -> _dyLD_OBJC_notify_register, set a symbolic breakpoint to see what is called by _objc_init in objC-os. mm. void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { dyld::registerObjCNotifiers(mapped, init, unmapped); }Copy the code
  • _dyld_objc_notify_registerThe caller of thedyldCannot be found in.

dyLD_OBJC_notify_register

You can see it’s being_objc_initThe call.

_objc_init is called in objc-os.mm.

void _objc_init(void) { static bool initialized = false; if (initialized) return; initialized = true; // fixme defer initialization until an objc-using image is found? environ_init(); tls_init(); static_init(); runtime_init(); exception_init(); #if __OBJC2__ cache_t::init(); #endif _imp_implementationWithBlock_init(); //_objc_init calls dyldAPIs. CPP _dyLD_objc_notify_register, The second parameter is load_images _dyLD_OBJC_notify_register (&map_images, load_images, unmap_image); #if __OBJC2__ didCallDyldNotifyRegister = true; #endif }Copy the code
  • Proved to be_objc_initCall the_dyld_objc_notify_register.
  • The first parameter is thetamap_imagesAssigned tosNotifyObjCMapped.
  • The second parameter isload_imagesAssigned tosNotifyObjCInit.
  • The third parameter isunmap_imageAssigned tosNotifyObjCUnmapped.

How these three parameters interact with DyLD is described in more detail later. ImageLoaderMachO::doInitialization(ImageLoaderMachO.cpp)

bool ImageLoaderMachO::doInitialization(const LinkContext& context) { CRSetCrashLogMessage2(this->getPath()); // mach-o has -init and static initializers doImageInit(context); // load the c++ constructor doModInitFunctions(context); CRSetCrashLogMessage2(NULL); return (fHasDashInit || fHasInitializers); }Copy the code

Add the following code to view the MachO file:

__attribute__((constructor)) void func1() {
    printf("\n ---func1--- \n");
}

__attribute__((constructor)) void func2() {
    printf("\n ---func2--- \n");
}
Copy the code

You’ll notice an extra __mod_init_func in MachO

  • calldoModInitFunctionsFunction to loadc++Constructor (__attribute__((constructor))Modification of thecFunction)

ImageLoaderMachO::doModInitFunctions

  • Inside is rightmachoSome reads of files.
  • Will be carried out in__mod_init_func sectionThe confirmation is consistent with the above verification.
  • The load must be complete before loadinglibSystemLibrary.