This is the 22nd day of my participation in the August Wen Challenge.More challenges in August
Following on from the previous article, let’s continue our analysis of dyLD loading
3.3 Loading the shared Cache with mapSharedCache
Shared cache Specializes in caching system dynamic libraries, such as UIKit and Foundation. MapSharedCache calls loadDyldCache:
Static void mapSharedCache(uintptr_t mainableslide) {...... LoadDyldCache loadDyldCache(opts, &sSharedCacheloadInfo); ... }Copy the code
3.3.1 loadDyldCache
bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results) { results->loadAddress = 0; results->slide = 0; results->errorMessage = nullptr; #if TARGET_OS_SIMULATOR // simulator only supports mmap()ing cache privately into process return mapCachePrivate(options, results); #else if (options. ForcePrivate) {// mmap cache into this process only // return mapCachePrivate(options, results); } else { // fast path: when cache is already mapped into shared region bool hasError = false; If (reuseExistingCache(options, results)) {hasError = (results->errorMessage! = nullptr); } else {// slow path: this is the first process to load cache // hasError = mapCacheSystemWide(options, results); } return hasError; } #endif }Copy the code
LoadDyldCache has three logics: 1. MapCachePrivate is invoked only for the current process. Do not put in the shared cache, only for your own use. 2. If yes, no processing is performed. 3. The current process invokes mapCacheSystemWide for the first time
The shared cache of the dynamic library is loaded first during the entire application startup process.
3.4 instantiateFromLoadedImage instantiation of the main program (createimage
)
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, Const char * path) {/ / instantiate the image ImageLoader * image = ImageLoaderMachO: : instantiateMainExecutable (mh, slide, path, gLinkContext); // addImage to all Images addImage(image); return (ImageLoaderMachO*)image; // throw "main executable not a known format"; }Copy the code
- Passed into the main program
Header
,ASLR
,path
Instantiate main program generationimage
. - will
image
joinall images
In the.
Instantiate the real call is actually ImageLoaderMachO: : instantiateMainExecutable:
// create image for main executable ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context) { bool compressed; unsigned int segCount; unsigned int libCount; const linkedit_data_command* codeSigCmd; const encryption_info_command* encryptCmd; // Get LoadCommands sniffLoadCommands(mh, path, false, &compressed, &segCount, &libcount, context, &codesigcmd, &encryptCmd); // Instantiate concrete class based on content of load commands // Instantiate concrete class based on content of load Select the corresponding subclass based on the value to instantiate image. if ( compressed ) return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); else #if SUPPORT_CLASSIC_MACHO return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context); #else throw "missing LC_DYLD_INFO load command"; #endif }Copy the code
- call
sniffLoadCommands
Generate relevant information, such ascompressed
. - According to the
compressed
Determine which subclass to load withimage
.ImageLoader
Is an abstract class that instantiates the main program by selecting the corresponding subclass based on the value.
sniffLoadCommands
void ImageLoaderMachO::sniffLoadCommands(const macho_header* mh, const char* path, bool inCache, bool* compressed, unsigned int* segCount, unsigned int* libCount, const LinkContext& context, const linkedit_data_command** codeSigCmd, Const encryption_info_command** encryptCmd) {// Select *compressed = false from LC_DYLIB_INFO and LC_DYLD_INFO_ONLY; // Segment number *segCount = 0; //lib number *libCount = 0; CodeSigCmd = NULL; *encryptCmd = NULL; ... If (*segCount > 255) dyld::throwf("malformed Mach-o image: more than 255 segments in %s", path); If (*libCount > 4095) dyld::throwf("malformed mach-o image: more than 4095 dependent libraries in %s", path); if ( needsAddedLibSystemDepency(*libCount, mh) ) *libCount = 1; // dylibs that use LC_DYLD_CHAINED_FIXUPS have that load command removed when put in the dyld cache if ( ! *compressed && (mh->flags & MH_DYLIB_IN_CACHE) ) *compressed = true; }Copy the code
compressed
Is based onLC_DYLIB_INFO
andLC_DYLD_INFO_ONLY
To get.segCount
most256
A.libCount
most4096
A.
3.5 loadInsertedDylib Inserts and loads dynamic libraries
static void loadInsertedDylib(const char* path) { unsigned cacheIndex; Try {... // Call load to load the actual dynamic library function load(path, context, cacheIndex); }... }Copy the code
- Initialize the configuration call based on the context
load
Load the dynamic library.
3.6 ImageLoader::link Links the main program/dynamic library
void link(ImageLoader* image, bool forceLazysBound, bool neverUnload, const ImageLoader::RPathChain& loaderRPaths, unsigned cacheIndex) { // add to list of known images. This did not happen at creation time for bundles if ( image->isBundle() && ! image->isLinked() ) addImage(image); // we detect root images as those not linked in yet if ( ! image->isLinked() ) addRootImage(image); // process images try { const char* path = image->getPath(); #if SUPPORT_ACCELERATE_TABLES if ( image == sAllCacheImagesProxy ) path = sAllCacheImagesProxy->getIndexedPath(cacheIndex); Link image->link(gLinkContext, forceLazysBound, false, neverUnload, loaderRPaths, path); }}Copy the code
link
The final call is going to beImageLoader::link
.
ImageLoader::link
void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath) { // clear error strings (*context.setErrorStrings)(0, NULL, NULL, NULL); // Start time. Uint64_t0 = mach_absolute_time(); // Recursively load the library that the main program depends on. this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath); context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly); ... uint64_t t1 = mach_absolute_time(); context.clearAllDepths(); this->updateDepth(context.imageCount()); __block uint64_t t2, t3, t4, t5; { dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0); t2 = mach_absolute_time(); / / Rebase correction ASLR this - > recursiveRebaseWithAccounting (context); context.notifyBatch(dyld_image_state_rebased, false); t3 = mach_absolute_time(); if ( ! This context. LinkingMainExecutable) / / bind NoLazy symbols - > recursiveBindWithAccounting (context, forceLazysBound neverUnload); t4 = mach_absolute_time(); if ( ! Context. LinkingMainExecutable) / / binding weak symbols this - > weakBind (context); t5 = mach_absolute_time(); } // interpose any dynamically loaded images if ( ! context.linkingMainExecutable && (fgInterposingTuples.size() ! = 0) ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0); / / recursive application insert dynamic library this - > recursiveApplyInterposing (context); } // now that all fixups are done, make __DATA_CONST segments read-only if ( ! context.linkingMainExecutable ) this->recursiveMakeDataReadOnly(context); if ( ! context.linkingMainExecutable ) context.notifyBatch(dyld_image_state_bound, false); uint64_t t6 = mach_absolute_time(); if ( context.registerDOFs ! = NULL ) { std::vector<DOFInfo> dofs; this->recursiveGetDOFSections(context, dofs); / / register context. RegisterDOFs (dofs); Uint64_t t7 = mach_absolute_time(); // Clear error strings // Configure the environment variables to see how long the dyld application is loading. (*context.setErrorStrings)(0, NULL, NULL, NULL); fgTotalLoadLibrariesTime += t1 - t0; fgTotalRebaseTime += t3 - t2; fgTotalBindTime += t4 - t3; fgTotalWeakBindTime += t5 - t4; fgTotalDOF += t7 - t6; // done with initial dylib loads fgNextPIEDylibAddress = 0; }Copy the code
- correction
ASLR
. - The binding
NoLazy
Symbols. - Bind weak symbols.
- Registration.
- Record the time, which can be seen through the configuration
dyld
Application loading duration.
3.7 initializeMainExecutable Initializes the main program
void initializeMainExecutable() { // record that we've reached this step gLinkContext.startedInitializingMainExecutable = true; / / the run initialzers for any inserted dylibs / / to get all of the image file ImageLoader: : InitializerTimingList initializerTimes[allImagesCount()]; initializerTimes[0].count = 0; const size_t rootCount = sImageRoots.size(); If (rootCount > 1) {// Start from 1 to end. For (size_t I =1; i < rootCount; SImageRoots [I]->runInitializers(gLinkContext, initializerTimes[0]); }} // Run initializers for main executable and everything it brings up sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); // register cxa_atexit() handler to run static terminators in all loaded images when this process exits if ( gLibSystemHelpers ! = NULL ) (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL); // dump info if requested if ( sEnv.DYLD_PRINT_STATISTICS ) ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]); if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS ) ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]); }Copy the code
- Initialize the
images
, the subscript from1
Start, and then initialize the main program (subscript0
)runInitializers
. - Environment variables can be configured
DYLD_PRINT_STATISTICS
andDYLD_PRINT_STATISTICS_DETAILS
Print related information.
dyld ImageLoader::runInitializers(ImageLoader.cpp
)
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo) { uint64_t t1 = mach_absolute_time(); mach_port_t thisThread = mach_thread_self(); ImageLoader::UninitedUpwards up; up.count = 1; up.imagesAndPaths[0] = { this, this->getPath() }; // processInitializers(context, thisThread, timingInfo, up); context.notifyBatch(dyld_image_state_initialized, false); mach_port_deallocate(mach_task_self(), thisThread); uint64_t t2 = mach_absolute_time(); fgTotalInitTime += (t2 - t1); }Copy the code
up.count
Value is set to1
And then callprocessInitializers
.
ImageLoader::processInitializers
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread, InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images) { uint32_t maxImageCount = context.imageCount()+2; ImageLoader::UninitedUpwards upsBuffer[maxImageCount]; ImageLoader::UninitedUpwards& ups = upsBuffer[0]; ups.count = 0; for (uintptr_t i=0; i < images.count; [I]. First -> Initialization(context, thisThread, recursiveInitialization) {// Initialize images. images.imagesAndPaths[i].second, timingInfo, ups); } // If any upward dependencies remain, init them. if ( ups.count > 0 ) processInitializers(context, thisThread, timingInfo, ups); }Copy the code
- And finally called
recursiveInitialization
. ,
ImageLoader::recursiveInitialization(ImageLoader.cpp
)
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, Const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) {... if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles fState = dyld_image_state_dependents_initialized-1; Try {// initialize lower level libraries first // Initialize lower level libraries lib for(unsigned int I =0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage ! = NULL) {... > >recursiveInitialization(context, recursiveInitialization) -> initialization (context, recursiveInitialization) this_thread, libPath(i), timingInfo, uninitUps); }}}... fState = dyld_image_state_dependents_initialized; oldState = fState; Dyld_image_state_dependents_initialized = dyLD_IMAGe_STATE_dependentS_initialized So you end up calling your own +load. Start with libobjc.a.dylib. context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // initialize this image file and call the c++ constructor. This is where libSystem_initializer for libSystem is called. It's going to call objc_init. The _dyld_objc_notify_register calls its own +load method, followed by the c++ constructor. //1. Call libSystem_initializer->objc_init to register the callback. //2. _dyLD_OBJC_notify_register calls map_images load_images. Dylib, libsystem_featureflags.dylib, libsystem_trace.dylib, libxpc.dylib. Bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image fState = dyld_image_state_initialized; oldState = fState; // The +load method is not called here. NotifySingle internal fState== DYLD_IMAGE_STATE_dependentS_INITIALIZED + LOAD is called. context.notifySingle(dyld_image_state_initialized, this, NULL); ... }... } recursiveSpinUnLock(); }Copy the code
- The whole process is a recursive process, calling the dependent libraries first, then calling their own.
- call
notifySingle
And finally it callsobjc
All of the+ load
Methods. Here’s the first onenotifySingle
Call is+load
Method, number twonotifySingle
Because the parameter isdyld_image_state_initialized
Does not call+load
Methods. Here,dyld_image_state_dependents_initialized
The dependency file has been initialized and is ready to initialize itself. - call
doInitialization
And finally calledc++
The system constructor of. The first thing I call islibSystem_initializer -> objc_init
Make a registration callback. Called in the callbackmap_images
,load_images
(+load
). Here,load_images
Call some system libraries to load, such as:Dylib, libsystem_featureflags.dylib, libsystem_trace.dylib, libxpc.dylib
.
C++ system constructor
__attribute__((constructor)) void func() {
printf("\n ---func--- \n");
}
Copy the code
The + load method is called earlier than the c++ constructor for the same image.
Dyld ::notifySingle (dyld2.cpp) notifySingle corresponds to a function that assigns a value to setContext:
Static void notifySingle(dyLD_image_states state, const ImageLoader* image, ImageLoader: : InitializerTimingList * timingInfo) {... if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! = NULL) && image->notifyObjC() ) { uint64_t t0 = mach_absolute_time(); dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); // The callback pointer sNotifyObjCInit is assigned in the registerObjCNotifiers. Here the execution goes to load_images of objC (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); uint64_t t1 = mach_absolute_time(); uint64_t t2 = mach_absolute_time(); uint64_t timeInObjC = t1-t0; uint64_t emptyTime = (t2-t1)*100; if ( (timeInObjC > emptyTime) && (timingInfo ! = NULL) ) { timingInfo->addTime(image->getShortName(), timeInObjC); }}... }Copy the code
notifySingle
Can’t find inload image
As can be seen from the stack informationnotifySingle
After thatload image
).- This function performs a callback
sNotifyObjCInit
On the condition thatstate
fordyld_image_state_dependents_initialized
.
Search for the assignment operation of the next callback sNotifyObjCInit and find the registerObjCNotifiers assigned in registerObjCNotifiers
// Who calls the registerObjCNotifiers? _dyld_objc_notify_register. There are three parameters assigned _dyLD_OBJC_notifY_mapped, _dyLD_OBJC_notify_init, _dyld_objc_notify_unmapped void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyLD_OBJC_notifY_unmapped unmapped) {// record functions to call // first parameter map_images snotifyobjCMUTTERED = mapped; // The second parameter load_images sNotifyObjCInit = init; // Unmap_image sNotifyObjCUnmapped = unmapped; // Call 'mapped' function with all images mapped so far try {// notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true); } catch (const char* msg) { // ignore request to abort during registration } // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem) for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it ! = sAllImages.end(); it++) { ImageLoader* image = *it; if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); // Call load_images for some system libraries. (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); }}}Copy the code
registerObjCNotifiers
The assignment comes from the second argument_dyld_objc_notify_init
.- It’s called inside after the assignment
notifyBatchPartial
(Internal callsNotifyObjCMapped
). - Cycle call
load_images
, which is called by the dependent system libraryDylib, libsystem_featureflags.dylib, libsystem_trace.dylib, libxpc.dylib
.
A search turns up a registerObjCNotifiers for the _dyLD_OBJC_notify_register call. _dyld_objc_notify_register (dyldAPIs. CPP)
// called in _objc_init. -> _dyLD_OBJC_notify_register, set a symbolic breakpoint to see what is called by _objc_init in objC-os. mm. void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { dyld::registerObjCNotifiers(mapped, init, unmapped); }Copy the code
_dyld_objc_notify_register
The caller of thedyld
Cannot be found in.
dyLD_OBJC_notify_register
You can see it’s being_objc_init
The call.
_objc_init is called in objc-os.mm.
void _objc_init(void) { static bool initialized = false; if (initialized) return; initialized = true; // fixme defer initialization until an objc-using image is found? environ_init(); tls_init(); static_init(); runtime_init(); exception_init(); #if __OBJC2__ cache_t::init(); #endif _imp_implementationWithBlock_init(); //_objc_init calls dyldAPIs. CPP _dyLD_objc_notify_register, The second parameter is load_images _dyLD_OBJC_notify_register (&map_images, load_images, unmap_image); #if __OBJC2__ didCallDyldNotifyRegister = true; #endif }Copy the code
- Proved to be
_objc_init
Call the_dyld_objc_notify_register
. - The first parameter is theta
map_images
Assigned tosNotifyObjCMapped
. - The second parameter is
load_images
Assigned tosNotifyObjCInit
. - The third parameter is
unmap_image
Assigned tosNotifyObjCUnmapped
.
How these three parameters interact with DyLD is described in more detail later. ImageLoaderMachO::doInitialization(ImageLoaderMachO.cpp)
bool ImageLoaderMachO::doInitialization(const LinkContext& context) { CRSetCrashLogMessage2(this->getPath()); // mach-o has -init and static initializers doImageInit(context); // load the c++ constructor doModInitFunctions(context); CRSetCrashLogMessage2(NULL); return (fHasDashInit || fHasInitializers); }Copy the code
Add the following code to view the MachO file:
__attribute__((constructor)) void func1() {
printf("\n ---func1--- \n");
}
__attribute__((constructor)) void func2() {
printf("\n ---func2--- \n");
}
Copy the code
You’ll notice an extra __mod_init_func in MachO
- call
doModInitFunctions
Function to loadc++
Constructor (__attribute__((constructor))
Modification of thec
Function)
ImageLoaderMachO::doModInitFunctions
- Inside is right
macho
Some reads of files. - Will be carried out in
__mod_init_func
section
The confirmation is consistent with the above verification. - The load must be complete before loading
libSystem
Library.