Dyld is a dynamic linker, so how does it load our main program and the dynamic library we use, and when is the load method called
Dyld_start: DYLD entrance
First, we create a new iOS project dyldDemo. We override the +(void)load method in viewController.m, and set a breakpoint in the load method, using the BT directive to view the function call stack there
(LLDB) thread #1, queue = 'com.apple.main-thread', stop reason = breakPoint 1.1 * frame #0: 0x000000010e58ce4c dyldDemo`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:23:1 frame #1: 0x00007fff201804e3 libobjc.A.dylib`load_images + 1442 frame #2: 0x000000010e5a0e54 dyld_sim`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425 frame #3: 0x000000010e5af887 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 437 frame #4: 0x000000010e5adbb0 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188 frame #5: 0x000000010e5adc50 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82 frame #6: 0x000000010e5a12a9 dyld_sim`dyld::initializeMainExecutable() + 199 frame #7: 0x000000010e5a5d50 dyld_sim`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 4431 frame #8: 0x000000010e5a01c7 dyld_sim`start_sim + 122 frame #9: 0x000000010f84857a dyld`dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*, unsigned long*) + 2093 frame #10: 0x000000010f845df3 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 1199 frame #11: 0x000000010f84022b dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 457 frame #12: 0x000000010f840025 dyld`_dyld_start + 37Copy the code
We can see that dyld calls start from _dyLD_START in dyLDBootstrap
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0); rebaseDyld(dyldsMachHeader); // 1 const char** apple = envp; while(*apple ! = NULL) { ++apple; } ++apple; // set up random value for stack canary __guard_setup(apple); // 2 #if DYLD_INITIALIZER_SUPPORT // run all C++ initializers inside dyld runDyldInitializers(argc, argv, envp, apple); #endif _subsystem_init(apple); // now that we are done bootstrapping dyld, call dyld's main uintptr_t appsSlide = appsMachHeader->getSlide(); return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);Copy the code
- 1,
Relocation dyld
When the APP starts, the system will provide the process with a virtual memory offset ASLR, yesdyld
Make corrections. - 2. Protect the stack space.
Main program environment configuration
In dyld: : _main, environment configuration of the main program, the initialization mainExecutableCDHash (the main function of the hash), sMainExecutableMachHeader (Mach – O header), SMainExecutableSlide (ASLR memory offset of the main program), which gives you the address and memory offset of the main program’s Mach-oheader.
If we set DYLD_PRINT_OPTS or DYLD_PRINT_ENV to 1, All dyLD configuration and environment variables will be printed, as shown below:
The following output is displayed:
opt[0] = "/Users/bel/Library/Developer/CoreSimulator/Devices/B428260A-1DE5-4090-A8E9-52E8EE6F2F2E/data/Containers/Bundle/Applicat ion/E364F774-F594-4E77-8622-DE78B7742C0D/dyldDemo.app/dyldDemo" IOS_SIMULATOR_SYSLOG_SOCKET=/tmp/com.apple.CoreSimulator.SimDevice.B428260A-1DE5-4090-A8E9-52E8EE6F2F2E/syslogsock SIMULATOR_SHARED_RESOURCES_DIRECTORY=/Users/bel/Library/Developer/CoreSimulator/Devices/B428260A-1DE5-4090-A8E9-52E8EE6F 2F2E/data XPC_SIMULATOR_LAUNCHD_NAME=com.apple.CoreSimulator.SimDevice.B428260A-1DE5-4090-A8E9-52E8EE6F2F2E DYLD_ROOT_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulator/Pr ofiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot esources/capabilities.plist SIMULATOR_FRAMEBUFFER_FRAMEWORK=/Library/Developer/PrivateFrameworks/CoreSimulator.framework/Resources/Platforms/iphoneo s/Library/PrivateFrameworks/SimFramebuffer.framework/SimFramebuffer DYLD_LIBRARY_PATH=/Users/bel/Library/Developer/Xcode/DerivedData/dyldDemo-aekwxanrcjrzwagsdjsahkghypfj/Build/Products/De bug-iphonesimulator:/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Library/Developer/CoreSimulat or/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/system/introspection .... DYLD_PRINT_OPTS=1 TESTMANAGERD_SIM_SOCK=/private/tmp/com.apple.launchd.j1MbmWtF0N/com.apple.testmanagerd.unix-domain.socket DYLD_PRINT_ENV=1 XPC_FLAGS=1Copy the code
Loading the Shared cache
After reading the header of the main application and configuring environment variables, load the shared cache. In iOS, you must have a shared cache. Libraries in the system, such as UIKit and Foundation, are stored in the shared cache.
static void mapSharedCache(uintptr_t mainExecutableSlide) { ... loadDyldCache(opts, &sSharedCacheLoadInfo); . } bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results) { ...... if ( options.forcePrivate ) { // mmap cache into this process only return mapCachePrivate(options, results); } else { // fast path: when cache is already mapped into shared region bool hasError = false; if ( reuseExistingCache(options, results) // 1 ) { hasError = (results->errorMessage ! = nullptr); } else { // slow path: this is first process to load cache hasError = mapCacheSystemWide(options, results); // 2 } return hasError; } #endif } ....Copy the code
- 1. If there is already one in the shared cache, no processing is done.
- 2. If it is the first time to load the system library, the process will load the system library if it is not in the shared cache
From this we can see that the shared cache is loaded first and then the dependent dynamic library is loaded
Instantiate the main program
As we move on,
// instantiate ImageLoader for main executable
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
// try mach-o loader
// if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
addImage(image);
return (ImageLoaderMachO*)image;
// }
// throw "main executable not a known format";
}
Copy the code
Get the header address of the main program,Slide(ASLR value) and the Path of the Mach-O file, and get an image file ImageLoader. The first image file loaded here is our main program.
After all the libraries are loaded, the code is signed
Link the main program to the dynamic library
Once all the libraries are loaded, it’s time to link the library files
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
Copy the code
Link through the link method
void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath) { //dyld::log("ImageLoader::link(%s) refCount=%d, neverUnload=%d\n", imagePath, fDlopenReferenceCount, fNeverUnload); // clear error strings (*context.setErrorStrings)(0, NULL, NULL, NULL); // Start time. Uint64_t0 = mach_absolute_time(); // Load the library that the main program depends on recursively. Send notice after completion. this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath); context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly); // we only do the loading step for preflights if ( preflightOnly ) return; uint64_t t1 = mach_absolute_time(); context.clearAllDepths(); this->updateDepth(context.imageCount()); __block uint64_t t2, t3, t4, t5; { dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0); t2 = mach_absolute_time(); / / Rebase correction ASLR! this->recursiveRebaseWithAccounting(context); context.notifyBatch(dyld_image_state_rebased, false); t3 = mach_absolute_time(); if ( ! This context. LinkingMainExecutable) / / bind NoLazy symbols - > recursiveBindWithAccounting (context, forceLazysBound neverUnload); t4 = mach_absolute_time(); if ( ! The context linkingMainExecutable) / / bind weak symbols! this->weakBind(context); t5 = mach_absolute_time(); } // interpose any dynamically loaded images if ( ! context.linkingMainExecutable && (fgInterposingTuples.size() ! = 0) ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0); / / recursive application insert dynamic library this - > recursiveApplyInterposing (context); } // now that all fixups are done, make __DATA_CONST segments read-only if ( ! context.linkingMainExecutable ) this->recursiveMakeDataReadOnly(context); if ( ! context.linkingMainExecutable ) context.notifyBatch(dyld_image_state_bound, false); uint64_t t6 = mach_absolute_time(); if ( context.registerDOFs ! = NULL ) { std::vector<DOFInfo> dofs; this->recursiveGetDOFSections(context, dofs); / / register context. RegisterDOFs (dofs); Uint64_t t7 = mach_absolute_time(); // clear error strings (*context.setErrorStrings)(0, NULL, NULL, NULL); fgTotalLoadLibrariesTime += t1 - t0; fgTotalRebaseTime += t3 - t2; fgTotalBindTime += t4 - t3; fgTotalWeakBindTime += t5 - t4; fgTotalDOF += t7 - t6; // done with initial dylib loads fgNextPIEDylibAddress = 0; }Copy the code
In the project, if we had configured the environment variable DYLD_PRINT_STATISTICS, we would have printed all the time it took to link the dynamic library. The time it took was counted when the ImageLoader::link method was called.
- 1. Record a start time.
- 2. Rebase, modify ASLR.
- 3. Bind symbols, first bind NoLazy symbols, then bind weak symbols. Lazy loading symbols are bound at use time, not startup time.
- 4. Register
After the above steps, the dynamic library and the main program have been loaded, linked, registered, and it is time to initiate the call to the main function
Initialize the Main method
Dyld load callback
In the initializeMainExecutable method, it is time to initialize the main program
void initializeMainExecutable() { gLinkContext.startedInitializingMainExecutable = true; ImageLoader::InitializerTimingList initializerTimes[allImagesCount()]; initializerTimes[0].count = 0; const size_t rootCount = sImageRoots.size(); if ( rootCount > 1 ) { for(size_t i=1; i < rootCount; ++i) { sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); } } sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); if ( gLibSystemHelpers ! = NULL ) (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL); if ( sEnv.DYLD_PRINT_STATISTICS ) ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]); if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS ) ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]); }Copy the code
At the beginning of the article, we printed out the call stack before load
Thread #1, queue = 'com.apple.main-thread', stop reason = breakPoint 1.1 * frame #0: 0x000000010e58ce4c dyldDemo`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:23:1 frame #1: 0x00007fff201804e3 libobjc.A.dylib`load_images + 1442 frame #2: 0x000000010e5a0e54 dyld_sim`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425 frame #3: 0x000000010e5af887 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 437 frame #4: 0x000000010e5adbb0 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188 frame #5: 0x000000010e5adc50 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82 frame #6: 0x000000010e5a12a9 dyld_sim`dyld::initializeMainExecutable() + 199Copy the code
After the initializeMainExecutable() method, the dyld::notifySingle method is called. In notifySignle, Call (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); Callback to pass the path to the image file and machHeader. We searched globally for the initialization of the callback, which was assigned to the _DYLD_OBJC_notify_register method. We did not find the call to the _DYLD_OBJC_notify_register method in dyLD. Back where we started,
load_images
The method is called in libobJC, whereobjc4
We can see the implementation in the source code,
void _objc_init(void)
{
static bool initialized = false;
if (initialized) return;
initialized = true;
// fixme defer initialization until an objc-using image is found?
environ_init();
tls_init();
static_init();
runtime_init();
exception_init();
#if __OBJC2__
cache_t::init();
#endif
_imp_implementationWithBlock_init();
_dyld_objc_notify_register(&map_images, load_images, unmap_image);
#if __OBJC2__
didCallDyldNotifyRegister = true;
#endif
}
Copy the code
In _objc_init, an initial assignment is made to _DYLD_OBJC_notify_init. From this we can see that in the _objC_init method, a post-load callback for dyLD is registered
Call to the load method
Let’s look at the implementation of the load_images method
void load_images(const char *path __unused, const struct mach_header *mh) { if (! didInitialAttachCategories && didCallDyldNotifyRegister) { didInitialAttachCategories = true; loadAllCategories(); } // Return without taking locks if there are no +load methods here. if (! hasLoadMethods((const headerType *)mh)) return; recursive_mutex_locker_t lock(loadMethodLock); // Discover load methods { mutex_locker_t lock2(runtimeLock); prepare_load_methods((const headerType *)mh); } // Call +load methods (without runtimeLock - re-entrant) call_load_methods(); / / 1}Copy the code
- 1. Call the load method of each class in turn
CXX constructor call
Let’s go back to Project DYLD and explore further
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { recursive_lock lock_info(this_thread); recursiveSpinLock(lock_info); if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles fState = dyld_image_state_dependents_initialized-1; try { // initialize lower level libraries first for(unsigned int i=0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage ! = NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); } } } // record termination order if ( this->needsTermination() ) context.terminationRecorder(this); // let objc know we are about to initialize this image uint64_t t1 = mach_absolute_time(); fState = dyld_image_state_dependents_initialized; oldState = fState; context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); bool hasInitializers = this->doInitialization(context); fState = dyld_image_state_initialized; oldState = fState; context.notifySingle(dyld_image_state_initialized, this, NULL); if ( hasInitializers ) { uint64_t t2 = mach_absolute_time(); timingInfo.addTime(this->getShortName(), t2-t1); } } catch (const char* msg) { // this image is not initialized fState = oldState; recursiveSpinUnLock(); throw; } } recursiveSpinUnLock(); }Copy the code
After the notifySignle method, do the doInitialization method, and let’s look at the implementation
bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
CRSetCrashLogMessage2(this->getPath());
// mach-o has -init and static initializers
doImageInit(context);
doModInitFunctions(context); //1
CRSetCrashLogMessage2(NULL);
return (fHasDashInit || fHasInitializers);
}
Copy the code
- 1. Execute the global Cxx constructor.
In main.m, we create two global Cxx constructors using __attribute__
__attribute__((constructor)) void func1(){printf("fun1 comes!" ); } __attribute__((constructor)) void func2(){printf("fun2 comes!" ); }Copy the code
After compilation, a __mod_init_func section is added to the mach-o file,
The doModInitFunctions(context) method is to call our global Cxx constructor
+(void)load -> Cxx constructor -> main
Find the main entrance
After initializeMainExecutable is complete, look up the address of the main function in the Mach -o file
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
Copy the code
Assign the address of main to result and return.
Conclusion:
After exploration, we know that the loading process of DYLD is as follows:
- 1, from
_dyld_start
Start, enterdyldbootstrap::start
. - 2, enter
Dyld: : the main function
. - 3. Configure environment variables to relocate the dynamic library according to ASLR.
- Load the shared cache and load the dynamic library in the shared cache.
- 5. Instantiate the main program, load the dynamic library, link the dynamic library, and carry out symbolic binding.
- 6, initialize the main program, after the dynamic library loading link is complete, call in Objc
load
Methods. - 7, call
mode_init_function
, that is, calling globalCXX structure
Methods. - 8, in
Mach-O
File, reads the address of main and returns that address.