preface
In previous articles, we introduced data for objects and classes, properties, methods, member variables, and so on. All of these things are done in code, they all need to be loaded into memory for us to use, or they’re just files, and today we’re going to explore how they’re loaded into an application.
The preparatory work
- Objc4-818.2 – the source code
- Dyld – 852 source
- Libdispatch source
- Libsystem source
First, the application loading principle
Every application load requires some underlying libraries, UIKit, CoreFoundation, AVFoundation, and so on. Libraries are executable binary files that can be loaded into memory by the operating system. There are static libraries and dynamic libraries.
The build process
Executable file
1. Executable files of the project
Create a macOS project:
int main(int argc, const char * argv[]) { @autoreleasepool { // insert code here... NSLog(@"Hello, World!" ); } return 0; }Copy the code
- The code is printed by default and is not modified.
Next, generate the executable file and drag it into terminal:
- As shown above, drag the executable file to
terminal
It’s ok to do the operation. It’s printed outHello, World!
.
2. System library executable file
Find the system Foundation executable file:
- through
image list
Access to theFoundation
Executable file path, finally found on disk successfully.
Static and dynamic linking
- The dynamic link method can share the dynamic library, optimize the memory space, so Apple’s library is dynamic library.
The loading process
Libraries are loaded into memory through dyld (dynamic linker). The overall process can be represented by the following diagram:
Two, the derivation of dyLD
Let’s create an iOS project and add the load method to the viewController.m:
@implementation ViewController
+ (void)load{
NSLog(@"%s",__func__);
}
@end
Copy the code
Enter a breakpoint at main and run the program:
- The program broke successfully
main
Delta function, we found that delta functionmain
Function was called beforestart
Function, so let’s add onestart
Symbol breakpoints for debugging.
Add the start symbol breakpoint and run the program again:
- Add the
start
The symbol breakpoint did not break, the program still wentmain
Function to indicate that these symbolic breakpoints are notstart
The implementation of the. inmain
Before the function+[ViewController load]
It’s being called, so it’s inload
Method to type a breakpoint.
At the break point in the ViewController’s load method, run the program:
- Program to break in
load
Method after passingbt
Print the stack. Found in the stack_dyld_start
Function. It also leads up heredyld
, click on thedyld-852Download the source code and proceeddyld
Source exploration.
3. On dyLD process
Search globally for _dyLD_start in source code:
- We are in
dyldStartup.s
I found it in the file_dyld_start
The implementation of the. And saw thatcall dyldbootstrap::start
Code like this,dyldbootstrap
isC++
Namespace in thestart
In this namespace.
Find the start function in the dyldBootstrap namespace:
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
...
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
Copy the code
- You can see
start
Function, returns_main
Delta function, and then yeah_main
Analyze.
4. Main function in dyld process is the main process
Click on the _main function:
uintptr_t _main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, int argc, const char* argv[], const char* envp[], const char* apple[], uintptr_t* startGlue) { ... GetHostInfo (mainExecutableMH, mainExecutableSlide); {__block bool platformFound = false; ((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) { if (platformFound) { halt("MH_EXECUTE binaries may only specify one platform"); } gProcessInfo->platform = (uint32_t)platform; platformFound = true; }); Const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH"); if ( (rootPath ! = NULL) ) { ... } else { ... } // Load shared cache mapSharedCache(mainExecutableSlide); / / instantiate the main program instantiate ImageLoader for main executable sMainExecutable = instantiateFromLoadedImage (mainExecutableMH, mainExecutableSlide, sExecPath); // Load any inserted libraries if (senv. DYLD_INSERT_LIBRARIES! = NULL ) { for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib ! = NULL; ++lib) loadInsertedDylib(*lib); } // link main program link(sMainExecutable, senv.dyLD_bind_at_launch, true, ImageLoader::RPathChain(NULL, NULL), -1); Bind and notify for the inserted images now interposture has been registered if (sInsertedDylibCount > 0) { for(unsigned int i=0; i < sInsertedDylibCount; ++i) { ImageLoader* image = sAllImages[i+1]; image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true, nullptr); <rdar://problem/12186933> do weak binding only after all inserted images linked sMainExecutable->weakBind(gLinkContext); gLinkContext.linkingMainExecutable = false; // Run all Initializers initializeMainExecutable(); // motoring may enter main() function notify any monmotoring proccesses that this process is about to enter main() notifyMonitoringDyldMain(); result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD(); return result; }Copy the code
InitializeMainExecutable process – Main program run
Enter the initializeMainExecutable function:
Void initializeMainExecutable() {// run initialzers for any dylibs // allImagesCount() : Get all the number of image file ImageLoader: : InitializerTimingList initializerTimes [allImagesCount ()]; initializerTimes[0].count = 0; const size_t rootCount = sImageRoots.size(); if ( rootCount > 1 ) { for(size_t i=1; i < rootCount; ++ I) {// sImageRoots[I]->runInitializers(gLinkContext, initializerTimes[0]); } // Run initializers for main executable and everything it brings up sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]); }Copy the code
- You can see that both the image file initialization and the main program initialization are called
runInitializers
.
runInitializers
Enter runInitializers:
- The point of this function is
processInitializers
Function.
processInitializers
Enter processInitializers:
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
uint32_t maxImageCount = context.imageCount()+2;
ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
ImageLoader::UninitedUpwards& ups = upsBuffer[0];
ups.count = 0;
// Calling recursive init on all images in images list, building a new list of
// uninitialized upward dependencies.
for (uintptr_t i=0; i < images.count; ++i) {
images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
}
// If any upward dependencies remain, init them.
if ( ups.count > 0 )
processInitializers(context, thisThread, timingInfo, ups);
}
Copy the code
- The point of this function is
recursiveInitialization
Function.
recursiveInitialization
Enter the recursiveInitialization function:
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) { if ( fState < dyld_image_state_dependents_initialized-1 ) { uint8_t oldState = fState; // break cycles fState = dyld_image_state_dependents_initialized-1; Try {// initialize lower level libraries first // for(unsigned int I =0; i < libraryCount(); ++i) { ImageLoader* dependentImage = libImage(i); if ( dependentImage ! = NULL ) { // don't try to initialize stuff "above" me yet if ( libIsUpward(i) ) { tUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) }; uninitUps.count++; } else if ( dependentImage->fDepth >= fDepth ) { dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); } } } // record termination order if ( this->needsTermination() ) context.terminationRecorder(this); // let objc know we are about to initialize this image uint64_t t1 = mach_absolute_time(); fState = dyld_image_state_dependents_initialized; oldState = fState; NotifySingle (dyLD_image_state_dependents_initialized, this, &timingInfo); // Call init method initialize this image bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image fState = dyld_image_state_initialized; oldState = fState; Context. notifySingle(dyLD_image_state_initialized, this, NULL); } catch (const char* msg) { ... }}}Copy the code
context.notifySingle
: A single notification injection.this->doInitialization
: Calls the init method.context.notifySingle
: indicates that initialization is complete.
notifySingle
A global search is found for notifySingle:
Click enter to enter notifySingle:
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo) { //dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath()); std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers); if ( handlers ! = NULL ) { dyld_image_info info; info.imageLoadAddress = image->machHeader(); info.imageFilePath = image->getRealPath(); info.imageFileModDate = image->lastModified(); for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it ! = handlers->end(); ++it) { const char* result = (*it)(state, 1, &info); if ( (result ! = NULL) && (state == dyld_image_state_mapped) ) { //fprintf(stderr, " image rejected by handler=%p\n", *it); // make copy of thrown string so that later catch clauses can free it const char* str = strdup(result); throw str; } } } if ( state == dyld_image_state_mapped ) { // <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache // <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches if (! image->inSharedCache() || (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) { dyld_uuid_info info; if ( image->getUUID(info.imageUUID) ) { info.imageLoadAddress = image->machHeader(); addNonSharedCacheImageUUID(info); } } } if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! = NULL) && image->notifyObjC() ) { uint64_t t0 = mach_absolute_time(); dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0); (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); uint64_t t1 = mach_absolute_time(); uint64_t t2 = mach_absolute_time(); uint64_t timeInObjC = t1-t0; uint64_t emptyTime = (t2-t1)*100; if ( (timeInObjC > emptyTime) && (timingInfo ! = NULL) ) { timingInfo->addTime(image->getShortName(), timeInObjC); }}}Copy the code
- Locate key code
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
.
sNotifyObjCInit
Search for sNotifyObjCInit to obtain the relevant code:
static _dyld_objc_notify_init sNotifyObjCInit;
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init;
sNotifyObjCUnmapped = unmapped;
}
Copy the code
sNotifyObjCInit
is_dyld_objc_notify_init
Type of, inregisterObjCNotifiers
The value assigned to the function, soregisterObjCNotifiers
And where was it called.
registerObjCNotifiers
Search registerObjCNotifiers globally:
- in
_dyld_objc_notify_register
FunctionregisterObjCNotifiers
And_dyld_objc_notify_register
We’ve seen it before.
Look at theObjc4-818.2 – the source code δΈ _objc_init
Function implementation:
- Here it is
_dyld_objc_notify_register
So let’s start with_objc_init
Continue exploring for pointcuts.
Images initialization process
We will follow the process backwards, starting with _objc_init.
Open objC4-818.2 source code, at the _objc_init break point, run the program:
- As you can see, in the
_objc_init
Function was called earlier_os_object_init
This function is inlibdispatch
In the library.
_os_object_init
Download the libDispatch source code and search globally for the _OS_object_init function:
- To find the
_os_object_init
Function, and found in it_objc_init()
Function is now called to get the flow:_os_object_init
->_objc_init()
.
Let’s see what functions are called before the _os_object_init function:
- You can see that it is
libdispatch_init
It’s called, it also belongs tolibdispatch
Library.
libdispatch_init
Libdispatch_init:
libdispatch_init(void)
{
...
_dispatch_hw_config_init();
_dispatch_time_init();
_dispatch_vtable_init();
_os_object_init();
_voucher_init();
_dispatch_introspection_init();
}
Copy the code
- in
libdispatch_init
Is found in the implementation of_os_object_init()
Currently get the process:libdispatch_init
->_os_object_init
->_objc_init()
.
Again, look at the function calls before libDispatch_init:
libdispatch_init
It was calledlibSystem_initializer
The function,libSystem_initializer
Belong tolibSystem
Library, continue validation.
libSystem_initializer
Download the Libsystem source code and search globally for libSystem_initializer:
- To find the
libSystem_initializer
Function implementation, and in239
Line to find thelibdispatch_init
Function is now called to get the flow:libSystem_initializer
->libdispatch_init
->_os_object_init
->_objc_init()
.
View the call to the step function on libSystem_initializer:
- You can see that the call from the previous step is
doModInitFunctions
Delta function, this is backdyld
.
doModInitFunctions
Find doModInitFunctions and enter the function:
void ImageLoaderMachO::doModInitFunctions(const LinkContext& context) { if ( fHasInitializers ) { const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds; const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)]; const struct load_command* cmd = cmds; for (uint32_t i = 0; i < cmd_count; If (CMD -> CMD == LC_SEGMENT_COMMAND) {const struct macho_segment_command* seg = (struct macho_segment_command*)cmd; const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command)); const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects]; for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) { const uint8_t type = sect->flags & SECTION_TYPE; if ( type == S_MOD_INIT_FUNC_POINTERS ) { Initializer* inits = (Initializer*)(sect->addr + fSlide); for (size_t j=0; j < count; ++j) {// Initializer includes libSystem_initializer. Initializer func = inits[j]; if ( ! dyld::gProcessInfo->libSystemInitialized ) { // <rdar://problem/17973316> libSystem initializer must run first // Const char* installPath = getInstallPath(); if ( (installPath == NULL) || (strcmp(installPath, libSystemPath(context)) ! = 0) ) dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n", this->getPath()); } bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers ! = NULL); { dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0); // Initializer includes libSystem_initializer calls func(context.argc, context.argv, context.envp, context.apple, &context.programVars); } if ( ! haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) { // now safe to use malloc() and other calls in libSystem.dylib dyld::gProcessInfo->libSystemInitialized = true; }} else if (type == S_INIT_FUNC_OFFSETS) { } } } } } }Copy the code
- We know by functional analysis
libSystem
Is the first one that will be loadedThe mirror
File, otherwise an error will be reported. UIKit
,Foundation
And all the other librariesRuntime
Base, thread base, environment base, etc., so load firstlibSystem
.Initializer func = inits[j];
I got it the first timefunc
islibSystem_initializer
, and through thefunc(context.argc, context.argv, context.envp, context.apple, &context.programVars);
Make a call, and look at itdoModInitFunctions
The call.- Current process:
doModInitFunctions
->libSystem_initializer
->libdispatch_init
->_os_object_init
->_objc_init()
.
Search for doModInitFunctions to see its calls:
OK!!!!!
Which brings us back to the function we mentioned abovedoInitialization
.- Current process:
doInitialization
->doModInitFunctions
->libSystem_initializer
->libdispatch_init
->_os_object_init
->_objc_init()
.
Dyld link objc function execution
From the previous analysis, we obtain the following following doInitialization:
doInitialization -> doModInitFunctions -> libSystem_initializer -> libdispatch_init -> _os_object_init -> _objc_init() -> _dyLD_OBJC_NOTIFy_register -> registerObjCNotifiers.
Let’s review the calls to the _dyLD_OBJC_NOTIFy_register and registerObjCNotifiers, and the key code to implement them:
void _objc_init(void) { ... _dyld_objc_notify_register(&map_images, load_images, unmap_image); . } void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { dyld::registerObjCNotifiers(mapped, init, unmapped); } void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped) { // record functions to call sNotifyObjCMapped = mapped; sNotifyObjCInit = init; sNotifyObjCUnmapped = unmapped; . }Copy the code
- And the comparative analysis shows that,
map_images()
=sNotifyObjCMapped()
.load_images()
=sNotifyObjCInit()
. - Let’s explore
map_images()
andload_images()
Where was it called.
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
Copy the code
Search globally for sNotifyObjCMapped:
- in
notifyBatchPartial
Function, we found itsNotifyObjCMapped
The call.
Search notifyBatchPartial globally:
- in
registerObjCNotifiers
FunctionnotifyBatchPartial
The call to the originalsNotifyObjCMapped
The function is assigned to this function and is called directly.
Where is sNotifyObjCInit called? Continue searching:
- in
notifySingle
Function, findsNotifyObjCInit
The call.
NotifySingle is called in recursiveInitialization, as in doInitialization:
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize, InitializerTimingList& timingInfo, UninitedUpwards& uninitUps) {if (fState < dyLD_image_state_dependents_initialized -1) {try {// Single notification injection context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo); // Call init method initialize this image bool hasInitializers = this->doInitialization(context); // let anyone know we finished initializing this image fState = dyld_image_state_initialized; oldState = fState; Context. notifySingle(dyLD_image_state_initialized, this, NULL); } catch (const char* msg) { ... }}}Copy the code
- Look at the code
context.notifySingle
Is in thethis->doInitialization
It was called before, andsNotifyObjCInit
Is in thedoInitialization
I just got registered. Why is that? - because
recursiveInitialization
It’s a recursive function, first callnotifySingle
whensNotifyObjCInit
Not initialized, the second time I came insNotifyObjCInit
It’s worth it. - So to summarize, the first time you go in, it’s called
doInitialization
In the functionmap_images
andload_images
To initialize, immediately following this callmap_images
. And then go down to thenotifySingle
Function, will callload_images
Function.
Viii. Dyld process analysis diagram
Today is the general process of dyLD analysis, the next article will be on the class loading and other detailed information to explore, click support!! π π π.