The whole process of dyLD ::_main function has not been analyzed to every detail, so we will continue to analyze by this article.

The last.

Now let’s analyze initializeMainExecutable.

initializeMainExecutable

We analyzed the overall flow of the dyld::_main function above, where the initializeMainExecutable function does all the initializers. Let’s see how it executes.

void initializeMainExecutable(a)
{
    // record that we've reached this step
    // Mark in the gLinkContext global variable that main Executable now begins Initializers
    gLinkContext.startedInitializingMainExecutable = true;

    // Run initialzers for any inserted dylibs
    
    // Create an array of struct InitializerTimingList to record the amount of time InitializerTimingList took
    ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
    
    initializerTimes[0].count = 0;
    
    // sImageRoots is a static global variable: static STD ::vector
      
        sImageRoots;
      *>
    const size_t rootCount = sImageRoots.size(a);if ( rootCount > 1 ) {
        for(size_t i=1; i < rootCount; ++i) {
            // ⬇️ call runInitializers of ImageLoader
            sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]); }}// run initializers for main executable and everything it brings up 
    // ⬇️ run initializers for main Executable and everything it brings
    sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
    
    // register cxa_atexit() handler to run static terminators in all loaded images when this process exits
    // Register the CXA_atexit () handler to run static terminators on all loaded images when this process exits
    if( gLibSystemHelpers ! =NULL ) 
        (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL.NULL);

    // dump info if requested
    // Determine whether to print the information based on the environment variables
    if ( sEnv.DYLD_PRINT_STATISTICS )
        ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
    if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
        ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
Copy the code

GLinkContext is an ImageLoader::LinkContext gLinkContext; LinkContext is a structure defined in class ImageLoader, which defines many function Pointers and member variables to record and process the Link context. The bool startedInitializingMainExecutable; The Main Executable is initialized with its value set to true.

InitializerTimingList is also a fairly simple structure defined in the Class ImageLoader. Used to record the time taken by Initializer.

struct InitializerTimingList
{
    uintptr_t    count;
    struct {
        const char*        shortName;
        uint64_t        initTime;
    } images[1];

    void addTime(const char* name, uint64_t time);
};

// Appends the time to the specified image
void ImageLoader::InitializerTimingList::addTime(const char* name, uint64_t time)
{
    for (int i=0; i < count; ++i) {
        if ( strcmp(images[i].shortName, name) == 0 ) {
            images[i].initTime += time;
            return;
        }
    }
    images[count].initTime = time;
    images[count].shortName = name;
    ++count;
}
Copy the code

The addTime function appends the time to the currently recorded image.

runInitializers

Let’s look at the runInitializers that both sImageRoots[I] and sMainExecutable call. RunInitializers are defined in the ImageLoader class.

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
    // Start the timer
    uint64_t t1 = mach_absolute_time(a);// Record the current thread
    mach_port_t thisThread = mach_thread_self(a);// UninitedUpwards is a structure defined inside the ImageLoader class,
    // Its imagesAndPaths member variable is used to record the image and its path
    ImageLoader::UninitedUpwards up;
    up.count = 1;
    up.imagesAndPaths[0] = { this.this->getPath() };
    
    / / core ⬇ ️ ⬇ ️ ⬇ ️
    processInitializers(context, thisThread, timingInfo, up);
    
    // The notification has been processed
    context.notifyBatch(dyld_image_state_initialized, false);
    
    / / deallocate task
    mach_port_deallocate(mach_task_self(), thisThread);
    
    // Timing at the end of execution
    uint64_t t2 = mach_absolute_time(a);// Count the duration
    fgTotalInitTime += (t2 - t1); 
}
Copy the code

In runInitializers we’ve seen two old faces: mach_absolute_time and mach_thread_self, which are used to count initialization times and record current threads.

UninitedUpwards is a super-simple structure defined internally by the ImageLoader class, whose member variables STD ::pair

imagesAndPaths[1]; One value records the address of the ImageLoader and the other value records the path to the ImageLoader.
*,>

struct UninitedUpwards
{
    uintptr_t count;
    std::pair<ImageLoader*, const char*> imagesAndPaths[1];
};
Copy the code

ProcessInitializers (Context, thisThread, timingInfo, up); Is the most important function of all, so let’s see.

processInitializers

In processInitializers, recursiveInitialization is called recursively because our dynamic or static libraries introduce other classes, and tables are multi-table structures, so recursiveInitialization is needed.

// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
// To handle dangling controls that are linked up but not down, all uplinked controls defer their initialization until completion of the recursion through the down control
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
                                     InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
    uint32_t maxImageCount = context.imageCount() +2;
    ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
    ImageLoader::UninitedUpwards& ups = upsBuffer[0];
    ups.count = 0;
    
    // Calling recursive init on all images in images list, building a new list of uninitialized upward dependencies.
    // All images in the image list are recursively instantiated to create a new list of uninitialized upward dependencies
    for (uintptr_t i=0; i < images.count; ++i) {
        images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
    }
    
    // If any upward dependencies remain, init them.
    // If there are any upward dependencies, initialize them
    if ( ups.count > 0 )
        processInitializers(context, thisThread, timingInfo, ups);
}
Copy the code

Image.imagesandpaths [I]. First is the ImageLoader cursor (ImageLoader *) that calls class ImageLoader recursiveInitialization. Let’s look at the definition of the recursiveInitialization function.

recursiveInitialization

// Initialize the initialize lower level libraries first for loop to check whether the initialize lower level libraries are loaded. If not, execute dependentImage->recursiveInitialization because the recursiveInitialization may depend on other libraries.

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
                                          InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
    // A recursive lock structure that holds the current thread
    // struct recursive_lock {
    // recursive_lock(mach_port_t t) : thread(t), count(0) {}
    // mach_port_t thread;
    // int count;
    // };
    
    recursive_lock lock_info(this_thread);
    
    // Recursively lock
    recursiveSpinLock(lock_info);

    // dyld_image_state_dependents_initialized = 45, // Only single notification for this
    
    if ( fState < dyld_image_state_dependents_initialized- 1 ) {
        uint8_t oldState = fState;
        // break cycles
        // Break the recursive loop
        fState = dyld_image_state_dependents_initialized- 1;
        
        try {
            // initialize lower level libraries first
            // Initialize the lower-level libraries first
            
            // unsigned int libraryCount() const { return fLibraryCount; } 
            
            for(unsigned int i=0; i < libraryCount(a); ++i) { ImageLoader* dependentImage =libImage(i);
                if( dependentImage ! =NULL ) {
                
                    // don't try to initialize stuff "above" me yet
                    if ( libIsUpward(i) ) {
                        uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
                        uninitUps.count++;
                    }
                    else if ( dependentImage->fDepth >= fDepth ) {
                        // Recursive initialization of dependent libraries
                        dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps); }}}// record termination order
            // Record the termination command
            if ( this->needsTermination() )
                context.terminationRecorder(this);

            // let objc know we are about to initialize this image
            // Let objc know that we are going to initialize the image
            uint64_t t1 = mach_absolute_time(a);// ⬅️ start time
            
            fState = dyld_image_state_dependents_initialized;
            oldState = fState;
            
            / / core ⬇ ️ ⬇ ️ ⬇ ️
            context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
            / / ⬆ ️ ⬆ ️ ⬆ ️
            
            // initialize this image
            // Initialize the image
            
            // This is the final implementation initialize. What is inside it? It's the following two functions!
            
            // mach-o has -init and static initializers
            // doImageInit(context);
            // doModInitFunctions(context); // Initializer for __mod_init_func is executed
            
            / / core ⬇ ️ ⬇ ️ ⬇ ️
            bool hasInitializers = this->doInitialization(context);
            / / ⬆ ️ ⬆ ️ ⬆ ️
            
            // let anyone know we finished initializing this image
            // Let anyone know that we are done initializing the image
            fState = dyld_image_state_initialized;
            oldState = fState;
            
            // void (*notifySingle)(dyld_image_states, const ImageLoader* image, InitializerTimingList*);
            
            / / core ⬇ ️ ⬇ ️ ⬇ ️
            context.notifySingle(dyld_image_state_initialized, this.NULL);
            / / ⬆ ️ ⬆ ️ ⬆ ️
            
            // Start the timer
            if ( hasInitializers ) {
                uint64_t t2 = mach_absolute_time(a);// ⬅️ finish time
                timingInfo.addTime(this->getShortName(), t2-t1); }}catch (const char* msg) {
        
            // this image is not initialized
            // If the initialization fails, the unlock is thrown incorrectly
            fState = oldState;
            recursiveSpinUnLock(a);throw; }}// Unlock recursively
    recursiveSpinUnLock(a); }Copy the code

RecursiveInitialization (recursiveInitialization) is a recursiveInitialization function. NotifySingle and doInitialization are mainly studied.

At the beginning of the article, at the +load interrupt point, the BT command prints the stack of function calls that have now reached the recursiveInitialization and notifySingle.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0:  Test_ipa_Simple`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:17:5
    frame #1:  libobjc.A.dylib`load_images + 944
    frame #2:  dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 464
    frame #3:  dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int.char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 512
    frame #4:  dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 184
    frame #5:  dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 92
    frame #6:  dyld`dyld::initializeMainExecutable() + 216
    frame #7:  dyld`dyld::_main(macho_header const*, unsigned long.int.char const* *,char const* *,char const* *,unsigned long*) + 5216
    frame #8:  dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int.char const**, dyld3::MachOLoaded const*, unsigned long*) + 396
    frame #9:  dyld`_dyld_start + 56
(lldb)
Copy the code

Can be seen from _dyld_start – > dyldbootstrap: : start – > dyld: : _main – > dyld: : initializeMainExecutable – > ImageLoader::runInitializers -> ImageLoader::processInitializers -> ImageLoader::recursiveInitialization -> Dyld ::notifySingle -> libobjC.a. dylib load_images -> +[ViewController load] all the way through the call flow, and we are now at the notifySingle.

Next, we analyze the recursiveInitialization function.

// Let objc know we are about to initialize this image These are the most important parts of the recursiveInitialization function.

context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
Copy the code

The call to the function, first of all if we go all the way back to the executable function we can see that the context argument is the ImageLoader::LinkContext gLinkContext; This global variable.

// A notifySingle is a pointer to such a function
void (*notifySingle)(dyld_image_states, const ImageLoader* image, InitializerTimingList*);
Copy the code

Then in the dyld/ SRC /dyld2.cpp file:

static void setContext(const macho_header* mainExecutableMH, int argc, const char* argv[], const char* envp[], const char* apple[]) {... }Copy the code

The static global function, gLinkContext. NotifySingle assigned for & notifySingle; And the notifySingle function is a static global function defined in dyLD2. CPP. See here, we can determine the recursiveInitialization function of the context in the call. The notifySingle namely gLinkContext. NotifySingle, The notifySingle static global function in dyld/ SRC/dyLD2. CPP.

notifySingle

Then we search the notifySingle function directly in dyld2. CPP, which is a static global function. Let’s now look at its implementation:

First let’s look at a set of enumerations defined in imageloader.h, each of which represents the state of the image during the dyLD process.

enum dyld_image_states
{
    dyld_image_state_mapped                    = 10.// No batch notification for this
    dyld_image_state_dependents_mapped        = 20.// Only batch notification for this
    dyld_image_state_rebased                = 30,
    dyld_image_state_bound                    = 40,
    dyld_image_state_dependents_initialized    = 45.// Only single notification for this
    dyld_image_state_initialized            = 50,
    dyld_image_state_terminated                = 60        // Only single notification for this
};
Copy the code
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
    //dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
    
    // This is a callback to the image state change (dyLD_image_state_change_handler).
    std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
    if( handlers ! =NULL ) {
        dyld_image_info info;
        info.imageLoadAddress    = image->machHeader(a); info.imageFilePath = image->getRealPath(a); info.imageFileModDate = image->lastModified(a);for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(a); it ! = handlers->end(a); ++it) {const char* result = (*it)(state, 1, &info);
            if( (result ! =NULL) && (state == dyld_image_state_mapped) ) {
                //fprintf(stderr, " image rejected by handler=%p\n", *it);
                // make copy of thrown string so that later catch clauses can free it
                const char* str = strdup(result);
                throwstr; }}}// If state is dyLD_IMAGe_STATE_mapped some processing done in Shared Cache
    
    // recursiveInitialization calls context.notifySingle twice.
    Dyld_image_state_dependents_initialized and dyLD_IMAGe_STATE_INITIALIZED
    // So we ignore for the moment the case where state is dyLD_IMAGe_STATE_mapped
    
    if ( state == dyld_image_state_mapped ) {
        // <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
        // <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
        if(! image->inSharedCache()
            || (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
            dyld_uuid_info info;
            if ( image->getUUID(info.imageUUID) ) {
                info.imageLoadAddress = image->machHeader(a);addNonSharedCacheImageUUID(info); }}}// ⬇️⬇️⬇️
    // sNotifyObjCInit is a static global variable: static _dyLD_OBJC_notify_init sNotifyObjCInit;
    Typedef void (* _dyLD_objC_notify_init)(const char* path, const struct mach_header* mh);
    
    // Then image->notifyObjC() defaults to 1
    
    if( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit ! =NULL) && image->notifyObjC()) {// Start time of statistics
        uint64_t t0 = mach_absolute_time(a);dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0);
        
        // ⬇️⬇️ calls sNotifyObjCInit. When is the pointer to the global variable assigned?
        (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        
        // Count the end time and append the time
        uint64_t t1 = mach_absolute_time(a);uint64_t t2 = mach_absolute_time(a);uint64_t timeInObjC = t1-t0;
        uint64_t emptyTime = (t2-t1)*100;
        if( (timeInObjC > emptyTime) && (timingInfo ! =NULL) ) {
            timingInfo->addTime(image->getShortName(), timeInObjC); }}// State is dyLD_image_STATE_terminated and can be omitted
    
    // mach message csdlc about dynamically unloaded images
    if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
    
        notifyKernel(*image, false);
        
        const struct mach_header* loadAddress[] = { image->machHeader() };
        const char* loadPath[] = { image->getPath() };
        
        notifyMonitoringDyld(true.1, loadAddress, loadPath); }}Copy the code

The core of notifySingle is this (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); Function execution!

typedef void (*_dyld_objc_notify_init)(const char* path, const struct mach_header* mh);

static _dyld_objc_notify_init sNotifyObjCInit;
Copy the code

SNotifyObjCInit is a static global pointer to a function named _dyLD_OBJC_notify_init. Then search in the dyld2. CPP file and you can see that in the registerObjCNotifiers there is an assignment to the global variable sNotifyObjCInit.

registerObjCNotifiers

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
    // record functions to call
    // Record the function to be called
    
    sNotifyObjCMapped    = mapped;
    
    / / ⬇ ️ ⬇ ️ ⬇ ️
    sNotifyObjCInit        = init;
    / / ⬆ ️ ⬆ ️ ⬆ ️
    
    sNotifyObjCUnmapped = unmapped;

    // call 'mapped' function with all images mapped so far
    try {
        notifyBatchPartial(dyld_image_state_bound, true.NULL.false.true);
    }
    catch (const char* msg) {
        // ignore request to abort during registration
    }

    // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
    for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(a); it ! = sAllImages.end(a); it++) { ImageLoader* image = *it;if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC()) {dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0.0);
            
            / / ⬇ ️ ⬇ ️ ⬇ ️
            (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()); }}}Copy the code

The three global variables that perform assignment are defined together as three function Pointers of different types.

typedef void (*_dyld_objc_notify_mapped)(unsigned count, const char* const paths[], const struct mach_header* const mh[]);
typedef void (*_dyld_objc_notify_init)(const char* path, const struct mach_header* mh);
typedef void (*_dyld_objc_notify_unmapped)(const char* path, const struct mach_header* mh);

static _dyld_objc_notify_mapped        sNotifyObjCMapped;
static _dyld_objc_notify_init        sNotifyObjCInit;
static _dyld_objc_notify_unmapped    sNotifyObjCUnmapped;
Copy the code

We see that the _dyLD_OBJC_notify_init init parameter of the registerObjCNotifiers is assigned directly to sNotifyObjCInit and called in the for loop below.

_dyld_objc_notify_register

So when do you call the registerObjCNotifiers? _dyLD_OBJC_notify_init What is the argument to init? Let’s search globally for registerObjCNotifiers. (If you look at the registerObjCNotifiers parameter here, you might get a bit of an impression that the _objc_init function of objC has an image part involved.)

Let’s search globally for registerObjCNotifiers, The registerobjC_notify_register function (belonging to namespace DYLD) is called inside the _dyLD_OBJC_notify_register function in dyLD/SRC /dyldAPIs. CPP.

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}
Copy the code

The _dyLD_OBJC_notify_register function is also called in the objC source code, and is in the _objc_init function. Let’s look at the declaration of the _dyLD_OBJC_notify_register function.

//
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to call dlopen() on them to keep them from being unloaded.  
// During the call to _dyld_objc_notify_register(), dyld will call the "mapped" function with already loaded objc images.  
// During any later dlopen() call, dyld will also call the "mapped" function.  
// Dyld will call the "init" function when dyld would be called initializers in that image.  
// This is when objc calls any +load methods in that image.

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped);
Copy the code

The _dyLD_OBJC_notify_register function is used only by the OBJC Runtime. Register handlers to be called in mapped, unmapped, and Initialized objC images. Dyld will call the mapped function with the images array containing the OBJC-image-info section. Those dylib images will automatically increase the reference count, so ObjC will no longer need to call dlopen() on them to prevent them from being unloaded. During the call to _dyLD_OBJC_Notify_register (), dyLD calls the mapped functions with the loaded OBJC images. Dyld will also call the mapped functions in any subsequent dlopen() calls. When DyLD calls initializers in the image, dyLD calls init. This is when objC calls any +load method in image.

Note: Only for use by objC Runtime the _dyLD_OBJC_notify_register function is provided only for objC Runtime. So let’s go to the objC4 source code and look for the _dyLD_OBJC_notify_register call.

_objc_init

Look for the _dyLD_OBJC_notify_register function in objC4-781. It is found in _objc_init.

void _objc_init(void)
{
    // Initialized is a local static variable and can be initialized only once. Ensure that _objc_init is executed globally only once
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init(a);tls_init(a);static_init(a);runtime_init(a);exception_init(a);cache_init(a); _imp_implementationWithBlock_init();/ / ⬇ ️ ⬇ ️ ⬇ ️
    _dyld_objc_notify_register(&map_images, load_images, unmap_image);

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}
Copy the code

So after all this time, the argument above that takes the init parameter to assign to sNotifyObjCInit is load_images, The other two are assigned &map_images to sNotifyObjCMapped and unMAP_image to sNotifyObjCUnmapped.

Load_images will call the +load function for all the parent, subclass, and category of the image. We will not expand the +load function here because we have analyzed it in detail in the previous article.

So here we can connect directly from the source level: recursiveInitialization -> context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo) -> (*sNotifyObjCInit)(image->getRealPath(), image->machHeader()), SNotifyObjCInit is the load_images function registered by calling _dyLD_OBJC_notify_register in the _objc_init function. These callbacks are then called in dyLD. (Currently, two Apple open source libraries: ObjC4-781 and DYLD-832.7.3)

This gives us a look at all the functions shown in the screenshot of the original BT directive: _dyld_start -> dyldbootstrap::start -> dyld::_main -> dyld::initializeMainExecutable -> ImageLoader::runInitializers -> ImageLoader::processInitializers -> ImageLoader::recursiveInitialization -> dyld::notifySingle -> libobjc.A.dylib Load_images.

Register callback functions in _objc_init and call them in dyLD.

Register callback functions in _objc_init and call them in dyLD.

Register callback functions in _objc_init and call them in dyLD.

doInitialization

The dyLD callback is registered by calling the _dyLD_OBJC_notify_register function inside the _objc_init function. When to call _objc_init? _objc_init is the image initializer libobjc. When does libobjc initialize?

We’re still following the recursiveInitialization function above, In the context. NotifySingle (dyld_image_state_dependents_initialized, this, & timingInfo); The following is called:

// initialize this image
// Initialize the image
bool hasInitializers = this->doInitialization(context);
Copy the code

So let’s look at the doInitialization.

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
    CRSetCrashLogMessage2(this->getPath());

    / / ⬇ ️ ⬇ ️ ⬇ ️
    // mach-o has -init and static initializers
    // Mach-o includes -init and static initializers
    // The for loop calls the image initialization method (the libSystem library requires the first initialization)
    doImageInit(context);
    
    doModInitFunctions(context);
    
    CRSetCrashLogMessage2(NULL);
    
    return (fHasDashInit || fHasInitializers);
}
Copy the code

The heart of this is doImageInit(Context); And doModInitFunctions (context); Two function calls.

In doImageInit (context); The core of the for loop is to call the image initialization function, but it is important to note that the libSystem library needs the first initialization.

doImageInit

Let’s look at the implementation of the doImageInit function:

// Problem /17973316 libSystem Initializer must run first

void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
    // fHasDashInit is one of the bitfields of class ImageLoaderMachO: fHasDashInit: 1,
    if ( fHasDashInit ) {
    
        // Number of load commands
        // fMachOData is a member of class ImageLoaderMachO: const uint8_t* fMachOData;
        const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
        
        // Directly addresses the load_command location from the Mach header
        const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
        const struct load_command* cmd = cmds;
        
        // Load_command is iterated
        for (uint32_t i = 0; i < cmd_count; ++i) {
        
            switch (cmd->cmd) {
            
                // Only load_command of type #define LC_ROUTINES_COMMAND LC_ROUTINES_64
                // This type of load_command is used to hold Initializer.
                case LC_ROUTINES_COMMAND:
                    
                    __LP64__ macho_routines_command inherits from routines_command_64,
                    // struct macho_routines_command : public routines_command_64 {};
                    // Search for routines_command_64 under Darwin-xnu /EXTERNAL_HEADERS/mach-o/loader.h and you can see the following definition
                    
                    /* * The 64-bit routines command. Same use as above. */
                    // struct routines_command_64 { /* for 64-bit architectures */
                    // uint32_t cmd; /* LC_ROUTINES_64 */
                    // uint32_t cmdsize; /* total size of this command */
                    // uint64_t init_address; /* address of initialization routine Initialization routine */
                    // uint64_t init_module; /* index into the module table that the init routine is defined in */
                    // uint64_t reserved1;
                    // uint64_t reserved2;
                    // uint64_t reserved3;
                    // uint64_t reserved4;
                    // uint64_t reserved5;
                    // uint64_t reserved6;
                    // };
                    
                    // This is the Initializer for the current load_command
                    Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
                    
#if __has_feature(ptrauth_calls)
                    func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
#endif
                    // <rdar://problem/8543820&9228031> verify initializers are in image
                    if(!this->containsAddress(stripPointer((void*)func)) ) {
                        dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
                    }
                    
                    Initializer for libSystem must be run first
                    // extern struct dyld_all_image_infos* gProcessInfo; The declaration is in the dyld namespace
                    // If libsystem. dylib is not initialized (and link), an error is thrown
                    if(! dyld::gProcessInfo->libSystemInitialized ) {// <rdar://problem/17973316> libSystem initializer must run first
                        dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n".this->getPath());
                    }
                    
                    if ( context.verboseInit )
                        dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath());
                    
                    / / execution, initializer
                    {
                        dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                        func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                    }
                    
                    break;
            }
            
            // CMD points to the next load_command
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); }}}Copy the code

You can see inside the doImageInit function that’s the load command that iterates over the current image, Find the LOAD command of type LC_ROUTINES_COMMAND and execute the Initializer function using the memory address offset. (Initializer func = (Initializer)((struct macho_routines_command*) CMD)->init_address + fSlide); Where if (! Dyld ::gProcessInfo->libSystemInitialized dyld::gProcessInfo->libSystemInitialized dyld::gProcessInfo->libSystemInitialized

Looking at the doImageInit function implementation, we are sure that the if (! Dyld ::gProcessInfo->libSystemInitialized), so why is libsystem.dylib initialized first, because libobJC library initialization is performed in libDispatch library, The libDispatch library is executed after the libSystem library is initialized. So how do we test this?

Create a breakpoint at _objc_init in objC4-781 source code and run it.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1

  * frame #0: 0x00000001860964a0 libobjc.A.dylib`_objc_init / / ⬅ ️ here
    frame #1: 0x00000001002f5014 libdispatch.dylib`_os_object_init + 24 / / ⬅ ️ here
    frame #2: 0x0000000100308728 libdispatch.dylib`libdispatch_init + 476 / / ⬅ ️ here
    frame #3: 0x000000018f8777e8 libSystem.B.dylib`libSystem_initializer + 220 / / ⬅ ️ here
    
    frame #4: 0x000000010003390c dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 868
    frame #5: 0x0000000100033b94 dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 56
    frame #6: 0x000000010002d84c dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int.char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 620
    frame #7: 0x000000010002d794 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int.char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 436
    frame #8: 0x000000010002b300 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 192
    frame #9: 0x000000010002b3cc dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 96
    frame #10: 0x00000001000167fc dyld`dyld::initializeMainExecutable() + 140
    frame #11: 0x000000010001cb98 dyld`dyld::_main(macho_header const*, unsigned long.int.char const* *,char const* *,char const* *,unsigned long*) + 7388
    frame #12: 0x0000000100015258 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int.char const**, dyld3::MachOLoaded const*, unsigned long*) + 476
    frame #13: 0x0000000100015038 dyld`_dyld_start + 56
(lldb) 
Copy the code

Since _dyld_start ImageLoaderMachO: : doModInitFunctions function call we are already familiar with, This is the call from libSystem_initializer to _objc_init.

Libobjc. A. d. ylib ` _objc_init ⬆ ️ ⬆ ️ ⬆ ️ libdispatch. Dylib ` _os_object_init ⬆ ️ ⬆ ️ ⬆ ️ libdispatch. Dylib ` libdispatch_init ⬆ ️ ⬆ ️ ⬆ ️ libSystem.B.dylib`libSystem_initializerCopy the code

As it happens, the libraries in which these functions reside are open source, so let’s download the source code to find out.

libSystem_initializer

The libSystem_initializer function is called. LibSystem_initializer function libSystem_initializer function Libsystem/init.c libSystem_initializer function libSystem_initializer

// libsyscall_initializer() initializes all of libSystem.dylib
// <rdar://problem/4892197>
__attribute__((constructor))
static void
libSystem_initializer(int argc,
              const char* argv[],
              const char* envp[],
              const char* apple[],
              const struct ProgramVars* vars)
{
    static const struct _libkernel_functions libkernel_funcs = {
        .version = 4.// V1 functions
#if! TARGET_OS_DRIVERKIT
        .dlsym = dlsym,
#endif
        .malloc = malloc,
        .free = free,
        .realloc = realloc,
        ._pthread_exit_if_canceled = _pthread_exit_if_canceled,
        ...
Copy the code

Here’s an excerpt from the libSystem_initializer function:

// Initialization of the kernel
__libkernel_init(&libkernel_funcs, envp, apple, vars);

// Initializing the platform
__libplatform_init(NULL, envp, apple, vars);

// Thread initialization (our GCD can only be used after initialization)
__pthread_init(&libpthread_funcs, envp, apple, vars);

// Initialize the liBC
_libc_initializer(&libc_funcs, envp, apple, vars);

// Initialize malloc
// TODO: Move __malloc_init before __libc_init after breaking malloc's upward link to Libc
// Note that __malloc_init() will also initialize ASAN when it is present
__malloc_init(apple);

Dyld is not initialized when dyLD is ld_start. Dyld is also a library.
_dyld_initializer();

// Initialize libDispatch, which is also seen in the stack above
libdispatch_init(a); _libxpc_initializer();Copy the code

In the Libsystem/init.c file we can see the declarations of a set of external functions:

// system library initialisers
extern void mach_init(void);            // from libsystem_kernel.dylib
extern void __libplatform_init(void *future_use, const char *envp[], const char *apple[], const struct ProgramVars *vars);
extern void __pthread_init(const struct _libpthread_functions *libpthread_funcs, const char *envp[], const char *apple[], const struct ProgramVars *vars);    // from libsystem_pthread.dylib
extern void __malloc_init(const char *apple[]); // from libsystem_malloc.dylib
extern void __keymgr_initializer(void);        // from libkeymgr.dylib
extern void _dyld_initializer(void);        // from libdyld.dylib
extern void libdispatch_init(void);        // from libdispatch.dylib
extern void _libxpc_initializer(void);        // from libxpc.dylib
extern void _libsecinit_initializer(void);        // from libsecinit.dylib
extern void _libtrace_init(void);        // from libsystem_trace.dylib
extern void _container_init(const char *apple[]); // from libsystem_containermanager.dylib
extern void __libdarwin_init(void);        // from libsystem_darwin.dylib
Copy the code

Looking inside the libSystem_initializer function, initialization functions from other libraries are called, such as _dyLD_Initializer (); This is the initialization of the DYLD library, because dyLD is also a dynamic library.

Upon launching an executable, the system kernel completes the initialization of the environment and gives dyLD control to perform loading and linking.

libdispatch_init

Libdispatch_init () is called inside the libSystem_initializer function; Libdispatch_init is in libdispatch.dylib. Libdispatch_init libdispatch/Dispatch Source/queue.c libdispatch_init is defined as follows:

DISPATCH_EXPORT DISPATCH_NOTHROW
void
libdispatch_init(void)
{
    dispatch_assert(sizeof(struct dispatch_apply_s) <=
            DISPATCH_CONTINUATION_SIZE);

    if (_dispatch_getenv_bool("LIBDISPATCH_STRICT".false)) { _dispatch_mode |= DISPATCH_MODE_STRICT; }... .Copy the code

As we move along the definition of libdispatch_init, near the end of the definition there will be some CREAT operations (GCD related) for _dispatch_thread.

. _dispatch_hw_config_init(); _dispatch_time_init(); _dispatch_vtable_init();/ / ⬇ ️ ⬇ ️ ⬇ ️
_os_object_init();

_voucher_init();
_dispatch_introspection_init();
}
Copy the code
_os_object_init

Os_object_init (which itself belongs to libdispatch.dylib) You can see the definition in libdispatch/Dispatch Source/object.m.

void
_os_object_init(void)
{
    / / ⬇ ️ ⬇ ️ ⬇ ️
    _objc_init();
    
    Block_callbacks_RR callbacks = {
        sizeof(Block_callbacks_RR),
        (void(*) (const void *))&objc_retain,
        (void(*) (const void *))&objc_release,
        (void(*) (const void *))&_os_objc_destructInstance
    };
    
    _Block_use_RR2(&callbacks);
#if DISPATCH_COCOA_COMPAT
    const char *v = getenv("OBJC_DEBUG_MISSING_POOLS");
    if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    v = getenv("DISPATCH_DEBUG_MISSING_POOLS");
    if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS");
    if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
#endif
}
Copy the code

The _os_object_init function defines a call to _objc_init, which jumps from _os_object_init to _objc_init at runtime. _objC_init calls _dyLD_OBJC_notify_register to assign to sNotifyObjCInit.

So here we can summarize the _objc_init call flow:

  • _dyld_start ->
  • dyldbootstrap::start ->
  • dyld::_main ->
  • dyld::initializeMainExecutable ->
  • ImageLoader::runInitializers ->
  • ImageLoader::processInitializers ->
  • ImageLoader::recursiveInitialization ->
  • doInitialization ->
  • doModInitFunctions ->
  • LibSystem_initializer belongs to libSystem. B.d ylib ->
  • Libdispatch_init belongs to libdispatch dylib ->
  • _os_object_init belongs to libdispatch dylib ->
  • _objc_init belongs to libobjc. A. d. ylib
  1. whendyldLoad to start linkmainExecutableWhen called recursivelyrecursiveInitializationFunction.
  2. The first time this function is run, it’s going to runlibSystemInitialization of, will go todoInitialization -> doModInitFunctions -> libSystem_initializer.
  3. libSystemClass, is calledlibdispatch_init.libdispatch_initWill be called_os_object_init._os_object_initWill be called_objc_init.
  4. in_objc_initIn the calldyld_dyld_objc_notify_registerThe function registration is savedmap_images,load_images,unmap_imagesThe function address of.
  5. Return after registrationdyldrecursiveInitializationRecursively the next call, for examplelibObjcwhenlibObjcCame torecursiveInitializationWhen called, a save is triggeredload_imagesThe callback is calledload_imagesFunction.

See here we have a function not see, we analyzed the above void ImageLoaderMachO: : doImageInit (const LinkContext & context) function of the content, Then in ImageLoaderMachO: : doInitialization function definition itself doImageInit (context); Call, and then doModInitFunctions(context); LibSystem_initializer (libSystem_initializer) is called from doModInitFunctions.

doModInitFunctions

Let’s have a look at the void ImageLoaderMachO: : doModInitFunctions (const LinkContext & context) function definition.

void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
    // In ImageLoaderMachO: fHasInitializers: 1, whether the tags are initialized
    if ( fHasInitializers ) {
    
        // Locate the load command for the current image
        const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
        const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
        const struct load_command* cmd = cmds;
        
        // Iterate through load Command
        for (uint32_t i = 0; i < cmd_count; ++i) {
        
            Load command (#define LC_SEGMENT_COMMAND LC_SEGMENT_64)
            if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
                
                // struct macho_segment_command : public segment_command_64 {};
                // Convert to segment_command_64 and move the pointer to find the macho_section location
                const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
                const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
                const struct macho_section* const sectionsEnd = &sectionsStart[seg->nsects];
                
                // Iterate over the section of the current segment
                for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
                    
                    // Retrieve the current section type
                    const uint8_t type = sect->flags & SECTION_TYPE;
                    
                    // If the current type is S_MOD_INIT_FUNC_POINTERS (that is, __mod_init_funcs)
                    if ( type == S_MOD_INIT_FUNC_POINTERS ) {
                    
                        Initializer* inits = (Initializer*)(sect->addr + fSlide);
                        const size_t count = sect->size / sizeof(uintptr_t);
                        
                        // <rdar://problem/23929217> Ensure __mod_init_func section is within segment
                        // Make sure the __mod_init_func field is in the current section
                        if ( (sect->addr < seg->vmaddr) || (sect->addr+sect->size > seg->vmaddr+seg->vmsize) || (sect->addr+sect->size < sect->addr) )
                            dyld::throwf("__mod_init_funcs section has malformed address range for %s\n".this->getPath());
                        
                        // Walk through all initializers for the current section
                        for (size_t j=0; j < count; ++j) {
                            
                            // Fetch each Initializer
                            Initializer func = inits[j];
                            
                            // <rdar://problem/8543820&9228031> verify initializers are in image
                            // Verify that initializers are in the image
                            if(!this->containsAddress(stripPointer((void*)func)) ) {
                                dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
                            }
                            
                            // Make sure libsystem.dylib is initialized first
                            if(! dyld::gProcessInfo->libSystemInitialized ) {// <rdar://problem/17973316> libSystem initializer must run first
                                
                                const char* installPath = getInstallPath(a);// if libsystem. dylib is not initialized and the path of the current image is NULL or the current image is not libsystem. dylib, an error will be cast
                                // if libsystem. dylib is not initialized, and the current image is the image before libsystem. dylib, throw an error.
                                // libsystem. dylib must be initialized first
                                
                                if ( (installPath == NULL) | | (strcmp(installPath, libSystemPath(context)) ! =0) )
                                    dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n".this->getPath());
                            }
                            
                            // Print to start the call initialization
                            if ( context.verboseInit )
                                dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
                            
                            // call initialization
                            // const struct LibSystemHelpers* gLibSystemHelpers = NULL; Is a global variable used to assist LibSystem
                            // struct LibSystemHelpers helpers is a structure full of function Pointers
                            
                            / / if the current is libSystem dylib haveLibSystemHelpersBefore value is NO
                            boolhaveLibSystemHelpersBefore = (dyld::gLibSystemHelpers ! =NULL);
                            
                            {
                                / / timing
                                dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                                
                                // Execute the initialization function
                                func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                            }
                            
                            / / if the current is libSystem dylib is performing here haveLibSystemHelpersBefore value is YES
                            boolhaveLibSystemHelpersAfter = (dyld::gLibSystemHelpers ! =NULL);
                            
                            // If one of the two conditions is NO and the other is YES, then libsystem.dylib was initialized and libSystemInitialized to true
                            if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
                            
                                // now safe to use malloc() and other calls in libSystem.dylib
                                // libSystem is initialized and it is now safe to use malloc() and other calls in libsystem.dylib
                                
                                dyld::gProcessInfo->libSystemInitialized = true; }}}// if the current type is S_MOD_INIT_FUNC_POINTERS (__init_offsets)
                    else if ( type == S_INIT_FUNC_OFFSETS ) {
                        / / read inits
                        const uint32_t* inits = (uint32_t*)(sect->addr + fSlide);
                        const size_t count = sect->size / sizeof(uint32_t);
                        
                        // Ensure section is within segment
                        // Make sure the current section is in the current segment
                        if ( (sect->addr < seg->vmaddr) || (sect->addr+sect->size > seg->vmaddr+seg->vmsize) || (sect->addr+sect->size < sect->addr) )
                            dyld::throwf("__init_offsets section has malformed address range for %s\n".this->getPath());
                        
                        // Verify that the current segment is read-only
                        if ( seg->initprot & VM_PROT_WRITE )
                            dyld::throwf("__init_offsets section is not in read-only segment %s\n".this->getPath());
                        
                        // Iterate over all inits in the current area
                        for (size_t j=0; j < count; ++j) {
                            uint32_t funcOffset = inits[j];
                            
                            // verify initializers are in image
                            // Verify that initializers are in the image
                            if(!this->containsAddress((uint8_t*)this->machHeader() + funcOffset) ) {
                                dyld::throwf("initializer function offset 0x%08X not in mapped image for %s\n", funcOffset, this->getPath());
                            }
                            
                            // Make sure libsystem.dylib is initialized first
                            if(! dyld::gProcessInfo->libSystemInitialized ) {// <rdar://problem/17973316> libSystem initializer must run first
                                
                                // The path returned by libSystemPath(context). Context is used only to determine whether driverKit is used
                                // #define LIBSYSTEM_DYLIB_PATH "/usr/lib/libSystem.B.dylib"
                                // The libSystem dynamic library has a fixed location locally
                                
                                const char* installPath = getInstallPath(a);// if libsystem. dylib is not initialized and the path of the current image is NULL or the current image is not libsystem. dylib, an error will be cast
                                // if libsystem. dylib is not initialized, and the current image is the image before libsystem. dylib, throw an error.
                                // libsystem. dylib must be initialized first
                                
                                if ( (installPath == NULL) | | (strcmp(installPath, libSystemPath(context)) ! =0) )
                                    dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n".this->getPath());
                            }
                            
                            // Convert to Initializer function pointer
                            Initializer func = (Initializer)((uint8_t*)this->machHeader() + funcOffset);
                            
                            // Print to start the call initialization
                            if ( context.verboseInit )
                                dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
                                
#if __has_feature(ptrauth_calls)
                            func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
#endif
                            
                            // call initialization
                            
                            / / haveLibSystemHelpersBefore haveLibSystemHelpersAfter and the use of two variables
                            boolhaveLibSystemHelpersBefore = (dyld::gLibSystemHelpers ! =NULL);
                            {
                                / / timing
                                dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                                
                                // Execute the initialization function
                                func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                            }
                            boolhaveLibSystemHelpersAfter = (dyld::gLibSystemHelpers ! =NULL);
                            
                            // If one of the two conditions is NO and the other is YES, then libsystem.dylib was initialized and libSystemInitialized to true
                            if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
                            
                                // now safe to use malloc() and other calls in libSystem.dylib
                                // libSystem is initialized and it is now safe to use malloc() and other calls in libsystem.dylib
                                
                                dyld::gProcessInfo->libSystemInitialized = true;
                            }
                        }
                        
                    }
                }
            }
            
            // Go through the next load Command
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize); }}}Copy the code

DoModInitFunctions internally call Initializer for the __mod_init_func and __init_offsets partitions. The libSystem_initializer for libsystem. dylib is stored in the __mod_init_func section of the libsystem. dylib library. (The __mod_init_func area is only used to store initialization functions.)

At the beginning of this article we saw that the load method was executed first. In the previous article we looked at the +load execution in detail. If you remember, you must remember the entry load_imags function. In the objC-781 source code, it starts with objc_init, and it ends with _dyLD_OBJC_notify_register, which is load_images, Prepare_load_methods and call_load_methods inside load_images complete all the +load function calls in the parent class, subclass, and classification of the whole project.

The doModInitFunctions function first iterates to find a Load command of type LC_SEGMENT_COMMAND, The Initializer is then facilitated and executed by traversing the section where the types are S_MOD_INIT_FUNC_POINTERS and S_INIT_FUNC_OFFSETS.

typedef void (*Initializer)(int argc, const char* argv[], const char* envp[], const char* apple[], const ProgramVars* vars);
Copy the code

We break a breakpoint in the function main_front we wrote earlier, run the code, and use BT to look at the function call stack.

Can see main_front in ImageLoaderMachO: : doModInitFunctions (ImageLoader: : LinkContext const &) perform, C++ static methods are executed under doModInitFunctions.

Let’s use MachOView to look at the structure of libsystem.dylib. (Don’t look here until you find libsystem. dylib)

__attribute__((constructor))

__attribute__((constructor)) a function marked by attribute((constructor) that is executed before main or when the dynamic library is loaded. In Mach-o, functions marked by attribute((constructor) are in the __mod_init_func section of the _DATA section. What if multiple tagged attribute((constructor) methods want to be executed sequentially? Attribute ((constructor)) supports priority: _attribute((constructor(1))).

When we learned about __attribute__((constructor)), we know that the function marked by attribute((constructor) is in the __mod_init_func section of the _DATA segment, When we search for the libSystem_initializer function in libsystem-1292.100.5, we can see the attribute((constructor) tag in front of it, Namely libSystem_initializer in libSystem. Dylib __mod_init_func area, the void ImageLoaderMachO: : doModInitFunctions function calls in the process, The __mod_init_func section is searched, but libSystem_initializer is executed when doModInitFunctions are called by libsystem. dylib.

When libSystem_initializer is called, dyld flags gProcessInfo->libSystemInitialized to indicate that libSystem has been initialized.

_dyld_initializer

Dyld knows that libSystem has been initialized using the _dyLD_initializer function:

// called by libSystem_initializer only
extern void _dyld_initializer(void);
Copy the code

The _dyLD_initializer function is called only by libSystem_initializer.

//
// during initialization of libSystem this routine will run and call dyld, 
// registering the helper functions.
//
extern "C" void tlv_initializer(a);

void _dyld_initializer()
{    
   void (*p)(dyld::LibSystemHelpers*);

    // Get the optimized objc pointer now that the cache is loaded
    // Now that the cache is loaded, get the optimized objc pointer
    
    const dyld_all_image_infos* allInfo = _dyld_get_all_image_infos();
    
    if( allInfo ! =nullptr  ) {
        const DyldSharedCache* cache = (const DyldSharedCache*)(allInfo->sharedCacheBaseAddress);
        if( cache ! =nullptr )
            // For gObjCOpt assignment only
            gObjCOpt = cache->objcOpt(a); }if ( gUseDyld3 ) {
        // If dyld3 is used, initialize all images in gAllImages
        dyld3::gAllImages.applyInitialImages(a);#ifTARGET_OS_IOS && ! TARGET_OS_SIMULATOR

        // For binaries built before 13.0, set the lookup function if they need it
        // For binaries built before 13.0, set the lookup feature if necessary
        
        if (dyld_get_program_sdk_version()"DYLD_PACKED_VERSION(13.0.0))
            setLookupFunc((void*)&dyld3::compatFuncLookup);
            
#endif

    }
    else {
        _dyld_func_lookup("__dyld_register_thread_helpers", (void**)&p);
        if(p ! =NULL)
            // sHelpers is a static global struct variable: static dyld::LibSystemHelpers = {.... }
            p(&sHelpers);
    }
    
    // The tlv_initializer function is called
    tlv_initializer(a); }Copy the code

During libSystem initialization, this routine runs and calls dyld to register auxiliary functions (libSystem helpers). Here also the corresponding doModInitFunctions inside the function, the bool haveLibSystemHelpersBefore = (dyld: : gLibSystemHelpers! = NULL); And bool haveLibSystemHelpersAfter = (dyld: : gLibSystemHelpers! = NULL); Two variables together can be used to indicate dyld::gProcessInfo->libSystemInitialized = true; , indicating that libSystem has been initialized and it is now safe to use malloc() and other calls in libsystem.dylib.

Initialization of libSystem is an internal behavior, how does DyLD know it is initialized? LibSystem is a special dylib that is relied on by all executables by default, dyld provides a separate API for it: _dyLD_initializer, which is called when libSystem is initialized so that dyLD knows that libSystem is initialized.

Call main() in _dyLD_START

How does the ld_start function in dyld call main() from main.m?

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001045d1ad8 Test_ipa_Simple`main(argc=1, argv=0x000000016b82dc88) at main.mm:89:5
    frame #1: 0x0000000180223cbc libdyld.dylib`start + 4
(lldb) 
Copy the code

The dyLDbootstrap ::start function returns the result of the dyld::_main function. return dyld::_main((macho_header*)mainExecutableMH, appsSlide, argc, argv, envp, apple, startGlue); The dyld::_main function returns the address of main(). The comment on dyld::_main indicates this:

The kernel load dyld and jumps to __dyLD_start which sets up some registers and call this function.

Returns address of main() in target program which __dyLD_start jumps to. Dyld. The kernel loads dyLD and jumps to __dyLD_START, which sets up some registers and calls this function. Returns the address of main() in the target program to which __dyLD_start jumps.

Let’s dive into the implementation of the dyld::_main function to see what the result is:

// find entry point for main executable
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN(a); .return result;
Copy the code

Let’s look at the getEntryFromLC_MAIN function implementation:

void* ImageLoaderMachO::getEntryFromLC_MAIN(a) const
{
    // Number of load commands
    const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
    
    // Skip macho_header addressing to the load command location
    const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
    const struct load_command* cmd = cmds;
    
    // Iterate through load Command
    for (uint32_t i = 0; i < cmd_count; ++i) {
        Load_command of type LC_MAIN
        if ( cmd->cmd == LC_MAIN ) {
            
            / / returns the entry
            entry_point_command* mainCmd = (entry_point_command*)cmd;
            void* entry = (void*)(mainCmd->entryoff + (char*)fMachOData);
            
            // <rdar://problem/8543820&9228031> verify entry point is in image
            if ( this->containsAddress(entry) ) {
            
#if __has_feature(ptrauth_calls)
                // start() calls the result pointer as a function pointer so we need to sign it.
                return __builtin_ptrauth_sign_unauthenticated(entry, 0.0);
#endif

                return entry;
            }
            else
                throw "LC_MAIN entryoff is out of range";
        }
        cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
    }
    return NULL;
}
Copy the code

That is, the Entry Point that returns LC_MAIN, which is the main() address of the current executable.

Dyld’s __dyLD_START is an assembly implementation of dyld’s __dyLD_start.

// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
bl    __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm

// ⬆️ here is the dyldbootstrap::start call, which returns the main() entry address and is saved in x16

mov    x16,x0                  // save entry point address in x16

#if __LP64__
ldr     x1, [sp]
#else
ldr     w1, [sp]
#endif

cmp    x1, #0
b.ne    Lnew

// LC_UNIXTHREAD way, clean up stack and jump to result
#if __LP64__
add    sp, x28, #8             // restore unaligned stack pointer without app mh
#else
add    sp, x28, #4             // restore unaligned stack pointer without app mh
#endif

// ⬇️ jumps to the entry of the program, the main() function

#if __arm64e__
braaz   x16                     // jump to the program's entry point
#else
br      x16                     // jump to the program's entry point
#endif

// LC_MAIN case, set up stack for call to main() sets stack for call to main()
Lnew:    mov    lr, x1            // Simulate return address into _start in libdyld. Dylib simulate the return address to _start in libdyld

// ⬇️ Here are the familiar argc and argv arguments to main

#if __LP64__
ldr    x0, [x28, #8]       // main param1 = argc
add    x1, x28, #16        // main param2 = argv
add    x2, x1, x0, lsl #3
add    x2, x2, #8          // main param3 = &env[0]
mov    x3, x2
Copy the code

After executing the dyLDbootstrap ::start function, the main() function will be called. We can also see that the address of main() is actually read from the load command of type LC_MAIN, indicating that main() is the underlying writable function.

The main function is compiled into the executable file, and is fixed to write. The compiler will load the main function into memory. If we change the name of main, we will get the following error: Entry point (_main) undefined. For architecture x86_64

At this point we have seen all the flow before main(). End scatter flowers 🎉🎉🎉

Refer to the link

Reference link :🔗

  • Dyld – 832.7.3
  • OC Basic principles -App startup process (DYLD loading process)
  • What is dyLD cache in iOS?
  • IOS advanced basic principles – application loading (dyLD loading process, class and classification loading)
  • What does the iOS application do before entering the main function?
  • Dyld load application startup details
  • Dynamic and static libraries in iOS
  • Link path problems in Xcode
  • IOS uses the Framework for dynamic updates
  • Namespace, and problem resolution for repeated definitions
  • C++ namespace namespace
  • Learn about Xcode’s process for generating “static libraries” and “dynamic libraries”
  • Hook static initializers
  • IOS reverse dyLD process
  • OC Low-level exploration 13, class loading 1 – dyld and objC association

Here are some of the new reference links at 🔗 :

  • Section 13 — DyLD loading Process
  • Section 14 — Dyld and LibobJC
  • How can iOS 15 make your apps launch faster
  • LLVM Clang for iOS
  • Rip iOS Bottom 17 — App Loading Process (Perfect update)
  • Rip iOS bottom 18 — A preliminary study of class loading — Dyld and libObjc those things
  • IOS Basics – Comb through the dyLD loading process from scratch