This is the 23rd day of my participation in the August More Text Challenge.More challenges in August

Iv. Inverse correlation between OBJC and DYLD

As you can see from the symbol breakpoint procedure above, there are also non-DYLD libraries between the _DYLD_OBJC_Notify_register and doModInitFunctions. To make a breakpoint in _objc_init there is the following call stack:

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
  * frame #0: 0x00000001002d2d44 libobjc.A.dylib`_objc_init at objc-os.mm:925:9
    frame #1: 0x000000010044b0bc libdispatch.dylib`_os_object_init + 13
    frame #2: 0x000000010045bafc libdispatch.dylib`libdispatch_init + 282
    frame #3: 0x00007fff69543791 libSystem.B.dylib`libSystem_initializer + 220
    frame #4: 0x000000010002f1d3 dyld`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 535
    frame #5: 0x000000010002f5de dyld`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
    frame #6: 0x0000000100029ffb dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 493
    frame #7: 0x0000000100029f66 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 344
    frame #8: 0x00000001000280b4 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188
    frame #9: 0x0000000100028154 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
    frame #10: 0x0000000100016662 dyld`dyld::initializeMainExecutable() + 129
    frame #11: 0x000000010001bbba dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 6667
    frame #12: 0x0000000100015227 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 453
    frame #13: 0x0000000100015025 dyld`_dyld_start + 37
Copy the code

fordoModInitFunctionsThe following flow is unknown. fromdoModInitFunctions->_objc_initThe process is unknown. So the best way to do that is from_objc_initPush back the entire process that calls it.

4.1 _os_object_init

_objc_init is called by _os_object_init, which is in libdispatch.dylib. Os_object_init:

Void _os_object_init(void) {//_objc_init calls _objc_init(); Block_callbacks_RR callbacks = { sizeof(Block_callbacks_RR), (void (*)(const void *))&objc_retain, (void (*)(const void *))&objc_release, (void (*)(const void *))&_os_objc_destructInstance }; _Block_use_RR2(&callbacks); #if DISPATCH_COCOA_COMPAT const char *v = getenv("OBJC_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("DISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS"); if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v); #endif }Copy the code

It turns out that _objc_init() is actually called directly from _os_object_init. Call _OS_object_init to libdispatch_init

  • In which theTLSKey value handling and thread handling.

4.2 libSystem_initializer

LibSystem_initializer is available in libSystem library, download libSystem source 1292.120.1. Also search libSystem_initializer directly:

  • It’s called directlylibdispatch_init, also called__malloc_init,_dyld_initializerAs well as_libtrace_init.libSystem_initializerisImageLoaderMachO::doModInitFunctionsSo the whole process goes backdyldIn the. The whole process is strung together.

The following code was found in doModInitFunctions:

  • libSystemThe library must be initialized first. And that makes sense because we’re going to initializedispatchAs well asobjc. otherimageAll depend on it.
  • funcIs thec++Constructor call.

So where is libSystem_initializer called? The libSystem_initializer call is not seen in doModInitFunctions. But the breakpoint read does read:

This has been analyzed beforedoModInitFunctionsIs inc++Constructor call.libSystem_initializerIs preciselyc++Constructor:

So the whole process works. onlylibSystem_initializerthisc++The constructor is called first.

libSystemthec++Constructors indyld,libobjc,Foundationthec++After the constructor, before the main program.

Dyld registration objC callback simple analysis

_dyLD_OBJC_notify_register is called from _objc_init to register a callback.

// first parameter map_images sNotifyobjCMMAPPED = mapped; // The second parameter load_images sNotifyObjCInit = init; // Unmap_image sNotifyObjCUnmapped = unmapped;Copy the code

The logic of these three callbacks will be examined in detail.

5.1 sNotifyObjCMapped (map_images)

SNotifyObjCMapped in dyLD is only called in notifyBatchPartial:

notifyBatchPartialThe call to theregisterObjCNotifiers,registerImageStateBatchChangeHandler, as well asnotifyBatchIn the. So the call that we have here based on our analysis is going to beregisterObjCNotifiersThe callback is called inside after it is registered. inobjcThe source codemap_imagesBreak point:

You can verify that the callback was invoked immediately after it was registeredmap_images.

Map_images_nolock is invoked directly in map_images, which performs class-related loading operations. This logic will be analyzed in a separate article.

5.2 sNotifyObjCInit (load_images)

SNotifyObjCInit calls in DYLD fall into the following cases:

  • 1.notifySingleFromCacheIn the.
  • 2.notifySingleIn the.
  • 3.registerObjCNotifiers.

NotifySingleFromCache and notifySingle have the same logic except that there is no cache. RegisterObjCNotifiers are callbacks made directly when a callback function is registered. Placing a breakpoint directly on load_images can trace the following information:

You can see that the base library of the system does this immediately after the callback is registeredload_imagesThe call.

For other libraries it is the notifySingle callback logic:

5.2.1 load_images (objc-runtime-new.mm)

SNotifyObjCInit is load_images and is implemented as follows:

void load_images(const char *path __unused, const struct mach_header *mh) { if (! didInitialAttachCategories && didCallDyldNotifyRegister) { didInitialAttachCategories = true; // loadAllCategories loadAllCategories(); } // Return without taking locks if there are no +load methods here. if (! hasLoadMethods((const headerType *)mh)) return; recursive_mutex_locker_t lock(loadMethodLock); // Discover load methods { mutex_locker_t lock2(runtimeLock); // Prepare all load methods prepare_load_methods((const headerType *)mh); } // Call load methods(without runtimelock-re-entrant); }Copy the code
  • Load all categories.
  • Prepare allloadMethods.
  • And finally calledcall_load_methods.

prepare_load_methods

void prepare_load_methods(const headerType *mhdr) { size_t count, i; runtimeLock.assertLocked(); classref_t const *classlist = _getObjc2NonlazyClassList(mhdr, &count); for (i = 0; i < count; I ++) {// add the load method schedule_class_load(remapClass(classlist[I])); } category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count); For (I = 0; i < count; i++) { category_t *cat = categorylist[i]; Class cls = remapClass(cat->cls); if (! cls) continue; // category for ignored weak-linked class if (cls->isSwiftStable()) { _objc_fatal("Swift class extensions and categories  on Swift " "classes are not allowed to have +load methods"); } // Implement class realizeClassWithoutSwift(CLS, nil); ASSERT(cls->ISA()->isRealized()); // Add a load method for the class. add_category_to_loadable_list(cat); }}Copy the code
  • Add the main classloadMethods.
  • add-classifiedloadMethods.

schedule_class_load

static void schedule_class_load(Class cls) { if (! cls) return; ASSERT(cls->isRealized()); // _read_images should realize if (cls->data()->flags & RW_LOADED) return; // Ensure superclass_first ordering (CLS ->getSuperclass())); // Ensure superclass_first ordering (CLS ->getSuperclass()); add_class_to_loadable_list(cls); cls->setInfo(RW_LOADED); }Copy the code
  • Recursive scheduling classloadMethod until the parent class isnil.

add_class_to_loadable_list & add_category_to_loadable_list

void add_class_to_loadable_list(Class cls) { IMP method; loadMethodLock.assertLocked(); Load method = CLS ->getLoadMethod(); if (! method) return; // Don't bother if cls has no +load method if (PrintLoading) { _objc_inform("LOAD: class '%s' scheduled for +load", cls->nameForLogging()); } // Allocated space is not available. If (loadable_classes_used == loadable_classes_allocated) {loadable_classes_allocated = loadable_classes_allocated*2 + 16; loadable_classes = (struct loadable_class *) realloc(loadable_classes, loadable_classes_allocated * sizeof(struct loadable_class)); } // Add the load method to loadable_classes. CLS -method loadable_classes[loadable_classes_used]. CLS = CLS; loadable_classes[loadable_classes_used].method = method; loadable_classes_used++; } void add_category_to_loadable_list(Category cat) { IMP method; loadMethodLock.assertLocked(); // Get load method = _category_getLoadMethod(cat); // Don't bother if cat has no +load method if (! method) return; if (PrintLoading) { _objc_inform("LOAD: category '%s(%s)' scheduled for +load", _category_getClassName(cat), _category_getName(cat)); } if (loadable_categories_used == loadable_categories_allocated) { loadable_categories_allocated = loadable_categories_allocated*2 + 16; loadable_categories = (struct loadable_category *) realloc(loadable_categories, loadable_categories_allocated * sizeof(struct loadable_category)); } // Categories_categories_categories_categories_categories_categories_categories_categories_categories_categories_categories_categories_categories_categories_categories.cat = cat; loadable_categories[loadable_categories_used].method = method; loadable_categories_used++; }Copy the code
  • Get by character out of that comparisonloadMethods.
  • Is space opened when space is insufficient? The size of space opened each time is (2X +16) * 16 bytes.
struct loadable_class {
  Class cls;  // may be nil
  IMP method;
};
Copy the code
  • Add the corresponding data toloadable_classeswithloadable_categoriesIn the.

⚠️ There is a distinction between classes and categories during loading. The reasons for this distinction will be analyzed in detail in subsequent articles. getLoadMethod

IMP objc_class::getLoadMethod() { runtimeLock.assertLocked(); const method_list_t *mlist; // Recurse all baseMethods, find the load method. mlist = ISA()->data()->ro()->baseMethods(); if (mlist) { for (const auto& meth : *mlist) { const char *name = sel_cname(meth.name()); // Match load if (0 == STRCMP (name, "load")) {return meth. Imp (false); } } } return nil; } IMP _category_getLoadMethod(Category cat) { runtimeLock.assertLocked(); const method_list_t *mlist; mlist = cat->classMethods; if (mlist) { for (const auto& meth : *mlist) { const char *name = sel_cname(meth.name()); if (0 == strcmp(name, "load")) { return meth.imp(false); } } } return nil; }Copy the code
  • loadMethod retrieval is obtained by comparing the character out of that.

5.2.2 call_load_methods (objc-loadmethod.mm)

void call_load_methods(void) { static bool loading = NO; bool more_categories; loadMethodLock.assertLocked(); // Re-entrant calls do nothing; the outermost call will finish the job. if (loading) return; loading = YES; void *pool = objc_autoreleasePoolPush(); // loop call_class_loads, Do {// 1. Repeatedly call class +loads until there aren't any more while (loadable_classes_used > 0) { // Load call_class_loads() on each class; } // 2. Callcategory +loads // loads are called before loads are called. (for image) more_categories = call_category_loads(); // 3. Run more +loads if there are classes OR more untried categories } while (loadable_classes_used > 0 || more_categories); objc_autoreleasePoolPop(pool); loading = NO; }Copy the code
  • callcall_class_loadsLoading of the class+ load.
  • Then callcall_category_loadsLoad classified+ load. Here is also a description of the classificationloadIn all classesloadMethod after it is called. (forimage).

This is where the + load method is called, which is why + Load is called before main. call_class_loads

static void call_class_loads(void) { int i; // Detach current loadable list. struct loadable_class *classes = loadable_classes; int used = loadable_classes_used; loadable_classes = nil; loadable_classes_allocated = 0; // Clear the value loadable_classes_used = 0; // Call all +loads for the detached list. for (i = 0; i < used; i++) { Class cls = classes[i].cls; Load_method = (load_method_t)classes[I]. Method; if (! cls) continue; if (PrintLoading) { _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging()); } // call load (*load_method)(CLS, @selector(load)); } // Destroy the detached list. if (classes) free(classes); }Copy the code
  • The interior is also derived fromloadable_classesIn the looploadMethod is called.

call_category_loads

static bool call_category_loads(void) { int i, shift; bool new_categories_added = NO; // Detach current loadable list. struct loadable_category *cats = loadable_categories; int used = loadable_categories_used; int allocated = loadable_categories_allocated; loadable_categories = nil; loadable_categories_allocated = 0; loadable_categories_used = 0; // Call all +loads for the detached list. for (i = 0; i < used; i++) { Category cat = cats[i].cat; Load load_method_t load_method = (load_method_t)cats[I]. Method; Class cls; if (! cat) continue; cls = _category_getClass(cat); if (cls && cls->isLoadable()) { if (PrintLoading) { _objc_inform("LOAD: +[%s(%s) load]\n", cls->nameForLogging(), _category_getName(cat)); } (*load_method)(cls, @selector(load)); cats[i].cat = nil; }}... }Copy the code
  • classificationloadIs also called fromloadable_categoriesCycle takeloadMethod is called. There is more internal processing logic in categories.

That’s why the + load and c++ constructor return LC_MAIN to call main. This can be verified by assembly breakpoints:

So that corresponds to what we started with. So if YOU modifymainThe name of the function, which was reported incorrectly at compile time. Entry to the main programmainIt’s written dead. It passesHookTo operatemainHide your logic. It can be seen from the above analysisdyldIs in accordance with theimage listFrom the first order1aimagecallrunInitializersIt can be regarded as thatimageBranch). Let’s call the next oneimagetherunInitializersFinally, the main program (subscript is0)runInitializers. inrunInitializersAll classes are called internally first+load, and then call all classified+ load, and finally callc++Constructor of.

objcIn the callload.dyldIn the calldoModInitFunctions.

⚠ ️If the+ loadDo the protection, then can pass in+ loadBlock external symbols for processing before execution. So we can bypass the guard.The most important thing about protection is not to let others find the logic of protection, as long as you can find it is easy to crack.

5.3 sNotifyObjCUnmapped (unmap_image)

SNotifyObjCUnmapped In dyLD only removeImage was called:

removeImagebecheckandAddImage,garbageCollectImages,_dyld_link_moduleThe call.

  • garbageCollectImagesIn:linkAnd other exceptions as well as when the collection is called.
  • checkandAddImage: Detects loadedimageDelete the mirror if it is not in the mirror list.
  • _dyld_link_module: Not sure where the call was made.

5.3.1 unmap_image

Unmap_image_nolock is called in unmap_image, and the core code is as follows:

Void unmap_image_nolock(const struct mach_header *mh) {... header_info *hi; ... // Release the class to classify related resources. _unload_image(hi); // Remove header_info from header list // Remove removeHeader(hi); free(hi); }Copy the code
  • Remove the free class and classify related resources.
  • removeHeaderInformation.

Dyld3 closure mode analysis

Closure will return when the closure is enabled, so the core logic is in launchWithClosure:

static bool launchWithClosure(const dyld3::closure::LaunchClosure* mainClosure, const DyldSharedCache* dyldCache, const dyld3::MachOLoaded* mainExecutableMH, uintptr_t mainExecutableSlide, int argc, const char* argv[], const char* envp[], const char* apple[], Diagnostics& diag, uintptr_t* entry, uintptr_t* startGlue, Bool * closureOutOfDate, bool* recoverable) {...... libDyldEntry->runInitialzersBottomUp((mach_header*)mainExecutableMH); ... }Copy the code

A call to runInitialzersBottomUp is found in the launchWithClosure:

void AllImages::runInitialzersBottomUp(const closure::Image* topImage) { // walk closure specified initializer list, already ordered bottom up topImage->forEachImageToInitBefore(^(closure::ImageNum imageToInit, bool& stop) { // get copy of LoadedImage about imageToInit, but don't keep reference into _loadedImages, because it may move if initialzers call dlopen() uint32_t indexHint = 0; LoadedImage loadedImageCopy = findImageNum(imageToInit, indexHint); // skip if the image is already inited, or in process of being inited (dependency cycle) if ( (loadedImageCopy.state() == LoadedImage::State::fixedUp) && swapImageState(imageToInit, indexHint, LoadedImage::State::fixedUp, LoadedImage::State::beingInited) ) { // tell objc to run any +load methods in image if ( (_objcNotifyInit ! = nullptr) && loadedImageCopy.image()->mayHavePlusLoads() ) { dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)loadedImageCopy.loadedAddress(), 0, 0); const char* path = imagePath(loadedImageCopy.image()); log_notifications("dyld: objc-init-notifier called with mh=%p, path=%s\n", loadedImageCopy.loadedAddress(), path); //+load (*_objcNotifyInit)(path, loadedImageCopy.loadedAddress()); } / / run all initializers in image / / c + + constructor runAllInitializersInImage (loadedImageCopy. The image (), loadedImageCopy.loadedAddress()); // advance state to inited swapImageState(imageToInit, indexHint, LoadedImage::State::beingInited, LoadedImage::State::inited); }}); }Copy the code
  • _objcNotifyInitAnd finally it calls+ loadMethods.
  • runAllInitializersInImagecallc++Constructor, which includes the registration callback.
void AllImages::runAllInitializersInImage(const closure::Image* image, const MachOLoaded* ml) { image->forEachInitializer(ml, ^(const void* func) { Initializer initFunc = (Initializer)func; #if __has_feature(ptrauth_calls) initFunc = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)initFunc, 0, 0); #endif { ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)ml, (uint64_t)func, 0); // the c++ constructor initFunc(NXArgc, NXArgv, environ, appleParams, _programVars); } log_initializers("dyld: called initialzer %p in %s\n", initFunc, image->path()); }); }Copy the code

_DYLD_OBJC_notify_register () is called by dyLD3 :: _dyLD_objC_notify_register:

But the final callback and the caller isdyld2The logic. Take a look at the source:

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    if ( gUseDyld3 )
        return dyld3::_dyld_objc_notify_register(mapped, init, unmapped);

    DYLD_LOCK_THIS_BLOCK;
    typedef bool (*funcType)(_dyld_objc_notify_mapped, _dyld_objc_notify_init, _dyld_objc_notify_unmapped);
    static funcType __ptrauth_dyld_function_ptr p = NULL;

    if(p == NULL)
        dyld_func_lookup_and_resign("__dyld_objc_notify_register", &p);
    p(mapped, init, unmapped);
}
Copy the code

If gUseDyld3 is NULL, dyLD2 logic is followed. However, if you go to dyLD3, you can get the following information: the three registered callback Pointers have different names from dyLD2:

_objcNotifyMapped   = map;
_objcNotifyInit     = init;
_objcNotifyUnmapped = unmap;
Copy the code
  • _objcNotifyInitIt has been made clear that therunInitialzersBottomUpIs called in.
  • _objcNotifyUnmappedIs in thegarbageCollectImages ->removeImagesIs called in.
  • _objcNotifyMappedIs in therunImageCallbacksIt has two callersapplyInitialImagesAs well asloadImage.
    • applyInitialImagesIs be_dyld_initializerThe call._dyld_initializerIn part IV it has been made clear thatlibSystem_initializerIs called in. And because the_dyld_initializerIs in thelibdispatch_initPreviously called, so the callback should not have been registered at this point.
    • loadImageIs in thedlopenIs called in.

There is no way for the real machine, emulator or MAC to enter closure mode debugging verification. And the closure mode code logic readability is poor, so here is only based on the source code conclusions, may not be valid.