preface

In the process of dyld loading, we analyzed the process of dyld loading to generate image(machO), but it has not been loaded into memory, and we can only use it when it is loaded into memory. Then how about this process? Next, we will analyze it.

_objc_init

  • inDyld Load processIn the study, our core process is to go to_objc_initWe studied it in general_dyld_objc_notify_registerSo let’s analyze that_objc_initThe process of:
void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init(a);// Read the configured environment variables that can be in the terminal
    tls_init(a);// Set the thread key
    static_init(a);// execute C++ static functions
    runtime_init(a);// Initialize the runtime environment
    exception_init(a);// The abnormal system is initialized
#if __OBJC2__
    cache_t: :init(a);// Cache initialization
#endif
    _imp_implementationWithBlock_init(); // Start the callback mechanism

    _dyld_objc_notify_register(&map_images, load_images, unmap_image); // Registration notification

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}
Copy the code

environ_init

  • environ_initTo read the environment variable, pull out the method that printed the environment variable, and then remove the restriction:



Print to see

Run all the environment variables again, and you can see that there are many:

You can see that a number of environment variables are set, and you see the familiar onesnonpointer_isaEnvironment variables are already used in development, such as setting pairs of environment variablesNSLogThe output of. First of all,nonpointer_isaDoes the representativePure isaLet’s addOBJC_DISABLE_NONPOINTER_ISALet’s print it out,

Testing environment variables
  • cmd+shift+,Open theEdit Scheme, the choice ofRun -> ArgumentsAnd then inEnvironment VariablesaddOBJC_DISABLE_NONPOINTER_ISA.ValueforYESFirst,Do not select environment variables

  • Then printWSPersontheisa:



  • Re-check environment variables, rewrite run and print:



  • Discover that after setting environment variables,isaIs a pureisaThere was nothing insideclassInformation.
  • You can also set itOBJC_PRINT_LOAD_METHODSTo see where to call+loadMethods to deal with some problems in the project.

Terminal Command Viewing

  • You can also use terminal commandsexport OBJC_HELP=1To view environment variables:



tls_init

  • tls_initThis is mainly for local thread Settingskey, the code is as follows
void tls_init(void)
{
#if SUPPORT_DIRECT_THREAD_KEYS
    pthread_key_init_np(TLS_DIRECT_KEY, &_objc_pthread_destroyspecific);
#else
    _objc_pthread_key = tls_create(&_objc_pthread_destroyspecific);
#endif
}
Copy the code

static_init

  • static_initIt’s mostly executionC++Static function of,libcWill be indyldCall before a static function is called_objc_init, that is,System C++ function advantages other custom function calls.
/*********************************************************************** * static_init * Run C++ static constructor functions. * libc calls _objc_init() before dyld would call our static constructors, * so we have to do it ourselves. **********************************************************************/
static void static_init(a)
{
    size_t count;
    auto inits = getLibobjcInitializers(&_mh_dylib_header, &count);
    for (size_t i = 0; i < count; i++) {
        inits[i]();
    }
    auto offsets = getLibobjcInitializerOffsets(&_mh_dylib_header, &count);
    for (size_t i = 0; i < count; i++) {
        UnsignedInitializer init(offsets[i]);
        init();
    }
}
Copy the code
  • We can simulate the scenario. In_objc_initMethod above to write a simulation system functionc++Constructor:
__attribute__((constructor)) void ws_func(a) {
    printf("🎉 🎉 🎉 % s", __func__);
}
Copy the code
  • Then, instatic_initMiddle break point,Step OverAs you go down, you see that the function has been calledWSPersontheloadHaven’t go:

runtime_init

  • runtime_initThere are two main steps:Initialization of the classification, andInitializes a table of classes
void runtime_init(void)
{
    objc::unattachedCategories.init(32); // Class initialization
    objc::allocatedClasses.init(a);// Initialize the table
}
Copy the code

exception_init

  • exception_initYes Abnormal System initialization

Analysis of the

  • mainlyRegister the callback for the exception.When the underlying program finds an error, it fires this callback, throwing an exception, the source is as follows
void exception_init(void)
{
    old_terminate = std::set_terminate(&_objc_terminate);
}
Copy the code
  • Source code is mainly implemented in_objc_terminateIn:
static void (*old_terminate)(void) = nil;
static void _objc_terminate(void)
{
    if (PrintExceptions) {
        _objc_inform("EXCEPTIONS: terminating");
    }

    if (! __cxa_current_exception_type()) {
        // No current exception.
        (*old_terminate)();
    }
    else {
        // There is a current exception. Check if it's an objc exception.
        @try {
            __cxa_rethrow(); // Execute normally
        } @catch (id e) {
            // It's an objc object. Call Foundation's handler, if any.
            (*uncaught_handler)((id)e); // In case of an exception, the exception e is called back through uncaught_handler
            (*old_terminate)();
        } @catch(...). {// It's not an objc object. Continue to C++ terminate.(*old_terminate)(); }}}Copy the code
  • The whole is onetry-catchWhen the underlying program finds an exception,uncaught_handlerThe exception is called backe
  • uncaught_handlerThe source code is as follows:
static void _objc_default_uncaught_exception_handler(id exception)
{
}
static objc_uncaught_exception_handler uncaught_handler = _objc_default_uncaught_exception_handler;

// objc_uncaught_exception_handler source:
objc_uncaught_exception_handler 
objc_setUncaughtExceptionHandler(objc_uncaught_exception_handler fn)
{
    objc_uncaught_exception_handler result = uncaught_handler;
    uncaught_handler = fn;
    return result;
}
Copy the code
  • Through analysis,objc_setUncaughtExceptionHandlerPass in a parameterfnThat will befnAssigned touncaught_handler, when an exception occurs,uncaught_handlerPass the exception to the outside world

Catch exceptions

Based on the principle, we can simulate this scenario to catch exceptions:

    1. To define aUncaughtExceptionHanlderClass, and then define the captured code:
// .h
@interface UncaughtExceptionHanlder : NSObject

+ (void)installHandler;

@end

// .m
@implementation UncaughtExceptionHanlder

void ws_exceptionHander(NSException *exception) {
    NSLog(@"\n caught an exception 🎉🎉🎉 %@\n", exception);
    NSLog(@" Here we go, buddy.");
}

+ (void)installHandler {
    NSSetUncaughtExceptionHandler(&ws_exceptionHander);
}

@end

Copy the code
    1. thenAppDelegateIn the callinstallHandlermethods[UncaughtExceptionHanlder installHandler];
    1. inViewControllerTo write a code that will trigger an exception
@interface ViewController(a)

@property (nonatomic.strong) NSArray *dataSource;

@end

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];

    self.dataSource = @[@"Kobe".@"Garnett".@"O'Neal".@"Iversen".@"Duncan"];
}

- (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event {
    NSString *name = self.dataSource[5];
}

@end
Copy the code
    1. Breakpoints inws_exceptionHanderThen run click to raise an exception:



  • The exception was then caught, proving that the analysis was correct.

cache_t::init

  • cache_t::initIs the cache initialization, source code as follows
void cache_t::init(a)
{
#if HAVE_TASK_RESTARTABLE_RANGES
    mach_msg_type_number_t count = 0;
    kern_return_t kr;
    
    while (objc_restartableRanges[count].location) {
        count++;
    }
    // Register a set of rebootable caches for the current task
    kr = task_restartable_ranges_register(mach_task_self(),
                                          objc_restartableRanges, count);

    if (kr == KERN_SUCCESS) return;
    _objc_fatal("task_restartable_ranges_register failed (result 0x%x: %s)", kr, mach_error_string(kr));
#endif // HAVE_TASK_RESTARTABLE_RANGES
}
Copy the code

_imp_implementationWithBlock_init

  • _imp_implementationWithBlock_init: Starts the callback mechanism. Usually this doesn’t do anything, because all the initialization is lazy, but for some processes, we can’t wait to load, righttrampolines dylib, the source code is as follows:
void
_imp_implementationWithBlock_init(void)
{
#if TARGET_OS_OSX
    // Eagerly load libobjc-trampolines.dylib in certain processes. Some
    // programs (most notably QtWebEngineProcess used by older versions of
    // embedded Chromium) enable a highly restrictive sandbox profile which
    // blocks access to that dylib. If anything calls
    // imp_implementationWithBlock (as AppKit has started doing) then we'll
    // crash trying to load it. Loading it here sets it up before the sandbox
    // profile is enabled and blocks it.
    //
    // This fixes EA Origin (rdar://problem/50813789)
    // and Steam (rdar://problem/55286131)
    if (__progname &&
        (strcmp(__progname, "QtWebEngineProcess") = =0 ||
         strcmp(__progname, "Steam Helper") = =0)) {
        Trampolines.Initialize(a); }#endif
}
Copy the code

_dyld_objc_notify_register

  • _dyld_objc_notify_registerIs the registration notification, mainly the three parameters passed in_dyld_objc_notify_register(&map_images, load_images, unmap_image);

load_images

  • Analyzed in the last articleload_imagesThe function of is mainly to call+loadMethods.

unmap_image

  • dyldremoveiamgeThis function is called

map_images

  • This parameter is different from the other two. It isPointer passed, which needs to change in real time,machoThis method is called when it is loaded into memory. Let’s take a look at the source code implementation:
void
map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    mutex_locker_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}
Copy the code
  • The implementation clearly returns a valuemap_images_nolockIs the key, its method is as follows:



  • We know thatMachOIt can’t be read until it’s loaded into memory_read_imagesmethods

_read_images analysis

  • The code in the method is long, and when you fold the code, you find that the corresponding code is somelog:
void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{

#define EACH_HEADER \
    hIndex = 0;         \
    hIndex < hCount && (hi = hList[hIndex]); \
    hIndex++

// 1. Condition control, the first load
    if(! doneOnce) { ... }// Fix up @selector references
// 2. Fix @selector confusion during precompile
    static size_tUnfixedSelectors; {... } ts.log("IMAGE TIMES: fix up selector references");

    // Discover classes. Fix up unresolved future classes. Mark bundle classes.
// 3. Error messy class handling
    bool hasDyldRoots = dyld_shared_cache_some_image_overridden(a);for (EACH_HEADER) { ... }

    ts.log("IMAGE TIMES: discover classes");

    if (!noClassesRemapped()) {... }// 4. Fix remapping some classes that were not loaded by the image file
    ts.log("IMAGE TIMES: remap classes");

#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
// 5. Fix some messages
    for (EACH_HEADER) { ... }

    ts.log("IMAGE TIMES: fix up objc_msgSend_fixup");
#endif

    // Discover protocols. Fix up protocol refs.
// 6. Read the protocol when it is in our class
    for (EACH_HEADER) { ... }

    ts.log("IMAGE TIMES: discover protocols");

// 7. Fix the protocol that was not loaded
    for (EACH_HEADER) { ... }

    ts.log("IMAGE TIMES: fix up @protocol references");

// 8. Classification
    if (didInitialAttachCategories) { ... }

    ts.log("IMAGE TIMES: discover categories");

// 9. Class loading processing
    for (EACH_HEADER) { ... }

    ts.log("IMAGE TIMES: realize non-lazy classes");

// 10. Unprocessed classes optimize those classes that have been violated
    if (resolvedFutureClasses) { ... }

    ts.log("IMAGE TIMES: realize future classes");

    if (DebugNonFragileIvars) { ... }

    // Print preoptimization statistics
    // Some print
    if (PrintPreopt) { ... }

#undef EACH_HEADER
}
Copy the code
  • According to thelogInformation that can be_read_imagesDivided into the following steps:
      1. Condition control for the first load
      1. Fixed @selector confusion during precompile
      1. Error messy class handling
      1. Fixed remapping of some classes that were not loaded by the image file
      1. Fixed some messages
      1. We read protocols when there are protocols in our class
      1. Fixed protocol not being loaded
      1. Classification process
      1. Class loading processing
      1. Classes that have not been processed optimize those that have been violated

if (! doneOnce)

  • It mainly carries on the variable state change, thus achieves the control first time to load, the source code is as follows
// 1. Condition control, the first load
if(! doneOnce) { doneOnce = YES; launchTime = YES;#if SUPPORT_NONPOINTER_ISA
    // Disable non-pointer isa under some conditions.
    
# if SUPPORT_INDEXED_ISA
    // Disable nonpointer isa if any image contains old Swift code
    for (EACH_HEADER) { ... }
# endif
    
# if TARGET_OS_OSX
    for (EACH_HEADER) { ... }
# endif
#endif
    
    if (DisableTaggedPointers) {
        disableTaggedPointers(a); }initializeTaggedPointerObfuscator(a);// Small object confusion
    
    if (PrintConnecting) { ... }
    
    // namedClasses
    // Preoptimized classes don't go in this table.
    // 4/3 is NXMapTable's load factor
    int namedClassesSize =
    (isPreoptimized()? unoptimizedTotalClasses : totalClasses) *4 / 3;
    gdb_objc_realized_classes =
    NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize);
    
    ts.log("IMAGE TIMES: first time tasks");
}
Copy the code
  • Here,NXCreateMapTableIs to create a table with a size of oneFour thirds“Was mentioned in the previous articlebuckettheThree quarters ofExpansion? Why is this placeFour thirds? The reason is simple and can be expressed by a formula:Open size * 3/4 = class sizeSo when you create a table, you definitely need to open up the size, which isClass usage x 4/3.
  • A table is also mentioned aboveallocatedClassesWhat’s the difference?allocatedClassesIs stored in the loaded class, whereasNXCreateMapTableA master table is created

size_t UnfixedSelectors

  • The main fix is to precompile @selector mess, it source below
// 2. Fix @selector confusion during precompile
static size_t UnfixedSelectors;
{
    mutex_locker_t lock(selLock);
    for (EACH_HEADER) {
        if (hi->hasPreoptimizedSelectors()) continue;
        
        bool isBundle = hi->isBundle(a); SEL *sels = _getObjc2SelectorRefs(hi, &count); UnfixedSelectors += count;for (i = 0; i < count; i++) {
            const char *name = sel_cname(sels[i]);
            SEL sel = sel_registerNameNoLock(name, isBundle);
            if(sels[i] ! = sel) { sels[i] = sel; }}}}Copy the code
  • selisName + Address, here judge when taken outselIf not, it will be locally processed._getObjc2SelectorRefsfromMachOOut ofselAnd thesel_registerNameNoLockLook at the source code that is fromdyldOut ofselWhen the methods are the same, the addresses may not be the same, because the relative positions are different, so it is necessary to processdyldIn theselShall prevail.

Discover classes

  • Error chaos class this piece, the source code is as follows:
bool hasDyldRoots = dyld_shared_cache_some_image_overridden(a);for (EACH_HEADER) {
    if (! mustReadClasses(hi, hasDyldRoots)) {
        // Image is sufficiently optimized that we need not call readClass()
        continue;
    }
    
    classref_t const *classlist = _getObjc2ClassList(hi, &count);
    
    bool headerIsBundle = hi->isBundle(a);bool headerIsPreoptimized = hi->hasPreoptimizedClasses(a);for (i = 0; i < count; i++) {
        Class cls = (Class)classlist[i];
        Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);
        
        if(newCls ! = cls && newCls) {// Class was moved but not deleted. Currently this occurs
            // only when the new class resolved a future class.
            // Non-lazily realize the class below.
            resolvedFutureClasses = (Class *)
            realloc(resolvedFutureClasses,
                    (resolvedFutureClassCount+1) * sizeof(Class)); resolvedFutureClasses[resolvedFutureClassCount++] = newCls; }}}Copy the code
  • Follow up through breakpoints, found not to goifJudge, according to the comment: the class has been moved but not deleted, this only happens if the new class resolves the future class, so this can be ignored
  • And then through the break point inreadClassPrint front and back separatelycls:



  • inreadClassBefore,classfromMachOOnly the address was taken out, but afterreadClassAfter,Associated with the name, the specific is how to achieve, then look at its source implementation:



  • You can see what you end up with is oneclsAnd theif (missingWeakSuperclass(cls))Get isnil“So just don’t look
    • Look again atif (mangledName ! = nullptr)“It saysroandrwRelevant content. Is this the place to find it? And then we use itLGPersonClass to verify, first write a code filterLGPerson:



    • strcmpIt means compare, when the same= = 0, will go in. The breakpoint toprintfRun again when the breakpoint is reached atStep OverSo if you go down, what you’re doing isLGPerson“And found that he had not goneif (mangledName ! = nullptr)Judge, goaddNamedClassandaddClassTableEntry(cls);Method, let’s look at these two methods

addNamedClass

  • addNamedClassThe source code is as follows:
static void addNamedClass(Class cls, const char *name, Class replacing = nil)
{
    runtimeLock.assertLocked(a); Class old;if ((old = getClassExceptSomeSwift(name)) && old ! = replacing) {inform_duplicate(name, old, cls);

        // getMaybeUnrealizedNonMetaClass uses name lookups.
        // Classes not found by name lookup must be in the
        // secondary meta->nonmeta table.
        addNonMetaClass(cls);
    } else {
        NXMapInsert(gdb_objc_realized_classes, name, cls);
    }
    ASSERT(! (cls->data()->flags & RO_META));

    // wrong: constructed classes are already realized when they get here
    // ASSERT(! cls->isRealized());
}
Copy the code
  • The break point goes down and finds that it finally goesNXMapInsertthisThe hash mapTable,NXMapInsertFront and back, printclsFound that isNXMapInsertwillThe address is associated with the name of the class.

addClassTableEntry

  • The source code is as follows:
static void
addClassTableEntry(Class cls, bool addMeta = true)
{
    runtimeLock.assertLocked(a);// This class is allowed to be a known class via the shared cache or via
    // data segments, but it is not allowed to be in the dynamic table already.
    auto &set = objc::allocatedClasses.get(a);// Obtain from the created class table

    ASSERT(set.find(cls) == set.end());

    if (!isKnownClass(cls)) // If the step is a known class, write it in the class
        set.insert(cls);
    if (addMeta) // If it is a metaclass, it needs to be loaded into the class table
        addClassTableEntry(cls->ISA(), false); // When the class is loaded, the metaclass must also be loaded
}
Copy the code
  • It mainly handles class loading into the table, and if it is not a known class, it is inserted intoallocatedClassesThe table, if it’s a metaclass, also has to get the metaclass and then go through the process.

Although rw and ro are mentioned in readClass, according to the breakpoint, it does not walk in, indicating that the address of the class is associated with the class name and inserted into the table.

remapClassRef

// 4. Fix remapping some classes that were not loaded by the image file
if (!noClassesRemapped()) {
    for (EACH_HEADER) {
        Class *classrefs = _getObjc2ClassRefs(hi, &count); // The class in macho
        for (i = 0; i < count; i++) {
            remapClassRef(&classrefs[i]);
        }
        // fixme why doesn't test future1 catch the absence of this?
        classrefs = _getObjc2SuperRefs(hi, &count);
        for (i = 0; i < count; i++) {
            remapClassRef(&classrefs[i]); }}}Copy the code
  • Mainly throughremapClassRefMethod detects some classes that have not been loaded and loads them

objc_msgSend_fixup

  • To fix some messages, the code is as follows:
// Fix some messages
for (EACH_HEADER) {
    message_ref_t *refs = _getObjc2MessageRefs(hi, &count);
    if (count == 0) continue;

    if (PrintVtables) {
        _objc_inform("VTABLES: repairing %zu unsupported vtable dispatch "
                     "call sites in %s", count, hi->fname());
    }
    for (i = 0; i < count; i++) {
        fixupMessageRef(refs+i); }}Copy the code

discover protocols

  • We read protocols when there are protocols in our class
for (EACH_HEADER) {
    extern objc_class OBJC_CLASS_$_Protocol;
    Class cls = (Class)&OBJC_CLASS_$_Protocol;
    ASSERT(cls);
    NXMapTable *protocol_map = protocols(a);bool isPreoptimized = hi->hasPreoptimizedProtocols(a);// Skip reading protocols if this is an image from the shared cache
    // and we support roots
    // Note, after launch we do need to walk the protocol as the protocol
    // in the shared cache is marked with isCanonical() and that may not
    // be true if some non-shared cache binary was chosen as the canonical
    // definition
    if (launchTime && isPreoptimized) {
        if (PrintProtocols) {
            _objc_inform("PROTOCOLS: Skipping reading protocols in image: %s",
                         hi->fname());
        }
        continue;
    }

    bool isBundle = hi->isBundle(a);protocol_t * const *protolist = _getObjc2ProtocolList(hi, &count); 
    for (i = 0; i < count; i++) {
        readProtocol(protolist[i], cls, protocol_map, isPreoptimized, isBundle); }}Copy the code
  • Mainly from theMachOTo get an array of protocols and then passreadProtocolMethods read

fix up @protocol

for (EACH_HEADER) {
    // At launch time, we know preoptimized image refs are pointing at the
    // shared cache definition of a protocol. We can skip the check on
    // launch, but have to visit @protocol refs for shared cache images
    // loaded later.
    if (launchTime && hi->isPreoptimized())
        continue;
    protocol_t **protolist = _getObjc2ProtocolRefs(hi, &count);
    for (i = 0; i < count; i++) {
        remapProtocolRef(&protolist[i]); }}Copy the code
  • throughremapProtocolRefChecks if the protocol is loaded, and if it is not, loads it again

discover categories

if (didInitialAttachCategories) {
    for (EACH_HEADER) {
        load_categories_nolock(hi); }}Copy the code
  • If there is a classification, the classification is loaded: there are two steps:
      1. The corresponding classification of the registration class
      1. If the class is implemented, the method list for the class is refactored

realize non-lazy classes

  • Class loading processing
for (EACH_HEADER) {
    classref_t const *classlist = hi->nlclslist(&count);
    for (i = 0; i < count; i++) {
        Class cls = remapClass(classlist[i]);
        if(! cls)continue;

        addClassTableEntry(cls);

        if (cls->isSwiftStable()) {
            if (cls->swiftMetadataInitializer()) {
                _objc_fatal("Swift class %s with a metadata initializer "
                            "is not allowed to be non-lazy",
                            cls->nameForLogging());
            }
            // fixme also disallow relocatable classes
            // We can't disallow all Swift classes because of
            // classes like Swift.__EmptyArrayStorage
        }
        realizeClassWithoutSwift(cls, nil); }}Copy the code
  • If you go through the break point, you will enteraddClassTableEntryandrealizeClassWithoutSwiftTwo functions
  • addClassTableEntryFunction: Adds a class to a table of all classes. If it is a metaclass, the metaclass of the class is automatically added as follows:
addClassTableEntry(Class cls, bool addMeta = true)
{
    runtimeLock.assertLocked(a);// This class is allowed to be a known class via the shared cache or via
    // data segments, but it is not allowed to be in the dynamic table already.
    auto &set = objc::allocatedClasses.get(a);ASSERT(set.find(cls) == set.end());

    if (!isKnownClass(cls))
        set.insert(cls);
    if (addMeta)
        addClassTableEntry(cls->ISA(), false);
}
Copy the code
  • realizeClassWithoutSwift:Rw and roTreatment, which will be discussed in detail in the next article

realize future classes

if (resolvedFutureClasses) {
    for (i = 0; i < resolvedFutureClassCount; i++) {
        Class cls = resolvedFutureClasses[i];
        if (cls->isSwiftStable()) {
            _objc_fatal("Swift class is not allowed to be future");
        }
        realizeClassWithoutSwift(cls, nil);
        cls->setInstancesRequireRawIsaRecursively(false/*inherited*/);
    }
    free(resolvedFutureClasses);
}
Copy the code
  • The class that has not been processed and passedsetInstancesRequireRawIsaRecursivelyThe class and all its subclasses in which the method will be violatedMark as the original ISA pointer.