preface

Through the previous study, basic already clear, dyld for the underlying image processing, the image is mapped to the program, but still not into the data in the memory, just library or name, data in memory is not loaded into memory, not loaded into memory can’t read the method, the agreement, the member variables inside, and so on. We also know that these image files are in mach0 format, so how does the system load these library files into memory when loading? The best way to do this is by reading data in mach0 format into memory, storing it like a table structure, each table storing the corresponding class, and then putting the contents into the class by rW RO.

Initialization preparations

Environ_init () : Reads environment variables that affect the runtime. You can also print environment variable help if desired.

Tls_init () on the thread key binding – such as the per-thread data destructor

Static_init () runs the C ++ static constructor. Before dyld calls our static constructor, libc will call _objc_init(), here it will call its own static function, no need to call dyld, for timings.

Lock_init () is not overridden and uses C++ features

Exception_init () initializes libobJC’s exception-handling system

Runtime_init () Runtime runtime environment initialization with unattachedCategories allocatedClasses

Cache_init () initializes the cache condition

_imp_implementationWithBlock_init Starts the callback mechanism. Normally this doesn’t do much because all initialization is lazy, but for some processes we can’t wait to load trampolines dylib.

map_images

Understanding is the key to preparation.

_dyld_objc_notify_register(&map_images, load_images, unmap_image);
Copy the code

Map_images is a pointer copy! That is, value copy, synchronous change.

_read_images

Map_images_nolock is accessed by map_images to see how images are added to the table. In map_images_NOLock, as the name implies, _read_images is the key method of reading.

The whole process

doneOnce

UnfixedSelectors

Methods to repair

static size_t UnfixedSelectors;
    {
        mutex_locker_t lock(selLock);
        for (EACH_HEADER) {
            if (hi->hasPreoptimizedSelectors()) continue;

            bool isBundle = hi->isBundle();
            SEL *sels = _getObjc2SelectorRefs(hi, &count);
            UnfixedSelectors += count;
            for (i = 0; i < count; i++) {
                const char *name = sel_cname(sels[i]);
                SEL sel = sel_registerNameNoLock(name, isBundle);
                if(sels[i] ! = sel) { sels[i] = sel; }}}}Copy the code

The address of macho is not accurate, there is an address offset at compile time, take dyld as the standard.

readClass

for (EACH_HEADER) {
        if (! mustReadClasses(hi, hasDyldRoots)) {
            // Image is sufficiently optimized that we need not call readClass()
            continue;
        }

        classref_t const *classlist = _getObjc2ClassList(hi, &count);

        bool headerIsBundle = hi->isBundle();
        bool headerIsPreoptimized = hi->hasPreoptimizedClasses();

        for (i = 0; i < count; i++) {
            Class cls = (Class)classlist[i];
            Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

            if(newCls ! = cls && newCls) {// Class was moved but not deleted. Currently this occurs 
                // only when the new class resolved a future class.
                // Non-lazily realize the class below.
                resolvedFutureClasses = (Class *)
                    realloc(resolvedFutureClasses, 
                            (resolvedFutureClassCount+1) * sizeof(Class)); resolvedFutureClasses[resolvedFutureClassCount++] = newCls; }}}Copy the code

You can see here that there is a readClass method

Class readClass(Class cls, bool headerIsBundle, bool headerIsPreoptimized)
{
    const char *mangledName = cls->nonlazyMangledName();
    
    const char *NXPersonName = "NXPerson";

    if (strcmp(mangledName, LGPersonName) == 0) {
        // How do you write
        printf("% s-nx: To be studied: - %s\n",__func__,mangledName);
    }
    
    if (missingWeakSuperclass(cls)) {
        // No superclass (probably weak-linked). 
        // Disavow any knowledge of this subclass.
        if (PrintConnecting) {
            _objc_inform("CLASS: IGNORING class '%s' with "
                         "missing weak-linked superclass", 
                         cls->nameForLogging());
        }
        addRemappedClass(cls, nil);
        cls->setSuperclass(nil);
        return nil;
    }
    
    cls->fixupBackwardDeployingStableSwift();

    Class replacing = nil;
    if(mangledName ! = nullptr) {if (Class newCls = popFutureNamedClass(mangledName)) {
            // This name was previously allocated as a future class.
            // Copy objc_class to future class's struct.
            // Preserve future's rw data block.

            if (newCls->isAnySwift()) {
                _objc_fatal("Can't complete future class request for '%s' "
                            "because the real class is too big.",
                            cls->nameForLogging());
            }

            class_rw_t *rw = newCls->data();
            const class_ro_t *old_ro = rw->ro();
            memcpy(newCls, cls, sizeof(objc_class));

            // Manually set address-discriminated ptrauthed fields
            // so that newCls gets the correct signatures.
            newCls->setSuperclass(cls->getSuperclass());
            newCls->initIsa(cls->getIsa());

            rw->set_ro((class_ro_t *)newCls->data());
            newCls->setData(rw);
            freeIfMutable((char *)old_ro->getName());
            free((void*)old_ro); addRemappedClass(cls, newCls); replacing = cls; cls = newCls; }}if(headerIsPreoptimized && ! replacing) {// class list built in shared cache
        // fixme strict assert doesn't work because of duplicates
        // ASSERT(cls == getClass(name));
        ASSERT(mangledName == nullptr || getClassExceptSomeSwift(mangledName));
    } else {
        if (mangledName) { //some Swift generic classes can lazily generate their names
            addNamedClass(cls, mangledName, replacing);
        } else {
            Class meta = cls->ISA();
            const class_ro_t *metaRO = meta->bits.safe_ro();
            ASSERT(metaRO->getNonMetaclass() && "Metaclass with lazy name must have a pointer to the corresponding nonmetaclass.");
            ASSERT(metaRO->getNonMetaclass() == cls && "Metaclass nonmetaclass pointer must equal the original class.");
        }
        addClassTableEntry(cls);
    }

    // for future reference: shared cache never contains MH_BUNDLEs
    if (headerIsBundle) {
        cls->data()->flags |= RO_FROM_BUNDLE;
        cls->ISA()->data()->flags |= RO_FROM_BUNDLE;
    }
    
    return cls;
}

Copy the code

Now, you might have guessed that there might be an rw ro operation here, but by typing a breakpoint, looking for our own class to do something after it, but we didn’t, we just went addNamedClass and addClassTableEntry, not ro rw.

The addNamedClass hash map is added to the GDB table. The name passed in is the external mangledName.

static void addNamedClass(Class cls, const char *name, Class replacing = nil)
{
    runtimeLock.assertLocked();
    Class old;
    if((old = getClassExceptSomeSwift(name)) && old ! = replacing) { inform_duplicate(name, old, cls);// getMaybeUnrealizedNonMetaClass uses name lookups.
        // Classes not found by name lookup must be in the
        // secondary meta->nonmeta table.
        addNonMetaClass(cls);
    } else{ NXMapInsert(gdb_objc_realized_classes, name, cls); } ASSERT(! (cls->data()->flags & RO_META));// wrong: constructed classes are already realized when they get here
    // ASSERT(! cls->isRealized());
}
Copy the code

I just added it to the table first, guess I added it to the table first, doing ro, rw. There’s also a metaclass here to show that nothing else is being done in readClass, so you can go to the other methods of read_images and use the same type of breakpoint to see the key methods.

addClassTableEntry(Class cls, bool addMeta = true)
{
    runtimeLock.assertLocked();

    // This class is allowed to be a known class via the shared cache or via
    // data segments, but it is not allowed to be in the dynamic table already.
    auto &set = objc::allocatedClasses.get();

    ASSERT(set.find(cls) == set.end());

    if(! isKnownClass(cls)) set.insert(cls);if (addMeta)
        addClassTableEntry(cls->ISA(), false);
}
Copy the code

Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized);

The class is known by readClass, but the contents of the class are not fully known.

realizeClassWithoutSwift

Judging our own class by typing a breakpoint in _read_images, you can see that he just went through the realizeClassWithoutSwift method to study the association class, so take a closer look.

static Class realizeClassWithoutSwift(Class cls, Class previously)
{
    runtimeLock.assertLocked();

    class_rw_t *rw;
    Class supercls;
    Class metacls;

    if(! cls)return nil;
    if (cls->isRealized()) {
        validateAlreadyRealizedClass(cls);
        return cls;
    }
    ASSERT(cls == remapClass(cls));

    // fixme verify class is not in an un-dlopened part of the shared cache?

    auto ro = (const class_ro_t *)cls->data();
    auto isMeta = ro->flags & RO_META;
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.rw = cls->data(); ro = cls->data()->ro(); ASSERT(! isMeta); cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE); }else {
        // Normal class. Allocate writeable class data.
        rw = objc::zalloc<class_rw_t>();
        rw->set_ro(ro);
        rw->flags = RW_REALIZED|RW_REALIZING|isMeta;
        cls->setData(rw);
    }

    cls->cache.initializeToEmptyOrPreoptimizedInDisguise();

#if FAST_CACHE_META
    if (isMeta) cls->cache.setBit(FAST_CACHE_META);
#endif

    // Choose an index for this class.
    // Sets cls->instancesRequireRawIsa if indexes no more indexes are available
    cls->chooseClassArrayIndex();

    if (PrintConnecting) {
        _objc_inform("CLASS: realizing class '%s'%s %p %p #%u %s%s",
                     cls->nameForLogging(), isMeta ? " (meta)" : "", 
                     (void*)cls, ro, cls->classArrayIndex(),
                     cls->isSwiftStable() ? "(swift)" : "",
                     cls->isSwiftLegacy() ? "(pre-stable swift)" : "");
    }

    // Realize superclass and metaclass, if they aren't already.
    // This needs to be done after RW_REALIZED is set above, for root classes.
    // This needs to be done after class index is chosen, for root metaclasses.
    // This assumes that none of those classes have Swift contents,
    // or that Swift's initializers have already been called.
    // fixme that assumption will be wrong if we add support
    // for ObjC subclasses of Swift classes.
    supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);
    metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);

#if SUPPORT_NONPOINTER_ISA
    if (isMeta) {
        // Metaclasses do not need any features from non pointer ISA
        // This allows for a faspath for classes in objc_retain/objc_release.
        cls->setInstancesRequireRawIsa();
    } else {
        // Disable non-pointer isa for some classes and/or platforms.
        // Set instancesRequireRawIsa.
        bool instancesRequireRawIsa = cls->instancesRequireRawIsa();
        bool rawIsaIsInherited = false;
        static bool hackedDispatch = false;

        if (DisableNonpointerIsa) {
            // Non-pointer isa disabled by environment or app SDK version
            instancesRequireRawIsa = true;
        }
        else if(! hackedDispatch &&0 == strcmp(ro->getName(), "OS_object"))
        {
            // hack for libdispatch et al - isa also acts as vtable pointer
            hackedDispatch = true;
            instancesRequireRawIsa = true;
        }
        else if (supercls  &&  supercls->getSuperclass()  &&
                 supercls->instancesRequireRawIsa())
        {
            // This is also propagated by addSubclass()
            // but nonpointer isa setup needs it earlier.
            // Special case: instancesRequireRawIsa does not propagate
            // from root class to root metaclass
            instancesRequireRawIsa = true;
            rawIsaIsInherited = true;
        }

        if(instancesRequireRawIsa) { cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited); }}// SUPPORT_NONPOINTER_ISA
#endif

    // Update superclass and metaclass in case of remapping
    cls->setSuperclass(supercls);
    cls->initClassIsa(metacls);

    // Reconcile instance variable offsets / layout.
    // This may reallocate class_ro_t, updating our ro variable.
    if(supercls && ! isMeta) reconcileInstanceVariables(cls, supercls, ro);// Set fastInstanceSize if it wasn't set already.
    cls->setInstanceSize(ro->instanceSize);

    // Copy some flags from ro to rw
    if (ro->flags & RO_HAS_CXX_STRUCTORS) {
        cls->setHasCxxDtor();
        if(! (ro->flags & RO_HAS_CXX_DTOR_ONLY)) { cls->setHasCxxCtor(); }}// Propagate the associated objects forbidden flag from ro or from
    // the superclass.
    if ((ro->flags & RO_FORBIDS_ASSOCIATED_OBJECTS) ||
        (supercls && supercls->forbidsAssociatedObjects()))
    {
        rw->flags |= RW_FORBIDS_ASSOCIATED_OBJECTS;
    }

    // Connect this class to its superclass's subclass lists
    if (supercls) {
        addSubclass(supercls, cls);
    } else {
        addRootClass(cls);
    }

    // Attach categories
    methodizeClass(cls, previously);

    return cls;
}
Copy the code

So let’s break the point before we go in, print out ro, methodList has no data, which means that the data hasn’t been loaded into the class yet and when we first come in, we’re going to do else, which is copy the ro data to Rw, which is really from ro->rw

Toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot, toot. But there’s always no operation on ro and RW.

    supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);
    metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);
Copy the code

So far we want to see where ro rw is assigned, but we can’t print until return, so we guess that methodizeClass did something, and it’s not worth printing ro rw at the beginning of the method. (including the need for separate metaclasses here)

methodizeClass

In the case of a list, the system prepareMethodLists, and when you go in, you can see that it does a fixupMethodList method.

fixupMethodList(method_list_t *mlist, bool bundleCopy, bool sort){ runtimeLock.assertLocked(); ASSERT(! mlist->isFixedUp());// fixme lock less in attachMethodLists ?
    // dyld3 may have already uniqued, but not sorted, the list
    if(! mlist->isUniqued()) { mutex_locker_t lock(selLock);// Unique selectors in list.
        for (auto& meth : *mlist) {
            constchar *name = sel_cname(meth.name()); meth.setName(sel_registerNameNoLock(name, bundleCopy)); }}// Sort by selector address.
    // Don't try to sort small lists, as they're immutable.
    // Don't try to sort big lists of nonstandard size, as stable_sort
    // won't copy the entries properly.
    if(sort && ! mlist->isSmallList() && mlist->entsize() == method_t::bigSize) {method_t::SortBySELAddress sorter;
        std::stable_sort(&mlist->begin()->big(), &mlist->end()->big(), sorter);
    }
    
    // Mark method list as uniqued and sorted.
    // Can't mark small lists, since they're immutable.
    if (!mlist->isSmallList()) {
        mlist->setFixedUp();
    }
}
Copy the code

1. Set the sel name

2. Sort the methods by address

But I still can’t print out ro and RW.

lazy Class

RealizeClassWithoutSwift does not work by default, but it only works when the load method is implemented.

This is done when the class is loaded, but most classes are loaded when we need them. We don’t need to load them all at first, and we know that loading a class must go through the realizeClassWithoutSwift method. So where does this method go if it is loaded lazily?

In the realizeClassWithoutSwift method, print the BT stack information by setting the breakpoint of our class. Back to lookUpImpOrForward – > realizeClassMaybeSwiftMaybeRelock

🤔, when the method starts, after entering main, whenever the method starts, it ends up in a lookUpImpOrForward method, whether alloc or any other class method, that loads the first time the method is used.

RWE

Rwe is an extension of RW that adds dynamically loaded class information, including categories, and so on. Then we can write a category, compile the main file through xcRun, and find the underlying implementation of rWE. Then in the source code to find his specific way to achieve, that is, from the external Dally, internal breakthrough.

As you can see from the compilation, the class also has a structure. How does the structure of the class load into the main class?

So in methodizeClass, we have a comment that says Attach Categories, and then we call attachToClass, and then we call attachCategories in the attachToClass method, The rWE is assigned by calling extAllocIfNeeded.

Find the initiator by searching globally for attachCategories. In addition to attachToClass, load_categories_nolock indicates that the category was loaded after these two methods were applied. When were these two methods invoked? I’ll talk to you next time

Afterword.

Recently, I have been a little busy with my work, so I don’t have much time to summarize. Ah ~ class loading is still very important. It is a process of abstract and method finding.