1. Review

In the previous post, we have been able to link the whole process from dyld to _objc_init to read_images. Finally, we have identified that class initialization is in realizeClassWithoutSwift. This post will take a closer look at class loading.

IOS low-level exploration loading (I):read_images analysis

2.realizeClassWithoutSwift

In the READ_images process, some repair work is done on the class, and the name of the class is associated with the class, inserted into the mapping table, and updated to the in-memory class table. We don’t know what happens to RW and RO. When is the information about the class in the compiled MachO file inserted into the corresponding CLS in memory?

// Category discovery MUST BE Late to avoid potential races
    // when other threads call the new category code before
    // this thread finishes its fixups.

    // +load handled by prepare_load_methods()

    // Realize non-lazy classes (for +load methods and static instances)
    for (EACH_HEADER) {
        classref_t const *classlist = hi->nlclslist(&count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if(! cls)continue;

            addClassTableEntry(cls);

            if (cls->isSwiftStable()) {
                if (cls->swiftMetadataInitializer()) {
                    _objc_fatal("Swift class %s with a metadata initializer "
                                "is not allowed to be non-lazy",
                                cls->nameForLogging());
                }
                // fixme also disallow relocatable classes
                // We can't disallow all Swift classes because of
                // classes like Swift.__EmptyArrayStorage
            }
            realizeClassWithoutSwift(cls, nil); }}Copy the code

As you can see from some of the comments in the read_images source code, this is an initialization for non-lazily loaded classes. So what is a class that is not lazily loaded? That’s the +load method.

  • Non-lazily loaded classes are implemented+ load method.
  • throughnlclslistFunction to get a list of non-lazily loaded classes.
  • Recursively process the class to complete the initialization of the non-lazily loaded class.
  • addClassTableEntryAdd the class to the memory table.
  • realizeClassWithoutSwiftInitialize the class.

2.1 realizeClassWithoutSwift source

/*********************************************************************** * realizeClassWithoutSwift * Performs first-time initialization on class cls, * including allocating its read-write data. * Does not perform any Swift-side initialization. * Returns the real class structure for the class. * Locking: runtimeLock must be write-locked by the caller **********************************************************************/
static Class realizeClassWithoutSwift(Class cls, Class previously)
{
    runtimeLock.assertLocked();

    class_rw_t *rw;
    Class supercls;
    Class metacls;

    if(! cls)return nil;
    if (cls->isRealized()) {
        validateAlreadyRealizedClass(cls);
        return cls;
    }
    ASSERT(cls == remapClass(cls));

    // fixme verify class is not in an un-dlopened part of the shared cache?
	const char * className = "JPStudent";
	if (strcmp(class_getName(cls), className) == 0)
	{
		 printf("hello JPStudent...");
	}

    auto ro = (const class_ro_t *)cls->data();
    auto isMeta = ro->flags & RO_META;
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.rw = cls->data(); ro = cls->data()->ro(); ASSERT(! isMeta); cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE); }else {
        // Normal class. Allocate writeable class data.
        rw = objc::zalloc<class_rw_t>();
        rw->set_ro(ro);
        rw->flags = RW_REALIZED|RW_REALIZING|isMeta;
        cls->setData(rw);
    }

    cls->cache.initializeToEmptyOrPreoptimizedInDisguise();

#if FAST_CACHE_META
    if (isMeta) cls->cache.setBit(FAST_CACHE_META);
#endif

    // Choose an index for this class.
    // Sets cls->instancesRequireRawIsa if indexes no more indexes are available
    cls->chooseClassArrayIndex();

    if (PrintConnecting) {
        _objc_inform("CLASS: realizing class '%s'%s %p %p #%u %s%s",
                     cls->nameForLogging(), isMeta ? " (meta)" : "", 
                     (void*)cls, ro, cls->classArrayIndex(),
                     cls->isSwiftStable() ? "(swift)" : "",
                     cls->isSwiftLegacy() ? "(pre-stable swift)" : "");
    }

    // Realize superclass and metaclass, if they aren't already.
    // This needs to be done after RW_REALIZED is set above, for root classes.
    // This needs to be done after class index is chosen, for root metaclasses.
    // This assumes that none of those classes have Swift contents,
    // or that Swift's initializers have already been called.
    // fixme that assumption will be wrong if we add support
    // for ObjC subclasses of Swift classes.
    supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);
    metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);

#if SUPPORT_NONPOINTER_ISA
    if (isMeta) {
        // Metaclasses do not need any features from non pointer ISA
        // This allows for a faspath for classes in objc_retain/objc_release.
        cls->setInstancesRequireRawIsa();
    } else {
        // Disable non-pointer isa for some classes and/or platforms.
        // Set instancesRequireRawIsa.
        bool instancesRequireRawIsa = cls->instancesRequireRawIsa();
        bool rawIsaIsInherited = false;
        static bool hackedDispatch = false;

        if (DisableNonpointerIsa) {
            // Non-pointer isa disabled by environment or app SDK version
            instancesRequireRawIsa = true;
        }
        else if(! hackedDispatch &&0 == strcmp(ro->getName(), "OS_object"))
        {
            // hack for libdispatch et al - isa also acts as vtable pointer
            hackedDispatch = true;
            instancesRequireRawIsa = true;
        }
        else if (supercls  &&  supercls->getSuperclass()  &&
                 supercls->instancesRequireRawIsa())
        {
            // This is also propagated by addSubclass()
            // but nonpointer isa setup needs it earlier.
            // Special case: instancesRequireRawIsa does not propagate
            // from root class to root metaclass
            instancesRequireRawIsa = true;
            rawIsaIsInherited = true;
        }

        if(instancesRequireRawIsa) { cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited); }}// SUPPORT_NONPOINTER_ISA
#endif

    // Update superclass and metaclass in case of remapping
    cls->setSuperclass(supercls);
    cls->initClassIsa(metacls);

    // Reconcile instance variable offsets / layout.
    // This may reallocate class_ro_t, updating our ro variable.
    if(supercls && ! isMeta) reconcileInstanceVariables(cls, supercls, ro);// Set fastInstanceSize if it wasn't set already.
    cls->setInstanceSize(ro->instanceSize);

    // Copy some flags from ro to rw
    if (ro->flags & RO_HAS_CXX_STRUCTORS) {
        cls->setHasCxxDtor();
        if(! (ro->flags & RO_HAS_CXX_DTOR_ONLY)) { cls->setHasCxxCtor(); }}// Propagate the associated objects forbidden flag from ro or from
    // the superclass.
    if ((ro->flags & RO_FORBIDS_ASSOCIATED_OBJECTS) ||
        (supercls && supercls->forbidsAssociatedObjects()))
    {
        rw->flags |= RW_FORBIDS_ASSOCIATED_OBJECTS;
    }

    // Connect this class to its superclass's subclass lists
    if (supercls) {
        addSubclass(supercls, cls);
    } else {
        addRootClass(cls);
    }

    // Attach categories
    methodizeClass(cls, previously);

    return cls;
}
Copy the code

2.2 Processing of RO and RW

The address of the data obtained from machO is forced according to the class_RO_T format. The space of RW is initialized and a copy of ro data is put into RW.

auto ro = (const class_ro_t *)cls->data();
auto isMeta = ro->flags & RO_META;
// Check whether it is a metaclass
if (ro->flags & RO_FUTURE) {
    // This was a future class. rw data is already allocated.rw = cls->data(); ro = cls->data()->ro(); ASSERT(! isMeta); cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE); }else {
    // Normal class. Allocate writeable class data.
    rw = objc::zalloc<class_rw_t>();
    rw->set_ro(ro);
    rw->flags = RW_REALIZED|RW_REALIZING|isMeta;
    cls->setData(rw);
}

Copy the code
  • roBelong toclean memory, the memory space that is determined at the time of editing, read-only, and does not change after loading, including class name, method, protocol and instance variable information;
  • rwThe data space belongs todirty memory.rwIs a run-time structure that can be read and written, and because of its dynamic nature, you can add properties, methods, and protocols to a class. Memory that changes at run time.

Specific can go to see wwDC2020 did a very detailed description and analysis inside.

2.3 Class processing

Parent class and metaclass handling

    // Recursively, load the parent class, metaclass implementation
    supercls = realizeClassWithoutSwift(remapClass(cls->superclass), nil);
    metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);

Copy the code

For classes that support NONPOINTER_ISA or not, pointer optimization means that the end bit of Isa is 1. For some classes of metaclasses and special case scenarios, without turning on pointer optimization classes, use Raw Isa, where the end bit of Isa is 0.

#if SUPPORT_NONPOINTER_ISA
    if (isMeta) {
        // the metaclass isa isa pure pointer.
        cls->setInstancesRequireRawIsa();
    } else {
        //isa isa pure pointer, bit 13 in flags
        bool instancesRequireRawIsa = cls->instancesRequireRawIsa();
        bool rawIsaIsInherited = false;
        static bool hackedDispatch = false;
        // This is the OBJC_DISABLE_NONPOINTER_ISA configuration in the environment variable
        if (DisableNonpointerIsa) {
            // Non-pointer isa disabled by environment or app SDK version
            // after setting the environment variable to YES, isa isa pure pointer.
            instancesRequireRawIsa = true;
        }
        // The OS_object class is a pure pointer
        else if(! hackedDispatch &&0 == strcmp(ro->getName(), "OS_object"))
        {
            // hack for libdispatch et al - isa also acts as vtable pointer
            hackedDispatch = true;
            instancesRequireRawIsa = true;
        }
        // The parent class is a pure pointer, and the parent class also has a parent class. Then you should also be a pure pointer. RawIsaIsInherited means that a pure pointer is inherited
        else if (supercls  &&  supercls->getSuperclass()  &&
                 supercls->instancesRequireRawIsa())
        {

            instancesRequireRawIsa = true;
            rawIsaIsInherited = true;
        }
        // Recursively set ISA to a pure pointer and subclass to a pure pointer. (The parent class is a pure pointer, and the child class is also a pure pointer). RawIsaIsInherited just controls the printing.
        if(instancesRequireRawIsa) { cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited); }}// SUPPORT_NONPOINTER_ISA
#endif
Copy the code
  • Recursively instantiate the parent class and metaclass.

  • Determines whether isa isa pure pointer.

  • The metaclass isa isa pure pointer.

  • Specifies whether the isa isa pure pointer to the class.

  • A superclass is a pure pointer, and a superclass has a superclass. Then you should also be a pure pointer. RawIsaIsInherited (which just controls printing) means that a pure pointer is inherited.

  • Recursively sets ISA to a pure pointer, and subclasses to a pure pointer as well. (The parent class is a pure pointer, and the child class is also a pure pointer).

Associate the parent class with the metaclass


    // Associate the parent class with the metaclass. That's inheritance chain and ISA.
    cls->setSuperclass(supercls);
    cls->initClassIsa(metacls);
Copy the code

2.3 Code mode

Add the following code to position the mode

    const char * className = "JPStudent";
	if (strcmp(class_getName(cls), className) == 0)
	{
		 printf("hello JPStudent...");
	}
Copy the code

Set breakpoints to determine the changes of the DATA structure of RO in LLDB mode printing before and after RO initialization.

  • roPrint before assignment:
hello JPStudent... (lldb) p cls (Class) $3 = 0x00000001000084e0
(lldb) p cls
(Class) $4 = JPStudent
(lldb) p ro
(const class_ro_t *) $5 = 0x00007ffeefbff190
(lldb) p *$5
(const class_ro_t) $6 = {
  flags = 4022333872
  instanceStart = 32766
  instanceSize = 0
  reserved = 0
   = {
    ivarLayout = 0x0000000100008508 "\xe0\x84"
    nonMetaclass = JPStudent
  }
  name = {
    std::__1::atomic<const char* > ="\xe0\x84" {
      Value = 0x0000000100008508 "\xe0\x84"
    }
  }
  baseMethodList = 0x00007ffeefbff1e0
  baseProtocols = 0x00000001003269a0
  ivars = 0x00000001000084e0
  weakIvarLayout = 0x0000000100008508 "\xe0\x84"
  baseProperties = 0x000000010036d080
  _swiftMetadataInitializer_NEVER_USE = {}
}
Copy the code
  • roPrint after assignment:
(lldb) p ro
(const class_ro_t *) $9 = 0x00000001000081b8
(lldb) p *$9
(const class_ro_t) $10 = {
  flags = 0
  instanceStart = 8
  instanceSize = 24
  reserved = 0
   = {
    ivarLayout = 0x0000000000000000
    nonMetaclass = nil
  }
  name = {
    std::__1::atomic<const char* > ="JPStudent" {
      Value = 0x0000000100003d2d "JPStudent"
    }
  }
  baseMethodList = 0x0000000100008200
  baseProtocols = nil
  ivars = 0x0000000100008340
  weakIvarLayout = 0x0000000000000000
  baseProperties = 0x0000000100008388
  _swiftMetadataInitializer_NEVER_USE = {}
}
(lldb) p $10.baseMethodList
(void *const) $11 = 0x0000000100008200
(lldb) p *$11
(lldb) 
Copy the code

I’m just going to print it by mode, and I’m not going to print out the list of methods, but JPStudent has methods, which means at this point, I just have the ro data structure, which is just the address of an empty structure, and sel and IMP are not bound yet.

In realizeClassWithoutSwift, we finally call the following code, and you can see from the comment that it should be handling the classification, the CLS argument is fine, and the Previously called code is passed nil from _read_images

// Attach categories
methodizeClass(cls, previously);
Copy the code

Now, methodizeClass

3. MethodizeClass analysis

3.1 methodizeClass source

/***********************************************************************
* methodizeClass
* Fixes up cls's method list, protocol list, and property list.
* Attaches any outstanding categories.
* Locking: runtimeLock must be held by the caller
**********************************************************************/
static void methodizeClass(Class cls, Class previously)
{
    runtimeLock.assertLocked();

    bool isMeta = cls->isMetaClass();
    auto rw = cls->data();
    auto ro = rw->ro();
    auto rwe = rw->ext();

    // Methodizing for the first time
    if (PrintConnecting) {
        _objc_inform("CLASS: methodizing class '%s' %s", 
                     cls->nameForLogging(), isMeta ? "(meta)" : "");
    }

    // Install methods and properties that the class implements itself.
    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1.YES, isBundleClass(cls), nullptr);
        if (rwe) rwe->methods.attachLists(&list, 1);
    }

    property_list_t *proplist = ro->baseProperties;
    if (rwe && proplist) {
        rwe->properties.attachLists(&proplist, 1);
    }

    protocol_list_t *protolist = ro->baseProtocols;
    if (rwe && protolist) {
        rwe->protocols.attachLists(&protolist, 1);
    }

    // Root classes get bonus method implementations if they don't have 
    // them already. These apply before category replacements.
    if (cls->isRootMetaclass()) {
        // root metaclass
        addMethod(cls, @selector(initialize), (IMP)&objc_noop_imp, "".NO);
    }

    // Attach categories.
    if (previously) {
        if (isMeta) {
            objc::unattachedCategories.attachToClass(cls, previously,
                                                     ATTACH_METACLASS);
        } else {
            // When a class relocates, categories with class methods
            // may be registered on the class itself rather than on
            // the metaclass. Tell attachToClass to look for those.
            objc::unattachedCategories.attachToClass(cls, previously,
                                                     ATTACH_CLASS_AND_METACLASS);
        }
    }
    objc::unattachedCategories.attachToClass(cls, cls,
                                             isMeta ? ATTACH_METACLASS : ATTACH_CLASS);

#if DEBUG
    // Debug: sanity-check all SELs; log method list contents
    for (const auto& meth : rw->methods()) {
        if (PrintConnecting) {
            _objc_inform("METHOD %c[%s %s]", isMeta ? '+' : The '-', 
                         cls->nameForLogging(), sel_getName(meth.name()));
        }
        ASSERT(sel_registerName(sel_getName(meth.name())) == meth.name());
    }
#endif
}
Copy the code
  • Ro, RW, reWInitialization, metaclass judgment token
  • Get a list of methods
  • Get the property list
  • Obtaining the Protocol List
  • Whether root metaclass, root metaclass addedinitializemethods
  • Classified treatment

Rwe is null when methodizeClass is debugger. Rwe has not been created yet. The addition of methods, attributes, and protocols is also invalid.

3.2 prepareMethodLists

The prepareMethodLists method is called when method_list_t is processed. The core code is as follows:

// Add method lists to array.
    // Reallocate un-fixed method lists.
    // The new methods are PREPENDED to the method list array.

    for (int i = 0; i < addedCount; i++) {
        method_list_t *mlist = addedLists[i];
        ASSERT(mlist);

        // Fixup selectors if necessary
        if(! mlist->isFixedUp()) { fixupMethodList(mlist, methodsFromBundle,true/*sort*/); }}Copy the code
  • The value of addedCount is 1, and the type of addedLists is **. So mlist is the list of ro.

  • If you don’t sort it, fix it, and sort ro’s methodLists again.

  • fixupMethodList

The name of the method (SEL) has been modified to correspond to binary search for slow message lookup.

  • Sorting validation

Print the pre-sort and post-sort methods with the following code

for (auto& meth : *mlist) {
		 const char *name = sel_cname(meth.name());
	printf("After sorting: %s-- %p\n",name,meth.name());
	}
Copy the code

The strange thing is that the order before and after sorting is the same, I guess the order is already sorted, there is no need to sort again, or it is already sorted when adding.

4. Exploration of classification

There is no RWE data in prepareMethodLists, so no subsequent attachLists operations will be performed. According to WWDC, RWE appears when there is a classification, so go ahead and create one.

4.1 Underlying structure of classification

Use clang to view the classification structure

    struct _category_t {
    const char *name;
    struct _class_t *cls;
    const struct _method_list_t *instance_methods;
    const struct _method_list_t *class_methods;
    const struct _protocol_list_t *protocols;
    const struct _prop_list_t *properties;
};
Copy the code
  • A classification is also a structural type.
  • nameThe name of should be the category name.
  • clsPointing to the class.
  • There is only one in the classmethodsWe have it in the categoryinstance_methodswithclass_methods. Because classes have no metaclasses (i.e., no metaclasses).
  • There is in the categoryproperties.

It is also available through the Xcode help documentation

Let’s go to objc source code and search for objc_category to verify that

That’s true! So it also verifies that the bottom of the classification is the structure!

4.2 Exploration of classified source code

Through the analysis of methodizeClass source code, the processing core logic of classification is in attachLists and attachToClass. Rwe is assigned from rw->ext() as follows:

auto rwe = rw->ext();
Copy the code
  • **ext()**
class_rw_ext_t *ext() const {
    return get_ro_or_rwe().dyn_cast<class_rw_ext_t *>(&ro_or_rw_ext);
}

class_rw_ext_t *extAllocIfNeeded() {
    / / for rwe
    auto v = get_ro_or_rwe();
    if (fastpath(v.is<class_rw_ext_t *>())) {
        return v.get<class_rw_ext_t *>(&ro_or_rw_ext);
    } else {
        / / create rwe
        return extAlloc(v.get<constclass_ro_t *>(&ro_or_rw_ext)); }}Copy the code

The rWE was created in extAllocIfNeeded. By searching the objC source code for calls to extAllocIfNeeded, we found that the calls in the attachCategories method fit. The call logic for attachCategories is in attachToClass and load_categories_NOLock.

There are therefore two lines for the classification to load:

  • methodizeClass --> attachToClass --> attachCategories
  • load_images --> loadAllCategories --> load_categories_nolock --> attachCategories

5. To summarize

  • Lazy-loaded and non-lazy-loaded classes: indicates whether the current class is implementedloadmethods
  • Non-lazy class loadingmap_imagesWhen all class data is loaded _getObjc2NonlazyClassList --> readClass -- > realizeClassWithoutSwift --> methodizeClass
  • In the lazy load class case, data loading is delayed until the first message,lookUpImpOrForward --> realizeClassMaybeSwiftMaybeRelock -- > realizeClassWithoutSwift --> methodizeClass

More to come

🌹 just like it 👍🌹

🌹 feel learned, can come a wave, collect + concern, comment + forward, lest you can’t find me next 😁🌹

🌹 welcome everyone to leave a message to exchange, criticize and correct, learn from each other 😁, improve self 🌹