preface

The whole process of _objC_init ->map_images->_read_images is analyzed. Finally, it is found that the class loading is completed in realizeClassWithoutSwift function. This paper will continue the above analysis.

The preparatory work

  • Dyld source code.
  • Objc4-818.2 – the source code.

A:realizeClassWithoutSwiftfunction

Classes that implement the +load method enter this function in the _read_images function, which loads the class and recursively loads its parent and metaclass, including allocating read and write data, and returns the real class structure of the class.

static Class realizeClassWithoutSwift(Class cls, Class previously)
{

    class_rw_t*rw; Class supercls; Class metacls; .1. Generate RW data
    auto ro = (const class_ro_t *)cls->data(a);auto isMeta = ro->flags & RO_META;
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.
        rw = cls->data(a); ro = cls->data() - >ro(a);ASSERT(! isMeta); cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
    } else {
        // Normal class. Allocate writeable class data.
        rw = objc::zalloc<class_rw_t> (); rw->set_ro(ro);
        rw->flags = RW_REALIZED|RW_REALIZING|isMeta;
        cls->setData(rw);
    }
    
    // 2. Cache initialization
    cls->cache.initializeToEmptyOrPreoptimizedInDisguise(a);#if FAST_CACHE_META
    // metaclass processing
    if (isMeta) cls->cache.setBit(FAST_CACHE_META);
#endif
    
    // Choose an index for this class.
    // Sets cls->instancesRequireRawIsa if indexes no more indexes are available
    // Designed for 32 bits, objc_indexed_classes_count is index
    // Store it in objc_indexed_classes (where index-CLS is recorded)
    // Whether isa is pure pointer
    cls->chooseClassArrayIndex(a); .// 3. Associate the parent and metaclass
    supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);
    metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);

#if SUPPORT_NONPOINTER_ISA
    if (isMeta) {
        // Metaclasses do not need any features from non pointer ISA
        // This allows for a faspath for classes in objc_retain/objc_release.
        cls->setInstancesRequireRawIsa(a); }else {
        // Disable non-pointer isa for some classes and/or platforms.
        // Set instancesRequireRawIsa.
        bool instancesRequireRawIsa = cls->instancesRequireRawIsa(a);bool rawIsaIsInherited = false;
        static bool hackedDispatch = false;

        if (DisableNonpointerIsa) {
            // Non-pointer isa disabled by environment or app SDK version
            instancesRequireRawIsa = true;
        }
        else if(! hackedDispatch &&0= =strcmp(ro->getName(), "OS_object"))
        {
            // hack for libdispatch et al - isa also acts as vtable pointer
            hackedDispatch = true;
            instancesRequireRawIsa = true;
        }
        else if (supercls  &&  supercls->getSuperclass()  &&
                 supercls->instancesRequireRawIsa())
        {
            // This is also propagated by addSubclass()
            // but nonpointer isa setup needs it earlier.
            // Special case: instancesRequireRawIsa does not propagate
            // from root class to root metaclass
            instancesRequireRawIsa = true;
            rawIsaIsInherited = true;
        }

        if (instancesRequireRawIsa) {
            cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited); }}// SUPPORT_NONPOINTER_ISA
#endif

    // Update superclass and metaclass in case of remapping
    cls->setSuperclass(supercls);
    cls->initClassIsa(metacls);
    
    // 4. Adjust ivars
    // Reconcile instance variable offsets / layout.
    // This may reallocate class_ro_t, updating our ro variable.
    if(supercls && ! isMeta)reconcileInstanceVariables(cls, supercls, ro);

    // Set fastInstanceSize if it wasn't set already.
    cls->setInstanceSize(ro->instanceSize);
    
    // 5. Synchronize the flags bit
    // Copy some flags from ro to rw
    if (ro->flags & RO_HAS_CXX_STRUCTORS) {
        cls->setHasCxxDtor(a);if (! (ro->flags & RO_HAS_CXX_DTOR_ONLY)) {
            cls->setHasCxxCtor();
        }
    }

    // Propagate the associated objects forbidden flag from ro or from
    // the superclass.
    if ((ro->flags & RO_FORBIDS_ASSOCIATED_OBJECTS) ||
        (supercls && supercls->forbidsAssociatedObjects()))
    {
        rw->flags |= RW_FORBIDS_ASSOCIATED_OBJECTS;
    }
    
    // 6
    // Connect this class to its superclass's subclass lists
    if (supercls) {
        addSubclass(supercls, cls);
    } else {
        addRootClass(cls);
    }
    
    // 7. Attach categories
    // Attach categories
    methodizeClass(cls, previously);
    
    return cls;
}
Copy the code
  1. throughrogeneraterwThe data.
  2. cacheInitialization.
  3. Associate a parent class with a metaclass. Superclasses and metaclasses are loaded recursively and then associated with the current class.
  4. Adjust theivarstheoffsets/layout.
  5. synchronousflagsFlag bit torw.
  6. The inheritance chain of a concatenated class.
  7. Classes attach logic to the main class inmethodizeClassIn the function, the following is analyzed separately.

1.1: to generaterwdata

    auto ro = (const class_ro_t *)cls->data(a);auto isMeta = ro->flags & RO_META;
    

    // Start debugging code
    const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "HPObject") = =0 && !isMeta) {
        printf("%s %s\n",__func__,mangledName);
    }
    // End of debugging code

    
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.
        // Messy class (memory moved, but not deleted), not lazy load implementation
        // RW data is already implemented
        rw = cls->data(a); ro = cls->data() - >ro(a);ASSERT(! isMeta); cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
    } else {
        // Normal class. Allocate writeable class data.
        // Common class, rW data is not allocated, open up RW space
        rw = objc::zalloc<class_rw_t> ();// Store ro data into RW. This is why data()->ro() is used.
        rw->set_ro(ro);
        Uint32_t 1<<31 1<<19 1<<0
        rw->flags = RW_REALIZED|RW_REALIZING|isMeta;
        // Set the rW data, this time the data() is the rW data.
        cls->setData(rw);
    }
Copy the code
  • First of all bycls->data()To obtainroData prior to class principles explored in the chapterdata()Access isrwData, that’s because at this pointrwIt’s not assigned yet, it’s just frommachoName in the file__objc_classlistor__objc_nlclslistthesectionOf the corresponding classdataRead data stored indata()In the. This can be verified before and after assignmentcls->data()The address.
  • Chaotic classes:
    • Messy classes are resetrwandroAnd callchangeInfoFunction modificationrwtheflags;
  • Normal class:
    • Ordinary classes are opened uprwSpace,roDeposit torwOnly the address is givenrwApple has optimized this area accordingly.)
    • Set up therwtheflags.flagsIs auint32_tType, the first31Bit indicates whether instantiation is complete (RW_REALIZED), the first19Bit indicates whether (RW_REALIZING), the first0Bits indicate whether a metaclass (metaclass is1, the non-metaclasses are0).
    • willrwdepositdataIn the.

⚠️ Because metaclass and class have the same name, both metaclass and class will be loaded into realizeClassWithoutSwift function, so judgment isMeta is added in the debugging code part to prevent metaclass from affecting our debugging.

Ro data is already generated during the LLVM compilation phase.

Class_ro_t is defined in LLVM.

Class_ro_t ::read

  • According to theclass_ro_tThe member variables of the structure are computedsize.
  • According to incomingaddrTo obtainsizeThe size ofbuffer.
  • throughbufferextractaddressData in.
  • rightclass_ro_tStructure member variables are assigned.

The class_ro_t::Read function is called by Read_class_row.

LLDB debugging verifies that ro data is available when the class is loaded into memory:

(lldb) p ro
(const class_ro_t *) $0 = 0x00000001000080a8
(lldb) p *$0
(const class_ro_t) $1 = {
  flags = 128
  instanceStart = 8
  instanceSize = 8
  reserved = 0
   = {
    ivarLayout = 0x0000000000000000
    nonMetaclass = nil
  }
  name = {
    std::__1::atomic<const char* > ="HPObject" {
      Value = 0x0000000100003f2e "HPObject"
    }
  }
  baseMethodList = 0x00000001000080f0
  baseProtocols = 0x0000000000000000
  ivars = 0x0000000000000000
  weakIvarLayout = 0x0000000000000000
  baseProperties = 0x0000000000000000
  _swiftMetadataInitializer_NEVER_USE = {}
}
(lldb) p $1.baseMethodList
(void *const) $2 = 0x00000001000080f0
(lldb) p *$2
(lldb) p $1.baseMethods()
(method_list_t *) $3 = 0x00000001000080f0
(lldb) p *$3
(method_list_t) $4 = {
  entsize_list_tt<method_t.method_list_t.4294901763.method_t::pointer_modifier> = (entsizeAndFlags = 24, count = 1)
}
(lldb) p $4.get(0).big()
(method_t::big) $5 = {
  name = "instanceMethod"
  types = 0x0000000100003f57 "v16@0:8"
  imp = 0x0000000100003ea0 (HPObjcTest`-[HPObject instanceMethod])
}
Copy the code
  • To obtainroAfter the outputbaseMethodListWithout displaying any data, call insteadbaseMethods()Function fetch.
  • And then callget(0).big()Gets the details of the method.

1.1.1: cls->data()To obtainroData exploration

Let’s first look at the definition of the data() function:

Struct objc_class data() function definition.

class_rw_t *data(a) const {
    return bits.data(a); }Copy the code
  • What’s actually called isstruct class_data_bits_tIn thedata()Function.
class_rw_t* data(a) const {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}
Copy the code
  • bits & FAST_DATA_MASKAnd then the strong transitionclass_rw_t *To return.

In the uintptr_t (unsigned long) uintptr_t (unsigned long) is the decimal representation of the address of the pointer, and the FAST_DATA_MASK is 0x00007FFFFFFFF8ul. It is then strongly converted to the class_rw_T pointer to return.

It then returns to realizeClassWithoutSwift, forcing the pointer class_ro_t (class_rw_t contains class_ro_t, so it can be strong).

So how did it become a class_ro_t structure if it returned a pointer?

The class_ro_t structure pointer points to the first address of the structure, and the read and write values are read and written to the structure’s member variables one by one through address offset. If the pointer type does not match the data structure type, parsing fails.

1.1.2: Pointer restore data structure validation

Verify this by restoring the objc_method structure with the Method pointer:

As we all know, we can retrieve methods through the Runtime API, but actually return a pointer to the method.

Method *me = class_getInstanceMethod([XJPerson class], @selector(loveEveryone));
Copy the code

Method has two underlying definitions:

// Non-source environments can also be seen
typedef struct objc_method *Method;

struct objc_method {
    SEL _Nonnull method_name                                 OBJC2_UNAVAILABLE;
    char * _Nullable method_types                            OBJC2_UNAVAILABLE;
    IMP _Nonnull method_imp                                  OBJC2_UNAVAILABLE;
};

// Only the source environment can see it
typedef struct method_t *Method;

// omit methods and static member variables
struct method_t {.struct big {
        SEL name;
        const char*types; MethodListIMP imp; }; .struct small {
        RelativePointer<const void *> name;
        RelativePointer<const char*> types; RelativePointer<IMP> imp; }; . }Copy the code
  • althoughobjc_methodThe member variables of the structure are indicatedOBJC2_UNAVAILABLEBut it is a data type that Apple gives us in the non-source code and can also be used for testing.
  • method_tBecause the structure is more complex, we can directly simplify to test.

  • You can see that the corresponding structure data is successfully resolved from the pointer.

So ro data is obtained by parsing the class_RO_T structure after reading the corresponding address pointer from macho file.

Method_t definition in LLVM:

Method_t ::read

  • To calculate thesize.
  • According to incomingaddrTo obtainsizeThe size ofbuffer.
  • throughbufferextractaddressData in.
  • rightmethod_tStructure member variables are assigned.

The method_t::Read function is called by Describe.

1.2: cacheInitialize the

The cache is initialized to an empty cache.

    cls->cache.initializeToEmptyOrPreoptimizedInDisguise(a);Copy the code
inline void initializeToEmptyOrPreoptimizedInDisguise(a) { initializeToEmpty(a); }void cache_t::initializeToEmpty(a)
{
    _bucketsAndMaybeMask.store((uintptr_t)&_objc_empty_cache, std::memory_order_relaxed);
    _originalPreoptCache.store(nullptr, std::memory_order_relaxed);
}
Copy the code
struct objc_cache _objc_empty_cache =
{
    0.// mask
    0.// occupied
    { NULL// buckets
};
Copy the code
  • The final call tocache_t::initializeToEmptyFunction willcacheInitialize to empty cache.

1.3: Associate parent and metaclass

    // Load the parent and metaclass
    supercls = realizeClassWithoutSwift(remapClass(cls->getSuperclass()), nil);
    metacls = realizeClassWithoutSwift(remapClass(cls->ISA()), nil);

#if SUPPORT_NONPOINTER_ISA
    if (isMeta) {
        The metaclass isa isa pure pointer
        cls->setInstancesRequireRawIsa(a); }else {
        // whether isa is pure pointer, the 13th bit in flags
        bool instancesRequireRawIsa = cls->instancesRequireRawIsa(a);bool rawIsaIsInherited = false;
        static bool hackedDispatch = false;
        // This is OBJC_DISABLE_NONPOINTER_ISA configured in the environment variable
        if (DisableNonpointerIsa) {
            // Non-pointer isa disabled by environment or app SDK version
            // isa isa pure pointer after setting the environment variable to YES.
            instancesRequireRawIsa = true;
        }
        // OS_object is a pure pointer
        else if(! hackedDispatch &&0= =strcmp(ro->getName(), "OS_object"))
        {
            // hack for libdispatch et al - isa also acts as vtable pointer
            hackedDispatch = true;
            instancesRequireRawIsa = true;
        }
        // The parent class is a pure pointer, and the parent class has a parent class. So you have to be a pure pointer.
        // rawIsaIsInherited indicates that inherited is a pure pointer
        else if (supercls  &&  supercls->getSuperclass()  &&
                 supercls->instancesRequireRawIsa())
        {
            instancesRequireRawIsa = true;
            rawIsaIsInherited = true;
        }
        // Recursively set ISA to pure, and subclasses to pure. (The parent class is pure pointer, and the child class is also pure pointer)
        // rawIsaIsInherited controls printing only
        if (instancesRequireRawIsa) {
            cls->setInstancesRequireRawIsaRecursively(rawIsaIsInherited); }}// SUPPORT_NONPOINTER_ISA
#endif

    // Associate the parent class with the metaclass. So inheritance chain and ISA pointing chain
    cls->setSuperclass(supercls);
    cls->initClassIsa(metacls);
Copy the code
  • Load the parent and metaclass recursively.
  • Judge setisaWhether it is a pure pointer.
    • The metaclassisaIs a pure pointer.
    • Of the classisaWhether it’s a pure pointer dependsflagsThe first13position
    • DisableNonpointerIsaThat isOBJC_DISABLE_NONPOINTER_ISAEnvironment variable configuration to specifyisaWhether it is a pure pointer.
    • OS_objecttheisaIs a pure pointer.
    • The parent classisaIt’s a pure pointer, and the parent class has a parent class. Then this class should also be pure Pointers.rawIsaIsInherited(used only to control print statements) indicates that pure Pointers are inherited.
    • Recursively set subclassesisaIs a pure pointer (parent classisaIs a pure pointer, subclassisaAlso for pure Pointers).
  • Associate a parent class with a metaclass. That’s the inheritance chain and theisaPoint to the chain.

1.4: to adjustivarstheoffsets/layout

    // Adjust ivar offset, possibly recreate 'class_ro_t' to update ivar
    if(supercls && ! isMeta)reconcileInstanceVariables(cls, supercls, ro);

    // Set the size of a member variable
    cls->setInstanceSize(ro->instanceSize);
Copy the code
  • Call if there is a parent class and it is not a metaclassreconcileInstanceVariablesFunction coordinates member variable offset/layout.
  • callsetInstanceSizeFunction resetinstanceSizeSize, and if not already setcachethefastInstanceSize, set it.

The interpretation of member variables will be analyzed in the next article, which focuses on method loading.

1.5: rosynchronousflagsFlag bit torw

// copy flags for ro to rW
// flags the second c++ constructor/destructor RO_HAS_CXX_STRUCTORS,
// flags bit 8 only the c++ destructor RO_HAS_CXX_DTOR_ONLY
if (ro->flags & RO_HAS_CXX_STRUCTORS) {
    // At least there are destructors
    cls->setHasCxxDtor(a);// Not just destructors
    if (! (ro->flags & RO_HAS_CXX_DTOR_ONLY)) {
        // Constructors are also available
        cls->setHasCxxCtor();
    }
}

// Whether to disallow associated objects the 20th bit of the flag disallows associated objects.
if ((ro->flags & RO_FORBIDS_ASSOCIATED_OBJECTS) ||
    (supercls && supercls->forbidsAssociatedObjects()))
{
    rw->flags |= RW_FORBIDS_ASSOCIATED_OBJECTS;
}
Copy the code
  • copyrotheflagstorw(C++Construct and destructorflagIs placed oncache).
    • RO_HAS_CXX_STRUCTORS.flagsThe first2positionC++Constructor/destructor.
    • RO_HAS_CXX_DTOR_ONLY.flagsThe first8A, onlyC++Destructor.
  • If the associated object of this class or its parent class is prohibited, the associated object of a subclass is prohibitedRW_FORBIDS_ASSOCIATED_OBJECTSThe tag.
    • RW_FORBIDS_ASSOCIATED_OBJECTS.flagsThe first20Bit to disallow associated objects.

1.6: Inheritance chain of concatenated classes

if (supercls) {
    // Associate subclasses
    addSubclass(supercls, cls);
} else {
    // Set the root nextSiblingClass to _firstRealizedClass the root class is the first class to be instantiated.
    addRootClass(cls);
}
Copy the code
  • If the class has a parent, connect the class to the subclass list of its parent.
    • Sets the adjacent classes of this class.
    • This class inherits from the parent classC++Constructor/destructorflag.
    • This class inherits from the parent classRW_NOPREOPT_CACHEandRW_NOPREOPT_SELStheflag.
    • This class inherits from the parent classFAST_CACHE_REQUIRES_RAW_ISA/RW_REQUIRES_RAW_ISAtheflag.
  • If this class has no parent, set it to the new root class.

1.6.1.addSubclass

static void addSubclass(Class supercls, Class subcls)
{...if (supercls  &&  subcls) {
    ...
        objc_debug_realized_class_generation_count++;
        / / class
        subcls->data()->nextSiblingClass = supercls->data()->firstSubclass;
        // The first subclass
        supercls->data()->firstSubclass = subcls;
        // class inherits flag from parent C++ constructor/destructor
        if (supercls->hasCxxCtor()) {
            subcls->setHasCxxCtor(a); }if (supercls->hasCxxDtor()) {
            subcls->setHasCxxDtor();
        }

        ...
        
        // This class inherits the 'flag' of its parent class 'RW_NOPREOPT_CACHE' and 'RW_NOPREOPT_SELS'
        if(! supercls->allowsPreoptCaches()) {
            subcls->setDisallowPreoptCachesRecursively(__func__);
        } else if(! supercls->allowsPreoptInlinedSels()) {
            subcls->setDisallowPreoptInlinedSelsRecursively(__func__);
        }

        // Special case: instancesRequireRawIsa does not propagate
        // from root class to root metaclass
        // This class inherits the 'flag' of its parent 'FAST_CACHE_REQUIRES_RAW_ISA'/' RW_REQUIRES_RAW_ISA '
        // The root metaclass does not inherit
        if (supercls->instancesRequireRawIsa()  &&  supercls->getSuperclass()) {
            subcls->setInstancesRequireRawIsaRecursively(true); }}}Copy the code
  • Sets an adjacent class of this class to be the first subclass of its parentnil).
  • Sets this class to be the first subclass of its parent class.
  • This class inherits from the parent classC++Constructor/destructorflag.
  • This class inherits from the parent classRW_NOPREOPT_CACHEandRW_NOPREOPT_SELStheflag(Shared cache andinlined sels).
  • This class inherits from the parent classFAST_CACHE_REQUIRES_RAW_ISA/RW_REQUIRES_RAW_ISAtheflag(isaIs it a pure pointer?). The root metaclass does not inherit.

1.6.2: addRootClass

static void addRootClass(Class cls)
{... objc_debug_realized_class_generation_count++;// The adjacent class of this class is set to be the first class loaded (nil)
    cls->data()->nextSiblingClass = _firstRealizedClass;
    // The first class to load is set to itself
    _firstRealizedClass = cls;
}
Copy the code
  • If this class has no parent, set its neighbor to nil and set this class to the first loaded class, the root class.

2:methodizeClass

The realizeClassWithoutSwift function finally calls methodizeClass to sort the list of main class methods and load the classification.

    // Attach categories
    methodizeClass(cls, previously);
Copy the code
  • The previously parameter is nil passed from _read_images.

        realizeClassWithoutSwift(cls, nil);
    Copy the code

MethodizeClass function core logic is as follows:

static void methodizeClass(Class cls, Class previously)
{
    runtimeLock.assertLocked(a);bool isMeta = cls->isMetaClass(a);auto rw = cls->data(a);auto ro = rw->ro(a);// rwe normally gets NULL because extAllocIfNeeded has not been called to create rWE
    auto rwe = rw->ext(a); .// Install methods and properties that the class implements itself.
    // Install the methods and properties and protocols implemented by the class itself
    // Lists of methods, attributes, and protocols are not normally added to rWE here,
    // Because rwe was added when we created it
    
    // Get the list of methods for ro
    method_list_t *list = ro->baseMethods(a);if (list) {
        // isBundleClass, which determines whether the class is in an unloadable bundle and must not be set by the compiler
        // RO_FROM_BUNDLE = # 29
        // Sort the list of methods by sel address
        prepareMethodLists(cls, &list, 1, YES, isBundleClass(cls), nullptr);
        // If there is an RWE, add the list of methods for the main class to rWE's methods
        if (rwe) rwe->methods.attachLists(&list, 1);
    }
    
    // Get the property list of ro
    property_list_t *proplist = ro->baseProperties;
    if (rwe && proplist) {
        // Rwe exists and proplist is not NULL
        // Add the main class property list to rWE properties
        rwe->properties.attachLists(&proplist, 1);
    }
    
    // Get the protocol list of ro
    protocol_list_t *protolist = ro->baseProtocols;
    if (rwe && protolist) {
        // RWE exists and the protocol list is not NULL
        // Add the main class's list of protocols to RWE's protocols
        rwe->protocols.attachLists(&protolist, 1);
    }

    // Root classes get bonus method implementations if they don't have
    // them already. These apply before category replacements.
    // If it is the root metaclass, add the initialize method
    if (cls->isRootMetaclass()) {
        // root metaclass
        addMethod(cls, @selector(initialize), (IMP)&objc_noop_imp, "", NO);
    }

    // Attach categories.
    Class append to main class previously == nil
    if (previously) {
        if (isMeta) {
            objc::unattachedCategories.attachToClass(cls, previously, ATTACH_METACLASS);
        } else {
            // When a class relocates, categories with class methods
            // may be registered on the class itself rather than on
            // the metaclass. Tell attachToClass to look for those.
            objc::unattachedCategories.attachToClass(cls, previously, ATTACH_CLASS_AND_METACLASS); }}// Do not walk without sorting
    objc::unattachedCategories.attachToClass(cls, cls, isMeta ? ATTACH_METACLASS : ATTACH_CLASS); . }Copy the code
  • rweHere, we usually get zeroNULLBecause it has not been called yetextAllocIfNeededFunction createsrwe.
  • Main class methods, properties, protocol handling.
    • Main class method list revises and sorts (pressselAddress sort).
    • There arerweThe following process is executed, and usually not.
      • The method list of the main class is added torwethemethods.
      • Add the property list of the main class torwetheproperties.
      • The protocol list of the main class is added torwetheprotocols.
  • If it is the root metaclass, add itinitializeMethods.
  • Classification ofattachToClassTo deal with. (This class is not executed if there is no classification).

2.1: prepareMethodLists

// The main class method sort and pass parameters
// cls, &list, 1, YES, isBundleClass(cls), nullptr
Mlists contain method_list_t *
// cls, mlists, mcount, NO, fromBundle, __func__
static void 
prepareMethodLists(Class cls, method_list_t **addedLists, int addedCount,
                   bool baseMethods, bool methodsFromBundle, const char *why)
{...// Add method lists to array.
    // Reallocate un-fixed method lists.
    // The new methods are PREPENDED to the method list array.
    
    // addedCount is 1 when the main class method is sorted, and the classification method is sorted depending on the number of loads
    for (int i = 0; i < addedCount; i++) {
        method_list_t *mlist = addedLists[i];
        ASSERT(mlist);

        // Fixup selectors if necessary
        // If not sorted, fix and sort
        if(! mlist->isFixedUp()) {
            // Fix and sort
            fixupMethodList(mlist, methodsFromBundle, true/*sort*/); }}... }Copy the code
  • isBundleClassTo determine if the class is in an unloadable bundle that must not be set by the compiler.RO_FROM_BUNDLE, 29th.
  • addedListsfor支那Type. thenmlistA pointer to a list of main class methods or a list of classification methods.
  • If not sorted, fix and sortro->baseMethods().

2.1.1: fixupMethodList

static void 
fixupMethodList(method_list_t *mlist, bool bundleCopy, bool sort)
{
    runtimeLock.assertLocked(a);ASSERT(! mlist->isFixedUp());

    // fixme lock less in attachMethodLists ?
    // dyld3 may have already uniqued, but not sorted, the list
    if(! mlist->isUniqued()) {
        mutex_locker_t lock(selLock);
    
        // Unique selectors in list.
        for (auto& meth : *mlist) {
            // SEL is converted to the name string
            const char *name = sel_cname(meth.name());
            // Set name and address to meth
            printf("Before setName: %s address: %p\n",name,meth.name());
            // Set SEL, SEL may end up calling the value of _dyLD_GET_objc_selector in __sel_registerName,
            // Equivalent to dyLD's.
            meth.setName(sel_registerNameNoLock(name, bundleCopy));
            printf("After setName: %s address: %p\n",name,meth.name()); }}// Sort by selector address.
    // Don't try to sort small lists, as they're immutable.
    // Don't try to sort big lists of nonstandard size, as stable_sort
    // won't copy the entries properly.
    // sort by SEL address. Samll lists are immutable and not sorted.
    // This corresponds to binary lookup for slow message lookup. I'm sorting here.
    if(sort && ! mlist->isSmallList() && mlist->entsize() = =method_t::bigSize) {
        method_t::SortBySELAddress sorter;
        std::stable_sort(&mlist->begin() - >big(), &mlist->end() - >big(), sorter);
    }

    // Mark method list as uniqued and sorted.
    // Can't mark small lists, since they're immutable.
    // Mark the list of methods as corrected and sorted
    if(! mlist->isSmallList()) {
        mlist->setFixedUp();
    }
}
Copy the code
  • On the firstmethodtheSELIt was corrected.sel_registerNameNoLock -> __sel_registerName -> search_builtins->_dyld_get_objc_selector. Equivalent todyldThe shall prevail.
  • And then toSELThe address of themethodListTo sort, this corresponds to binary lookup in the slow message lookup process. The sorting is done here.
  • Finally, mark the list of methods as corrected and sorted.

Small lists are not sorted because they are immutable.

Before and after SEL correction:

Sort verification:

Three: classification and exploration

MethodizeClass (methodizeClass) ¶ methodizeClass (methodizeClass) ¶ methodizeClass (methodizeClass) ¶ methodizeClass (methodizeClass) ¶ methodizeClass (methodizeClass) ¶ According to the previous WWDC introduction, WE know that rWE data will exist in the case of classification or dynamic addition of methods, attributes and protocols to classes, so we added a classification to explore.

@interface XJPerson (CatA)

@property (nonatomic, copy) NSString *cata_Name;

@property (nonatomic, assign) int cata_age;

- (void)cata_instanceMethod1;

- (void)cata_instanceMethod2;

+ (void)cata_classMethod1;

@end

@implementation XJPerson (CatA)

- (void)cata_instanceMethod1
{
    NSLog(@"%s", __func__);
}

- (void)cata_instanceMethod2
{
    NSLog(@"%s", __func__);
}

+ (void)cata_classMethod1
{
    NSLog(@"%s", __func__);
}

@end
Copy the code

3.1: clangRestore the underlying code exploration

In order to facilitate the study of the realization principle of classification, clang is first used to explore the implementation of the lower layer.

Compile.m files into.cpp using the clang directive.

clang -rewrite-objc main.m -o main.cpp
Copy the code

In the.cpp file, search for the class name CatA and find the following definition at the very end of the file:

static struct _category_t *L_OBJC_LABEL_CATEGORY_$[1] __attribute__((used.section(" __DATA, __objc_catlist.regular.no_dead_strip"))) = {
    &_OBJC_$_CATEGORY_XJPerson_$_CatA,
};
Copy the code
  • _OBJC_$_CATEGORY_XJPerson_$_CatAis_category_tStruct type of.

Then search for _category_t:

struct _category_t {
    const char *name;
    struct _class_t *cls;
    const struct _method_list_t *instance_methods;
    const struct _method_list_t *class_methods;
    const struct _protocol_list_t *protocols;
    const struct _prop_list_t *properties;
};
Copy the code
  • Classification is also structure type at the bottom.
  • Member variablesnameIt should be the name of the categoryCatA.
  • clsPointing to the class.
  • There’s only one classmethods(List of methods), but categories doinstance_methodsandclass_methodsThis is because the classification has no metaclasses (that is, no sub-metaclasses).
  • Classification hasprotocolsandproperties.

_OBJC_ $_CATEGORY_XJPerson_ $_CatA definition:

static struct _category_t _OBJC_The $_CATEGORY_XJPerson_The $_CatA __attribute__ ((used.section(" __DATA, __objc_const"))) = 
{
    "XJPerson".0.// &OBJC_CLASS_$_XJPerson,
    (const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_XJPerson_$_CatA,
    (const struct _method_list_t *)&_OBJC_$_CATEGORY_CLASS_METHODS_XJPerson_$_CatA,
    0,
    (const struct _prop_list_t *)&_OBJC_$_PROP_LIST_XJPerson_$_CatA,

};
Copy the code
  • herenameisXJPersonRather thanCatAThe class name is assigned because the name of the class is not known at the time of static compilation.
  • clsThere is no assignment, but there is a comment because it is not yet associated and requires runtime association.
  • The protocol is empty because the classification does not comply with the protocol.

After complying with the NSObject protocol, recompile to see:

  • That’s when the deal pays off. ExactlyNSObjectThe agreement.

Continue with the property list:

static struct/ * _prop_list_t* / {
    unsigned int entsize;  // sizeof(struct _prop_t)
    unsigned int count_of_properties;
    struct _prop_t prop_list[2].
} _OBJC_$_PROP_LIST_XJPerson_$_CatA __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_prop_t),
    2,
    {{"cata_Name"."T@\"NSString\",C,N"},
    {"cata_age"."Ti,N"}}};Copy the code
  • The property list finds the corresponding property of the classification declaration.

Continue to see the list of methods:

static struct/ * _method_list_t* / {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[2].
} _OBJC_$_CATEGORY_INSTANCE_METHODS_XJPerson_$_CatA __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    2,
    {{(struct objc_selector *)"cata_instanceMethod1"."v16@0:8", (void *)_I_XJPerson_CatA_cata_instanceMethod1},
    {(struct objc_selector *)"cata_instanceMethod2"."v16@0:8", (void *)_I_XJPerson_CatA_cata_instanceMethod2}}
};

static struct/ * _method_list_t* / {
    unsigned int entsize;  // sizeof(struct _objc_method)
    unsigned int method_count;
    struct _objc_method method_list[1].
} _OBJC_$_CATEGORY_CLASS_METHODS_XJPerson_$_CatA __attribute__ ((used, section ("__DATA,__objc_const"))) = {
    sizeof(_objc_method),
    1,
    {{(struct objc_selector *)"cata_classMethod1"."v16@0:8", (void *)_C_XJPerson_CatA_cata_classMethod1}}
};
Copy the code
  • There is no automatically generated attribute correspondingsetter & getterMethods. So categorical attributessetter & getterMethods can only be handled by associated objects.

3.2: category_tSource validation

Category_t objc source search category_t

struct category_t {
    const char *name;
    classref_t cls;
    WrappedPtr<method_list_t, PtrauthStrip> instanceMethods;
    WrappedPtr<method_list_t, PtrauthStrip> classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
    // Fields below this point are not always present on disk.
    // This variable does not always appear on disk
    struct property_list_t* _classProperties;

    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta, struct header_info *hi);

    protocol_list_t *protocolsForMeta(bool isMeta) {
        if (isMeta) return nullptr;
        else returnprotocols; }};Copy the code
  • objcIn the sourcecategory_tThe structure andclangThe conversion.cppIn the file_category_tIt’s basically the same structure, butcategory_tOne more_classProperties. This variable does not always appear on disk according to the comments, so it will not be explored for the moment.

3.3: Official document exploration

Xcode holds command, Shift, and 0 to open the official documentation, search for categories, and see the description:

  • In the documentCategoryisobjc_categoryRename the.

Objc source code search objc_category structure explore:

  • objc_categoryinObjc 2.0It is no longer available. You can see how long apple hasn’t updated the document classification, or it could be apple hiding the underlying implementation.

3.4: Source code exploration classification loading

After the above exploration, we can roughly understand the essence of classification: classification and class, essence is also a structure. So let’s explore how categories are loaded.

In methodizeClass, the core functions attached to this class are attachLists and attachToClass. If RWE exists, attachLists will be called to add the method, attribute and protocol of this class to RWE, but usually the RWE is not created at this time. So let’s first sort out the loading process of classification.

3.4.1 track:rweCreation time analysis

MethodizeClass (methodizeClass)

auto rwe = rw->ext(a);Copy the code

Ext function implementation:

struct class_rw_t {.class_rw_ext_t *ext(a) const {
        return get_ro_or_rwe().dyn_cast<class_rw_ext_t *>(&ro_or_rw_ext);
    }

    class_rw_ext_t *extAllocIfNeeded(a) {
        / / for rwe
        auto v = get_ro_or_rwe(a);if (fastpath(v.is<class_rw_ext_t* > ())) {// Return the address pointer if it exists
            return v.get<class_rw_ext_t *>(&ro_or_rw_ext);
        } else {
            // Create rwe if it doesn't exist, set rwe, and return address pointer
            return extAlloc(v.get<const class_ro_t*>(&ro_or_rw_ext)); }}// rwe deep copy
    class_rw_ext_t *deepCopy(const class_ro_t *ro) {
        return extAlloc(ro, true);
    }

    class_rw_ext_t *
    class_rw_t::extAlloc(const class_ro_t *ro, bool deepCopy)
    {
        runtimeLock.assertLocked(a);// Open up memory
        auto rwe = objc::zalloc<class_rw_ext_t> ();// The metaclass version is 7 and the non-metaclass is 0
        rwe->version = (ro->flags & RO_META) ? 7 : 0;

        // The methods, attributes, and protocols of the main class are added here to the RWE
        // Add the rWE as soon as it is created
        MethodizeClass will not be added, since there is usually no RWE at that time

        // Get the list of main class methods
        method_list_t *list = ro->baseMethods(a);if (list) {
            // deepCopy defaults to false. If true, a copy is made
            // class_rw_ext_t *extAlloc(const class_ro_t *ro, bool deep = false);
            // The extAlloc function declaration can be explained
            if (deepCopy) list = list->duplicate(a);// Add the list of main class methods to RWE's methods
            MethodizeClass has been fixed and sorted
            rwe->methods.attachLists(&list, 1);
        }

        // See comments in objc_duplicateClass
        // property lists and protocol lists historically
        // have not been deep-copied
        //
        // This is probably wrong and ought to be fixed some day
        // Get the main class attribute list
        property_list_t *proplist = ro->baseProperties;
        if (proplist) {
            // Add the main class property list to rWE's properties
            rwe->properties.attachLists(&proplist, 1);
        }

        // Get the protocol list of the main class
        protocol_list_t *protolist = ro->baseProtocols;
        if (protolist) {
            // Add the main class protocol list to rWE's protocols
            rwe->protocols.attachLists(&protolist, 1);
        }
        // Set rwe and ro
        set_ro_or_rwe(rwe, ro);
        / / return rwe
        returnrwe; }... }Copy the code
  • extFunction to obtainrwe.
  • extAllocIfNeededFunction to judgerweIf it exists, return if it exists, call if it does not existextAllocfunctionrweCreation.
  • extAllocfunctionrweAdd the method list, property list, and protocol list to the main classrweIn the corresponding list (not inmethodizeClassFunction added because there was no Rwe at that time) and then setrweandroAnd returnrwe.

To sum up, the rWE is created only when extAllocIfNeeded is called. A global search found that extAllocIfNeeded is called with the following functions (rWE creation case) :

  • attachCategoriesFunction.
  • objc_class::demangledNameFunction (isRealized() ||  isFuture()).
  • class_setVersionFunction.
  • addMethods_finishFunction.
  • class_addProtocolFunction.
  • _class_addPropertyFunction.
  • objc_duplicateClassFunction.

You can see that with the exception of attachCategories, the rwE is either created when the class is processed dynamically or when the class is repaired. This is in line with the WWDC presentation. We are now exploring the loading process of the categories, so clearly the core logic is in the attachCategories function.

3.4.2: Preliminary study on classification loading process

The global search found that the attachCategories function was invoked from the attachToClass function and the load_categories_NOLock function.

attachToClassThe function is calledattachCategoriesFunction:

load_categories_nolockThe function is calledattachCategoriesFunction:

3.4.2.1: Classification Loading Process 1

Global search for attachToClass shows that only methodizeClass is called.

// Only the code related to loading classification is posted here
static void methodizeClass(Class cls, Class previously)
{...// Attach categories.
    // Previously passed is nil
    if (previously) {
        if (isMeta) {
            objc::unattachedCategories.attachToClass(cls, previously,
                                                     ATTACH_METACLASS);
        } else {
            // When a class relocates, categories with class methods
            // may be registered on the class itself rather than on
            // the metaclass. Tell attachToClass to look for those.
            objc::unattachedCategories.attachToClass(cls, previously, ATTACH_CLASS_AND_METACLASS); }}// Go through this process
    objc::unattachedCategories.attachToClass(cls, cls, isMeta ? ATTACH_METACLASS : ATTACH_CLASS); . }Copy the code
  • methodizeClassThe function was analyzed earlier byrealizeClassWithoutSwiftFunction call.
  • previouslyThis judgment condition in the source code isnil, possibly as a backup parameter for internal apple debugging.

Based on the previous analysis, classified loading process 1 can be concluded:

  • map_images –> map_images_nolock –> _read_images –> realizeClassWithoutSwift –> methodizeClass –> attachToClass –> attachCategories –> attachLists.

3.4.2.2: Classification Loading Process 2

The load_categories_NOLock function was searched globally and applied to the loadAllCategories and _read_images functions. As we’ve seen before, the categories must be loaded after the load_images function is called, so we just need to explore the loadAllCategories function.

static void loadAllCategories(a) {
    mutex_locker_t lock(runtimeLock);
    for (auto*hi = FirstHeader; hi ! =NULL; hi = hi->getNext()) {
        load_categories_nolock(hi); }}Copy the code

Continue searching for the loadAllCategories function and find that it is called by the load_images function.

static bool  didInitialAttachCategories = false;

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    if(! didInitialAttachCategories && didCallDyldNotifyRegister) {/ / didInitialAttachCategories usually only go to true ` _read_images ` ` inside load_categories_nolock `,
        // But the next line of code calls loadAllCategories
        // Set this to true to ensure that loadAllCategories are called only once
        didInitialAttachCategories = true;
        // Load the classification
        loadAllCategories(a); }// Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;
    recursive_mutex_locker_t lock(loadMethodLock);
    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }
    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods(a); }Copy the code
  • didInitialAttachCategoriesThe default isfalse, the implementation ofloadAllCategoriesBefore the function will bedidInitialAttachCategoriesSet totrueTo ensure that it is called only onceloadAllCategoriesFunction.
  • in_objc_initA function call_dyld_objc_notify_register(&map_images, load_images, unmap_image)After the callback is registereddidCallDyldNotifyRegister = true.

After analysis, classified loading process 2 can be obtained:

  • load_images –> loadAllCategories –> load_categories_nolock –> attachCategories –> attachLists.

The detailed process of classification loading will be further analyzed below.

conclusion

The core logic of realizeClassWithoutSwift is shown in the form of mind map as follows: