preface

As iOS principle on the underlying principle of OC class load () has been analyzed the class loading, and explore the lazy loading classes and class not lazy loading of different processes, and preliminarily determines the classification of loading process, this paper is to detailed analysis of the classification under the loading process, as well as the classification load and load the relationship and difference between the main class.

The preparatory work

  • Objc4-818.2 – the source code.
  • MachOView.

One: classification loading

The loading principle of OC class in the previous iOS underlying principle (middle) has been analyzed and determined that there are two processes for the loading of classification:

  1. map_images –> map_images_nolock –> _read_images –> realizeClassWithoutSwift –> methodizeClass –> attachToClass –> AttachCategories – > attachLists.

  2. Load_images –> loadAllCategories –> load_categories_NOLock –> attachCategories –> attachLists.

However, these two processes are just inferred based on the code. Today, we will actually analyze and verify whether they are right or not.

1.1: attachCategories

Both processes end up calling the attachCategories function, loading the list of methods, properties, and protocols for the classification into RWE.

static void
attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count, int flags)
{.../* * Only a few classes have more than 64 categories during launch. * This uses a little stack, and avoids malloc. * * Categories must be added in the proper order, which is back * to front. To do that with the chunking, we iterate cats_list * from front to back, build up the local buffers backwards, * and call attachLists on the chunks. attachLists prepends the * lists, so the final result is in the expected order. */
    
    // Threshold. If the number of categories exceeds 64, append to the class first and save the rest
    // From the back to the front, 63, 62, 61... 2, 1, 0
    constexpr uint32_t ATTACH_BUFSIZ = 64;
    method_list_t   *mlists[ATTACH_BUFSIZ];
    property_list_t *proplists[ATTACH_BUFSIZ];
    protocol_list_t *protolists[ATTACH_BUFSIZ];

    uint32_t mcount = 0;
    uint32_t propcount = 0;
    uint32_t protocount = 0;
    bool fromBundle = NO;
    bool isMeta = (flags & ATTACH_METACLASS);
    // Get or create an RWE, which adds methods, attributes, and protocols to the main class
    auto rwe = cls->data() - >extAllocIfNeeded(a);// Iterate over all categories
    for (uint32_t i = 0; i < cats_count; i++) {
        auto& entry = cats_list[i];
        
        // Get the list of methods
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            // If there are more than 64 categories, the list of methods for the 64 categories is sorted and appented to the main class
            // Set count to 0 and save the rest
            if (mcount == ATTACH_BUFSIZ) {
                // The method list is fixed and sorted (sorted by sel address)
                prepareMethodLists(cls, mlists, mcount, NO, fromBundle, __func__);
                rwe->methods.attachLists(mlists, mcount);
                mcount = 0;
            }
            // If McOunt is 0, ATTACH_BUFSIZ - ++ McOunt is 63
            // The storage index ranges from 63 to 0
            mlists[ATTACH_BUFSIZ - ++mcount] = mlist;
            fromBundle |= entry.hi->isBundle(a); }// Attribute processing
        property_list_t *proplist =
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
            if (propcount == ATTACH_BUFSIZ) {
                rwe->properties.attachLists(proplists, propcount);
                propcount = 0;
            }
            proplists[ATTACH_BUFSIZ - ++propcount] = proplist;
        }

        // Protocol processing
        protocol_list_t *protolist = entry.cat->protocolsForMeta(isMeta);
        if (protolist) {
            if (protocount == ATTACH_BUFSIZ) {
                rwe->protocols.attachLists(protolists, protocount);
                protocount = 0; } protolists[ATTACH_BUFSIZ - ++protocount] = protolist; }}// The total number of categories or the remaining ones does not exceed 64, append to the class together below
    if (mcount > 0) {
        // Sort the classification methods before adding them to the main class
        Mlists = d n = ATTACH_BUFSIZ - McOunt
        // mlists + ATTACH_BUFSIZ - mcount = d + n 
        // At this point mlists + attach_bufsiz-mcount is a two-dimensional pointer that holds the first address of the method list
        prepareMethodLists(cls, mlists + ATTACH_BUFSIZ - mcount, mcount,
                           NO, fromBundle, __func__);
        // Add the classification method to rWE's methods
        rwe->methods.attachLists(mlists + ATTACH_BUFSIZ - mcount, mcount);
        if (flags & ATTACH_EXISTING) {
            flushCaches(cls, __func__, [](Class c){
                // constant caches have been dealt with in prepareMethodLists
                // if the class still is constant here, it's fine to keep
                return! c->cache.isConstantOptimizedCache();
            });
        }
    }
    
    // Add the classification properties to rWE's Properties
    rwe->properties.attachLists(proplists + ATTACH_BUFSIZ - propcount, propcount);
    
    // Add the classification protocol to the PROTOCOLS of RWE
    rwe->protocols.attachLists(protolists + ATTACH_BUFSIZ - protocount, protocount);
}
Copy the code
  • Get or createrweAfter the main class is created, the methods, properties, and protocols will be added to itrwe.
  • Traverse all categories, get method list, attribute list, protocol list, store to the corresponding total list, total list capacity64From back to front.
  • If the classification exceeds64One, just callattachListsThe function first obtains a list of methods, a list of properties, and a list of protocolsrweAnd save the rest.
  • After traversing all the categories, the total number of categories or the remaining ones does not exceed64A, callattachListsFunction to get a list of methods, a list of properties, and a list of protocols torweIn the.

1.2: attachLists

The attachCategories function takes the classified data and then calls the attachLists function to attach the data to the main class, so the core logic is in the attachLists function.

    void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;
        // If array() exists, enter the judgment
        if (hasArray()) {
            // many lists -> many lists
            OldCount = array()-> Number of lists
            uint32_t oldCount = array()->count;
            // newCount = old count + newCount
            uint32_t newCount = oldCount + addedCount;
            // Open memory according to 'newCount', type is array_t, array()->lists is a two-dimensional array
            array_t *newArray = (array_t *)malloc(array_t: :byteSize(newCount));
            // Set the number of new arrays equal to newCount
            newArray->count = newCount;
            // Set the number of old arrays equal to newCount
            array()->count = newCount;
            NewArray ->lists ->lists ->lists
            // Start with the last one
            for (int i = oldCount - 1; i >= 0; i--)
                newArray->lists[i + addedCount] = array()->lists[i];
            // Iterate over the list in the two-dimensional pointer 'addedLists' and store it in newArray->lists
            // And start from the starting position
            for (unsigned i = 0; i < addedCount; i++)
                newArray->lists[i] = addedLists[i];
            // Release the original array()
            free(array());
            // Set a new newArray
            setArray(newArray);
            validate(a); }// If there are methods in the main class, the first entry is the list of methods in the main class
        // If there are no methods in the main class, then the list of methods for the class comes in
        else if(! list && addedCount ==1) {
            // 0 lists -> 1 list
            list = addedLists[0];
            validate(a); }Array_t = array_t; array_t = array_t
        // Lists of 'array_t' store address Pointers for each classified array
        else {
            // 1 list -> many lists
            // Assign the list array to oldList
            Ptr<List> oldList = list; 
            // oldList has an oldCount of 1, and vice versa
            uint32_t oldCount = oldList ? 1 : 0; 
            // newCount = old count + newCount
            uint32_t newCount = oldCount + addedCount;
            // Open memory according to 'newCount', type is array_t, array()->lists is a two-dimensional array
            setArray((array_t *)malloc(array_t: :byteSize(newCount)));
            // Set the number of arrays
            array()->count = newCount;
            // Put the original list at the end of the array
            if (oldList) array()->lists[addedCount] = oldList;
            // Iterate over the list in the two-dimensional pointer 'addedLists' and store it in array()->lists
            // And start from the starting position
            // Since attachCategories are already inserted in reverse order, they are now inserted sequentially
            // The later the class is loaded, the earlier it is attached to the main class
            for (unsigned i = 0; i < addedCount; i++)
                array()->lists[i] = addedLists[i];
            validate();
        }
    }
Copy the code

There are three cases of attachLists:

0 lists -> 1 list

  • willaddedLists[0]Assigned tolist.

If there are methods in the main class, the first entry is the list of methods in the main class; If there are no methods in the main class, what comes in is a list of methods for the class. Property, and the same goes for protocols.

1 list -> many lists

  • Calculate the original total, the originallistThere’s only one, so no1is0.
  • The new total is equal to the old total plus the new total,newCount = oldCount + addedCount.
  • The corresponding memory is allocated according to the new total, and the data type isarray_tAnd set the arraysetArray.
  • Set the total number of arrays to equalnewCount.
  • Will the originallistPut it at the end of the array, there was only one before, so you don’t have to go through it.
  • Iterates to get the new arraylist, from0The number bit starts to go into the arraylistsIn the.

Many lists -> many lists (array() exists)

  • Calculates the total number of original arrays.
  • The new total is equal to the old total plus the new total,newCount = oldCount + addedCount.
  • The corresponding memory is allocated according to the new total, and the data type isarray_t.
  • Set the total number of new arrays equal tonewCount.
  • Set the total number of original data to be equal tonewCount.
  • Iterating to get to the original arraylistTo the new arraylistsAnd start from the end.
  • Iterates to get the new arraylist, from0The number bit starts to be stored in the new arraylistsIn the.
  • Release the original array.
  • Set the new array.

Note: List* const * addedLists is a two-dimensional pointer. Just like XJPerson *p = [XJPerson alloc], p is a one-dimensional pointer and &p is a two-dimensional pointer.

Array_t structure

class list_array_tt {
    struct array_t {
        uint32_t count;
        Ptr<List> lists[0];

        static size_t byteSize(uint32_t count) {
            return sizeof(array_t) + count*sizeof(lists[0]);
        }
        size_t byteSize(a) {
            return byteSize(count); }};protected:...private:
    union {
        Ptr<List> list;
        uintptr_t arrayAndFlag;
    };

    bool hasArray(a) const {
        return arrayAndFlag & 1;
    }

    array_t *array(a) const {
        return (array_t *)(arrayAndFlag & ~1);
    }

    void setArray(array_t *array) {
        arrayAndFlag = (uintptr_t)array | 1;
    }

    void validate(a) {
        for (auto cursor = beginLists(), end = endLists(a); cursor ! = end; cursor++) cursor->validate();
    }
    
    ...
}
Copy the code
  • setArrayfunctionarrayAndFlag = (uintptr_t)array | 1.arrayAndFlagThe first0A must have1.
  • hasArrayfunctionarrayAndFlag & 1If thearrayAndFlagThe first0Who is the1, it returnsYESAnd vice versaNO.
  • So as soon as it’s calledsetArrayAfter the function, before the release,hasArrayAnd what this function returns isYES.
  • In the unionlistandarrayAndFlagMutually exclusive, which means there islistThere is noarray, there arearrayThere is nolist.

1.2.1: attachListsFlow chart:

1.3: Example verificationattachCategoriesandattachLists

After analyzing the process of attachCategories and attachLists, the following is an example of how to verify them (using methods as an example).

Create the main class XJPerson and the categories XJPerson+XJA, XJPerson+XJB.

@implementation XJPerson

+ (void)load
{
    NSLog(@"__%s__", __func__);
}

- (void)instanceMethod1
{
    NSLog(@"%s", __func__);
}

- (void)instanceMethod2
{
    NSLog(@"%s", __func__);
}

@end

@implementation XJPerson (XJA)

+ (void)load
{
    NSLog(@"%s", __func__);
}

- (void)xja_instanceMethod1
{
    NSLog(@"%s", __func__);
}

- (void)xja_instanceMethod2
{
    NSLog(@"%s", __func__);
}

@end

@implementation XJPerson (XJB)

+ (void)load
{
    NSLog(@"%s", __func__);
}

- (void)xjb_instanceMethod1
{
    NSLog(@"%s", __func__);
}

- (void)xjb_instanceMethod2
{
    NSLog(@"%s", __func__);
}

@end
Copy the code

1.3.1: attachCategoriesvalidation

Add the debug code, run the source code, and locate the attachCategories function to debug.

  • Load isXJAClassification.

Then output the variable mlists to see the data.

  • mlistsAnd then the last thing that’s stored isXJAThe address of the classified method list.

Mlists are of type method_list_t *[] (pointer array) and store data of type method_list_t *.

  • mlistsThe first address is a two-dimensional pointer.
  • mlists + ATTACH_BUFSIZ - mcountThat’s the address shift,mlistsIs the initial address,ATTACH_BUFSIZ - mcountIs the exact position, and what you get is a two-dimensional pointer.

1.3.2: attachListsvalidation

The verification method uses dynamic source debugging and macho files (executable files) to verify each other, so first introduce how the source code reads macho files. _getObjc2NonlazyClassList and _getObjc2NonlazyCategoryList method was encountered in the source code, check in the concrete implementation.

#define GETSECT(name, type, sectname)                                   \
    type *name(const headerType *mhdr, size_t *outCount) {              
        returngetDataSection<type>(mhdr, sectname, nil, outCount); \} \type *name(const header_info *hi, size_t *outCount) {\return getDataSection<type>(hi->mhdr(), sectname, nil, outCount); The \}// function name content type section name
// Refs ends with classes and methods that need to be fixed
GETSECT(_getObjc2SelectorRefs,        **SEL**,          "__objc_selrefs"); 

GETSECT(_getObjc2MessageRefs,         message_ref_t."__objc_msgrefs"); 

GETSECT(_getObjc2ClassRefs,           Class,           "__objc_classrefs");

GETSECT(_getObjc2SuperRefs,           Class,           "__objc_superrefs");
// Macho section equals __objc_classList List of all classes (excluding classes)
GETSECT(_getObjc2ClassList,           classref_t民运分子const* *,"__objc_classlist");
// The macho section is equal to the list of __objc_NLclslist non-lazily loaded classes
GETSECT(_getObjc2NonlazyClassList,    classref_t民运分子const* *,"__objc_nlclslist");
// The macho section is equal to the list of __objc_catList classes
GETSECT(_getObjc2CategoryList,        category_t* * *const* *,"__objc_catlist");
// The macho section is equal to the list of __objc_catList2 classes
GETSECT(_getObjc2CategoryList2,       category_t* * *const* *,"__objc_catlist2");
// The macho section is equal to __objc_NLcatList non-lazy loading class
GETSECT(_getObjc2NonlazyCategoryList, category_t* * *const* *,"__objc_nlcatlist");
// The macho section equals __objc_protolist protocol list
GETSECT(_getObjc2ProtocolList,        protocol_t* * *const* *,"__objc_protolist");
// The macho section equals the __objc_protorefs protocol repair list
GETSECT(_getObjc2ProtocolRefs,        protocol_t *,    "__objc_protorefs");
// The macho section is equal to __objc_init_func __objc_init list of initialization methods
GETSECT(getLibobjcInitializers,       UnsignedInitializer, "__objc_init_func");
Copy the code

__objc_nlclslist corresponds to the section name in the Macho file, and the data of the corresponding section is on the right.

1.3.2.1: 0 lists -> 1 list

After creating an RWE, the extAlloc function calls attachLists if the main class has methods (the same goes for properties and protocols, for example) to add them to the list corresponding to the RWE, or does nothing if it doesn’t. Then there are two cases where the main class has methods and the main class has no methods (both default classes have methods).

1.3.2.1.1: Main class has methods

Run the source Location to attachCategories function to get the associated location of the RWE.

Enter the extAllocIfNeeded function to get or create an RWE.

Since we don’t have an RWE yet, we’ll go into extAlloc to create an RWE.

  • Now the main class has methods,listThere is a value, so callattachListsFunction to add the main class method torweIn the corresponding list.

  • At this timerweI just created,listandarrayIt’s all empty, so it’s in0 lists -> 1 listThe branch.
  • lldbDebug shows that the main class method list is added at this point.
1.3.2.1.2: The main class has no method

Comment out the main class method and repeat the debugging steps above to enter the extAlloc function.

  • Now the main class has no methods,listforNULL, so it will not be calledattachListsFunction.

To continue debugging, go back to the attachCategories function, LLDB to see the cats_list information.

  • At this point the load is wrapped intolocstamped_category_t *Classification of typesXJA.

Continue debugging.

  • Called after the classification data has been readattachListsFunction added torweThis article only analyzes methods, attributes, protocols similar).

  • The main class has no methodslistandarrayAre all empty, so the category loads in0 lists -> 1 listThe branch.
  • lldbDebug shows that the category is added at this pointXJAMethod list.

Now there are XJA and XJB categories, why load XJA first instead of XJB?

This is determined by the order in which the classification files were compiled, in which order they are loaded.

If you put the classification XJB before the classification XJA, then XJB will compile first and load first.

1.3.2.1.3: 0 lists -> 1 listconclusion

0 lists -> 1 lists

  • The main class has methods: Create finishedrweThen the main class method is added.
  • Main class has no methods: add in firstrweIs the first category to load.

1.3.2.2: 1 list -> many lists

Add the main class method back, put the classification XJB before XJA (verify the load order), and run the source code.

  • The classificationXJBIn theXJAIn front, it will load firstXJB.
  • Read the classificationXJBCall after dataattachListsFunction added torweThis article only analyzes methods, attributes, protocols similar).

  • listIs a pointer to the main class method list,addedListsIs stored in the classificationXJBMethod list pointer.

Continue looking after the merge.

  • array()->listsContains the addresses of the two method lists, with the classified method list first.

1.3.2.3: many lists -> many lists

To continue debugging the load class XJA, go to the many Lists -> Many Lists branch.

  • Read the classificationXJACall after dataattachListsFunction added torweThis article only analyzes methods, attributes, protocols similar).

  • addedListsIs stored in the classificationXJAMethod list pointer.

Continue looking after the merge.

  • After the mergernewArray->listsStored in the3Method list addresses, in order from the front to the backClassification XJA,Classification XJB,The main class. The main class inlistsAt the end, the last compiled category is inlistsThe front.

Two: class and category collocation loading

Class loading can be divided into lazy and non-lazy classes based on whether the +load method is implemented or not. Is the loading of a class related to the +load method, and what happens when the class and class are loaded together? Here are a few ways to explore:

  1. Classes and classifications are implemented+loadMethods.
  2. Class implements+loadMethod, classification is not implemented.
  3. Neither class nor classification is implemented+loadMethods.
  4. Class is not implemented, classification is implemented+loadMethods.
  5. Many classification

To facilitate tracing, add debugging code and breakpoints to the key functions of the two routes analyzed earlier.

2.1: Both classes and classifications are implemented+loadmethods

Not lazy loading class load data is through _getObjc2NonlazyClassList function from the macho file access, not lazy loading classification data loading were obtained by _getObjc2NonlazyCategoryList function from macho file.

Non-lazy loading class reading data schematic:

Non-lazy loading of classified data read schematic:

  • Non-lazy loading class loading process:map_images -> map_images_nolock -> _read_images -> realizeClassWithoutSwift -> methodizeClass -> attachToClass
  • Non-lazy loading classification loading process:load_images -> loadAllCategories -> load_categories_nolock -> attachCategories -> attachLists.

Log printing process:

In this case, the class is not lazily loaded, and the ro data is viewed in the realizeClassWithoutSwift function.

  • There are only methods for the main class, there are no methods for classification.

After the main class is loaded, the load_images function is called.

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    / / didInitialAttachCategories control to only one.
    / / didCallDyldNotifyRegister after registered in _objc_init callback assignment to true
    if(! didInitialAttachCategories && didCallDyldNotifyRegister) { didInitialAttachCategories =true;
        // Load all categories
        loadAllCategories(a); }... }Copy the code
  • didInitialAttachCategoriesThe control is executed only onceloadAllCategories(because theload_imagesWill be executed multiple times),didCallDyldNotifyRegisterin_objc_initFunction after registering the callback is assigned totrue.

2.1.1: load_categories_nolock

In the loadAllCategories function, invoke the load_categories_nolock function based on the header_info loop. The core code is as follows:

static void load_categories_nolock(header_info *hi) {
    bool hasClassProperties = hi->info() - >hasCategoryClassProperties(a);size_t count;
    // Declare and implement processCatlist
    auto processCatlist = [&](category_t * const *catlist) {
        for (unsigned i = 0; i < count; i++) { // Number of categories
            category_t *cat = catlist[i];
            Class cls = remapClass(cat->cls);
            // Wrap cat and hi into locstamped_category_t
            locstamped_category_tlc{cat, hi}; ...// Process this category.
            if (cls->isStubClass()) {
         ……
            } else {
                // Instance methods/protocols/instance properties
                if (cat->instanceMethods ||  cat->protocols
                    ||  cat->instanceProperties)
                {
                    if (cls->isRealized()) { // Non-lazily loading classes
                        attachCategories(cls, &lc, 1, ATTACH_EXISTING);
                    } else { // Load the class lazily
                        objc::unattachedCategories.addForClass(lc, cls); }}// Class method/protocol/class attribute
                if (cat->classMethods  ||  cat->protocols
                    ||  (hasClassProperties && cat->_classProperties))
                {
                    if (cls->ISA() - >isRealized()) { // Non-lazily loading classes
                        attachCategories(cls->ISA(), &lc, 1, ATTACH_EXISTING | ATTACH_METACLASS);
                    } else { // Load the class lazily
                        objc::unattachedCategories.addForClass(lc, cls->ISA()); }}}}};/ / call processCatlist
    // Load the class __objc_catlist, count from macho.
    processCatlist(hi->catlist(&count));
    //__objc_catlist2
    processCatlist(hi->catlist2(&count));
}
Copy the code
  • Declare and implementprocessCatlistFunction.
  • callprocessCatlistFunction, readmachoThe file named__objc_catlistwith__objc_catlist2thesection(This is generated only when there is a shared cache__objc_catlist2).
  • processCatlistThe class () function loops through the class according to the class information read.
  • The non-lazy load classes are loaded one by one, soattachCategoriesThe function passes valuescountAre all1.

LLDB View category information:

The attachCategories and attachLists functions are already analyzed in sections 1.1 and 1.2 and will not be covered here.

If you change the number of classes that implement the +load method, you will find that as long as there is a class that implements the +load method, all class loads will go through the loading process of non-lazily loaded classes and will not be merged into the main class ro.

Conclusion: The class implements the +load method. As long as a class implements the +load method, all classes are not merged into the ro of the main class. Instead, they are merged into RWE through the non-lazy loading process.

2.2: class implements+loadMethod, classification is not implemented

  • Non-lazy loading class loading process:map_images -> map_images_nolock -> _read_images -> realizeClassWithoutSwift -> methodizeClass -> attachToClass

Log printing process:

  • Non-lazily loaded classes still gomap_imagesProcess.
  • The lazy loading class does not goattachCategoriesProcess.

So when is the classified data loaded?

  • machoThe category list in the file has no data, indicating that the category data is not loaded dynamically.

If it’s not loaded dynamically, it’s most likely loaded by the compiler, so when the realizeClassWithoutSwift function loads the class, get the ro data and see if there’s any data related to the class.

  • roThere are not only main class methods, but also classification methods.

Ro is determined at compile time, which means that the data in the lazy loading class is already merged into the main class at compile time. And the methods of the classification are placed before the methods of the main class.

2.3: Neither class nor classification is implemented+loadmethods

Log printing process:

The print information shows that the class loading process and the class loading process are not gone. Check the Macho file.

  • machoThere are no non-lazy-loaded classes and non-lazy-loaded classes in the filesectionAnd there is no data in the classification list.

Instantiate XJPerson in the main function and call the instance method, debug into the realizeClassWithoutSwift function that loads the class (the message slow lookup flow enters), and look at the function call stack.

  • The function call stack display is yesMessage search is slowThe process callsrealizeClassWithoutSwiftFunction, which relates to the slow message lookup process analyzed earlier.

LLDB To view RO data:

  • According to thelldbThe outputroData, you can find that the classified data is also merged at compile timeroIn the now.

The loading process for lazy-loaded classes is delayed until the first message is sent, while the data for lazy-loaded classes is merged into ro at compile time.

2.4: Class is not implemented, classification is implemented+loadmethods

This situation is complicated because the number of non-lazily loaded classes affects the overall loading process.

2.4.1: No class implementation, a classification implementation+loadmethods

Run the source code to view the log printing process:

  • The process of this situation andClass implements +load method, classification does notIt’s the same thing.

The non-lazy-loaded class forces the lazy-loaded class to become a non-lazy-loaded class (the class is forced to open), and the data from the non-lazy-loaded class is merged into the main class.

  • XJPersonThe class is forced to become a non-lazy-loaded class.

  • The classification list has no data that has been merged at compile timeroIn the now.

Run the source code, stop after realizeClassWithoutSwift, LLDB check ro data:

  • Lazy-loaded classes are forced to become non-lazy-loaded classes, and the class data is merged at compile timeroIn the now.

2.4.2: Classes are not implemented, but multiple categories are implemented+loadmethods

Re-add the classification XJB, run the source code, and view the log printing process:

  • Main class loading does not gomap_imagesProcess.
  • Call twiceload_categories_nolockFunction after loading two classes, no callsattachCategoriesFunction, instead of callingrealizeClassWithoutSwiftFunction loads the main class and then calls itattachCategoriesFunction.

So there are two questions:

  1. Why didn’t the two categories go after they were loadedattachCategoriesFunction?
  2. How to callrealizeClassWithoutSwiftThe function?

  • machoThe category list in the file and the non-lazily loaded category list have categoriesXJAandXJBBut there is no non-lazily loaded class list.

Open the breakpoint in the debugging code in the load_categories_nolock function, run the source code, and see the logic after the two calls.

  • clsIf it is loaded, goattachCategoriesFlow, no gounattachedCategories.addForClassProcess.

This is the addForClass process.

void addForClass(locstamped_category_t lc, Class cls)
{
    runtimeLock.assertLocked(a);if (slowpath(PrintConnecting)) {
        _objc_inform("CLASS: found category %c%s(%s)",
                     cls->isMetaClassMaybeUnrealized()?'+' : The '-',
                     cls->nameForLogging(), lc.cat->name);
    }
    // Look for lc in the hash table
    // Insert if no
    auto result = get().try_emplace(cls, lc);
    // Check whether there is data in result.second
    if(! result.second) { result.first->second.append(lc); }}Copy the code
  • So let’s go to the hash table firstclsforkeyTo findlc, no insert.
  • judgelc.secondIs there any data? If not, assign a value.

At this point, the classification data is only stored in the hash table with CLS as the key, and has not been loaded into RWE.

To explore how the class loads, open the breakpoint in the debug code in the realizeClassWithoutSwift function and continue debugging.

  • The function call stack is displayed in the call flow as:load_images -> prepare_load_methods -> realizeClassWithoutSwift.

In the function call stack, select load_images to see how prepare_load_Methods is called.

  • loadAllCategoriesWhen the function is finished, callhasLoadMethodsThe function determines if there is+loadMethod, return if you don’t have it, call if you doprepare_load_methodsFunction.
bool hasLoadMethods(const headerType *mhdr)
{
    size_t count;
    if (_getObjc2NonlazyClassList(mhdr, &count)  &&  count > 0) return true;
    if (_getObjc2NonlazyCategoryList(mhdr, &count)  &&  count > 0) return true;
    return false;
}
Copy the code
  • call_getObjc2NonlazyClassListFunction to readmachoThe file named__objc_nlclslistthesectionData, if any, is returnedtrue.
  • call_getObjc2NonlazyCategoryListFunction to readmachoThe file named__objc_nlcatlistthesectionData, if any, is returnedtrue.
  • If no, returnfalse.

The classes XJA and XJB both implement the +load method, so it returns true, and then prepare_load_methods is called.

void prepare_load_methods(const headerType *mhdr)
{
size_t count, i;

runtimeLock.assertLocked(a);// Get the list of non-lazily loaded classes from Macho
classref_t const *classlist = 
    _getObjc2NonlazyClassList(mhdr, &count);
for (i = 0; i < count; i++) {
    // Add the remapped classes to the loadable_classes list
    schedule_class_load(remapClass(classlist[i]));
}
// Get the list of non-lazy load classes from macho
category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
for (i = 0; i < count; i++) {
    category_t *cat = categorylist[i];
    // Remap the classified classes
    Class cls = remapClass(cat->cls);
    if(! cls)continue;  // category for ignored weak-linked class
    if (cls->isSwiftStable()) {
        _objc_fatal("Swift class extensions and categories on Swift "
                    "classes are not allowed to have +load methods");
    }
    // Class loading
    realizeClassWithoutSwift(cls, nil);
    ASSERT(cls->ISA() - >isRealized());
    // Add the category to the loadable_categories list for the category
    add_category_to_loadable_list(cat); }}Copy the code
  • frommachoTo get a list of non-lazily loaded classes.
  • callschedule_class_loadFunction to add the remapped class toloadable_classesIn the list.
  • frommachoTo get a list of non-lazy loading categories.
  • Remap the classified classes.
  • callrealizeClassWithoutSwiftFunctions load classes.
  • calladd_category_to_loadable_listFunction to add a category to the category’sloadable_categoriesIn the list.

Check the schedule_class_load implementation.

static void schedule_class_load(Class cls)
{
    if(! cls)return;
    ASSERT(cls->isRealized());  // _read_images should realize

    if (cls->data()->flags & RW_LOADED) return;
    // Parent class of recursion 'CLS'
    // Ensure superclass-first ordering
    schedule_class_load(cls->getSuperclass());
    // Add the class and its parent to the load table
    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); `IMP`
}
Copy the code
  • Both the recursive parent and the class are calledadd_class_to_loadable_listFunction.

The add_class_to_loadable_list function has similar logic to the add_category_to_loadable_list function.

static struct loadable_class *loadable_classes = nil;
static int loadable_classes_used = 0;
static int loadable_classes_allocated = 0;

struct loadable_class {
    Class cls;  // may be nil
    IMP method;
};

void add_class_to_loadable_list(Class cls)
{
    IMP method;
    
    loadMethodLock.assertLocked(a);// Get the load method
    method = cls->getLoadMethod(a);if(! method)return// Don't bother if cls has no +load method.// Expand the first time to 16
    if (loadable_classes_used == loadable_classes_allocated) {
        loadable_classes_allocated = loadable_classes_allocated*2 + 16;
        loadable_classes = (struct loadable_class *)
            realloc(loadable_classes,
                              loadable_classes_allocated *
                              sizeof(struct loadable_class)); } store loadable_classes[loadable_classes_used]. CLS = CLS; loadable_classes[loadable_classes_used].method = method; loadable_classes_used++; }Copy the code
static struct loadable_category *loadable_categories = nil;
static int loadable_categories_used = 0;
static int loadable_categories_allocated = 0;

struct loadable_category {
    Category cat;  // may be nil
    IMP method;
};

void add_category_to_loadable_list(Category cat)
{
    IMP method;

    loadMethodLock.assertLocked(a); method = _category_getLoadMethod(cat);// Don't bother if cat has no +load method
    if(! method)return; .// Expand the first time to 16
    if (loadable_categories_used == loadable_categories_allocated) {
        loadable_categories_allocated = loadable_categories_allocated*2 + 16;
        loadable_categories = (struct loadable_category *)
            realloc(loadable_categories,
                              loadable_categories_allocated *
                              sizeof(struct loadable_category));
    }
    / / store
    loadable_categories[loadable_categories_used].cat = cat;
    loadable_categories[loadable_categories_used].method = method;
    loadable_categories_used++;
}
Copy the code
  • Classify classes or categories with their+loadmethodsIMPPut them in the list together.

Continue debugging, RealizeClassWithoutSwift -> methodizeClass -> attachToClass -> attachCategories -> AttachLists.

void attachToClass(Class cls, Class previously, int flags)
{
    runtimeLock.assertLocked(a);ASSERT((flags & ATTACH_CLASS) ||
       (flags & ATTACH_METACLASS) ||
       (flags & ATTACH_CLASS_AND_METACLASS));
       
    auto &map = get(a);// Look for the class insert in the addForClass function in the hash table with the previously inserted class key (CLS)
    auto it = map.find(previously);
    // If it is not equal to the last data in the table, the last data is regarded as the identification bit
    if(it ! = map.end()) {
        // Get the address of the data list for the category
        category_list &list = it->second;
        if (flags & ATTACH_CLASS_AND_METACLASS) {
            // Load classification data to the metaclass class
            int otherFlags = flags & ~ATTACH_CLASS_AND_METACLASS;
            attachCategories(cls, list.array(), list.count(), otherFlags | ATTACH_CLASS);
            attachCategories(cls->ISA(), list.array(), list.count(), otherFlags | ATTACH_METACLASS);
        } else {
            // Load the class with class data
            attachCategories(cls, list.array(), list.count(), flags);
        }
        // Erase the CLS category data from the list
        map.erase(it); }}Copy the code
  • In order toPreviously (CLS)forkeyGo to the hash table and look inaddForClassFunction to insert classified data.
  • Get the data and the tableendIf the tags are not equalattachCategoriesFunction (the number of categories passed incats_countFor the number of all categories), and then erase in the hash tableclsCorresponding classification data.

LLDB debug to view classification data:

  • it ! = map.end()Condition holds, callattachCategoriesFunction.

Call process: load_images -> loadAllCategories -> load_categories_nolock -> addForClass -> prepare_load_methods -> RealizeClassWithoutSwift -> methodizeClass -> attachToClass -> attachCategories -> attachLists.

Conclusion: If the main class does not implement the +load method, and at least two classes implement the +load method, then the prepare_load_methods process in the load_images function is followed by the prepare_load_methods process to force the main class to load. The classification is then added to RWE.

2.5: summary

  • Class implements+loadMethods (non-lazily loaded classes) :

    • At least one classification implementation+loadMethod, will be inload_imagesThrough the processloadAllCategoriesAdd all classified data torweIs not merged into the main classro.

    • All classifications are not implementedloadWill merge the classification methods into the classroIn the.

  • Class doesn’t implement+loadMethods:

    • There is only one classification implementation+loadMethod, the classification method will be merged into the classroDue to the mergeroClass is forced to become a non-lazily loaded class.

    • At least two classification implementations+loadMethods. Classification methods are not merged into the main classro, in theload_imagesFunction of theloadAllCategoriesWhen the process is done, move onload_imagesIn the functionprepare_load_methodsThe process forces the main class to load and then adds the classes torweIn the. (Because there is no mergero, the class itself is a lazy-loaded class. The classification causes it to be loaded).

  • Neither class nor classification is implemented+loadMethod, all classification methods will be merged into the classroIs loaded when the first message is sent (classes are lazily loaded).

  • The class itself or its subclass is instantiated during the map_images process.

  • The class itself is lazily loaded, and classes that are instantiated due to its or its subclass’s non-lazily loaded classes are instantiated in the prepare_load_methods process in the load_images function. (Ro merges will cause the class to become non-lazily loaded if the class has a +load method).

  • In essence, only the class itself that implements the +load method is a non-lazily loaded class. All other cases are forced and are not non-lazily loaded classes per se.

  • Whether or not the list of classes and classification methods merges depends on the total number of +load methods. If there’s one or none, it merges, otherwise it doesn’t merge.

  • Empty classes will not be loaded.

⚠️ Why is the class not incorporated into ro at compile time if the class and classification implementation +load method add up to two or more?

Because all +load methods are called in the load_images function, the +load methods of each class or category may have done their own operation, so they should not be forcibly combined. So in the actual development, try to implement the +load method as little as possible. Too many +load methods need to be invoked separately, which will cause the classes will not be merged into RO. In addition, the loading process of classes and classes is also advanced to the program startup stage, which will cause the program startup to be slow.

Third: the search for methods of the same name in the class

3.1: Method search process analysis

The getMethodNoSuper_nolock lookup method is called in the lookUpImpOrForward function.

NEVER_INLINE
IMP lookUpImpOrForward(id inst, SEL sel, Class cls, int behavior)
{...for (unsigned attempts = unreasonableClassCount();;) {
        if (curClass->cache.isConstantOptimizedCache(/* strict */true)) {... }else {
            // curClass method list.
            // Look for the method in the class's method list (using binary search algorithm). If found, return, and cache the method in cache
            Method meth = getMethodNoSuper_nolock(curClass, sel);
            if (meth) {
                imp = meth->imp(false);
                gotodone; }... }... }... }Copy the code

3.1.1: getMethodNoSuper_nolock

static method_t *
getMethodNoSuper_nolock(Class cls, SEL sel)
{.../ / get the methods
    auto const methods = cls->data() - >methods(a);// Loop to find methodlist and store method_list_t, possibly two-dimensional data. Dynamic loading of methods and classes
    for (auto mlists = methods.beginLists(),
              end = methods.endLists(a);// Before the last one, shift the address by 1mlists ! = end; ++mlists) {// Find the method
        method_t *m = search_method_list_inline(*mlists, sel);
        if (m) return m;
    }

    return nil;
}
Copy the code
  • To obtainclsthemethods(two-dimensional pointer).
  • traversemethodsTo get what’s stored insidemethod_list_t *Method list of typemlists, the callsearch_method_list_inlineFunction.
  • frombeginListsSo we’re starting to iterate, so we’re going to iterate from the beginning of the array. That is, categories that are loaded later will be looked up first.
const Ptr<List>* beginLists(a) const {
    if (hasArray()) {
        // Return the first address of lists array instead of array.
        return array()->lists;
    } else {
        // ** ** ** ** ** ** ** ** *
        return&list; }}const Ptr<List>* endLists(a) const {
    if (hasArray()) {
        return array()->lists + array()->count;
    } else if (list) {
        return &list + 1;
    } else {
        return&list; }}Copy the code
  • Directly determine whether there ishasArrayTo return toarray()->listsor&list.
  • &listIt’s a layer of packaging (* - > * *), so inforIn the loop is a call to the same data type.

3.1.2: Method list structure verification:

3.1.2.1: roMerge (used here2.2Section) :

  • There is noarray.&listPack one layer (* - > * *) in order togetMethodNoSuper_nolockIn the functionforThe data types are uniform in the loop.
  • roMerged, so there’s onlylist, there is noarray.

  • roMerged. All the methods are therelistIn the.

3.1.2.2: roNo merge (used here2.1Section) :

  • roIt doesn’t merge, so it doesarray, which holds the pointer to the method list of the class and classification.

  • classificationXJBAfter loading, put the classificationXJAFirst, look up. The last loaded category is put inarrayFirst, look for it first, and then the main class, if it has data, goes last.

3.1.3: findMethodInSortedMethodList

Call flow after getMethodNoSuper_nolock: GetMethodNoSuper_nolock – > search_method_list_inline – > findMethodInSortedMethodList – > findMethodInSortedMethodList.

FindMethodInSortedMethodList function is the core of the search method is to use binary search method.

ALWAYS_INLINE static method_t *
findMethodInSortedMethodList(SEL key, const method_list_t *list, const getNameFunc &getName)
{
    ASSERT(list);

    auto first = list->begin(a);// The location of the first method
    auto base = first;
    decltype(first) probe;
    
    // Convert key to uintPtr_t because the elements in the repaired method_list_t are sorted
    uintptr_t keyValue = (uintptr_t)key;
    uint32_t count;
    
    // Count = number of arrays, count >> 1 = count / 2
    // count >>= 1 = (count = count >> 1) = (count = count / 2)
    /* Case 1: for example, count = 8. The index of sel to be searched is 2 1(the first time). Count = 8 2(the second time)
    /* Case 2: For example, count = 8 sel index = 7 1(the first time). Count = 8 2(the second time). Count = (7 >>= 1) = 3(the third time) 1(count--) */
    for(count = list->count; count ! =0; count >>= 1) {
        // Get probe value (intermediate value)
        Prebe = 0 + (count / 2) = 2 */
        /* case 2:1. Probe = 1 (0) + (count / 2) = 0 + 4 2. Probe = 5 + (3/2) = 6 3
        probe = base + (count >> 1);
        
        // Get the uintPtr_t value of sel to probe
        uintptr_t probeValue = (uintptr_t)getName(probe);
        
        /* Example 1:1. Key = 2, prebe = 2, method_t * */
        /* Example 2:1. Key = 7, prebe = 6, unequal 3. Key = 7, probe = 7, unequal
        if (keyValue == probeValue) { // If the uintPtr_t value of the target sel matches the uintPtr_t value of the probe SEL successfully
            // `probe` is a match.
            // Rewind looking for the *first* occurrence of this value.
            // This is required for correct category overrides.
            
            // The probe value is not the first && the uintPtr_t value above sel is also equal to keyValue
            // Note that there is a class with the same name as the class overridden, and that the class with the last compiled method is first
            while (probe > first && keyValue == (uintptr_t)getName((probe - 1))) {
                probe--;
            }
            / / return
            return &*probe;
        }
        
        // If keyValue > probe value
        /* Example 1:1. 2 is not greater than 4, do not enter, continue the loop */
        /* Case 2:1. 7 > 4, enter 2. 7 > 6, enter */
        if (keyValue > probeValue) { 
            2. Base = 6 + 1 = 7, count-- = 3-- = 2 */
            base = probe + 1; count--; }}return nil; // Return nil
}
Copy the code
  • while (probe > first && keyValue == (uintptr_t)getName((probe - 1)))The logic is that the classification and main class have methods with the same name, which are merged at compile timeroIn, the processing required to obtain the final compiled classification.

This is why the class overrides methods of the same name on the main class.

3.1.4: Binary search verification

Add and implement the method – (void)sameMethod in both the main class XJPerson and the classes XJA and XJB.

3.1.4.1: roCase of merger

As in section 2.2, only the main class implements the +load method and calls the sameMethod method in the main function.

To run the source code, first validate the ro data in the realizeClassWithoutSwift function of the class loading process:

  • The main class and classification methods are merged intoroIn the now.

Methods with the same name should be grouped together when searching for a method.

selBefore and after correction:

Methods sort before and after:

  • After correcting the sorting, the methods with the same name are indeed put together.

That is why in findMethodInSortedMethodList internal binary search function will after finding ways to continue looking for a reason.

The sameMethod method of the classification XJB should be the first to load when the method lookup is performed as analyzed earlier.

  • In the consolidatedroIn this case, the class and the class have the same name, with the class at the end and the last loaded class at the beginning.

3.1.4.2: roThe case of no merge

As in section 2.1, the main class and any class implement the +load method, and the sameMethod method is called in the main function.

  • Because of non-mergeroNow the list of methods is therearray, it is aTwo dimensional pointer, so the first thing to look for is the classificationXJBMethod list.

Continue to run to the binary search function findMethodInSortedMethodList validation.

  • Find the correspondingsameMethodThe method returns. It doesn’t go anywherewhileProcess.

Four:initializeProcess analysis

We know that the +load method is called in the load_images process. When is the +initialize method called? Let’s explore.

We all know that the +initialize method is called before the method is first called, so create a new non-source project and add the XJPerson class. Add the +initialize method to the XJPerson class, and then instantiate XJPerson in the main function and call the method.

  • +initializeThe method is to be looked up slowly by the messagelookUpImpOrForwardCalled by the process.

Some reference to the OBJc source code was found in the assembly of the step before the +initialize method was called.

CALLING_SOME_+initialize_METHOD = CALLING_SOME_+initialize_METHOD = CALLING_SOME_+initialize_METHOD

A global search for the callInitialize function shows that only the initializeNonMetaClass function has been called.

Continue the global search for the initializeNonMetaClass function to find the caller initializeAndMaybeRelock function.

Continue searching for the initializeAndMaybeRelock function to find the caller initializeAndLeaveLocked function.

Continue to search initializeAndLeaveLocked function, find out the caller realizeAndInitializeIfNeeded_locked function.

Continue to search realizeAndInitializeIfNeeded_locked function, find out the caller lookUpImpOrForward function.

This will always find the message slow lookup process. Note that the +initialize method is indeed called when the first message is sent.

+ Initialize method call process:

  • lookUpImpOrForward -> realizeAndInitializeIfNeeded_locked -> initializeAndLeaveLocked -> initializeAndMaybeRelock -> initializeNonMetaClass -> callInitialize -> initialize.

The +initialize method is called when the first message is sent, so it does not affect the loading of classes and classes.

Five:

  • Merging of classes and classifications: depends+loadIs the total number of implementations of the method multiple (+initializeBecause all+loadMethods in theload_imagesIn the function that’s going to be called, each class or class+loadMethods may all have their own actions, so don’t violently merge them).

    • Merger:0/1a+loadMethod implementation, the list of classification methods will be merged into the main classroIn, the post-compiled classification method with the same name comes first.

      • 0a+loadMethod implementation, class for lazy loading class.
      • 1a+loadMethod implementation, class for non-lazily loaded classes (due to merge, who implementedloadIt doesn’t matter).
    • Do not merge:2One or more+load. The classified method list is loaded torweIn the.

      • The main class implementation+load:load_imagesThrough the processloadAllCategoriesLoad the classification data torweIn the.
      • The main class is not implemented+load:load_imagesThrough the processprepare_load_methodsThe process finally instantiates the class and loads the classification method torweIn the.
  • Class instantiation (loading)

    • A classification or subclass (+load) causes the class to be instantiated atload_imagesIn the process. The class itself is lazily loaded and is forced to instantiate.
    • Of or relating to a subclass or class+loadMethod causes the class to be instantiated inmap_imagesIn the.
    • In other cases, the class is lazily loaded and finds the message slowlylookUpImpOrForwardInstantiate in the process.
  • Lazy loading of class & non-lazy loading

    • Lazy loading: class, subclass, classification, subclass classification not implemented+loadMethod, the class is lazily loaded.

    • The lazy loading

      • Completely non-lazy loading: class implementation+loadMethod, in which case the class is not lazily loaded

      • Dependent non-lazy loading: The class itself is not implemented+loadMethods.

        • The subclass implementation+loadMethod: Since recursive instantiation causes the parent class to be instantiated, the parent class is essentially a lazy-loaded class, which in this case is equivalent to a non-lazy-loaded class.
        • Classification/subclass classification implementation+loadMethods:prepare_load_methodsThe class is instantiated because the class is not lazily loaded, which also means the class becomes a non-lazily loaded class.

Class and classification loading process: