preface

In our previous exploration of class loading analysis, we saw that DYLD entered the program, loaded the image file, registered the callback function in the objc_init method, and then loaded it into memory through a series of operations on map_images. In map_images, the _read_images method is used to first create a table, traverse all classes to map them to the table, then add SEL and protocol to the corresponding table, initialize classes and non-lazy loading classes, assign values to RO and RW, and a series of processes. So today we will look at some interview questions about RO and RW, and then we will look at classes and non-lazy loading classes and Category loading.

1.The Runtime interview questions

Q: Can I dynamically add a member variable to a class? Why is that?

A: Dynamically created classes can add member variables. Registered classes cannot add member variables dynamically.

Analysis is as follows:

First, we write the following code using the Runtime API:

Class LGPerson = objc_allocateClassPair([NSObject Class],"LGPerson", 0); Ivar-ro-ivarlist class_addIvar(LGPerson,"lgName", sizeof(NSString *), log2(sizeof(NSString *)), "@"); // 3: register to memory objc_registerClassPair(LGPerson);Copy the code

Through the above code to dynamically create classes, add member variables, and then register to memory, run the code, the program can run normally.

When we switch the order of step 2 and step 3, first register in memory, then add member variables, at this time, the program will crash, next through the source code to analyze.

Member variables are stored in the ivar_list_t * ivars ro of class_rw_t *data(). The following source code:

Objc_class source

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    class_rw_t *data() { 
        returnbits.data(); }... }Copy the code

Class_rw_t source code:

struct class_rw_t { // Be warned that Symbolication knows the layout of this structure. uint32_t flags; uint32_t version; const class_ro_t *ro; method_array_t methods; property_array_t properties; protocol_array_t protocols; . . }Copy the code

Class_ro_t source code:

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endifconst uint8_t * ivarLayout; const char * name; method_list_t * baseMethodList; protocol_list_t * baseProtocols; const ivar_list_t * ivars; const uint8_t * weakIvarLayout; property_list_t *baseProperties; . . }Copy the code

Ro assigns values at compile time. It can only be read and cannot be changed. Rw assigns values at class initialization. Rw — >ro is also assigned at this time.

Objc_registerClassPair Runtime will register the API objc_registerClassPair

/***********************************************************************
* objc_registerClassPair
* fixme
* Locking: acquires runtimeLock
**********************************************************************/
void objc_registerClassPair(Class cls)
{
    mutex_locker_t lock(runtimeLock);

    checkIsKnownClass(cls);

    if ((cls->data()->flags & RW_CONSTRUCTED)  ||
        (cls->ISA()->data()->flags & RW_CONSTRUCTED)) 
    {
        _objc_inform("objc_registerClassPair: class '%s' was already "
                     "registered!", cls->data()->ro->name);
        return;
    }

    if(! (cls->data()->flags & RW_CONSTRUCTING) || ! (cls->ISA()->data()->flags & RW_CONSTRUCTING)) { _objc_inform("objc_registerClassPair: class '%s' was not "
                     "allocated with objc_allocateClassPair!", 
                     cls->data()->ro->name);
        return;
    }

    // Clear "under construction" bit, set "done constructing"Bit / / replace CLS - > ISA () - > changeInfo (RW_CONSTRUCTED, RW_CONSTRUCTING | RW_REALIZING); cls->changeInfo(RW_CONSTRUCTED, RW_CONSTRUCTING | RW_REALIZING); // Add to named class table. addNamedClass(cls, cls->data()->ro->name); }Copy the code

Key steps

cls->ISA()->changeInfo(RW_CONSTRUCTED, RW_CONSTRUCTING | RW_REALIZING);
cls->changeInfo(RW_CONSTRUCTED, RW_CONSTRUCTING | RW_REALIZING);
Copy the code

At the time of registration, the RW_CONSTRUCTING | RW_REALIZING replacement for RW_CONSTRUCTED.

Add the member variable dynamically by looking at the class_addIvar API

BOOL 
class_addIvar(Class cls, const char *name, size_t size, 
              uint8_t alignment, const char *type)
{
    if(! cls)return NO;

    if (!type) type = "";
    if (name  &&  0 == strcmp(name, "")) name = nil;

    mutex_locker_t lock(runtimeLock);

    checkIsKnownClass(cls);
    assert(cls->isRealized());

    // No class variables
    if (cls->isMetaClass()) {
        return NO;
    }

    // Can only add ivars to in-construction classes.
    if(! (cls->data()->flags & RW_CONSTRUCTING)) {return NO;
    }

    // Check for existing ivar with this name, unless it's anonymous.
    // Check for too-big ivar.
    // fixme check for superclass ivar too?
    if ((name  &&  getIvar(cls, name))  ||  size > UINT32_MAX) {
        return NO;
    }

    class_ro_t *ro_w = make_ro_writeable(cls->data());

    // fixme allocate less memory here
    
    ivar_list_t *oldlist, *newlist;
    if ((oldlist = (ivar_list_t *)cls->data()->ro->ivars)) {
        size_t oldsize = oldlist->byteSize();
        newlist = (ivar_list_t *)calloc(oldsize + oldlist->entsize(), 1);
        memcpy(newlist, oldlist, oldsize);
        free(oldlist);
    } else {
        newlist = (ivar_list_t *)calloc(sizeof(ivar_list_t), 1);
        newlist->entsizeAndFlags = (uint32_t)sizeof(ivar_t);
    }

    uint32_t offset = cls->unalignedInstanceSize();
    uint32_t alignMask = (1<<alignment)-1;
    offset = (offset + alignMask) & ~alignMask;

    ivar_t& ivar = newlist->get(newlist->count++);
#if __x86_64__
    // Deliberately over-allocate the ivar offset variable. 
    // Use calloc() to clear all 64 bits. See the note in struct ivar_t.
    ivar.offset = (int32_t *)(int64_t *)calloc(sizeof(int64_t), 1);
#else
    ivar.offset = (int32_t *)malloc(sizeof(int32_t));
#endif
    *ivar.offset = offset;
    ivar.name = name ? strdupIfMutable(name) : nil;
    ivar.type = strdupIfMutable(type);
    ivar.alignment_raw = alignment;
    ivar.size = (uint32_t)size;

    ro_w->ivars = newlist;
    cls->setInstanceSize((uint32_t)(offset + size));

    // Ivar layout updated in registerClass.

    return YES;
}

Copy the code

Look at the key code in the source code above,

    // Can only add ivars to in-construction classes.
    if(! (cls->data()->flags & RW_CONSTRUCTING)) {return NO;
    }
Copy the code

CLS ->ISA()->changeInfo and CLS ->changeInfo are modified when we register the class, so we return NO directly in the above judgment and do not assign ivar in ro later. So you can’t dynamically add a member variable to a registered class.

So what I’m going to do is I’m going to add an attribute to my LGPerson class, and I’m going to print it, and I’m going to say, can I print it? Why is that?

Define the following methods:

void lg_class_addProperty(Class targetClass , const char *propertyName){
    
    objc_property_attribute_t type = { "T", [[NSString stringWithFormat:@"@ \" % @ \ "",NSStringFromClass([NSString class])] UTF8String] }; //type
    objc_property_attribute_t ownership0 = { "C"."" }; // C = copy
    objc_property_attribute_t ownership = { "N"."" }; //N = nonatomic
    objc_property_attribute_t backingivar  = { "V", [NSString stringWithFormat:@"_ % @",[NSString stringWithCString:propertyName encoding:NSUTF8StringEncoding]].UTF8String };  //variable name
    objc_property_attribute_t attrs[] = {type, ownership0, ownership,backingivar};

    class_addProperty(targetClass, propertyName, attrs, 4);

}
Copy the code

Add attributes to the RW

Rw lg_class_addProperty(LGPerson,"subject");

[person setValue:@"master" forKey:@"subject"];
NSLog(@"% @",[person valueForKey:@"subject"]);
Copy the code

Answer: No, we just added the Subject property to the RW. Because we added it dynamically, the system did not generate the setter and getter, and assignment printing is equivalent to calling those two methods, so it cannot print.

We need to add the following code to add setters and getters to the list of methods.

Add setter + getter class_addMethod(LGPerson, @selector(setSubject:), (IMP)lgSetter, "v@:@");
class_addMethod(LGPerson, @selector(subject), (IMP)lgName, "@ @.");
Copy the code

Summary:

1. Dynamically created classes can dynamically add member variables and attributes, but created and registered classes cannot dynamically add member variables. 2. For dynamically added properties, setters and getters are not generated by default. You need to add setter + getter methods to the method list.Copy the code

2.Class and non-lazy-loaded class loading

2.1Class and non-lazy-loaded class analysis

As we learned in the previous article, after classes are added to the table, a series of operations are carried out in the _read_images() method, such as registering SEL in the hash table, adding protocols to the table, initializing non-lazy loaded classes, initializing lazy loaded classes, and processing classification categories.

So what is a lazy-loaded class and what is a non-lazy-loaded class?

LGTeacher, LGStudent, LGPerson, LGTeacher, LGStudent, LGPerson

+(void)load
{
    NSLog(@"%s",__func__);
}
Copy the code

Then in the _read_images() method, when initializing the non-lazy-loaded class, print the class information as follows

And then run the code, look at the console,

It prints only two classes with a load method. It does not find a LGPerson that does not implement a load method. Instead, LGPerson is called in the main function and instantiates an object.

Hence: The load method will load the class earlier, bringing the compile time of the class forward to where the data is loaded. A class that implements a load method is a non-lazy class. Lazy-loaded classes are implemented when you need them.

Next, let’s examine the loading of non-lazy-loaded classes and the loading of lazy-loaded classes.

2.2Loading of non-lazy-loaded classes

As you saw in the previous analysis, when you enter the loading step of the non-lazy-loading class in the _read_images() method, the flow is as follows:

  • Get all theNon-lazy-loaded classes.classref_t *classlist = _getObjc2NonlazyClassList(hi, &count)
  • Loop readsNon-lazy-loaded classes, add it to memory,addClassTableEntry(cls)
  • Implement allNon-lazy-loaded classesInstantiate some information about the class object, such as RW,realizeClassWithoutSwift(cls)
  • inrealizeClassWithoutSwift(cls)In theclsthesupClass,isaAs well asrw->roAnd so on, and then entermethodizeClass(cls)withroData pairs inrwPerform the assignment.

Rw – > ro assignment

2.3Lazy loading of classes

Lazily loaded classes are only loaded when called, so in main, create LGPerson, and then set a breakpoint to analyze.

When we call the alloc method, we enter the method lookup flow, which inevitably leads to the lookUpImpOrForward method.

Then judge enter realizeClassMaybeSwiftAndLeaveLocked method,

Check the realizeClassMaybeSwiftAndLeaveLocked method source code:

static Class
realizeClassMaybeSwiftAndLeaveLocked(Class cls, mutex_t& lock)
{
    return realizeClassMaybeSwiftMaybeRelock(cls, lock, true);
}
Copy the code

In realizeClassMaybeSwiftAndLeaveLocked () call realizeClassMaybeSwiftMaybeRelock () method, the source code is as follows:

/***********************************************************************
* realizeClassMaybeSwift (MaybeRelock / AndUnlock / AndLeaveLocked)
* Realize a class that might be a Swift class.
* Returns the real class structure forthe class. * Locking: * runtimeLock must be held on entry * runtimeLock may be dropped during execution * ... AndUnlockfunction leaves runtimeLock unlocked on exit
*   ...AndLeaveLocked re-acquires runtimeLock if it was dropped
* This complication avoids repeated lock transitions insome cases. **********************************************************************/ static Class realizeClassMaybeSwiftMaybeRelock(Class cls, mutex_t& lock, bool leaveLocked) { lock.assertLocked(); // Check whether it is Swiftif(! CLS ->isSwiftStable_ButAllowLegacyForNow()) {// No // non-swift class. Realize it now with the lock still held. // fixme wrongin the future forObjc subclasses of Swift classes // initialize realizeClassWithoutSwift(CLS);if(! leaveLocked) lock.unlock(); }else {
        // Swift class. We need to drop locks and call the Swift
        // runtime to initialize it.
        lock.unlock();
        cls = realizeSwiftClass(cls);
        assert(cls->isRealized());    // callback must have provoked realization
        if (leaveLocked) lock.lock();
    }

    return cls;
}

Copy the code

Through the above realizeClassMaybeSwiftMaybeRelock source code analysis, to judge whether the Swift, not, call the realizeClassWithoutSwift (CLS) of class initialization, the superClass, isa, rw assignment, Same as the non-lazy-loaded class above.

When the class is initialized, it goes to lookUpImpOrForward, which is initialized, allocates space, and so on.

Note: When a class inherits from another class and the subclass implements the load method, the subclass becomes a non-lazy-loaded class, and the parent class becomes a non-lazy-loaded class. Supercls is initialized recursively with the following code in the realizeClassWithoutSwift method.

3.Category Category loading

First, we create a LGTeacher class and LGTeacher (test) class as follows:

@interface LGTeacher : NSObject
@property (nonatomic, copy) NSString *name;
@property (nonatomic, copy) NSString *subject;
@property (nonatomic, assign) int age;

- (void)sayHello;

+ (void)sayMaster;
@end

#import "LGTeacher.h"


@implementation LGTeacher

//+ (void)load{
//    NSLog(@"%s",__func__);
//}
@end

@interface LGTeacher (test)
@property (nonatomic, copy) NSString *cate_p1;
@property (nonatomic, copy) NSString *cate_p2;

- (void)cate_instanceMethod1;
- (void)cate_instanceMethod2;

+ (void)cate_classMethod1;
+ (void)cate_classMethod2;

@end

@implementation LGTeacher (test)

//+ (void)load{
//    NSLog(@"Classification load");
//}

- (void)setCate_p1:(NSString *)cate_p1{
}

- (NSString *)cate_p1{
    return @"cate_p1";
}

- (void)cate_instanceMethod2{
    NSLog(@"%s",__func__);
}

+ (void)cate_classMethod2{
    NSLog(@"%s",__func__);
}
@end

Copy the code

3.1 Clang preliminary study on the structure of classification categories

Using clang, look at the c++ file,

_category_t structure:

// attachlist C++ struct _category_t {const char *name; Struct _class_t * CLS; // class const struct _method_list_t *instance_methods; // list of object methods const struct _method_list_t *class_methods; Const struct _protocol_list_t *protocols; const struct _prop_list_t *properties; };Copy the code
// OC
struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;
    struct method_list_t *classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
    // Fields below this point are not always present on disk.
    struct property_list_t *_classProperties;

    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta, struct header_info *hi);
};
Copy the code

Why are there two lists of methods? Object methods live in classes, and class methods live in metaclasses. When methods are added using the attachLists method, they need to be added to different classes.

3.2 Collocation loading of classes and classification categories

1. Classification of lazy loading (not implementedloadMethods)

We know that in the read_iamge method, rW is assigned by ro in methodizeClass and added to the list of methods by attachCategories when the class is initialized.

Through breakpoint debugging, judge to be LGTeacher,

category_list
NULL
NULL
attachCategories

So is the classification method added? Let’s analyze it through LLDB:

Methods has a value,

methods

Rw ->methods = ro->baseMethods(); rw->methods = ro->baseMethods(); rw->methods = ro->baseMethods(); This is the code below

    // Install methods and properties that the class implements itself.
    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1, YES, isBundleClass(cls));
        rw->methods.attachLists(&list, 1);
    }
Copy the code

So when a lazy-loaded class (which does not implement the load method) is paired with a lazy-loaded class and a non-lazy-loaded class, two more things happen.

It is processed at compile time and added directly to the ro of the corresponding data(). The main difference is the loading of the classes. As mentioned above, the loading of the above classes and non-lazily loaded classes is done when sending messages. Loaded via the lookuporForward ->realizeClassWithoutSwift->methodlizeClass process. Instead of lazy loading classes are loaded through the read_images->realizeClassWithoutSwift->methodlizeClass process.

2. Classification of non-lazy loading (implementationloadMethods)

When a class implements load, its load will also be advanced, that is, read_iamges class processing, the following key code:

for (i = 0; i < count; i++) {
            category_t *cat = catlist[i];
            Class cls = remapClass(cat->cls);
   
            if(! cls) { // Category's target class is missing (probably weak-linked). // Disavow any knowledge of this category. catlist[i] = nil; if (PrintConnecting) { } continue; } // Process this category. // First, register the category with its target class. // Then, rebuild the class's method lists (etc) if// the class is realized. bool classExists = NO; // ✅ determine if it is an object methodif(cat - > instanceMethods | | cat - > separate protocols | | cat - > instanceProperties) {/ / ✅ for the class of adding an additional category addUnattachedCategoryForClass (cat, cls, hi); const char *cname = cls->demangledName(); const char *oname ="LGTeacher";
                if (cname && (strcmp(cname, oname) == 0)) {
                    printf("_getObjc2CategoryList :%s \n",cname); } // ✅ to check if the class is lazyif (cls->isRealized()) {
                    remethodizeClass(cls);
                    classExists = YES;
                }
                if(PrintConnecting) {}} // // ✅ Check whether it is a class methodif(cat->classMethods || cat->protocols || (hasClassProperties && cat->_classProperties)) { addUnattachedCategoryForClass(cat, cls->ISA(), hi); // ✅ to check if it is lazy loading classif (cls->ISA()->isRealized()) {
                    remethodizeClass(cls->ISA());
                }
                if (PrintConnecting) {   }
            }
        }
    }
Copy the code

To read the first classification, to determine whether classification methods in the object method, then addUnattachedCategoryForClass for the class to add additional categories. If the class is not lazy, call remethodizeClass to add the class method to the list of methods in the main class

RemethodizeClass method

static void remethodizeClass(Class cls)
{
    category_list *cats;
    bool isMeta;

    runtimeLock.assertLocked();

    isMeta = cls->isMetaClass();

    // Re-methodizing: check for more categories
    if ((cats = unattachedCategoriesForClass(cls, false/*not realizing*/))) {
        if (PrintConnecting) {
            _objc_inform("CLASS: attaching categories to class '%s' %s", 
                         cls->nameForLogging(), isMeta ? "(meta)" : ""); } //✅ attachCategories(CLS, cats,true/*flush caches*/); free(cats); }}Copy the code

Lazy-loaded classes are loaded through the lookuporForward ->realizeClassWithoutSwift->methodlizeClass process when the message is sent.

So, when a non-lazy-loaded class is attached, the remethodizeClass method is called and attachCategories() is applied to the main class.

When a non-lazy-loaded class is paired with a lazy-loaded class, the class is loaded but the main class is not. Classification methods do not know where to add to the main class, so such a classification is not loaded in advance is meaningless?

Of course not. And then,

Prepare_load_methods, then call realizeClassWithoutSwift, source code is as follows:

void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertLocked();

    classref_t *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    for(i = 0; i < count; i++) { schedule_class_load(remapClass(classlist[i])); } / / map_images finished category_t * * categorylist = _getObjc2NonlazyCategoryList (MHDR, & count);for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        
        const class_ro_t *ro = (const class_ro_t *)cls->data();
        const char *cname = ro->name;
        const char *oname = "LGTeacher";
        if (strcmp(cname, oname) == 0) {
           printf("prepare_load_methods :%s \n",cname);
        }
        
        if(! cls)continue;  // category for ignored weak-linked class
        if (cls->isSwiftStable()) {
            _objc_fatal("Swift class extensions and categories on Swift "
                        "classes are not allowed to have +load methods"); } realizeClassWithoutSwift(cls); assert(cls->ISA()->isRealized()); add_category_to_loadable_list(cat); }}Copy the code

When we enter the realizeClassWithoutSwift method and then call methodizeClass, the cats in methodizeClass method is not empty. Then we call attachCategories() to attach the classified method to the main class.

category_list *cats = unattachedCategoriesForClass(cls, true /*realizing*/);
attachCategories(cls, cats, false /*don't flush caches*/);
Copy the code

conclusion

This paper introduces the loading principle of lazy loading class and non-lazy loading class, and the loading principle of four combinations of lazy loading class and non-lazy loading class with lazy loading class and non-lazy loading class.

  • Classes that implement the load method are non-lazy-loaded classes that are initialized at startup

  • Methods that do not implement load are lazy-loaded classes that are initialized when called

  • Loading of non-lazy classes:

    • read_iamges
    • Loop through non-lazily loaded classes, add them to memory,addClassTableEntry(cls)
    • inrealizeClassWithoutSwift(cls)In theclsthesupClass,isaAs well asrw->roAnd so on, and then entermethodizeClass(cls)withroData pairs inrwPerform the assignment.
  • Lazy-loaded class loading: initialized at call time and then entered into lookUpImpOrForward

    • Enter method lookup flow, enterlookUpImpOrForward
    • ! cls->isRealized()Determine whether or notLazy loading classNo, start method search
    • isLazy loading classAnd into therealizeClassMaybeSwiftAndLeaveLocked
    • callrealizeClassWithoutSwift
  • The loading of lazy loading classes is handled at compile time and added directly to the ro of the corresponding data(). At class initialization, rw->methods are assigned directly with ro->baseMethods().

  • Loading + lazy loading classes for non-lazy loading classes

    • read_imageThe processing of classification in
    • Determine whether or notObject methodsorClass method.
    • Add unattached categories for classes,addUnattachedCategoryForClass
    • Determine whether or notLazy loading class.cls->isRealized()The forLazy loading class, do not enter judgment
    • Enter theprepare_load_methods
    • Enter therealizeClassWithoutSwift
    • Enter themethodizeClassAt this time,category_list *catsDon’t empty
    • callattachCategories()Method to paste classified methods into the main class.
  • Loading of non-lazy loading classes + non-lazy loading classes

    • read_imageThe processing of classification in
    • Determine whether or notObject methodsorClass method.
    • Add unattached categories for classes,addUnattachedCategoryForClass
    • Determine whether or notLazy loading class.cls->isRealized()The forNon-lazy-loaded classes, enter into judgment
    • Enter theremethodizeClass
    • callattachCategories()Method to paste classified methods into the main class.