Welcome to the iOS Basics series (suggested in order)

IOS low-level – Alloc and init explore

IOS Bottom – Isa for everything

IOS low-level – Analysis of the nature of classes

IOS Underlying – cache_t Process analysis

IOS Low-level – Method lookup process analysis

IOS bottom layer – Analysis of message forwarding process

IOS Low-level – How does Dyld load app

IOS low-level – class load analysis

IOS low-level – Load analysis of categories

1. Overview of this paper

This paper aims to improve the loading process of all classes by analyzing the loading process of classes and their performance in lazy loading and non-lazy loading respectively.

2. Classification related exploration

2.1 Preliminary Classification

In the last article, the class loading analysis analyzed the main process of MAP_images. The last part of this process is the loading part of classification. Now let’s go back to the analysis.

It reads the categories from the __objc_catList segment of macho, and then iterates over the read categories. Both steps receive type CATEGORY_T.

2.2 Classified data structure

Naturally click category_t to see its structure:

struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;
    struct method_list_t *classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
    struct property_list_t *_classProperties;
};
Copy the code
  • name: Validates the name of the class, that is, the field inside the parentheses.
  • cls: Corresponding class address,remapClassthroughshiftclassGet the name of the class
  • instanceMethods: Instance method
  • classMethods: class method
  • instanceProperties: Instance attributes
  • classProperties: class attribute

Why instanceMethods and classMethods?

Classes can add instance methods as well as class methods. When the class structure is loaded, the corresponding metaclass is also loaded. As we all know, instance methods live in classes, and class methods live in metaclasses. Therefore, instanceMethods data is taken during class loading, and classMethods data is taken during metaclass loading. Instance methods and class methods are handled differently and naturally need to be stored separately.

What is classProperties?

Since Xcode 8, LLVM has supported explicit declaration of class attributes in Objective-C, primarily for interoperation with class attributes in Swift.

When defining properties, add class modifiers to define them as class properties, which require manual implementation of setters and getters. Class attributes may help you when uncoupling is needed.

2.3 Verify the existence of lazy and non-lazy loading of classes and categories

The previous article analyzed the loading of classes at startup, but stated that only non-lazy-loaded classes exist. If a class is lazy-loaded and non-lazy-loaded, is the class differentiated as well?

2.3.1 Verify the existence of lazy and non-lazy loading of classes

In the class loading source, you can see a comment like this:

// Realize non-lazy classes (for +load methods and static constructors)Copy the code

CJFStudent, CJFTeacher, CJFPerson, CJFStudent, CJFTeacher, CJFPerson

+ (void)load{
    NSLog(@"%s",__func__);
}
Copy the code

Add validation method code to source code:

After running, you can see console output:

Classes that implement +load are loaded, but those that aren’t are not.

To be verified:

Lazy-loaded classes and non-lazy-loaded classes exist and are implemented+loadMethods are non-lazy-loaded classes.

2.3.2 Verify the existence of lazy loading and non-lazy loading of categories

Since the difference between lazy loading and non-lazy loading is +load, the classification can be verified in the same way.

Create CJFPerson+text class and add validation code to the source code that implements the class:

+load is then implemented and not implemented in this order, resulting in console output:

As you can see, the loading of classes at startup depends on whether the +load method is implemented. It is also verified that in the data structure of the classification, name is the name of the classification.

To be verified:

Whether the classification or the class is implemented or not+loadMethod, which does affect the loading process at startup.

Are those classes and classifiers lazy-loaded or non-lazy-loaded only based on the +load method or static constructor? When is the key +load method executed? Are classes and categories related to each other to affect the loading process? Let’s take each of these questions one by one.

2.4 load_images () analysis

Since the +load method is so critical, there is a natural need to study it.

In the loading analysis of the class in the last article, only map_images of the three things libObjc received from dyld were analyzed. Now let’s analyze load_images.

Read the source code directly:

It’s simple, just two steps:

  • Prepare_load_methods: Prepares the load method

  • Call_load_methods: Calls the load method

Against 2.4.1 prepare_load_methods ()

  • schedule_class_load: Load method of the scheduling class.
static void schedule_class_load(Class cls){ ... schedule_class_load(cls->superclass); add_class_to_loadable_list(cls); . }Copy the code

In general, the +load method is added to the corresponding loadable table by recursive method, which explains why the +load method calls the parent class first and then the child class.

It holds a loadable_class structure:

struct loadable_class {
    Class cls;  // may be nil
    IMP method;
};
Copy the code

The structure contains IMP for subsequent calls.

  • add-classified+loadMethod to the correspondingloadableList of (and classloadableTable is not the same sheet), before adding the callrealizeClassWithoutSwift, in case the class is not implemented (otherwise useful, make a note and come back later).

It holds a loadable_category structure:

struct loadable_category {
    Category cat;  // may be nil
    IMP method;
};
Copy the code

The structure also contains IMP for subsequent calls.

2.4.2 call_load_methods ()

  • call_class_loads: in the table of the calling class+loadmethods

According to the order in the table, take out the IMP call +load method in turn, through the function pointer to achieve fast call.

  • call_category_loads: in the table calling the classification+loadmethods

Also, in accordance with the order in the table, take out the IMP call +load method in turn, through the function pointer to achieve fast call.

This call flow also explains why the +load method calls the class first, before calling the class.

The uninherited class +load method calls depend on the order in which they are added to the table, that is, Compile Sources; The +load method calls for the same class’s classes also depend on the order in which they are added to the table, which is also the order in which Compile Sources is called.

As for the direct use of function pointer calls is also well understood, it is still in the start-up phase, if the use of sending messages is more time-consuming.

2.5 Lazy and non-lazy performance of classes and categories

Given two subjects, two loading types, there are four permutations.

2.5.1 Non-lazy loading classes and non-lazy loading classes

This is the case that the class load analysis in the previous article analyzed, while implementing the +load method, that class must be loaded at startup.

So read_images-realizeclass with outswift – methodizeClass, that’s a fixed order, but when I go to methodlizeClass and I’m ready to append the classification, it hasn’t been added to the hash table yet,

category_list *cats = unattachedCategoriesForClass(cls, true /*realizing*/);
attachCategories(cls, cats, false /*don't flush caches*/);
Copy the code

This step unattachedCategoriesForClass for classification is empty, classification is not in methodlizeClass is attached, but in the read_images part handling classification on loading.

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses){ ... addUnattachedCategoryForClass(cat, cls, hi); if (cls->isRealized()) { remethodizeClass(cls); classExists = YES; }... }Copy the code

Here did not attach first but have read classified addUnattachedCategoryForClass join the corresponding hash table, determine whether the class implementation CLS – > isRealized (), because is lazy loading class, class must have been implemented, Call remethodizeClass to reattach the classification.

2.5.2 Non-lazy loading classes and lazy loading classes

Start by adding two methods named categoryInstanceMethod and categoryClassMethod to the categories you create. Then analyze:

Read_images-realizeclasswithoutswift-methodizeclass is fixed,

static void methodizeClass(Class cls){ ... category_list *cats = unattachedCategoriesForClass(cls, true /*realizing*/); attachCategories(cls, cats, false /*don't flush caches*/); . }Copy the code

Came to methodlizeClass prepare additional classification, corresponding classification is lazy loading, will not be deposited in the hash table, unattachedCategoriesForClass access is empty (here, too, and the first kind of circumstance is to get to the hash table is empty, but the first one is not added to the table, This is not added to the table and needs to be distinguished).

Where is the lazy load classification placed at this time?

You can see this by looking at the ro of the class in methodlizeClass

CategoryInstanceMethod and categoryClassMethod are added to the RO of CJFPerson’s class and metaclass, respectively, as well as to validate the classification data structure. Why you need instanceMethods and classMethods fields (stored separately).

So when a class is not lazily loaded, the data for the class is read from the compiled RO and then copied into the RW of the class.

2.5.3 Lazy loading classes and lazy loading classification

Because of lazy loading classification, data is still loaded into RO first;

Because the class is lazily loaded, it will not be loaded at startup.

So when is the class loaded?

As you can imagine, lazy loading works by being loaded the first time it is used. Then try calling the next method to use it. The calling method will naturally come to the message sending flow, which analyzes the lookUpImpOrForward method in the method lookup flow, but then ignores one step:

IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver{     ...
    if (!cls->isRealized()) {
          cls = realizeClassMaybeSwiftAndLeaveLocked(cls, runtimeLock);
    }
    ...
}
Copy the code

When a message is sent, it is determined whether the class that sent the message has been implemented or not, and if not, layers are called to realizeClassWithoutSwift to implement the class.

The order of realizeClassWithoutSwift – methodizeClass is still fixed.

In methodizeClass, the classification information is read from ro into the RW of the class.

2.5.4 Lazy and non-lazy classes

Since it is a lazy-loaded class, the non-lazy-loaded class processing in read_images will not be performed;

Since the classification is not lazily loaded, the classification processing in read_images is performed.

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses){ ... addUnattachedCategoryForClass(cat, cls, hi); if (cls->isRealized()) { remethodizeClass(cls); classExists = YES; }... }Copy the code

At this point, the classification of the data was addUnattachedCategoryForClass added to the hash table, but has not been realized, CLS – > isRealized () is false, don’t come in execution remethodizeClass additional classified information to the class.

This is understandable, the class is not ready, how can attach the classification.

So at this time, how to load the class, is also waiting for the first time to send a message, but the class +load method has been started in a hurry to need to be executed, at this time the main body of the class has not been loaded, some unreasonable.

Remember the +load method call process from above.

Before the class +load method is called, realizeClassWithoutSwift is implemented as a fault tolerant method to ensure that the class has been loaded. However, in addition to fault tolerance, the main purpose is to handle both lazy and non-lazy classes.

So, instead of using ro, realizeClassWithoutSwift – methodizeClass is called to load the class before the +load method is executed.

static void methodizeClass(Class cls){ ... category_list *cats = unattachedCategoriesForClass(cls, true /*realizing*/); attachCategories(cls, cats, false /*don't flush caches*/); . }Copy the code

But by unattachedCategoriesForClass out classification in the hash table, called attachCategories classification extraneous to the class.

2.5.5 summary

These four cases are quite round, need to calm down to analyze and understand the principle, here also directly give the conclusion:

  • Lazy loading depends+loadMethods.
  • If the classification is lazily loaded, it will be added at compile timero.
  • If the classification is non-lazy-loaded, the loading process depends on whether the class is lazy-loaded. If the class is lazy, it will be sorted+loadMethod is loaded before calling; If the class is not lazily loaded, the class is loaded earlier.
  • If the class is lazily loaded, the loading process depends on whether the class is lazily loaded. If the class is lazy, it is loaded the first time it is used; If the non-lazy load classification, in the classification+loadMethod is loaded before it is called.

2.6 Classification and class expansion

Finish the analysis of classification, by the way, the analysis of the next class expansion.

What is the difference between classification and class extension? This is probably one of the most frequently asked questions during the interview process.

Class extension can add attributes; Categories cannot add attributes directly. The answer is simple, so why can’t categories just add attributes?

Create the CJFPerson class to extend CJFPerson+Extension, import the header file, add the attribute extensionName, and add the attribute categoryName to the previous CJFPerson category.

then

clang -rewrite-objc CJFPerson.m -o CJFPerson.cpp
Copy the code

Look at the compiled source file and search for two property names

Obviously, extensionName is identified at compile time, but categoryName is not.

So class extensions are loaded as part of the class at compile time, whereas classification attaches data to the class at run time.

It is not allowed to add the ivAR generated by the attribute to ro, and there is no ivar_list field in THE RW. And the system only declares class setter and getter methods, but does not implement them.

So categories cannot add attributes directly, but can be added dynamically using associated objects.

Note that class extensions will not be compiled by the system if they are not imported into the header file.Copy the code

2.7 + initialize analysis

The +initialize and +load methods are often compared because they were implemented earlier in the system.

Simple to analyze the +initialize implementation source code:

In general, before calling +initialize of a class, the system will recursively implement +initialize of its parent class, and then call callInitialize to implement it

void callInitialize(Class cls)
{
    ((void(*)(Class, SEL))objc_msgSend)(cls, SEL_initialize);
    asm("");
}
Copy the code

SEL_initialize is called internally by sending a message to callInitialize.

So when does +initialize get called?

Create CJFPerson classes, parent classes, and classes, all implement + Initialize, initialize CJFPerson, and run

And then you do three things:

  • cancelCJFPersonThe initialization, on the run, found that there was no print.
  • Initialize multipleCJFPerson, on the run, found no multiple prints.
  • Annotate classes and categories+initializeOn the run, the parent class is found to be printed twice.

After the test, you can basically come to this conclusion

  • +initializeIs executed when the class first sends a message
IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver)
{
    ...
    if (initialize  &&  !cls->isInitialized()) {
        initializeNonMetaClass (_class_getNonMetaClass(cls, inst));
    }
    ...
}
Copy the code

When we go to lookUpImpOrForward, we find another detail that checks if the class is initialized before sending the message. If it is not initialized, initializeNonMetaClass is called and the conclusion is verified.

  • Classification of+initializeThe main class+initializeIt’s covered.

This is because while sending messages, methods are added to the list of methods, and classified methods are added before the main class, causing this phenomenon. The last article analyzed this phenomenon in detail.

  • If the subclass is not implemented+initializeThe parent class has an implementation of the parent class when it initializes the child class+initializeIt gets called multiple times.

InitializeNonMetaClass is a recursive search for the parent class. All +initialize of the parent class is executed first, and since the message is sent, when calling +initialize of the subclass, the method list of the subclass cannot be found, so it will look up the method list of the parent class, resulting in multiple calls.

3. Write at the end

Classification is a common technology in daily development, but also frequent interview, is a must to master. Classes and classes are inseparable, so the content of this chapter needs to be understood in conjunction with the class loading analysis of the previous chapter.

Recently, I have been studying reverse correlation, resulting in the slow update of the underlying series of articles. The following will be updated in succession multithreading, locking, block and other underlying principles, and componentized actual combat process. Stay tuned.