An in-depth analysis of the Class structure

Runtime Series

Runtime Principle Exploration (I) — An in-depth understanding of ISA (Apple isa optimization)

OC Class cache_t

Runtime principle exploration (4) — To investigate the bottom message mechanism

Exploring the Runtime principle (5) — The nature of Super

Runtime theory exploration (6) — Runtime in the interview question

I am inThe nature of the OC object (part 2) — the ISA&Superclass pointerThe memory structure of the Class object is described in the following figureThis article uses that as a starting point to take a closer look at the treasures inside a Class.

(a) A brief description of the structure of Class

First let’s take a look at objC’s source code definition of Class

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable method cache
    class_data_bits_t bits;    // class_rw_t * plus custom RR /alloc flags is used to obtain specific class information

    class_rw_t *data(a) { 
        return bits.data();
    }
    void setData(class_rw_t *newData) {
        bits.setData(newData);
    }
     / /... The following stack of method/function code can be ignored for the moment
};
Copy the code

The structure is clear

Class superclass;Get the parent Class, the metaclass object, which is also a Class type
cache_t cache;— is method caching
class_data_bits_t bits;To get specific information about the class, seebitsI must have read mineRuntime isa insights (Apple isa optimizations)This article will certainly give you a deep understanding.
And then there’s oneclass_rw_t *data()Function, which gets the value of the classRead-write messageThrough theclass_data_bits_tthebits.data()Method, click on the method to see

class_rw_t* data(a) {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
    }
Copy the code

As you can see, this is similar to how we used isa to get the address of the object. Here, we use class_datA_bits_t bits; And a FAST_DATA_MASK for &. Returns a pointer of type class_rw_t *. The source code for this type is as follows

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint32_t version;

    const class_ro_t *ro;

    method_array_t methods;        //⚠️⚠️⚠️ List of methods
    property_array_t properties;    //⚠️⚠️⚠️ Attribute list
    protocol_array_t protocols;      //⚠️⚠️⚠️ Protocol list

    Class firstSubclass;
    Class nextSiblingClass;

    char *demangledName;
}
Copy the code

As we know, OC class method, attribute, protocol can be added dynamically, that is, read and write, from the above source, you can find that there is indeed a corresponding member to save method, attribute, protocol information. The structure name class_rw_t also implies that the above methods, attributes, and protocol information are readable and writable. Const class_ro_t *ro; const class_ro_t *ro; This is a pointer to that part of the content information. Again, look at the source code

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;// The memory space occupied by the instance object
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    
    const char * name;/ / the name of the class
    method_list_t * baseMethodList;// List of methods
    protocol_list_t * baseProtocols;// Protocol list
    const ivar_list_t * ivars;// List of member variables

    const uint8_t * weakIvarLayout;
    property_list_t*baseProperties; Property listmethod_list_t *baseMethods(a) const {
        returnbaseMethodList; }};Copy the code

This structure contains some read-only information about the class

uint32_t instanceSize;— Memory space occupied by the instance object
const char * name;– the name of the class
const ivar_list_t * ivars;— A list of class member variables

Method_list_t * baseMethodList; baseMethodList; , protocol_list_t * baseProtocols; , property_list_t * baseProperties; This is the same method_array_t methods that we saw in class_rw_t; , property_array_t properties; , protocol_array_t separate protocols; What’s the connection? What’s the difference? With these questions in mind, the analysis continues.

First of all, the above part is summarized in a diagram as followsThis diagram can be interpreted as the internal structure of a Class in a steady state. But in fact, in the process of program startup and initialization, Class is not such a structure, we can analyze the source code. I am inThe underlying implementation of objective-C categoriesI have analyzed the source code execution path from objC initialization to the category information loading process, so I will not repeat it here, just make a brief statement, if you don’t understand, you can add through this article.

First of all inobjc-os.mmFind objc’s initialization function invoid _objc_init(void)
Continue to enter the_dyld_objc_notify_register(&map_images, load_images, unmap_image);The inside of themap_imagesfunction
inmap_imagesSo let’s go into the functionmap_images_nolockfunction
inmap_images_nolockAt the end of the function, enter_read_imagesFunction entry
in_read_imagesIn the function implementation, you can see that before you process the category information, that’s the comment// Discover categoriesThe above part of the code, is the class initialization steps

// Realize newly-resolved future classes, in case CF manipulates them
    if (resolvedFutureClasses) {
        for (i = 0; i < resolvedFutureClassCount; i++) {
            realizeClass(resolvedFutureClasses[i]);
            resolvedFutureClasses[i]->setInstancesRequireRawIsa(false/*inherited*/);
        }
        free(resolvedFutureClasses);
    }    

    ts.log("IMAGE TIMES: realize future classes");
Copy the code

Enter the core function realizeClass

static Class realizeClass(Class cls)
{
    runtimeLock.assertWriting();

    const class_ro_t *ro;
    class_rw_t *rw;
    Class supercls;
    Class metacls;
    bool isMeta;

    if(! cls)return nil;
    if (cls->isRealized()) return cls;
    assert(cls == remapClass(cls));

    // fixme verify class is not in an un-dlopened part of the shared cache?

    ro = (const class_ro_t *)cls->data(); //-----⚠️⚠️⚠️ In the beginning, the data() of the class is directly class_ro_t
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.
        //-----⚠️⚠️⚠️ If the RW has allocated memory, then the RW points to CLS ->data(), and then the RO pointer to the RW points to the previous ro
        rw = cls->data();
        ro = cls->data()->ro;
        cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
    } else {
        Allocate writeable class data. //-----⚠️⚠️⚠️ If rW has not allocated memory
        rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1); //-----⚠️⚠️⚠️ Allocates memory to the RWrw->ro = ro; The rw ro pointer to initial ro rw - > flags = RW_REALIZED | RW_REALIZING; cls->setData(rw);//-----⚠️⚠️⚠️ Adjust class data()} isMeta = ro->flags & RO_META; . . . }Copy the code

From this function, we can see that,

In the beginning, Class has no read/write part, only read-only part, i.ero = (const class_ro_t *)cls->data();

Next, space is allocated for reading and writing information, i.erwAnd then throughrw->ro = ro;willclass_rw_tOne’s ownroThe pointer points to the realroInformation.

throughcls->setData(rw);willcls->data()Make changes, and finally passFAST_DATA_MASKRole torw

After completing the above steps, you are ready to handle the loading of category information. In the next analysis of Class, I will only deal with the method list part, the property and protocol part is actually similar to the method processing logic. Going back to our previous question, class_ro_t and class_rw_t both have lists of methods. What’s the difference between them?

(2) class_ro_t

In fact,class_ro_tThe read-only information that represents the Class is inherent in the Class itself and, more directly, written to it@interfaceand@endBetween methods, attributes, and other information, of course, the most important role is to store the class member variable informationivarsAnd byconstDecorator specifications are not modifiable, which is why the Runtime cannot dynamically add member variables, as determined by the underlying structure. I personally understand this part as static information for OC.class_ro_tIn themethod_list_t * baseMethodList; // List of methodsIs a one-dimensional array that contains the Class’s own methods.

(3) class_rw_t

In aclass_rw_tAfter that, the category is processed, and the list of methods in the Class itself and the list of methods in the category are placed firstclass_rw_tthemethod_array_t methodsThe list of Class methods is placed first and at the end of the list. The order in which category methods are added is the same as the order in which category files are compiled. I have detailed explanations for this part of the processThe underlying implementation of objective-C categoriesThis is explained in detail in the article. As a result,method_array_t methods; Is a two-dimensional array, as shown below

(4) method_t

Above we have analyzed the structure of class_rw_t and class_ro_t, and mainly focused on the method list. From the above analysis, we can find that the most basic and important unit is method_t, which contains all kinds of information needed to describe a method. Now, let’s do a thorough scan of it. First take a look at the definition in the source code

struct method_t {
    SEL name;
    const char *types;
    IMP imp;
};
Copy the code

IMP IMP – a pointer to a function (i.e. the address of the method/function implementation), which is also defined in objc.h: typedef id _Nullable (*IMP)(id _Nonnull, SEL _Nonnull,…) ;

SEL name – We all know that SEL is a method selector, which can be interpreted as a method name, but what exactly is it? It is defined in objc. H: typedef struct objC_selector *SEL; It’s a pointer type that points to a struct objc_Selector, which unfortunately Apple doesn’t open source, but look at the code below
@implementation ViewController

- (void)viewDidLoad {
  [super viewDidLoad];
 NSLog(@"%s",@selector(test)); } - (void)test {
  NSLog(@"%s",__func__);
}
@end

*************************************

2019- 08- 05 21:37:11.603121+0800 iOS-Runtime[2093:302816] test
Copy the code
Struct objc_selector contains a char * member variable, and should be the first member of the structure. So we can think of SEL as a string, whose main function is to represent the name of the method. It can be converted to a string by the following method

Can be achieved by@selector()andsel_registerName()Convert a string toSEL.

Can be achieved bysel_getName()andNSStringFromSelector()willSELConvert to string

Methods with the same name in different classes have the same SEL, that is to say, forXXXMethod, they all point to the same SEL in memorystruct objc_selectorStruct object, no matter how many classes define itXXXMethods.

Const char *types — Encoding of function types (including return value types, parameter types). IOS provides an @encode directive that can represent specific types as string encoding, i.e. representing types by strings. The main purpose is to facilitate the runtime by describing and storing function return values and parameter types in strings. Take a look at the following code example
       NSLog(@"%s",@encode(int));
       NSLog(@"%s",@encode(float));
       NSLog(@"%s",@encode(int *));
       NSLog(@"%s",@encode(id));
       NSLog(@"%s",@encode(void));
       NSLog(@"%s",@encode(SEL));
       NSLog(@"%s",@encode(float*)); * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *2019- 08- 06 16:13:22.136917+0800 iOS-Runtime[8904:779780] i
2019- 08- 06 16:13:22.137461+0800 iOS-Runtime[8904:779780] f
2019- 08- 06 16:13:22.137549+0800 iOS-Runtime[8904:779780] ^i
2019- 08- 06 16:13:22.137639+0800 iOS-Runtime[8904:779780] @
2019- 08- 06 16:13:22.137718+0800 iOS-Runtime[8904:779780] v
2019- 08- 06 16:13:22.137832+0800 iOS-Runtime[8904:779780] :
2019- 08- 06 16:13:22.137912+0800 iOS-Runtime[8904:779780] ^f
Copy the code
From the print above you can see the string representation for the various types. – (int)test:(int)age height:(float)height, We know OC method corresponding to the first two are the underlying function of the default parameter id self and SEL CMD, then the approach that in from left to right, return values and parameters of type int, respectively – > id – > SEL – > int – > float, converted to type code, is I – @ – : – I – f, The result is i24@0:8i16f20, which, you may wonder, contains some numbers that describe the length and position of the function’s arguments, from left to right:

iThe return type of the function isint

24— Total length of arguments (24 bytes)

@— the first parameterid

0— The initial memory offset of the first argument (0 bytes, from the 0th byte)

:— The second parameterSEL

8The offset in memory from the second parameter (8 bytes, starting from the 8th byte, so the aboveidArguments take up the previous 8 bytes)

i— The third parameterint

16The start offset of the third parameter in memory (16 bytes, starting from the 16th byte, hence the aboveSELThe parameter takes up the previous 8 bytes.

f— The fourth parameterfloat

20— The start offset of the fourth parameter in memory (20 bytes, starting from the 20th byte, hence the aboveintArguments take up the first four bytes, and the total length is 24, so the last four bytes are givenfloatFor parameters)

So, for any OC method, itsmethod_tThe values of the types string can be derived from the procedure above. You search in apple docsType EncodingYou can find more detailed introduction, which has all the parameter types corresponding to the string expression comparison table.

Based on the above research, the three member variables in method_t provide all the information we need for an OC method, so that’s all we need to know about method_t.

(5) Cache_t method cache

To avoid getting too long, explore method caching by moving to the next article — Runtime Notes (3) — OC Class’s method cache cache_t

🦋🦋🦋 Portal 🦋 port 🦋

Runtime Principle Exploration (I) — An in-depth understanding of ISA (Apple isa optimization)

An in-depth analysis of the Class structure

OC Class cache_t

Runtime principle exploration (4) — To investigate the bottom message mechanism

Exploring the Runtime principle (5) — The nature of Super

Runtime theory exploration (6) — Runtime in the interview question

An in-depth analysis of the Class structure

Runtime Series

(a) A brief description of the structure of Class

(2) class_ro_t

(3) class_rw_t

(4) method_t

(5) Cache_t method cache

🦋🦋🦋 Portal 🦋 port 🦋

Related Posts

IOS – Runtime Swizzling source code analysis

How does iOS gracefully rotate CALayer 360 degrees

Talk about the iOS network layer design