This article belongs to “Jane Book — Liu Xiaozhuang” original, please note:

< Jane books – Liu Xiaozhuang > https://www.jianshu.com/p/5b7e7c8075ef



NSObject

The previous definition

In OC1.0, many of the definitions of Runtime were written in the nsobject. h file. Those of you who have studied Runtime before may have seen the following definitions, which define some basic information.

// Declare Class and id
typedef struct objc_class *Class;
typedef struct objc_object *id;

// Declare common variables
typedef struct objc_method *Method;
typedef struct objc_ivar *Ivar;
typedef struct objc_category *Category;
typedef struct objc_property *objc_property_t;

/ / objc_object and objc_class
struct objc_object {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;
};

struct objc_class {  
    Class isa  OBJC_ISA_AVAILABILITY;
    
#if ! __OBJC2__
    Class super_class                                        OBJC2_UNAVAILABLE;
    const char *name                                         OBJC2_UNAVAILABLE;
    long version                                             OBJC2_UNAVAILABLE;
    long info                                                OBJC2_UNAVAILABLE;
    long instance_size                                       OBJC2_UNAVAILABLE;
    struct objc_ivar_list *ivars                             OBJC2_UNAVAILABLE;
    struct objc_method_list **methodLists                    OBJC2_UNAVAILABLE;
    struct objc_cache *cache                                 OBJC2_UNAVAILABLE;
    struct objc_protocol_list *protocols                     OBJC2_UNAVAILABLE;
#endif
    
} OBJC2_UNAVAILABLE;
Copy the code

The previous Runtime structure was relatively simple, with some straightforward structure definitions. Now the new Runtime operates with various address offsets and bit operations.

Subsequent definitions

Later apple probably didn’t want developers to know about the internal implementation of Runtime either, so they moved the source code definition from NSObject to Runtime. And the previous definition is no longer needed, with the OBJC_TYPES_DEFINED precompilation directive, the previous code is discarded.

Now the definition in NSObject is very simple, just an ISA variable of type Class, and everything else is hidden.

@interface NSObject <NSObject> {
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wobjc-interface-ivars"
    Class isa  OBJC_ISA_AVAILABILITY;
#pragma clang diagnostic pop
}
Copy the code

These are some of the latest commonly used Runtime definitions, which are not quite the same as the previous ones, using the latest structure object, and the previous structure is deprecated.

typedef struct objc_class *Class;
typedef struct objc_object *id;

typedef struct method_t *Method;
typedef struct ivar_t *Ivar;
typedef struct category_t *Category;
typedef struct property_t *objc_property_t;
Copy the code

Object structure

Objc_object definition

Each object in OC isa structure that contains a member variable of isa, which is the first member variable. Isa member variables used to be of Class type until Apple changed them to ISA_t.

struct objc_object {
private:
    isa_t isa;
};
Copy the code

The same goes for classes and metaclasses in OC, which are made up of structures. Since the structure definition of a class inherits from ObjC_Object, it is also an object and has the ISA characteristics of an object.

So you can use isa_t to find the corresponding class or metaclass by using bits of the uintptr_t type and bitwise operation to find the address of the class isa_t points to.

The methods of instance objects or class objects are not defined in each object, but in the class to which ISA_t points. When the corresponding class is found, the bits structure of class_datA_bits_t is used to search for the corresponding class. Objects, classes, and metaclasses use the same search principle.

Isa_t definition

Isa_t isa struct object for a union, which is similar to a C++ struct that defines member variables and functions. CLS, bits and ISA_T are defined in ISA_T, and the following struct structure is the structure of ISA_T.

The structure in ISA_t is bit-field declared below, with the addresses arranged from nonpointer to extra_rc, from lowest to highest. Bitfields are also a declaration of the memory layout of a structure, which can be manipulated directly through the following structure member variables. The bitfields are 8 bytes in total, and all the bitfields add up to exactly 64 bits.

Tip: The bits in a union can operate on the entire memory area, whereas the bitfield can only operate on the corresponding bit.

The following code is incomplete, only the ARM64 part is retained, the rest is ignored.

union isa_t 
{
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

# if __arm64__
# define ISA_MASK 0x0000000ffffffff8ULL
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
    struct {
        uintptr_t nonpointer        : 1; // Is it 32-bit or 64-bit
        uintptr_t has_assoc         : 1; // Whether the object contains or has ever contained an associated reference. If there is no associated reference, the object can be released faster
        uintptr_t has_cxx_dtor      : 1; // indicates whether there is a C++ destructor or an OC destructor
        uintptr_t shiftcls          : 33; // The object points to the memory address of the class, that is, the address isa points to
        uintptr_t magic             : 6; // Whether the object is initialized
        uintptr_t weakly_referenced : 1; // Whether the object is or has been weakly referenced
        uintptr_t deallocating      : 1; // Whether the object is being released
        uintptr_t has_sidetable_rc  : 1; // The object reference count is too large to exceed the storage area
        uintptr_t extra_rc          : 19; // Object reference count
#       define RC_ONE   (1ULL<<45) # define RC_HALF (1ULL<<18) }; # elif __x86_64__ # else / /... /... / # endif};Copy the code

Under the ARM64 architecture, ISA_T is laid out with the following structure. The layout varies depending on the CPU architecture, but the parameters are the same.

Class structure

Objc_class structure

In Runtime, a class is also an object. The class structure objC_class is inherited from ObjC_Object and has all the characteristics of an object. There are three member variables defined in objc_class. Superclass is a pointer to the objc_class structure of its parent class. The cache used to process the cache of called methods.

Bits is the main character of objc_class, which only defines a uintptr_t bits member variable to store the address of class_rw_t. Bits also defines some basic operations, such as getting class_rw_t, raw ISA status, swift, and so on. Some functions defined in the objc_class structure are internally implemented through bits.

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             
    class_data_bits_t bits;    

    class_rw_t *data() { 
        return bits.data();
    }
    void setData(class_rw_t *newData) {
        bits.setData(newData);
    }
    / /...
}
Copy the code

As you can see from objc_class’s source code, we can get the class_rw_t pointer through the bits data() function. The bits variable of the uintptr_t type is used to locate the corresponding value.

class_rw_t* data() {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}
Copy the code

Uintptr_t is essentially a typedef of an unsigned long, which takes up 8 bytes in a 64-bit processor and is exactly a 64-bit binary. After FAST_DATA_MASK is converted to binary, the position 47-3 in bits is exactly where the class_rw_t pointer is retrieved.

In the OC is a pointer to the length of the 47, such as printing a UIViewController address is 0 x7faf1b580450, converted to binary is 11111111010111100011011010110000000010001010000, The last three digits are placeholders, so the last three digits are ignored when retrieving the address.

// find the 0 bit, indicating whether swift
#define FAST_IS_SWIFT           (1UL#define FAST_HAS_DEFAULT_RR (1UL<<1) #define FAST_REQUIRES_RAW_ISA #define FAST_REQUIRES_RAW_ISA Ul (1 < < 2) / / a pointer to the data segment # define FAST_DATA_MASK 0 x00007ffffffffff8ul / / 11111111111111111111111111111111111111111111000 a total of 47Copy the code

Because the last three bits are useless, they can be used to store other information. There are also three macros defined in class_datA_bits_t that operate on the last three bits.

Class_ro_t and class_rw_t

There are two important structures associated with class_data_bits_t, class_rw_t and class_ro_t, which define key information such as method list, protocol list, and property list.

struct class_rw_t {
    uint32_t flags;
    uint32_t version;

    const class_ro_t *ro;

    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;

    Class firstSubclass;
    Class nextSiblingClass;

    char *demangledName;
};
Copy the code

When compiled, class_datA_bits_t refers to a class_ro_t address. This structure is immutable (read only). At run time, bits are referred to class_rw_t via the realizeClass function.

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
    uint32_t reserved;

    const uint8_t * ivarLayout;
    
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
};
Copy the code

Class is initialized after the program starts running. In this process, class_ro_t is taken out of bits stored by the compiler, class_rw_t is created, and ro is assigned to RW as a member variable of RW. Finally, RW is set to bits, replacing ro stored in bits. In addition to these operations, there will be some other assignment operations. Here is a simplified version of the code for initializing the Class.

static Class realizeClass(Class cls) 
{
    const class_ro_t *ro;
    class_rw_t *rw;
    Class supercls;
    Class metacls;
    bool isMeta;

    if(! cls)return nil;
    if (cls->isRealized()) return cls;

    ro = (const class_ro_t *)cls->data();
    rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
    rw->ro = ro;
    rw->flags = RW_REALIZED|RW_REALIZING;
    cls->setData(rw);

    isMeta = ro->flags & RO_META;
    rw->version = isMeta ? 7 : 0;

    supercls = realizeClass(remapClass(cls->superclass));
    metacls = realizeClass(remapClass(cls->ISA()))

    cls->superclass = supercls;
    cls->initClassIsa(metacls);
    cls->setInstanceSize(ro->instanceSize);

    if (supercls) {
        addSubclass(supercls, cls);
    } else {
        addRootClass(cls);
    }

    methodizeClass(cls);
    return cls;
}
Copy the code

We also found two functions in the above code, addRootClass and addSubclass, which string the subclasses of a class into a list, roughly in the following order. We can get all the subclasses of the current class by class_rw_t.

superClass.firstSubclass -> subClass1.nextSiblingClass -> subClass2.nextSiblingClass -> ...
Copy the code

MethodizeClass = methodizeClass = methodizeClass = methodizeClass = methodizeClass = methodizeClass = methodizeClass = methodizeClass The rW function takes the list of ro and assigns it to the RW. If it is modified dynamically at run time, it also does something to the RW. So ro stores raw data that is determined at compile time, while RW is dynamically modified at run time.

static void methodizeClass(Class cls)
{
    bool isMeta = cls->isMetaClass();
    auto rw = cls->data();
    auto ro = rw->ro;

    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1.YES, isBundleClass(cls));
        rw->methods.attachLists(&list, 1);
    }

    property_list_t *proplist = ro->baseProperties;
    if (proplist) {
        rw->properties.attachLists(&proplist, 1);
    }

    protocol_list_t *protolist = ro->baseProtocols;
    if (protolist) {
        rw->protocols.attachLists(&protolist, 1);
    }

    if (cls->isRootMetaclass()) {
        // root metaclass
        addMethod(cls, SEL_initialize, (IMP)&objc_noop_imp, "".NO);
    }

    // Attach categories.
    category_list *cats = unattachedCategoriesForClass(cls, true /*realizing*/);
    attachCategories(cls, cats, false /*don't flush caches*/);
}
Copy the code

Suppose you create a class, LXZObject, that inherits from NSObject, and add a testMethod method to it without doing anything else. Since the bits of objC_class correspond to the class_ro_t structure after compilation, let’s print out the member variables of the structure to see what class_ro_t looks like after compilation.

struct class_ro_t {
  flags = 128
  instanceStart = 8
  instanceSize = 8
  reserved = 0
  ivarLayout = 0x0000000000000000 <no value available>
  name = 0x0000000100000f7a "LXZObject"
  baseMethodList = 0x00000001000010c8
  baseProtocols = 0x0000000000000000
  ivars = 0x0000000000000000
  weakIvarLayout = 0x0000000000000000 <no value available>
  baseProperties = 0x0000000000000000
}
Copy the code

After printing, it can be seen that class_ro_t of a class only contains information about the current class, not its parent class. LXZObject contains only name and baseMethodList, while baseMethodList has only one testMethod method. It follows that the class_rw_t structure is the same.

Initialization process

Here is the layout of the initialized ISA_T structure and the positions of the individual structure members within the structure.

A union is often used with a structure, and the first use of a union is to initialize the structure area. When the object is initialized, the bits field of ISA_T is assigned to ISA_MAGIC_VALUE, which is how the union is initialized.

Objc-723 is no longer available
inline void objc_object::initIsa(Class cls, bool indexed, bool hasCxxDtor) 
{
    if(! indexed) { isa.cls = cls; }else {
        isa.bits = ISA_MAGIC_VALUE;
        isa.has_cxx_dtor = hasCxxDtor;
        isa.shiftcls = (uintptr_t)cls >> 3; }}Copy the code

Isa is initialized via ISA_MAGIC_VALUE when the object is initialized via initIsa(). ISA_MAGIC_VALUE is a hexadecimal value. After converting it to binary, you will find that ISA_MAGIC_VALUE initializes nonpointer and magic.

Nonpointer is compatible with previous 32-bit processors. When accessing the class to which the object belongs, the previous ISA pointer address is returned if it is 32-bit, or the isa_t structure if it is 64-bit.

# define ISA_MAGIC_VALUE 0x000001a000000001ULLBinary:11010000000000000000000000000000000000001Complete binary:23A zero +11010000000000000000000000000000000000001
Copy the code

Has_cxx_dtor and Shiftcls are then initialized through the bitfields, and four fields have been initialized. Has_cxx_dtor indicates whether there are C++ or OC destructor methods. When printing a list of methods, you will often see a method named.cxx_destruct that is associated with this field.

In order to facilitate the reading of Memory or Disk in the computer, the internal Memory will operate on it during the writing and reading. This is usually done in bytes, which is also optimized for read and write speed. When assigning shiftcls, the pointer to Class is shifted three places to the right. This is because the class pointer fills the last three digits with zeros for memory purposes, so these three digits are meaningless.

Isa structure0000000001011101100000000000000100000000001110101110000011111001
0x5d8001003ae0f8Class object address100000000001110101110000011111000
0x1003ae0f8Move the address of the class object three places to the right100000000001110101110000011111, just conforms to the part of shiftcls in isa_T address, the front is not enough to fill zero.Copy the code

When the outside world gets the Class, it should go through the ISA() function, rather than directly accessing the ISA pointer as before. In ISA(), the sum is performed on the structure of ISA_t, using the ISA_MASK macro, which converts to binary to extract the address of shiftcls.

inline Class 
objc_object::ISA() 
{
    return (Class)(isa.bits & ISA_MASK);
}

#define ISA_MASK 0x0000000ffffffff8ULL
111111111111111111111111111111111000
Copy the code

Tagged Pointer

Starting with the iPhone5s, iOS devices began introducing 64-bit processors, which had always been 32-bit.

However, in 64-bit processors, the length of pointer and the memory occupied by some variables are changed. A 32-bit pointer takes up 4 bytes, but a 64-bit pointer takes up 8 bytes. A 32-bit long takes up 4 bytes, a 64-bit long takes up 8 bytes, and so on, so you get a lot more memory on 64-bit.

Apple has addressed this issue with a new feature called Tagged Pointer. In Tagged Pointer, a Pointer represents a value. NSNumber is an example.

NSNumber *number1 = @1;
NSNumber *number2 = @3;
NSNumber *number3 = @54;

/ / output
(lldb) p number1
(__NSCFNumber *) $3 = 0xb000000000000012 (int)1
(lldb) p number2
(__NSCFNumber *) $4 = 0xb000000000000032 (int)3
(lldb) p number3
(__NSCFNumber *) $5 = 0xb000000000000362 (int)54
Copy the code

Using Tagged Pointer, you can see that an object’s value is stored in a Pointer. For example, for an NSNumber with a value of 1, the pointer is 0xB000000000000012. If you discard the preceding 0xb and the following 2, the middle value is exactly the hexadecimal value.

Apple significantly improves execution efficiency and saves a lot of memory by using Tagged Pointer. With 64-bit processors, the memory footprint is cut nearly in half and execution efficiency is greatly improved. Without the malloc and free processes, objects are created and destroyed dozens of times faster because they are represented directly by Pointers.

isa_t

The same is true for object Pointers. In OC1.0 isa was a real pointer to a heap address. In the OC2.0 era, a pointer is eight bytes long, or 64 bits, which directly stores information about an object. When looking for the class to which an object belongs, the bit operation can be done directly in the ISA pointer, and since the operation is performed in the stack area, the lookup is very fast.

struct {
    uintptr_t nonpointer        : 1;
    uintptr_t has_assoc         : 1;
    uintptr_t has_cxx_dtor      : 1;
    uintptr_t shiftcls          : 33;
    uintptr_t magic             : 6;
    uintptr_t weakly_referenced : 1;
    uintptr_t deallocating      : 1;
    uintptr_t has_sidetable_rc  : 1;
    uintptr_t extra_rc          : 19;
};
Copy the code

For example, isa_t is essentially a structure, and if you create a structure and then point to it with a pointer, the memory footprint is huge. The Tagged Pointer feature, however, stores the values of the structure directly into the Pointer, which saves a lot of memory.

Apple does not allow direct access to the ISA Pointer. In the case of Tagged Pointer, ISA does not point to another memory area but directly represents the value of the object. Therefore, information obtained by accessing ISA directly is incorrect.

Tagged Pointer


Due to typesetting problems, the reading experience is not good, such as layout, picture display, code and many other problems. So go to Github and download the Runtime PDF collection. Put all the Runtime articles together in this PDF, with a table of contents on the left for easy reading.

Please give me a thumbs up, thank you! 😁