Runtime (2) : data structure

Runtime Series

Data structure Data structure Message mechanism The nature of super The nature of Super

1. objc_object

Objective-c object orientation is implemented based on C/C++ data structures called constructs. All objects we normally use are of type ID, which corresponds to the Runtime objC_Object structure.

// A pointer to an instance of a class.
typedef struct objc_object *id;
Copy the code

struct objc_object {
private:
    isa_t isa;
    / *... Isa operation related weak reference related Associated object related Memory management related... * /
};
Copy the code

2. objc_class

The Class pointer is used to point to an Objective-C Class, which is an objc_class structure type, so the Class and meta-class underlying structures are both objC_class structures, and objC_class inherits from ObjC_Object. So it also has an ISA pointer, which is also an object.

// An opaque type that represents an Objective-C class.
typedef struct objc_class *Class;
Copy the code

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;          // point to the parent class
    cache_t cache;             // formercache pointer and vtable
    class_data_bits_t bits;    Class_rw_t * plus custom RR /alloc flags class_rw_t * plus custom RR /alloc flags

    class_rw_t *data() { 
        returnbits.data(); }};Copy the code

2.1 class_data_bits_t

class_data_bits_tIs mainly toclass_rw_tThe package can be passedbits & FAST_DATA_MASKTo obtainclass_rw_t.

struct class_data_bits_t {
    // Values are the FAST_ flags above.
    uintptr_t bits;
public:
    class_rw_t* data() {
        return(class_rw_t *)(bits & FAST_DATA_MASK); }};Copy the code

class_rw_tRepresents read and write information related to the class, which is trueclass_ro_tEncapsulation;
class_rw_tThe method list, attribute list and protocol list of the class are mainly stored in.
class_rw_tThe inside of themethods,properties,protocolsInherit fromlist_array_ttA two-dimensional array, readable and writable, contains the initial contents of the class and the contents of the classification.

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint32_t version;

    const class_ro_t *ro;

    method_array_t methods;       // List of methods
    property_array_t properties;  // Attribute list
    protocol_array_t protocols;   // Protocol list

    Class firstSubclass;
    Class nextSiblingClass;

    char *demangledName;
};
Copy the code

class_ro_tRepresents read-only information related to a class;
class_ro_tThe main storage of the class member variable list, class name, etc.
class_ro_tThe inside of thebaseMethodList,baseProtocols,ivars,basePropertiesIs a one-dimensional array that is read-only and contains the initial contents of the class;
At first, the information of the class was storedclass_ro_tWhen the program runs, it passes through a series of function call stacks, inrealizeClass()In the function, we willclass_ro_tI’m going to merge what’s in there with what’s classifiedclass_rw_tAnd let mebitsPoint to theclass_rw_t.

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;  // The memory space occupied by the instance object
#ifdef __LP64__
    uint32_t reserved;
#endif
    const uint8_t * ivarLayout;    
    const char * name;  / / the name of the class
    method_list_t * baseMethodList;  
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;  // List of member variables
    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
    method_list_t *baseMethods() const {
        returnbaseMethodList; }};Copy the code

method_array_twithmethod_list_t.

2.2 cache_t

Used to quickly find method execution functions;
Is the increment extensible hash table structure, using the hash table to cache used methods, can improve the method search speed (space for time: sacrifice memory space for execution efficiency);
Is the best application of the locality principle (such as some method calls with high frequency, stored incacheThe next call to these methods will have a higher hit ratio);
The hash function isf(@selector()) = index, @selector() & _mask;
When we call a method,runtimeWill cache this method tocache, the next time this method is called,runtimePriority tocacheIn the search.

struct cache_t {
    struct bucket_t *_buckets;  / / a hash table
    mask_t _mask;               // Hash table length -1
    mask_t _occupied;           // The number of methods already cached
};

struct bucket_t {
private:
    cache_key_t _key;  // SEL
    IMP _imp;          // IMP function memory address
};
Copy the code

2.2.1 Cache Search process

Mm (objc4) / / objc - cache.
bucket_t * cache_t::find(cache_key_t k, id receiver)  // find it by k, which is at sign selector{ assert(k ! =0);

    bucket_t *b = buckets();          / / get _buckets
    mask_t m = mask();                / / get _mask
    mask_t begin = cache_hash(k, m);  // Calculate the starting index
    mask_t i = begin;
    do {
        // Select the value from the _buckets hash table according to index I
        Cache_fill_nolock (); // If bucket_t has _key = 0, the bucket_t has not been cached at the index location
        // If the _key of bucket_t is k, the query is successful. Bucket_t is displayed
        if (b[i].key() == 0  ||  b[i].key() == k) {
            return &b[i];
        }
      // select I -1 from the __arm64__ hash table
      // Until I points to the first element (index = 0), assign mask to I so that it points to the last element of the hash table, and continue traversing backwards
      // If bucket_t for k is not found or bucket_t is empty, the loop ends, the search fails, and the bad_cache() function is called
      // Go to methods in class_rw_t
    } while((i = cache_next(i, m)) ! = begin);// hack
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    cache_t::bad_cache(receiver, (SEL)k, cls);
}


static inline mask_t cache_hash(cache_key_t key, mask_t mask) 
{
    return (mask_t)(key & mask);
}
static inline mask_t cache_next(mask_t i, mask_t mask) {
    // return (i+1) & mask; // __arm__ || __x86_64__ || __i386__
    return i ? i- 1 : mask;   // __arm64__
}
Copy the code

2.2.2 Cache Adding Process

Mm (objc4) / / objc - cache.
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();

    // Never cache before +initialize is done
    if(! cls->isInitialized())return;   // If the class is not initialized, return it directly

    // Make sure the entry wasn't added to the cache by some other thread 
    // before we grabbed the cacheUpdateLock.
    if (cache_getImp(cls, sel)) return;  // It is possible that another thread has preempted this method, so check the cache and return it if it exists

    cache_t *cache = getCache(cls);  // ️ fetches cache_t for the class
    cache_key_t key = getKey(sel);   // ️ get _key according to sel

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = cache->occupied() + 1;  _occupied in cache_t +1: the number of methods that have been cached; this is just to determine whether the cache is full
    mask_t capacity = cache->capacity();  // Get cache capacity = _mask + 1
    if (cache->isConstantEmptyCache()) {  // If the cache is read-only, reapply for cache space
        // Cache is read-only. Replace it.cache->reallocate(capacity, capacity ? : INIT_CACHE_SIZE);// Apply for new cache space and free the old one
    }
    else if (newOccupied <= capacity / 4 * 3) {  // ️ If the number of methods currently cached +1 <= 3/4 of the cache capacity, proceed
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {  // ️ If the preceding conditions are not met, the cache is full. Expand the cache
        // Cache is too full. Expand it.
        cache->expand();
    }

    // Scan for the first unused slot and insert there. // Scan the first unused slot (bucket_t) and insert it
    // There is guaranteed to be an empty slot because the...
    // Minimum size is 4 and we resized at 3/4 full
    bucket_t *bucket = cache->find(key, receiver);       // ️ a cache lookup called find() invariably yields an empty bucket_t
    if (bucket->key() == 0) cache->incrementOccupied();  // ️ If bucket_t is empty, _occupied is the number of cached methods + 1
    bucket->set(key, imp);  // ️ add cache
}

void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if ! DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}
Copy the code

2.2.3 Cache Capacity Expansion Process

① Set a new cachebucket_t, capacity = twice as old;
② Set a new_mask=bucket_tLength -1;
③ Release the old cache (inruntimeDynamic swap methods also release the cache when implemented.

Mm (objc4) / / objc - cache.
void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    // ️ double the size of the cache. If it is called for the first time, set the initial size of the cache to 4
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if((uint32_t)(mask_t)newCapacity ! = newCapacity) {// mask overflow - can't grow further
        // fixme this wastes one bit of mask
        newCapacity = oldCapacity;
    }

    reallocate(oldCapacity, newCapacity);  // ️ apply for new cache space and free the old one
}

enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2)
};

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    bool freeOld = canBeFreed();  // ️ determine if the cache is empty. If it is, there is no need to free space

    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity- 1) == newCapacity- 1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
        cache_collect(false); }}bool cache_t::canBeFreed()
{
    return! isConstantEmptyCache(); }bool cache_t::isConstantEmptyCache()
{
    return 
        occupied() == 0  &&  
        buckets() == emptyBucketsForCapacity(capacity(), false);
}
Copy the code

More aboutcache_tPlease see:

Simple Runtime (3) : message mechanism

3. Isa pointer

isaPointers are used to maintain relationships between objects and classes and to ensure that objects and classes passisaPointers find corresponding methods, instance variables, properties, protocols, etc.
Before the ARM64 architecture,isaIt’s just a regular pointer, pointing straight toobjc_classThat storeClass,Meta-ClassObject memory address.instanceThe object’sisaPoint to theclassObject,classThe object’sisaPoint to themeta-classObject;
Starting with the ARM64 architecture, yesisaOptimized to become a common body (union) structure, and also uses bitfields to store more information. There are a lot of things to store in 64 bits of memory, of which 33 bits are used for storageclass,meta-classObject memory address information. It’s going to be done by the bit operationisaThe value of the& ISA_MASKTo obtain the maskclass,meta-classObject memory address.

struct objc_object {
    Class isa;  // Before arm64 architecture
};

struct objc_object {
private:
    isa_t isa;  // Start with the ARM64 architecture
};

union isa_t 
{
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

#if SUPPORT_PACKED_ISA

    // extra_rc must be the MSB-most field (so it matches carry/overflow flags)
    // nonpointer must be the LSB (fixme or get rid of it)
    // shiftcls must occupy the same bits that a real class pointer would
    // bits + RC_ONE is equivalent to extra_rc + 1
    // RC_HALF is the high bit of extra_rc (i.e. half of its range)

    // future expansion:
    // uintptr_t fast_rr : 1; // no r/r overrides
    // uintptr_t lock : 2; // lock for atomic property, @synch
    // uintptr_t extraBytes : 1; // allocated with extra bytes

# if __arm64__
# define ISA_MASK 0x0000000ffffffff8ULL # define ISA_MASK 0x0000000FFFFFF8ull # define ISA_MASK 0x0000000FFFFFF8ull
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
    struct {
        uintptr_t nonpointer        : 1;  // 0: represents a common pointer, storing the memory address of Class and meta-class objects
                                          // 1: indicates that it is optimized to use bitfields to store more information
        uintptr_t has_assoc         : 1;  // Whether the associated object is set, if not, the release will be faster
        uintptr_t has_cxx_dtor      : 1;  // whether there is a C++ destructor (.cxx_destruct), if not, the release will be faster
        uintptr_t shiftcls          : 33; // Store the memory address information of Class and meta-class objects
        uintptr_t magic             : 6;  // Used to tell if an object is not initialized during debugging
        uintptr_t weakly_referenced : 1;  // If there is a weak reference, the release will be faster
        uintptr_t deallocating      : 1;  // Whether the object is being released
        uintptr_t has_sidetable_rc  : 1;  // If it is 1, the reference count is too large to be stored in ISA, and the excess reference count is stored in a RefCountMap hash table called SideTable
        uintptr_t extra_rc          : 19; // The value stored inside is the reference count retaincoun-1
#       define RC_ONE   (1ULL<<45) # define RC_HALF (1ULL<<18) }; };Copy the code

3.1 ISA and Superclass pointer pointing

3.2 Class and Meta-Class Objects

class,meta-classThe underlying structure is allobjc_classStructure,objc_classInherited fromobjc_objectSo it also hasisaPointer, so it’s also an object;
classStore instance methods, member variables, attributes, protocols, etc.meta-classStored in class methods and other information;
isaPointers andsuperclassPointer pointing (as shown above);
The base classmeta-classthesuperclassPoints to the base classclass, determines the property that when we call a class method, it passesclasstheisaA pointer to findmeta-classIn themeta-classIf the method does not exist, pass themeta-classthesuperclassThe pointer looks up the parent step by stepmeta-class, all the way to the base classmeta-classIf you haven’t found the method, you’ll look for the base classclassAn implementation of the instance method of the same name in.

3.3 How to get a class or meta-class

There are three ways to get a class

- (Class)class;
+ (Class)class;
Class object_getClass(id obj);  // Pass parameters: instance object
Copy the code

There is only one way to get a meta-class

Class object_getClass(id obj);  // Pass parameter: Class object
Copy the code

Sample code is as follows

    NSObject *object1 = [NSObject alloc] init];
    NSObject *object2 = [NSObject alloc] init];
    // objectClass1 ~ objectClass5 are NSObject class objects
    Class objectClass1 = [object1 class];
    Class objectClass2 = [object2 class];
    Class objectClass3 = [NSObject class];
    Class objectClass4 = object_getClass(object1);
    Class objectClass5 = object_getClass(object2);  
    // objectMetaClass1 ~ objectMetaClass4 are metaclass objects of NSObject
    Class objectMetaClass1 = object_getClass([object1 class];    
    Class objectMetaClass2 = object_getClass([NSObject class]);    
    Class objectMetaClass3 = object_getClass(object_getClass(object1));    
    Class objectMetaClass4 = object_getClass(objectClass5);    
Copy the code

Method implementation

- (Class)class {
    return object_getClass(self);
}

+ (Class)class {
    return self;
}

Class object_getClass(id obj)
{
    if (obj) return obj->getIsa();
    else return Nil;
}
objc_object::getIsa() 
{
    if(! isTaggedPointer())returnISA(); . } objc_object::ISA() { assert(! isTaggedPointer());#if SUPPORT_INDEXED_ISA
    if (isa.nonpointer) {
        uintptr_t slot = isa.indexcls;
        return classForIndex((unsigned)slot);
    }
    return (Class)isa.bits;
#else
    return (Class)(isa.bits & ISA_MASK);
#endif
}
#if __ARM_ARCH_7K__ >= 2
# define SUPPORT_INDEXED_ISA 1
#else
# define SUPPORT_INDEXED_ISA 0
#endif
Copy the code

3.4 Why design meta-class?

The goal is to separate the list of related methods and build information for instances and classes so that each can perform its own duties and conform to the single responsibility design principle.

4. method_t

Methodismethod_tPointer to a type;
method_tIs the encapsulation of a method/function (function four elements: function name, return value, parameter, function body).

typedef struct method_t *Method;
Copy the code

struct method_t {
    SEL name;  / / the method name
    const char *types;  // Encode (return value type, parameter type)
    IMP imp;   // The address/implementation of the method
};
Copy the code

4.1 SEL

SEL, also known as a “selector,” is a point to methodselectorPointer to, representing method/function names;
The SEL is maintained in a global Map, so it is globally unique, and the SEL is the same for methods of the same name in different classes.

typedef struct objc_selector *SEL;
Copy the code

SEL can be obtained in the following ways

    SEL sel1 = @selector(selector);
    SEL sel2 = sel_registerName("selector");
    SEL sel3 = NSSelectorFromString(@"selector");
Copy the code

SEL can be converted to a string in the following ways

    char *string1 = sel_getName(sel1);
    NSString *string2 = NSStringFromSelector(sel1);
Copy the code

4.2 IMP

IMP is a function pointer to a method implementation;
We call the method, which is actually looking for IMP based on SEL;
method_tThere’s actually a mapping between SEL and IMP.

#if ! OBJC_OLD_DISPATCH_PROTOTYPES
typedef void (*IMP)(void /* id, SEL, ... * / ); 
#else
typedef id _Nullable (*IMP)(id_Nonnull, SEL _Nonnull, ...) ;#endif
Copy the code

4.3 Type Encodings

The Type Encodings coding technique is coordinationruntimeThe return value type and parameter type of a method are described as strings.
@encode()The directive can convert a Type to a Type Encodings string encoding, such as@encode(int)=i;
OCMethods take two implicit arguments, the method caller(id)selfAnd the method name(SEL) _cmdSo we can use it in methodsselfand_cmd;
Such as-(void)test, its code is”v16@0:8“, can be shortened to”v@:“

v: indicates that the return value type is void

@: indicates that parameter 1 is of id type

:: indicates that the type of parameter 2 is SEL

16: indicates the total number of bytes of all parameters

0: indicates the byte from which parameter 1 is stored

8: indicates the byte from which parameter 2 is stored
The following figure shows the Type Encodings corresponding to the Type:

Type Encodings inruntimeWill be used in message forwarding;
For more information about Type Encodings, see the official document Type Encodings.