preface

If you want to learn a programming language in a certain direction, the underlying foundation is an essential prerequisite. And in Objective-C, classes and objects are the foundation of foundations, the building blocks that connect everything.

Therefore, this article will explore the nature of classes, their structure, lazy loading concepts, and what they do from compile time to run time to get to the bottom of it.

The source code to prepare

  • Objc source code.

  • Objc4-756.2 latest source code compiler debugging.

Leading knowledge

  • The process of creating OC objects

  • Isa’s past and present lives

  • OC class object/instance object/metaclass resolution

What is the class

Objective-c is an object-oriented programming language. Each object is an instance of its class and is called an instance object. Each object has a pointer named ISA to that object’s class.

And a class itself is an object. Why do you say that?

Take a look at the source code

typedef struct objc_object *id;
typedef struct objc_class *Class;

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    / * * /
}

struct objc_object {
private:
    isa_t isa;
public:
    // ISA() assumes this is NOT a tagged pointer object
    Class ISA(a);

    // getIsa() allows this to be a tagged pointer object
    Class getIsa(a);
    / *... * /
}
Copy the code

First, the object is of type ID, which is a pointer to the objc_Object structure. And we see that Class is a pointer to the objc_class structure, and objC_class inherits from ObjC_Object, so we say that Class is also an object.

An object is an instance of a class, and a class is an instance of a metaclass object. Instance methods are stored in a class, and class methods are stored in a metaclass. We can also conclude that Objective-C objects are implemented by C language constructs.

Conclusion:

  • In Objective-C, every object (which is essentially an objc_Object structure pointer) has a pointer named ISA to the object’s class (which is essentially an objc_class structure).

  • Each class is actually an object (because objC_class inherits from ObjC_Object), so each class can also receive messages that call class methods, and the recipient is the metaclass to which the class object ISA points.

  • NSObject and NSProxy are two base classes that follow the

    protocol to provide common interfaces and capabilities for their inherited subclasses.

The structure of the class

Tip:

The following objC_classes in OBJC2 have been deprecated.

struct objc_class {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;

#if! __OBJC2__
    Class _Nullable super_class                              OBJC2_UNAVAILABLE;
    const char * _Nonnull name                               OBJC2_UNAVAILABLE;
    long version                                             OBJC2_UNAVAILABLE;
    long info                                                OBJC2_UNAVAILABLE;
    long instance_size                                       OBJC2_UNAVAILABLE;
    struct objc_ivar_list * _Nullable ivars                  OBJC2_UNAVAILABLE;
    struct objc_method_list * _Nullable * _Nullable methodLists                    OBJC2_UNAVAILABLE;
    struct objc_cache * _Nonnull cache                       OBJC2_UNAVAILABLE;
    struct objc_protocol_list * _Nullable protocols          OBJC2_UNAVAILABLE;
#endif
} OBJC2_UNAVAILABLE;
Copy the code

Class structure source code as follows:

typedef struct objc_class *Class;

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    / * * /
}
Copy the code

First of all, the class is essentially a structure, which is why we all say that the OC object is essentially a structure pointer.

The internal structure of the structure is as follows:

  • Class ISA: points to the associated class, inherited fromobjc_objectReference.Isa’s past and present lives
  • Class superclass: parent class pointer, also refer to the above article has a detailed pointing exploration.
  • cache_t cache, the method cache stores data structures.
  • class_data_bits_t bits , bitsStores source data for classes such as properties, methods, etc.

Let’s explore cache_t and class_datA_bits_t one by one.

1. Class_data_bits_t – Data source of class information

struct class_data_bits_t {
    uintptr_t bits;
    
    class_rw_t* data() {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
    }
    /*. Some other methods omit.. * /
}

typedef unsigned long           uintptr_t;

// data pointer
#define FAST_DATA_MASK          0x00007ffffffffff8UL
Copy the code

Click on it and we see that the bits actually tell us the same thing about isa when we turned on ISA optimization.

That is, bits uses eight bytes, 64 bits in total, to store more content, and is read with a mask bit operation to retrieve the stored pointer data.

Such as:

class_rw_t* data() {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}
Copy the code

Bits & FAST_DATA_MASK is used to obtain the fixed bits and convert them to Pointers of type class_rw_T. Known as the Meta-design pattern).

1.1 class_rw_t

The class_rw_t data structure is as follows:

struct class_rw_t {
    uint32_t flags;
    uint32_t version;
    const class_ro_t *ro;

    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;

    Class firstSubclass;
    Class nextSiblingClass;
    char *demangledName;

#if SUPPORT_INDEXED_ISA
    uint32_t index;
#endif
    / *... * /
}
Copy the code

So this is the data that we’re all familiar with: method lists, property lists, protocol lists, etc.

Class objects store object methods, and metaclass objects store class methods.

One other thing worth mentioning is const class_ro_t *ro; . Its source code is as follows:

1.2 class_ro_t

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
}
Copy the code

As you can see, ro also stores baseMethodList, baseProtocols, Ivars, etc., but ro is modified by const, that is, immutable.

Its structure is shown as follows:

So what is the relationship between RW and RO, and why is it necessary to repeat storage?

1.3 Relationship between RW and RO

First, the conclusion.

  • Rw stands for read write and ro stands for read only. OC For dynamic nature, the compiler determines and stores one copy of the class’s structure data in ro and another in RW loaded at runtime. It is used for dynamic modification by runtime.

  • Ro is immutable, while methods, properties, and protocols in RW are mutable. This is why existing classes can add methods dynamically, but not properties dynamically (adding properties also adds member variables, namely ivar. Ivar is stored in RO).

  • The same is true for categories that cannot add attributes (association attributes are stored separately in the ObjectAssociationMap, unlike classes).

1.4 RW and RO relationship verification

As mentioned in the dyLD loading process from the beginning, libobJC is initialized from _objc_init, which calls map_images, load_images, and unmap_image.

  • What does that mean?

Dyld is responsible for loading applications from disk into running memory. It is also the data and memory structure of classes and metaclasses registered at this time. That’s in map_images.

Tip:

  • 1️, recursiveInitialization function is called recursively when dyLD is loaded to the main program of start link.

  • 2️ retail, this function is first executed to initialize libsystem. Go to doInitialization -> doModInitFunctions -> libSystemInitialized.

  • 3️, the initialization of libdispatch_init will call libdispatch_init, and the init of libdispatch will call _OS_object_init, which calls _objc_init.

  • 4️, _objc_init registered and saved the addresses of map_images, load_images and unmap_image functions.

  • 5 ️ ⃣ : Register and continue back to the next recursiveInitialization call, such as libobjc, when libobJC comes to the recursiveInitialization call, Will trigger a libsystem call to a callback registered in _objc_init. I’m going to libobjc and I’m going to call map_images.

void map_images(unsigned count, const char * const paths[],
           const struct mach_header * const mhdrs[])
{
    mutex_locker_t lock(runtimeLock);
    return map_images_nolock(count, paths, mhdrs);
}

void map_images_nolock(unsigned mhCount, const char * const mhPaths[],
                  const struct mach_header * const mhdrs[])
{
    if (hCount > 0) { _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses); }}Copy the code

Left left

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
    for (EACH_HEADER) {
        classref_t *classlist = _getObjc2ClassList(hi, &count);
        for (i = 0; i < count; i++) { Class cls = (Class)classlist[i]; Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized); }}}Class readClass(Class cls, bool headerIsBundle, bool headerIsPreoptimized)
{
    Class replacing = nil;
    if (Class newCls = popFutureNamedClass(mangledName)) {
        class_rw_t *rw = newCls->data();
        const class_ro_t *old_ro = rw->ro;
        memcpy(newCls, cls, sizeof(objc_class));
        rw->ro = (class_ro_t *)newCls->data();
        newCls->setData(rw);
        freeIfMutable((char *)old_ro->name);
        free((void*)old_ro); addRemappedClass(cls, newCls); replacing = cls; cls = newCls; }}Copy the code

Conclusion:

  • As you can see, during dyld class loading, data from ro is copied to rw. In addition, it is also reflected in realizeClassWithoutSwift, which is not posted here.

  • Before that, the ro data has been processed, that is, the structure of the class has been processed at compile time.

So how do we verify compile time RO data determination first?

The answer is obvious, clang + MachoView

1.5 compile timeroData validation

1. The clang

Create a new class, clang readwrite, and open main.cpp. Check as follows:

static struct _class_ro_t _OBJC_METACLASS_RO_The $_LBPerson __attribute__ ((used.section(" __DATA, __objc_const"))) = {
	1.sizeof(struct _class_t), sizeof(struct _class_t), 
	(unsigned int)0.0."LBPerson".0.0.0.0.0};static struct _class_ro_t _OBJC_CLASS_RO_The $_LBPerson __attribute__ ((used.section(" __DATA, __objc_const"))) = {
	0, __OFFSETOFIVAR__(struct LBPerson, name), sizeof(struct LBPerson_IMPL), 
	(unsigned int)0.0."LBPerson",
	(const struct _method_list_t *)&_OBJC_$_INSTANCE_METHODS_LBPerson,
	0, 
	(const struct _ivar_list_t *)&_OBJC_$_INSTANCE_VARIABLES_LBPerson,
	0, 
	(const struct _prop_list_t *)&_OBJC_$_PROP_LIST_LBPerson,
};
Copy the code
2. Macho checks validation

Ro is stored in the __DATA segment _objc_const section. MachOView macho file:

Verified, data in RO is determined and stored at compile time and cannot be modified at run time.

So let’s use LLDB to debug and actually see the memory layout of the data in the class.

1.6 Exploring memory layout debugging

The code for

@interface LBPerson : NSObject{
    @public
    NSString *ivarName;
}
@property (nonatomic.copy) NSString *propertyName;
+ (void)testClassMethod;
- (void)testInstanceMethod;
@end

@implementation LBPerson
+ (void)testClassMethod{
    NSLog(@"%s",__func__);
}
- (void)testInstanceMethod{
    NSLog(@"%s",__func__);
}
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        LBPerson * person = [[LBPerson alloc] init];
        person->ivarName = @"ivar";
        person.propertyName = @"property";
        [person propertyName];
        [person testInstanceMethod];
        NSLog(@ "123");
    }
    return 0;
}
Copy the code

Add a breakpoint to NSLog and run the project.

LLDB input instruction p/x lbPerson.class. Print the following:

0x20
bits

  • Because of the structure of the class,isaA consortium8Bytes,superclassPointer to the8Bytes,cache_tThe structure of the body16Bytes.
struct objc_class : objc_object {
    // Class ISA; / / 8
    Class superclass;   / / 8
    cache_t cache;      / / 16
    class_data_bits_t bits; 
}

struct cache_t {
    struct bucket_t* _buckets;  / / 8
    mask_t _mask;       / / 4
    mask_t _occupied;   / / 4
}
Copy the code

Note also: the LLDB debugging described above must be done in objC’s compiled source code, otherwise class_datA_bits_t cannot be strong-rolled.

Find the RW and let’s look at where member variables, properties, and methods are stored.

Member variables

First of all, in RW we see that there is no IVar. So it comes to RO.

Protocols, properties and methods in RW are the same as those in RO. Indicates that these three lists are shallow copies when ro reads into RW at runtime.

Continue to obtain IVAR.

ivarName
_propertyName
_ + name

attribute

methods

  • You can see that the instance variable is not generatedgettersetterThis is the difference between attributes and member variables.
  • Combining the results in the instance variables yields: properties = member variables + getters and setters
  • In addition, the class method is not in the class method list, in fact, in the metaclass, interested students can continue to test.

That’s about it. Exploring bits. Let’s take a look at cache_t and see how method caching works.

The cache_t – method caches the data source

Cache_t serves as the data structure for storing method caches, so let’s explore how method caches work.

struct cache_t {
    struct bucket_t* _buckets; // Cache arrays, that is, hash buckets
    mask_t _mask;   // Cache array capacity threshold, actually for capacity service
    mask_t _occupied;   // Number of cached methods in the cache array
    / *... * /
}

#if __LP64__
typedef uint32_t mask_t;  // x86_64 & arm64 asm are less efficient with 16-bits
#else
typedef uint16_t mask_t;
#endif

struct bucket_t {
private:
#if __arm64__
    uintptr_t _imp;
    SEL _sel;
#else
    SEL _sel;
    uintptr_t _imp;
#endif
}
Copy the code

According to the source code, cache_t occupies 16 bytes in 64 bits, while IMP and SEL are stored in bucket_t structures.

Next, let’s use LLDB to actually explore the principle of method caching.

3. Explore the principle of method caching

3.1 Code Preparation

@interface LBObj : NSObject
- (void)testFunc1;
- (void)testFunc2;
- (void)testFunc3;
@end

@implementation LBObj
- (void)testFunc1{
    NSLog(@"%s",__FUNCTION__);
}
- (void)testFunc2{
    NSLog(@"%s",__FUNCTION__);
}
- (void)testFunc3{
    NSLog(@"%s",__FUNCTION__);
}
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        LBObj * obj = [[LBObj alloc] init];
        [obj testFunc1];
        [obj testFunc2];
        [obj testFunc3];
    }
    return 0;
}
Copy the code

3.2 Start exploration

Tip: Run items in objC source code, otherwise LLDB strong transfer prompt won’t find cache_t.

Add a breakpoint before creating the object, and let’s look at caching the bucket.

Pass the creation object breakpoint, go to the calling method, and look again.

  • So if you look here, first of allinitselAs well asimpThey’ve been cached in buckets. They’re occupied_occupiedinto1 , _maskinto3 .

And we have two other questions.

1️ : Alloc method why no cache?

2️ : Why init method is not in the first position of _BUCKETS?

A:

  • Why is alloc not cached in class cache_t? The obvious answer is that class methods and their caches are stored in metaclasses. If you’re interested, you can check it out.

  • Why not in order, this involves the structure of the hash table design, this article will not do more about. If you are familiar with it, you should know that there are also key encoding, hash table expansion, and hash conflict processing, which is a classic topic and is used more in iOS.

Execute testFunc1 over the breakpoint. Continue looking.

  • initAs well astestFunc1selAs well asimpThey’ve been cached in buckets. They’re occupied_occupiedinto2 , _mask3 .

Pass the breakpoint and execute testFunc2. Continue looking.

  • inittestFunc1testFunc2selAs well asimpThey’ve been cached in buckets. They’re occupied_occupiedinto3 , _mask3 .

Pass the breakpoint and execute testFunc3. Continue looking.

The sel and IMP of testFunc3 are cached in the bucket. _mask: 7; _occupied: 1; _mask: 7; We’ll explore it in detail below.

3.3 About Cache Capacity

Instead of taking the _mask attribute, mask_t capacity() is called in the cache_t structure; Methods.

mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
}
mask_t cache_t::mask() 
{
    return _mask; 
}
Copy the code

If no method is called, _mask is 0, and capacity is also 0. If _mask has a value, the actual capacity is _mask + 1.

Now that we have a better understanding of the memory structure of Cache_t, we can take a look at how the cache is actually looked up when OC calls methods. When the cache reaches the threshold, what is the capacity expansion, or cache obsolescence strategy?

In exploring the nature of OC method and OC method lookup and message forwarding, we have explored the nature of the method and the complete process of method lookup in detail. The first part is the assembly lookup cache of objc_MsgSend.

3.4 Assembly lookup cache

Take ARM64 for example (different architectures correspond to different assembly instruction sets).

ENTRY _objc_msgSend
	UNWIND _objc_msgSend, NoFrame

	cmp	p0.# 0			// nil check and tagged pointer check
#if SUPPORT_TAGGED_POINTERS
	b.le	LNilOrTagged		//  (MSB tagged pointer looks negative)
#else
	b.eq	LReturnZero
#endif
	ldr	p13, [x0]		// p13 = isa
	GetClassFromIsa_p16 p13		// p16 = class
LGetIsaDone:
	CacheLookup NORMAL		// calls imp or objc_msgSend_uncached

#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
	b.eq	LReturnZero		// nil check

	// tagged
	adrp	x10, _objc_debug_taggedpointer_classes@PAGE
	add	x10, x10, _objc_debug_taggedpointer_classes@PAGEOFF
	ubfx	x11, x0, # 60.# 4
	ldr	x16, [x10, x11, LSL # 3]
	adrp	x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGE
	add	x10, x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGEOFF
	cmp	x10, x16
	b.ne	LGetIsaDone

	// ext tagged
	adrp	x10, _objc_debug_taggedpointer_ext_classes@PAGE
	add	x10, x10, _objc_debug_taggedpointer_ext_classes@PAGEOFF
	ubfx	x11, x0, # 52.# 8
	ldr	x16, [x10, x11, LSL # 3]
	b	LGetIsaDone
// SUPPORT_TAGGED_POINTERS
#endif

LReturnZero:
	// x0 is already zero
	mov	x1, # 0
	movi	d0.# 0
	movi	d1.# 0
	movi	d2.# 0
	movi	d3.# 0
	ret

	END_ENTRY _objc_msgSend
Copy the code
  • After a detailed understanding of ISA, it is relatively clear to look at these assembly instructions. For those of you who don’t know much about ISA, look it up.

  • For those unfamiliar with assembly instructions, students can use Hopper and IDA, which have assembly language restore advanced pseudocode functions to help view.

In the whole _objc_msgSend assembly part, it is actually the lookup cache process, which we call the fast lookup process (cache). The specific steps of this process are as follows:

1️, object null value judgment
ENTRY _objc_msgSend
UNWIND _objc_msgSend, NoFrame

cmp	p0.# 0

#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
	b.eq	LReturnZero		// nil check
Copy the code

The method caller is the first parameter we hide in OC. In ARM assembly, the first parameter of the method and its return value are stored in the X0 register.

Therefore, this is the determination of whether the message receiver in register X0 is empty or not. If empty, it returns directly, which is why sending a message to an empty object will not crash or be called.

2️ access to ISA based on message caller
#if SUPPORT_TAGGED_POINTERS
	b.le	LNilOrTagged		//  (MSB tagged pointer looks negative)
#else
	b.eq	LReturnZero
#endif
	ldr	p13, [x0]		// p13 = isa
	GetClassFromIsa_p16 p13		// p16 = class
Copy the code

Since objects are fundamentally objc_object, each object has an ISA pointer, and fetching an ISA from an object is handled differently due to isa optimization (whether it is nonPOinter_ISA or whether it is taggedPoint).

For those of you who are not familiar with this part please read isa’s past life, which is very detailed.

Assembly code to obtain isa as follows:

.macro GetClassFromIsa_p16 /* src */

#if SUPPORT_INDEXED_ISA
	// Indexed isa
	mov	p16, $0			// optimistically set dst = src
	tbz	p16, #ISA_INDEX_IS_NPI_BIT, 1f	// done if not non-pointer isa
	// isa in p16 is indexed
	adrp	x10, _objc_indexed_classes@PAGE
	add	x10, x10, _objc_indexed_classes@PAGEOFF
	ubfx	p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS  // extract index
	ldr	p16, [x10, p16, UXTP #PTRSHIFT]	// load class from array
1:

#elif __LP64__
	// 64-bit packed isa
	and	p16, $0, #ISA_MASK
#else
	// 32-bit raw isa
	mov	p16, $0
#endif
.endmacro
Copy the code

Here is isa, put in register P16. In fact, with objC source code getIsa is basically the same, but one is in assembly, one is in C to implement.

inline Class objc_object::getIsa() 
{
    if(! isTaggedPointer())return ISA();
    uintptr_t ptr = (uintptr_t)this;
    if (isExtTaggedPointer()) {
        uintptr_t slot = 
            (ptr >> _OBJC_TAG_EXT_SLOT_SHIFT) & _OBJC_TAG_EXT_SLOT_MASK;
        return objc_tag_ext_classes[slot];
    } else {
        uintptr_t slot = 
            (ptr >> _OBJC_TAG_SLOT_SHIFT) & _OBJC_TAG_SLOT_MASK;
        returnobjc_tag_classes[slot]; }}Copy the code
3️, traverse cache hash bucket and find method implementation in cache

Assembly source code: CacheLookup

.macro CacheLookup // p1 = SEL, p16 = isa ldp p10, p11, [x16, #CACHE] // p10 = buckets, p11 = occupied|mask #if ! __LP64__ and w11, w11, 0xffff // p11 = mask #endif and w12, w1, w11 // x12 = _cmd & mask add p12, p10, p12, LSL #(1+PTRSHIFT) // p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT)) ldp p17, p9, [x12] // {imp, sel} = *bucket 1: cmp p9, p1 // if (bucket->sel ! = _cmd) b.ne 2f // scan more CacheHit $0 // call or return imp 2: // not hit: p12 = not-hit bucket CheckMiss $0 // miss if bucket->sel == 0 cmp p12, p10 // wrap if bucket == buckets b.eq 3f ldp p17, p9, [x12, #-BUCKET_SIZE]! // {imp, sel} = *--bucket b 1b // loop 3: // wrap: p12 = first bucket, w11 = mask add p12, p12, w11, UXTW #(1+PTRSHIFT) // p12 = buckets + (mask << 1+PTRSHIFT) ldp p17, p9, [x12] // {imp, sel} = *bucket 1: cmp p9, p1 // if (bucket->sel ! = _cmd) b.ne 2f // scan more CacheHit $0 // call or return imp 2: // not hit: p12 = not-hit bucket CheckMiss $0 // miss if bucket->sel == 0 cmp p12, p10 // wrap if bucket == buckets b.eq 3f ldp p17, p9, [x12, #-BUCKET_SIZE]! // {imp, sel} = *--bucket b 1b // loop 3: // double wrap JumpMiss $0 .endmacroCopy the code

CacheLookup has three cases: NORMAL, GETIMP, LOOKUP. So we’re going to do NORMAL first.

Based on the class structure we mentioned earlier, 8-byte ISA + 8-byte superClass, followed by cache_t.

ldp	p10, p11, [x16, #CACHE]	// p10 = buckets, p11 = occupied|mask
and	w12, w1, w11		// x12 = _cmd & mask
add	p12, p10, p12, LSL
Copy the code
  • We offset buckets and occupied in cache_t by 16 bytes to p10 and P11. We then enforce the method name to cache_key_t and mask for and (the and instruction is &)

    • To describe this a little bit, we already know that the value of mask is the number of buckets in the cache minus 1. The number of buckets in the initial cache of a class is 4, multiplied by 2 each time the number of buckets increases.

    • That is, all bits of the binary value of mask are 1, so the name of the method is changed to cache_key_t and mask, which take the lower bits of mask to hit elements in the hash bucket.

    • Therefore, the index value of the hash algorithm must be less than the number of buckets in the cache without crossing the threshold.

    • If this is the case when finding a method, it must be the case when saving it. More on this later.

  • 3.2. After the corresponding index value is obtained through the hash algorithm, the method name to be searched is cyclically compared with the method name stored in the bucket, and the index value index– is searched each time.

    • To find theCacheHit $0Directly,call or return imp
    • Could not findkeyif $0 == NORMAL cbz p9, __objc_msgSend_uncached.
    • Find a matchkeyBut there is no correspondenceimpHash conflict,index--Keep looking.

4️, cache not found call: __objc_msgSend_uncached.

STATIC_ENTRY __objc_msgSend_uncached
    UNWIND __objc_msgSend_uncached, FrameWithNoSaves
    MethodTableLookup
    TailCallFunctionPointer x17
END_ENTRY __objc_msgSend_uncached
Copy the code

MethodTableLookup:

.macro MethodTableLookup
    SignLR
    stp	fp.lr[sp# -16]!
    mov	fp.sp

    / * * / has been eliminated
    // receiver and selector already in x0 and x1
    mov	x2, x16
    bl	__class_lookupMethodAndLoadCache3
.endmacro
Copy the code

Class_lookupMethodAndLoadCache3 came to C function process, also is what we call the class and the parent class news search and forward process.

Those interested in this section can read exploring the nature of OC method and OC method lookup and message forwarding

In these two articles we have explored the nature of methods and the complete process of method lookup in detail.

3.5 Storage Cache

Now that we know how to find the cache when a method is called, let’s look at how to store the method in the cache when the cache is not found. And how to expand and store when the critical value is reached.

As we said in OC method lookup and message forwarding, after looking up the list of methods for your own class and its parent class.

  • If you findimp, is called tolog_and_fill_cache .
  • If you don’t find itimpFirst there will be an opportunity for dynamic method resolution, and finally there will be an opportunity to fetchresolveInstanceMethodIn returnimpAnd forcache_fill .

The source code is as follows:

/** Find this class */
{
    Method meth = getMethodNoSuper_nolock(cls, sel);
    if (meth) {
        log_and_fill_cache(cls, meth->imp, sel, inst, cls);
        imp = meth->imp;
        gotodone; }}/** traverse to find the parent */
unsigned attempts = unreasonableClassCount();
for(Class curClass = cls->superclass; curClass ! = nil; curClass = curClass->superclass) {if (imp) {
        if(imp ! = (IMP)_objc_msgForward_impcache) {// Found the method in a superclass. Cache it in this class.
            log_and_fill_cache(cls, imp, sel, inst, curClass);
            gotodone; }}/* Dynamic method after parsing */
imp = (IMP)_objc_msgForward_impcache;
cache_fill(cls, sel, imp, inst);
Copy the code

Log_and_fill_cache also calls cache_fill internally, and cache_fill calls cache_fill_nolock, so we go straight to this method implementation.

static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();

    if(! cls->isInitialized())return;

    // Check again to make sure no other threads happen to have stored the method
    if (cache_getImp(cls, sel)) return;
    // fetch cache_t for CLS class/metaclass
    cache_t *cache = getCache(cls);

    // The new occupation is the old + 1
    mask_t newOccupied = cache->occupied() + 1;
    mask_t capacity = cache->capacity();
    if (cache->isConstantEmptyCache()) {
        ; // Check whether it is the first time that the class is cached
        // cache_t The default value is read-only.cache->reallocate(capacity, capacity ? : INIT_CACHE_SIZE); }else if (newOccupied <= capacity / 4 * 3) {
        // If the number of new uses does not exceed three quarters of the total capacity, it will not be processed
    }
    else {
        // If the number of new users exceeds three quarters of the total capacity, expand the capacity
        cache->expand();
    }

    // Through this sel to find the cache, find the number of occupation need not increase, directly update the SEL corresponding IMP
    // set sel and IMP to bucket if not found
    bucket_t *bucket = cache->find(sel, receiver);
    if (bucket->sel() == 0) cache->incrementOccupied();
    bucket->set<Atomic>(sel, imp);
}
Copy the code

The padding of the cache method is explained very clearly in the source code comments above.

Let’s look at scaling up to see why _occupied changed to 1 and _mask to 7 after calling testFunc3 in our previous exploration.

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if ((uint32_t) (mask_t)newCapacity ! = newCapacity) { newCapacity = oldCapacity; } reallocate(oldCapacity, newCapacity); }enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2)
};
Copy the code

You can see that the new capacity has the following conditions:

  • When the mask is 0, capacity is also 0INIT_CACHE_SIZE, that is,1 < < 2, it is4 .
  • whenoldCapacityDon’t for0, then open up the old capacity twice.
  • When the new capacity exceeds4Bytes, the new capacity is reset to the old capacity.

After the capacity is processed, reallocate resets the hash bucket, which is why the cache stored before testFunc3 is gone after we called it above.

3.6 question

🔐 1. Why is the hash bucket reset for each capacity expansion?

🔑 answer:

  • 1️ : Due to the property of hash table — address mapping, when the total table expands every time, the mapping of all elements will fail, because the total capacity changes, the subscript hash result will also change.

  • 2️ discount: If all the previously cached methods need to be restored, the consumption and cost is a little too large.

  • 3️ discount: Therefore, on the premise that the latest method can be cached, if the old method needs to be re-stored each time expansion, the meaning of design cache itself (in order to improve the efficiency of method search) will conflict.

🔐 2. Why is the cache capacity expanded when three-quarters of the cache is occupied? Many of iOS’s elimination strategies use the three-fourths approach.

🔑 answer:

  • 1️ : Expansion is not necessarily successful, so even if it is not successful, it can guarantee space storage, if it runs out and then expands, then the method cache will fail to store, based on the most common LRU (least recent access principle), it is obvious that the latest method should not fail to store.

  • 2️ : From the perspective of space and time utilization and hash conflict, 3/4 is obviously the best treatment strategy.

  • 3️ discount: On the condition that the number of empty barrels is always guaranteed to be 1/4 idle by the system, the loop will definitely hit the cache or exit the loop with key == NULL when circulating and searching the cache, so as to avoid the occurrence of dead-loop.

3.7 Cache read and write Security in Multi-threading

This part of the knowledge of the article reference

  • Deconstruct the implementation of the objc_msgSend function in depth
  • Caching principle of OC source analysis method
3.7.1 Multithreading reads cache

First of all, read cache does not carry out write operation, so there is no multi-thread safety problem in the case of multiple threads reading cache. Therefore, for efficiency, there is no lock processing in the _objc_msgSend assembly reading cache.

3.7.2 Multi-threaded write Cache

There is a first step in both expand and cache_FILL_NOLock

cacheUpdateLock.assertLocked();

/*********************************************************************** * Lock management * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
mutex_t runtimeLock;
mutex_t selLock;
mutex_t cacheUpdateLock;
recursive_mutex_t loadMethodLock;
Copy the code

The system uses a global mutex, and when filling the cache we check it again before writing to make sure that no other thread has just stored the method, thus ensuring that the multithreaded write cache is safe.

3.7.3 Multi-threaded Read and write Cache

First let’s think about what happens to multi-threaded read and write caching.

  • 🔐 1. Delete: When the write cache of one thread reaches the threshold and triggers capacity expansion, the old methods are discarded and only locally called methods are retained. Then other threads might have problems reading.
  • 🔐 2. Modify: When one of the threads is modifiedbucketselThe correspondingimp, other threads will also have problems reading.
  • 🔐 3, change: when a thread is expandedbuckets , maskGets bigger while other threads read the old onebucketsAnd the newmask, the boundary will be crossed.

So with that in mind let’s explore how LibobJC maximizes thread-safe requirements without compromising performance.

Question 1:

A: 🔑 In order to ensure that the expansion of the zeroing does not affect the reading of other threads to solve this problem, the system saves the starting and ending addresses of all six API functions that access the cache data in the Class object into two global arrays:

extern "C" uintptr_t objc_entryPoints[];
extern "C"  uintptr_t objc_exitPoints[];
Copy the code
  • When a writer thread expands the hash bucket in the Class cache, it first stores the address of the allocated old hash bucket memory block that needs to be destroyed in a global garbage collection array variable, garbage_refs.

  • It then iterates through all threads in the current process and checks to see if the values in the current PC register in the thread state are in the objc_entryPoints and objc_exitPoints range.

  • That is, check to see if any thread is executing the functions in the objc_entryPoints list.

    • If not, no function is accessing at this pointClassThe object of thecacheData, at which point you can safely garbage collection array variables globallygarbage_refsAll hash bucket memory blocks to be destroyed in.
    • And if any thread is executingobjc_entryPointsThe functions in the list are left alone and wait to be checked again and destroyed when appropriate.
  • This ensures that the reader thread does not generate a memory access exception when accessing buckets in the cache of the Class object.

Question 2/3:

A: 🔑 in the _objc_msgSend assembly lookup cache section above, we see

ldp x10, x11, [x16, #CACHE]	// x10 = buckets, x11 = occupied|mask
Copy the code
  • In the whole search cache method, buckets and mask (occupied = mask + 1; occupied = 0) are read into registers X10 and X11. The values in these two registers are used for subsequent processing.

  • And assembly instruction because of itself is the smallest execution unit. (atom (atom) refers to the basic particle of chemical reaction cannot be divided, atoms in chemical reaction cannot be divided), so in most cases, assembly instruction can ensure atomicity.

    • In most cases, the system can still abort the command being executed, while ordinary assembly instructions can still be accidentally modified. At the assembly language level, it provideslockInstruction prefixes are used to secure data in the instruction execution layer.
    • addedlockModified single compile instructions and these special safety instructions are true atomic operations
    • Ex. :lock addl $0x1 %r8d
  • So as long as the instruction of reading _buckets and _mask into X10 and X11 registers matches each other (that is, the data before and after expansion), Even if the bucket is deleted during the call, it will not affect the call because it is already stored in the register.

How to ensure that _buckets and _mask are matched and not affected by compiler optimization?

Compile memory barrier

First of all, we know that _mask is always a new value in each expansion and may be larger than the old value.

// Set the hash bucket memory and mask values for the update cache.
  void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
    // objc_msgSend uses mask and buckets with no locks.
    // It is safe for objc_msgSend to see new buckets but old mask.
    // (It will get a cache miss but not overrun the buckets' bounds).
    // It is unsafe for objc_msgSend to see old buckets and new mask.
    // Therefore we write new buckets, wait a lot, then write new mask.
    // objc_msgSend reads mask first, then buckets.

    // ensure other threads see buckets contents before buckets pointer
    mega_barrier();

    buckets = newBuckets;
    
    // ensure other threads see new buckets before new mask
    mega_barrier();
    
    mask = newMask;
    occupied = 0;
}
Copy the code

As the comments in the code above make clear, the compiler memory barrier only ensures that the assignment of _buckets takes precedence over the assignment of _mask, that is, instructions

ldp	x10, x11, [x16, #CACHE]
Copy the code

After execution, either the old _mask and the old buckets, or the new buckets and the old mask, will not cross the boundary. All these methods ensure the security of multithreaded read and write cache.

At this point, we are done exploring the cache lookup of the method and the cache reading.

4. Summary of method caching principle

  • 1 ️ ⃣ :OCMethods can be cached,Class methodInstance methodsStored separately inThe metaclassclassOf the structure ofcache_tbucketsIn the hash bucket. The message will be sent to the cache first, if it is found, the process of method lookup and forwarding will not continue.
  • 2 ️ ⃣ :cache_tusecapacity ( mask = 0 ? mask + 1 : 0) to record the current maximum capacity.
  • 3 ️ ⃣ :cache_tuseoccupiedTo record the capacity currently used.
  • 4️ retail: When the used capacity reaches the total capacityThree quarters of, the capacity of the hash bucket will be expanded to double the current capacity (if the usage exceeds 4 bytes, the capacity will not be expanded). During capacity expansion, only the latest historical cache is retainedselimp .
  • 5️ discount: Cache read is thread-safe.

About the basic knowledge of the class, the knowledge points involved are basically finished, the next will continue to bring iOS related to the underlying knowledge articles, please pay attention to.