OC source code analysis method search principle

preface

If you want to be an iOS developer, you have to read source code. The following is a small series that the author combed in OC source code exploration — Class and Object, welcome you to read and correct, but also hope to help you.

Object creation for OC source analysis

OC source code analysis of ISA

OC source code analysis and such structural interpretation

Caching principle of OC source analysis method

OC source code analysis method search principle

Analysis and forwarding principle of OC source code analysis method

In Objective-C, when the compiler encounters a method call, it changes the method call to one of the following functions:

objc_msgSend,objc_msgSend_stret,objc_msgSendSuperandobjc_msgSendSuper_stret.

Messages sent to the object’s parent class (when using the super keyword) are sent using objc_msgSendSuper, and other messages are sent using objc_msgSend. If the method returns a data structure, it is sent using objc_msgSendSuper_stret or objc_msgSend_stret.

The above four functions are all used to send messages, do some preparatory work, and then do method lookup, parsing, and forwarding. The topic of this article is method lookup. I will explain the implementation of objc_msgSend function and method lookup process step by step, starting with method invocation.

Let’s get straight to the point.

Note that the source code I used was ObjC4-756.2.

1 `objc_msgSend`parsing

1.1 For example

Well, for a simple example, the code is as follows

@interface Person : NSObject

- (void)personInstanceMethod1;

@end

@implementation Person

- (void)personInstanceMethod1 {
    NSLog(@"%s", __FUNCTION__);
}

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Person *person = [Person alloc];
        [person personInstanceMethod1];
    }
    return 0;
}
Copy the code

Recompile the main.m file with the clang command

clang -rewrite-objc main.m -o main.cpp
Copy the code

Open the main. CPP file

int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 
        Person *person = ((Person *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("Person"), sel_registerName("alloc"));
        ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("personInstanceMethod1"));
    }
    return 0;
}
Copy the code

It is much easier to distinguish after removing the strong rotation

Person *person = objc_msgSend(objc_getClass("Person"), sel_registerName("alloc"));
objc_msgSend(person, sel_registerName("personInstanceMethod1"));
Copy the code

It turns out that the calls to the +alloc and personInstanceMethod1 methods are actually calling the objc_msgSend function.

1.2 Nature of method invocation

A description of the objc_msgSend function can be found in the arm64.s file

id objc_msgSend(id self, SEL _cmd, ...)
Copy the code

Where the first argument self is the caller itself, which is also the receiver; The second argument, _cmd, is the method number; The rest of the mutable argument list is the method’s own arguments.

A quick explanation:

idRefers to theOCObject, each object has an indeterminate structure in memory, but its initial address points to the object’sisaThrough theisa, available at run timeobjc_class

objc_classObject representingClassIts structure is determined after compilation

SELRepresents a selector, usually understood as a string.OCOne is maintained at run timeSELTable that maps a method name with the same string to a unique oneSELon

Mapping from the same method name of any classSELAll the sameSELApproximately equivalent to method name)

Can be achieved bysel_registerName(char *name)thisCFunction to getSEL.OCAlso provides a grammar sugar@selectorTo facilitate the call of the function

IMPIs a function pointer.OCAll of the methods in theCThe function,IMPThat’s the address of these functions.

One conclusion can be made from the above example: The essence of a method call is to send SEL to the caller via objc_msgSend, find the specific function address IMP, and execute the function.

In other words, the following two pieces of code actually have the same effect

To use the objc_msgSend function, you need to change one setting. The following figureCopy the code

2 method search

Method lookup, also known as message lookup, is prepared from objc_msgSend and is not performed until it is ready.

2.1 `objc_msgSend`Source code analysis

Using arm64 architecture as an example, objc_msgSend source code and parse is as follows

    // ENTRY indicates the function ENTRY
    ENTRY _objc_msgSend
    UNWIND _objc_msgSend, NoFrame

    // p0 stores the first parameter of objc_msgSend.
    // Make a non-null judgment on the receiver here
    cmp	p0, #0			    // nil check and tagged pointer check
    // Indicates whether Tagged Pointer is supported. Set this parameter to 1 on 64-bit cpus
#if SUPPORT_TAGGED_POINTERS 
    // 64-bit, and P0 <= 0 (le is less or equal), the switch to LNilOrTagged
    b.le	LNilOrTagged		// (MSB tagged pointer looks negative)
#else
    // 32 bits, and p0 == 0 (eq equals), jump to LReturnZero
    b.eq	LReturnZero
#endif
    // Read the isa of the receiver (instance object, class object, metaclass object) to p13
    ldr	p13, [x0]		        // p13 = isa
    // Get class according to isa
    GetClassFromIsa_p16 p13		// p16 = class
LGetIsaDone:
    CacheLookup NORMAL	    // calls imp or objc_msgSend_uncached

#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
    // p0 == 0, i.e. receiver is nil, jump to LReturnZero
    b.eq	LReturnZero		    // nil check

    // tagged
    adrp	x10, _objc_debug_taggedpointer_classes@PAGE
    add	x10, x10, _objc_debug_taggedpointer_classes@PAGEOFF
    ubfx	x11, x0, #60#,4
    ldr	x16, [x10, x11, LSL #3]
    adrp	x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGE
    add	x10, x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGEOFF
    cmp	x10, x16
    b.ne	LGetIsaDone

    // ext tagged
    adrp	x10, _objc_debug_taggedpointer_ext_classes@PAGE
    add	x10, x10, _objc_debug_taggedpointer_ext_classes@PAGEOFF
    ubfx	x11, x0, #52#,8
    ldr	x16, [x10, x11, LSL #3]
    b	LGetIsaDone
// SUPPORT_TAGGED_POINTERS
#endif

LReturnZero:
    // x0 is already zero
    mov	x1, #0
    movi	d0, #0
    movi	d1, #0
    movi	d2, #0
    movi	d3, #0
    ret
    
    // END_ENTRY indicates the end of the function
    END_ENTRY _objc_msgSend
Copy the code

As you can see, objc_msgSend is basically getting the recipient’s ISA.

Consider: Why is objc_msgSend written in assembly?

2.2 `GetClassFromIsa_p16`

GetClassFromIsa_p16 is executed in two cases:

When the system is 64-bit architecture, the receiver is notTagged PointerObject,isaIs notnonpointer;
When the system is not 64-bit architecture and the receiverisaIs not empty

In both cases you get GetClassFromIsa_p16, whose source code is shown below

.macro GetClassFromIsa_p16 /* src */

#if SUPPORT_INDEXED_ISA     // armv7k or arm64_32
	// Indexed isa
	mov	p16, $0			// optimistically set dst = src
	tbz	p16, #ISA_INDEX_IS_NPI_BIT, 1f	// done if not non-pointer isa
	// isa in p16 is indexed
	adrp	x10, _objc_indexed_classes@PAGE
	add	x10, x10, _objc_indexed_classes@PAGEOFF
	ubfx	p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS  // extract index
	ldr	p16, [x10, p16, UXTP #PTRSHIFT]	// load class from array
1:

#elif __LP64__
	// 64-bit packed isa
	and	p16, $0, #ISA_MASK

#else
	// 32-bit raw isa
	mov	p16, $0

#endif

.endmacro
Copy the code

In 64-bit architecture, isa & ISA_MASK is used to obtain the true ISA. Its value is class or metaclass, depending on whether the receiver is an instance object or a class object.

That is, the main purpose of objc_msgSend is to get the recipient’s ISA information, and if so, CacheLookup:

LGetIsaDone:
    CacheLookup NORMAL	    // calls imp or objc_msgSend_uncached
Copy the code

2.3 `CacheLookup`

After objc_msgSend, p1 = SEL, p16 = isa

The purpose of CacheLookup is to find method implementations in the cache. It has three modes: NORMAL, GETIMP, and LOOKUP.

Take a look at the source code of CacheLookup

.macro CacheLookup
	// p1 = SEL, p16 = isa
	ldp	p10, p11, [x16, #CACHE]	// p10 = buckets, p11 = occupied|mask
#if! __LP64__
	and	w11, w11, 0xffff	// p11 = mask
#endif
	and	w12, w1, w11		// x12 = _cmd & mask
	add	p12, p10, p12, LSL #(1+PTRSHIFT)
		             // p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT))

	ldp	p17, p9, [x12]		// {imp, sel} = *bucket
1:	cmp	p9, p1			// if (bucket->sel ! = _cmd)
	b.ne	2f			// scan more
	CacheHit $0			// call or return imp
	
2:	// not hit: p12 = not-hit bucket
	CheckMiss $0			// miss if bucket->sel == 0
	cmp	p12, p10		// wrap if bucket == buckets
	b.eq	3f
	ldp	p17, p9, [x12, #-BUCKET_SIZE]!	// {imp, sel} = *--bucket
	b	1b			// loop

3:	// wrap: p12 = first bucket, w11 = mask
	add	p12, p12, w11, UXTW #(1+PTRSHIFT)
		                        // p12 = buckets + (mask << 1+PTRSHIFT)

	// Clone scanning loop to miss instead of hang when cache is corrupt.
	// The slow path may detect any corruption and halt later.

	ldp	p17, p9, [x12]		// {imp, sel} = *bucket
1:	cmp	p9, p1			// if (bucket->sel ! = _cmd)
	b.ne	2f			// scan more
	CacheHit $0			// call or return imp
	
2:	// not hit: p12 = not-hit bucket
	CheckMiss $0			// miss if bucket->sel == 0
	cmp	p12, p10		// wrap if bucket == buckets
	b.eq	3f
	ldp	p17, p9, [x12, #-BUCKET_SIZE]!	// {imp, sel} = *--bucket
	b	1b			// loop

3:	// double wrap
	JumpMiss $0
	
.endmacro
Copy the code

CacheLookup operates on a class cache member variable called cache_t, which is used to cache methods called cache_t.

Analysis of CacheLookup is as follows:

aboutldp p10, p11, [x16, #CACHE]

Find the definition of CACHE

#define CACHE            (2 * __SIZEOF_POINTER__)
#define CLASS            __SIZEOF_POINTER__
Copy the code

In 64-bit CPU architecture, the length of the pointer is 8 bytes, so the CACHE is 16 bytes. In the cache_T structure, the BUCKETS pointer is 8 bytes and stored in P10. The 4-byte Mask is stored in P11 together with the occupied 4-byte; the lower 32 bits of P11 are for masks.

Find the targetbucket

#if __LP64__    // arm64.#define PTRSHIFT 3.#else           // arm64_32.#define PTRSHIFT 2.Copy the code

The index value is calculated by SEL & mask hash, and then the corresponding bucket is obtained. Then the IMP and SEL of bucket are stored in P17 and P9 respectively.

Consider: why index values are shifted 1 + PTRSHIFT bits to the left?

CacheHit,CheckMissandJumpMiss

Once the bucket is found, the following flow follows:

ifbuckettheselNot equal to the methodsel, the implementation{imp, sel} = *--bucket, which is ergodicbucketsEach of thebucket, respectively with the methodselcompare
ifbuckettheselEqual to methodsel, the implementationCacheHit, that is, directly return and executeimp;
If you find Buckets’ firstbucket, the implementationJumpMiss
ifbuckettheselIs equal to 0, which is thisbucketIf no, run the commandCheckMiss

Take a look at the source code for the CacheHit, CheckMiss, and JumpMiss functions

// CacheHit: x17 = cached IMP, x12 = address of cached IMP, x1 = SEL
.macro CacheHit
.if $0 == NORMAL
	TailCallCachedImp x17, x12, x1	// authenticate and call imp
.elseif $0 == GETIMP
	mov	p0, p17
	cbz	p0, 9f			// don't ptrauth a nil imp
	AuthAndResignAsIMP x0, x12, x1	// authenticate imp and re-sign as IMP
9:	ret				// return IMP
.elseif $0 == LOOKUP
	// No nil check for ptrauth: the caller would crash anyway when they
	// jump to a nil IMP. We don't care if that jump also fails ptrauth.
	AuthAndResignAsIMP x17, x12, x1	// authenticate imp and re-sign as IMP
	ret				// return imp via x17
.else
.abort oops
.endif
.endmacro

// CheckMiss
.macro CheckMiss
	// miss if bucket->sel == 0
.if $0 == GETIMP
	cbz	p9, LGetImpMiss
.elseif $0 == NORMAL
	cbz	p9, __objc_msgSend_uncached
.elseif $0 == LOOKUP
	cbz	p9, __objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

// JumpMiss
.macro JumpMiss
.if $0 == GETIMP
	b	LGetImpMiss
.elseif $0 == NORMAL
	b	__objc_msgSend_uncached
.elseif $0 == LOOKUP
	b	__objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro
Copy the code

The three functions are summarized as follows:

CacheHitFunction in theNORMALIn mode, we’ll find itIMPReturn and call; inGETIMP,LOOKUPIn both modes it’s just a returnIMP, and did not call.
JumpMiss,CheckMissThe two functions behave basically the same in the three modes:
- inNORMALIn mode, all calls are invoked__objc_msgSend_uncached;
- inGETIMPIn mode, all calls are invokedLGetImpMissTo return tonil;
- inLOOKUPIn mode, all calls are invoked__objc_msgLookup_uncached;

2.4 `__objc_msgSend_uncached` 和 `__objc_msgLookup_uncached`

Also look at the source code

// __objc_msgSend_uncached
STATIC_ENTRY __objc_msgSend_uncached
UNWIND __objc_msgSend_uncached, FrameWithNoSaves

// THIS IS NOT A CALLABLE C FUNCTION
// Out-of-band p16 is the class to search
	
MethodTableLookup
TailCallFunctionPointer x17

END_ENTRY __objc_msgSend_uncached

// __objc_msgLookup_uncached

STATIC_ENTRY __objc_msgLookup_uncached
UNWIND __objc_msgLookup_uncached, FrameWithNoSaves

// THIS IS NOT A CALLABLE C FUNCTION
// Out-of-band p16 is the class to search
	
MethodTableLookup
ret

END_ENTRY __objc_msgLookup_uncached
Copy the code

The main job of discovering these two functions is to call MethodTableLookup

2.5 `MethodTableLookup`

The source code is as follows

.macro MethodTableLookup
	
	// push frame
	SignLR
	stp	fp, lr, [sp, #- 16]!
	mov	fp, sp

	// save parameter registers: x0.. x8, q0.. q7
	sub	sp, sp, #(10*8 + 8*16)
	stp	q0, q1, [sp, #(0*16)]
	stp	q2, q3, [sp, #(2*16)]
	stp	q4, q5, [sp, #(4*16)]
	stp	q6, q7, [sp, #(6*16)]
	stp	x0, x1, [sp, #(8*16+0*8)]
	stp	x2, x3, [sp, #(8*16+2*8)]
	stp	x4, x5, [sp, #(8*16+4*8)]
	stp	x6, x7, [sp, #(8*16+6*8)]
	str	x8,     [sp, #(8*16+8*8)]

	// receiver and selector already in x0 and x1
	mov	x2, x16
	bl	__class_lookupMethodAndLoadCache3

	// IMP in x0
	mov	x17, x0
	
	// restore registers and return
	ldp	q0, q1, [sp, #(0*16)]
	ldp	q2, q3, [sp, #(2*16)]
	ldp	q4, q5, [sp, #(4*16)]
	ldp	q6, q7, [sp, #(6*16)]
	ldp	x0, x1, [sp, #(8*16+0*8)]
	ldp	x2, x3, [sp, #(8*16+2*8)]
	ldp	x4, x5, [sp, #(8*16+4*8)]
	ldp	x6, x7, [sp, #(8*16+6*8)]
	ldr	x8,     [sp, #(8*16+8*8)]

	mov	sp, fp
	ldp	fp, lr, [sp], #16
	AuthenticateLR

.endmacro
Copy the code

MethodTableLookup function mainly to save the parameters of the part of the register, and then call _class_lookupMethodAndLoadCache3 function. Coming here means that the cache lookup process for the message is officially over, and it’s time to go to the list of methods. The lookup process in the list of methods is implemented in C\C++ and is less efficient than cache lookup, so it is also called the slow lookup process for messages.

2.6 `_class_lookupMethodAndLoadCache3`

_class_lookupMethodAndLoadCache3 is a simple C/C + + function, it is only one line of code, namely lookUpImpOrForward function called

IMP _class_lookupMethodAndLoadCache3(id obj, SEL sel, Class cls)
{        
    return lookUpImpOrForward(cls, sel, obj, 
                              YES/*initialize*/, NO/*cache*/, YES/*resolver*/);
}
Copy the code

Next, look at the lookUpImpOrForward function.

2.7 `lookUpImpOrForward`

Since _class_lookupMethodAndLoadCache3, obviously the initialize and resolver both parameter values are YES (resolver this flag determines whether the back of the dynamic method resolution), the cache is NO. Take a look at the source code for lookUpImpOrForward

IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver)
{
    IMP imp = nil;
    bool triedResolver = NO;

    runtimeLock.assertUnlocked();

    // Optimistic cache lookup
    // If it is from cache, cache is NO; When parsing and forwarding a message, the cache needs to be passed.
    // Cache lookups are unlocked to improve cache lookups
    if (cache) {
        Cache_getImp is also written in assembly
        imp = cache_getImp(cls, sel);
        if (imp) return imp;
    }

    // runtimeLock is held during isRealized and isInitialized checking
    // to prevent races against concurrent realization.

    // runtimeLock is held during method search to make
    // method-lookup + cache-fill atomic with respect to method addition.
    // Otherwise, a category could be added but ignored indefinitely because
    // the cache was re-filled with the old value after the cache flush on
    // behalf of the category.
    // Lock to prevent multi-threaded operations and ensure atomicity of method lookup and cache-fill.
    // and ensure that no new methods are added to the locked code to flush the cache.
    runtimeLock.lock();
    checkIsKnownClass(cls);
    // Attach information (attributes, methods, protocols, etc.)
    // Lazy classes use this method
    if(! cls->isRealized()) { cls = realizeClassMaybeSwiftAndLeaveLocked(cls, runtimeLock);// runtimeLock may have been dropped but is now locked again
    }

    // If the class is not initialized, it needs to be initialized
    if(initialize && ! cls->isInitialized()) { cls = initializeAndLeaveLocked(cls, inst, runtimeLock);// runtimeLock may have been dropped but is now locked again

        // If sel == initialize, class_initialize will send +initialize and 
        // then the messenger will send +initialize again after this 
        // procedure finishes. Of course, if this is not being called 
        // from the messenger then it won't happen. 2778172
    }


 retry:    
    runtimeLock.assertLocked();

    // Try this class's cache.
    // Look it up in the current class cache
    imp = cache_getImp(cls, sel);
    if (imp) goto done;

    // Try this class's method lists.
    // Find the method list of the current class
    {
        Method meth = getMethodNoSuper_nolock(cls, sel);
        if (meth) {
            // If found, cache first
            log_and_fill_cache(cls, meth->imp, sel, inst, cls);
            imp = meth->imp;
            gotodone; }}// Try superclass caches and method lists.
    // Look in the list of caches and methods of the parent class
    {
        unsigned attempts = unreasonableClassCount();
        for(Class curClass = cls->superclass; curClass ! = nil; curClass = curClass->superclass) {// Halt if there is a cycle in the superclass chain.
            if (--attempts == 0) {
                _objc_fatal("Memory corruption in class list.");
            }
            
            // Superclass cache.
            imp = cache_getImp(curClass, sel);
            if (imp) {
                if(imp ! = (IMP)_objc_msgForward_impcache) {// Found the method in a superclass. Cache it in this class.
                    log_and_fill_cache(cls, imp, sel, inst, curClass);
                    goto done;
                }
                else {
                    // Found a forward:: entry in a superclass.
                    // Stop searching, but don't cache yet; call method 
                    // resolver for this class first.
                    break; }}// Superclass method list.
            Method meth = getMethodNoSuper_nolock(curClass, sel);
            if (meth) {
                log_and_fill_cache(cls, meth->imp, sel, inst, curClass);
                imp = meth->imp;
                gotodone; }}}// No implementation found. Try method resolver once.
    // IMP is not found in the cache + method list of the root class
    if(resolver && ! triedResolver) { runtimeLock.unlock(); resolveMethod(cls, sel, inst); runtimeLock.lock();// Don't cache the result; we don't hold the lock so it may have 
        // changed already. Re-do the search from scratch instead.
        triedResolver = YES;
        goto retry;
    }

    // No implementation found, and method resolver didn't help. 
    // Use forwarding.
    // Message forwarding
    imp = (IMP)_objc_msgForward_impcache;
    cache_fill(cls, sel, imp, inst);

 done:
    runtimeLock.unlock();

    return imp;
}
Copy the code

Just a quick overview. For a PARTICULAR CLS, IMP searches will always start in the CLS cache (calling cache_getImp); If it doesn’t, it looks in the CLS method list, which calls getMethodNoSuper_nolock. If still not found, the search process is repeated in the CLS parent class (the parent of the parent class, all the way to the root class) until it is found. If it is still not found after reaching the root class, the process of method parsing and even message forwarding will enter.

Note that if it is not found in the current class’s cache, in the method list, but in its “parent class… Root class “cache or method list) found IMP, need to be a message forwarding judgment, if not message forwarding, then on the current class of cache filling operation, convenient next call; If it is message forwarding, exit the loop.

LookUpImpOrForward is the message dispatch center, which includes not only the lookup of messages, but also the parsing and forwarding of messages. Due to lack of space, this article only covers the message lookup aspect; the rest will be covered in another article.Copy the code

Next, look at the cache_getImp and getMethodNoSuper_nolock functions.

2.8 `cache_getImp`

Cache_getImp is also written in assembly and has the following source code:

STATIC_ENTRY _cache_getImp

	GetClassFromIsa_p16 p0
	CacheLookup GETIMP

LGetImpMiss:
	mov	p0, #0
	ret

	END_ENTRY _cache_getImp
Copy the code

The CacheLookup mode is GETIMP (in GETIMP mode, if the CacheLookup fails, LGetImpMiss will be executed).

We’ve already parses GetClassFromIsa_p16 and CacheLookup.

2.9 `getMethodNoSuper_nolock`

The purpose of calling the getMethodNoSuper_nolock function is to retrieve the list of CLS methods, look for a method named sel, and return that method (method_t structure, which contains IMP). The source code is:

static method_t *
getMethodNoSuper_nolock(Class cls, SEL sel)
{
    runtimeLock.assertLocked();

    assert(cls->isRealized());
    // fixme nil cls? 
    // fixme nil sel?

    for (automlists = cls->data()->methods.beginLists(), end = cls->data()->methods.endLists(); mlists ! = end; ++mlists) {method_t *m = search_method_list(*mlists, sel);
        if (m) return m;
    }

    return nil;
}
Copy the code

CLS ->data()->methods is a method_array_T structure, which can be either a one-dimensional array or a two-dimensional array. Iterating from beginLists() to endLists() ensures that you always get a one-dimensional array of method_t with each iteration. The next step is to retrieve the array by calling the search_method_list function.

2.10 `search_method_list`

static method_t *search_method_list(const method_list_t *mlist, SEL sel)
{
    int methodListIsFixedUp = mlist->isFixedUp();
    int methodListHasExpectedSize = mlist->entsize() == sizeof(method_t);
    
    if (__builtin_expect(methodListIsFixedUp && methodListHasExpectedSize, 1)) {
        // In general, mList is ordered, so binary lookup is performed on it
        return findMethodInSortedMethodList(sel, mlist);
    } else {
        // Linear search of unsorted method list
        // The mlist is unordered, so we have to traverse the match
        for (auto& meth : *mlist) {
            if (meth.name == sel) return&meth; }}#if DEBUG
    // sanity-check negative results
    if (mlist->isFixedUp()) {
        for (auto& meth : *mlist) {
            if (meth.name == sel) {
                _objc_fatal("linear search worked when binary search did not"); }}}#endif

    return nil;
}
Copy the code

Builtin_expect () says the expectations of results: most of them are true, that is to say, it will usually perform findMethodInSortedMethodList function, binary search for mlist; Otherwise, the search is traversed.

Note that sorting is triggered when the structure of the method list changes (note the list of methods, not all method lists in the RW, i.e. if it is a two-dimensional array, just rearrange the list of current methods). A sort of mList is triggered in the following calls:

methodizeClass
attachCategories
addMethod
addMethods

2.11 `log_and_fill_cache`

The log_and_fill_cache function populates the found IMP into the cache for the next call. This function is called in three cases:

Not found in the cache of the class (or forward class)IMP, but found in the list of its methods;
The “parent” of the class (which is not a forward class)… Root class is found in the cacheIMP;
In the class “parent class… Found in the list of root class methodsIMP

static void
log_and_fill_cache(Class cls, IMP imp, SEL sel, id receiver, Class implementer)
{
#if SUPPORT_MESSAGE_LOGGING
    if (objcMsgLogEnabled) {
        bool cacheIt = logMessageSend(implementer->isMetaClass(), 
                                      cls->nameForLogging(),
                                      implementer->nameForLogging(), 
                                      sel);
        if(! cacheIt)return;
    }
#endif
    cache_fill (cls, sel, imp, receiver);
}

void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if! DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
Copy the code

The log_and_fill_cache function is relatively simple. It calls cache_fill, which in turn calls cache_fill_NOLock. In other words, the key function for filling the cache is cache_fill_NOLock.

Cache_fill_nolock (); cache_fill_nolock (); cache_fill_nolock ();

3 summary

Now that I’ve covered finding messages, it’s time to wrap up.

3.1 The nature of method calls

Method calls are translated by the compiler intoobjc_msgSend,objc_msgSendSuper,objc_msgSend_stretandobjc_msgSendSuper_stretOne of the four functions, all of which are implemented by assembly code.
- If the method returns the value with a data structure, it will eventually be converted to the correspondingobjc_msgSend_stretorobjc_msgSendSuper_stretfunction
- usesuperThe keyword is used when calling a methodobjc_msgSendSuperSend the
- Other method calls are made usingobjc_msgSendThe function sends the message.
Method is called throughobjc_msgSend(orobjc_msgSendSuperOr,objc_msgSend_stretOr,objc_msgSendSuper_stret) function to send a name to the callerSELTo find the specific function addressIMPTo execute the function.

3.2 Method search

The search flow of method is as follows:

fromobjc_msgSendSource code to start, will first go to class (instance method) metaclass (class method) cache search, if foundIMPReturns and calls, otherwise it looks up the list of methods in class \ metaclass
In Step 1“Cache + Method List”, traverses the class inheritance system (class, class’s parent,… , root class), respectively, until foundIMPSo far.
- If not found in the current class’s cache, but in its method list (or its “parent class… Root class “cache or method list) found IMP, need to be a message forwarding judgment, if not message forwarding, then on the current class of cache filling operation, convenient next call when the search; If it is message forwarding, it will not be cached in the current class
If I don’t find it at the end of the walkIMP, the message parsing or forwarding is started.

4 Discussion

4.1 `objc_msgSend`Why write in assembly?

A: The reasons are as follows

CLanguage is static language, can not realize the number of parameters, type unknown jump to another arbitrary function to achieve the function; Assembler registers do this
Assembly execution efficiency is higher than C language
Using assembly can effectively prevent system functions from being hooked, so it is more secure.

4.2 Why is the Index Moved to the left`1 + PTRSHIFT`Who?

A: I didn’t find the answer in objC4-756.2, but there is an explanation in the cache_t structure of objC4-779.1. Part of the source code is as follows:

// How much the mask is shifted by.
static constexpr uintptr_t maskShift = 48;
    
// Additional bits after the mask which must be zero. msgSend
// takes advantage of these additional bits to construct the value
// `mask << 4` from `_maskAndBuckets` in a single instruction.
static constexpr uintptr_t maskZeroBits = 4;
    
// The largest mask value we can store.
static constexpr uintptr_t maxMask = ((uintptr_t)1< < (64 - maskShift)) - 1;
    
// The mask applied to `_maskAndBuckets` to retrieve the buckets pointer.
static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1;
Copy the code

In general, in 64-bit, the mask has an offset of 48, which is the highest 16-bit storage mask; The next 44 bits are the pointer address of buckets, and the lowest four bits are additional bits. That is, the effective pointer address of buckets is only [4, 47] bits among the 64 bits. In the CacheLookup source code, the _cmd & mask hash can be used to obtain the index value (index < ((1 << 16) -1)). If you want to obtain the bucket at this position, the index value must be moved 4 bits to the left. The bucket address can be added to the buckets pointer address to obtain the correct bucket address.

5 Reference Materials

Messages and message forwarding in Objective-C
Objective-c low-level summary

6 PS

The source code project has been placedgithubThe stamp, pleaseObjc4-756.2 – the source code
You can also download apple’s official ObjC4 source code to study.
Reprint please indicate the source! Thank you very much!

OC source code analysis method search principle

preface

1 objc_msgSendparsing

1.1 For example

1.2 Nature of method invocation

2 method search

2.1 objc_msgSendSource code analysis

2.2 GetClassFromIsa_p16

2.3 CacheLookup

2.4 __objc_msgSend_uncached 和 __objc_msgLookup_uncached

2.5 MethodTableLookup

2.6 _class_lookupMethodAndLoadCache3

2.7 lookUpImpOrForward

2.8 cache_getImp

2.9 getMethodNoSuper_nolock

2.10 search_method_list

2.11 log_and_fill_cache