preface

The basic structure of CACHE_t was examined in the OC underlying principles above, so when is the cache read and inserted? This article explores reading from the cache.

From our previous analysis, we already know that cache_t is a cache of methods, so cache insertions and reads must be related to sending messages. Let’s explore calling methods and FormSelector separately:

Method call analysis

1. Method call: Next, create an XQPerson object as shown below and call the eatFood method:

We compiled it into a CPP file using clang-rewrite-objc main.m -o main.cpp:

As you can see, calling the method actually calls the objc_msgSend function to send the message

2. The performSelector viewperformSelectorThe source code for the method is as follows:

As you can see, performSelector actually calls objc_msgSend to send a message.

Summary: Method calls essentially call objc_msgSend to send messages.

Next, we can focus our exploration onobjc_msgSendthe

Objc_msgSend assembly analysis (this article is only for ARM64 real machine environment)

Preparations:

Download the source objC-818.2

To search objc_msgSend globally, hold down the command key and click the small arrow to the left of the search result to close the search result, select the objC-msG-arm64.s file and find ENTRY _objc_msgSend as follows:

ENTRY _objc_msgSend UNWIND _objc_msgSend, NoFrame // Compare P0 (message receiver) with 0 CMP P0, Check #if SUPPORT_TAGGED_POINTERS // determine tagged pointer b.le: LNilOrTagged // (MSB tagged pointer looks negative) #else // Check whether CMP is null. CMP result = 0 execute address LReturnZero otherwise go down b.qlreturnZero #endif // write [x0] data to register i.e: P13 register holds object isa LDR p13, [x0] // p13 = isa // get calss getFromisA_p16 p13, 1, x0 // p16 = class LGetIsaDone: // calls IMP or objc_msgSend_uncached CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached #if SUPPORT_TAGGED_POINTERS LNilOrTagged: b.eq LReturnZero // nil check GetTaggedClass b LGetIsaDone // SUPPORT_TAGGED_POINTERS #endif LReturnZero: // x0 is already zero mov x1, #0 movi d0, #0 movi d1, #0 movi d2, #0 movi d3, #0 ret END_ENTRY _objc_msgSendCopy the code
  • CMP P0, #0 determines whether the message receiver is null

  • B. LNilOrTagged If the CMP value is less than or equal to 0, run the LNilOrTagged command to check whether it is tagged pointer

  • CMP result equal to 0 execute address LReturnZero otherwise go down

  • Dr P13, [x0] writes [x0] data to a register that holds the isa of the object

  • GetClassFromIsa_p16 p13, 1, x0 isa & ISA_MASK gets class

.macro GetClassFromIsa_p16 src, needs_auth, auth_address /* note: auth_address is not required if ! needs_auth */ #if SUPPORT_INDEXED_ISA // Indexed isa mov p16, \src // optimistically set dst = src tbz p16, #ISA_INDEX_IS_NPI_BIT, 1f // done if not non-pointer isa // isa in p16 is indexed adrp x10, _objc_indexed_classes@PAGE add x10, x10, _objc_indexed_classes@PAGEOFF ubfx p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS // extract index ldr p16, [x10, p16, UXTP #PTRSHIFT] // load class from array 1: Needs_auth == 0 // _cache_getImp takes an authed class already mov p16, \ src. else // 64-bit Packed ISA // SRC isa ExtractISA P16, \ SRC, \ auth_address. endif # Else // 32-bit raw ISA MOV P16, SRC # endif.endmacro. Macro ExtractISA and $0, $1, #ISA_MASK // $1(ISA) & ISA_MASK = $0(p16Copy the code

CacheLookup function parsing

Arguments are passed in: NORMAL, _objc_msgSend, __objc_msgSend_uncached

//Mode:NORMAL , Function:_objc_msgSend , MissLabelDynamic : __objc_msgSend_uncached , MissLabelConstant 为空
.macro CacheLookup Mode, Function, MissLabelDynamic, MissLabelConstant
	//
	// Restart protocol:
	//
	//   As soon as we're past the LLookupStart\Function label we may have
	//   loaded an invalid cache pointer or mask.
	//
	//   When task_restartable_ranges_synchronize() is called,
	//   (or when a signal hits us) before we're past LLookupEnd\Function,
	//   then our PC will be reset to LLookupRecover\Function which forcefully
	//   jumps to the cache-miss codepath which have the following
	//   requirements:
	//
	//   GETIMP:
	//     The cache-miss is just returning NULL (setting x0 to 0)
	//
	//   NORMAL and LOOKUP:
	//   - x0 contains the receiver
	//   - x1 contains the selector
	//   - x16 contains the isa
	//   - other registers are set as per calling conventions
	//

	mov	x15, x16			// stash the original isa
LLookupStart\Function:
	// p1 = SEL, p16 = isa
        //模拟器环境或mac
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS 
	ldr	p10, [x16, #CACHE]				// p10 = mask|buckets
	lsr	p11, p10, #48			// p11 = mask
	and	p10, p10, #0xffffffffffff	// p10 = buckets
	and	w12, w1, w11			// x12 = _cmd & mask
        //真机环境
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
        //#define CACHE            (2 * __SIZEOF_POINTER__) ,
        //x16(class)平移CACHE(16)位得到cache,首地址指向_bucketsAndMaybeMask,唧p11=_bucketsAndMaybeMask
	ldr	p11, [x16, #CACHE]			// p11 = mask|buckets
        //64位真机环境
#if CONFIG_USE_PREOPT_CACHES
        // A12后处理器
#if __has_feature(ptrauth_calls)
	tbnz	p11, #0, LLookupPreopt\Function
	and	p10, p11, #0x0000ffffffffffff	// p10 = buckets
#else   //A12以前设备
        // p11 & 0x0000fffffffffffe得到buckets
	and	p10, p11, #0x0000fffffffffffe	// p10 = buckets
        //判断 p11第0位是否不为0,为0继续下面流程
	tbnz	p11, #0, LLookupPreopt\Function
#endif
        //p12 = p1(_cmd)^(p1(_cmd) >> 7)
	eor	p12, p1, p1, LSR #7
        // p12 得到方法缓存的下标 对应cache_t->insert->cache_hash函数
	and	p12, p12, p11, LSR #48		// x12 = (_cmd ^ (_cmd >> 7)) & mask
#else   
	and	p10, p11, #0x0000ffffffffffff	// p10 = buckets
	and	p12, p1, p11, LSR #48		// x12 = _cmd & mask
#endif // CONFIG_USE_PREOPT_CACHES
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	ldr	p11, [x16, #CACHE]				// p11 = mask|buckets
	and	p10, p11, #~0xf			// p10 = buckets
	and	p11, p11, #0xf			// p11 = maskShift
	mov	p12, #0xffff
	lsr	p11, p12, p11			// p11 = mask = 0xffff >> p11
	and	p12, p1, p11			// x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif
        /**
        #define PTRSHIFT 3
        p10 = buckets, p12 = index(第一次查询的index)
        一个bucket_t占用16字节(sel,imp两个指针)
        p12(index)左移4(= 1 + PTRSHIFT)位就是index * 16
        其实就是buckets首地址加上index个bucket_t内存大小,
        找到index位置的bucket,赋值给p13
        p13 = p10 + (p12 << 4) = buckets + index * 16,内存平移
        p13 = index位置的bucket
        */
	add	p13, p10, p12, LSL #(1+PTRSHIFT)
						// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))

						// do {
        /**
        #define BUCKET_SIZE      (2 * __SIZEOF_POINTER__)
        x13平移-BUCKET_SIZE,到下个bicket得到 p17(imp),p9(sel)
        */
1:	ldp	p17, p9, [x13], #-BUCKET_SIZE	//     {imp, sel} = *bucket--
        //比较 p9和_cmd
	cmp	p9, p1				//     if (sel != _cmd) {
        //不相对执行 3: 开始下次循环
	b.ne	3f				//         scan more
						//     } else {
        //相等,找到缓存
2:	CacheHit \Mode				// hit:    call or return imp
						//     }
        //cbz : 和 0 比较,如果结果为零就转移(只能跳到后面的指令)
3:	cbz	p9, \MissLabelDynamic		//     if (sel == 0) goto Miss;
        //b.hs:判断是否无符号小于,满足执行1:
	cmp	p13, p10			// } while (bucket >= buckets)
	b.hs	1b

	// wrap-around:
	//   p10 = first bucket
	//   p11 = mask (and maybe other bits on LP64)
	//   p12 = _cmd & mask
	//
	// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.
	// So stop when we circle back to the first probed bucket
	// rather than when hitting the first bucket again.
	//
	// Note that we might probe the initial bucket twice
	// when the first probed slot is the last entry.


#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
	add	p13, p10, w11, UXTW #(1+PTRSHIFT)
						// p13 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
        //得到最后一个bucket
	add	p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
						// p13 = buckets + (mask << 1+PTRSHIFT)
						// see comment about maskZeroBits
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	add	p13, p10, p11, LSL #(1+PTRSHIFT)
						// p13 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
        //得到第一次查询的bucket
	add	p12, p10, p12, LSL #(1+PTRSHIFT)
						// p12 = first probed bucket

						// do {
4:	ldp	p17, p9, [x13], #-BUCKET_SIZE	//     {imp, sel} = *bucket--
	cmp	p9, p1				//     if (sel == _cmd)
	b.eq	2b				//         goto hit
	cmp	p9, #0				// } while (sel != 0 &&
	ccmp	p13, p12, #0, ne		//     bucket > first_probed)
	b.hi	4b

LLookupEnd\Function:
LLookupRecover\Function:
	b	\MissLabelDynamic

#if CONFIG_USE_PREOPT_CACHES
#if CACHE_MASK_STORAGE != CACHE_MASK_STORAGE_HIGH_16
#error config unsupported
#endif
LLookupPreopt\Function:
#if __has_feature(ptrauth_calls)
	and	p10, p11, #0x007ffffffffffffe	// p10 = buckets
	autdb	x10, x16			// auth as early as possible
#endif

	// x12 = (_cmd - first_shared_cache_sel)
	adrp	x9, _MagicSelRef@PAGE
	ldr	p9, [x9, _MagicSelRef@PAGEOFF]
	sub	p12, p1, p9

	// w9  = ((_cmd - first_shared_cache_sel) >> hash_shift & hash_mask)
#if __has_feature(ptrauth_calls)
	// bits 63..60 of x11 are the number of bits in hash_mask
	// bits 59..55 of x11 is hash_shift

	lsr	x17, x11, #55			// w17 = (hash_shift, ...)
	lsr	w9, w12, w17			// >>= shift

	lsr	x17, x11, #60			// w17 = mask_bits
	mov	x11, #0x7fff
	lsr	x11, x11, x17			// p11 = mask (0x7fff >> mask_bits)
	and	x9, x9, x11			// &= mask
#else
	// bits 63..53 of x11 is hash_mask
	// bits 52..48 of x11 is hash_shift
	lsr	x17, x11, #48			// w17 = (hash_shift, hash_mask)
	lsr	w9, w12, w17			// >>= shift
	and	x9, x9, x11, LSR #53		// &=  mask
#endif

	ldr	x17, [x10, x9, LSL #3]		// x17 == sel_offs | (imp_offs << 32)
	cmp	x12, w17, uxtw

.if \Mode == GETIMP
	b.ne	\MissLabelConstant		// cache miss
	sub	x0, x16, x17, LSR #32		// imp = isa - imp_offs
	SignAsImp x0
	ret
.else
	b.ne	5f				// cache miss
	sub	x17, x16, x17, LSR #32		// imp = isa - imp_offs
.if \Mode == NORMAL
	br	x17
.elseif \Mode == LOOKUP
	orr x16, x16, #3 // for instrumentation, note that we hit a constant cache
	SignAsImp x17
	ret
.else
.abort  unhandled mode \Mode
.endif

5:	ldursw	x9, [x10, #-8]			// offset -8 is the fallback offset
	add	x16, x16, x9			// compute the fallback isa
	b	LLookupStart\Function		// lookup again with a new isa
.endif
#endif // CONFIG_USE_PREOPT_CACHES

.endmacro

Copy the code

Resolution:

  • mov x15, x16Store ISA to X15
  • CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16Real machine environment
  • ldr p11, [x16, #CACHE] x16(class) translation CACHE(16) to obtain the CACHE, the first address pointing to_bucketsAndMaybeMask, i.e.,p11=_bucketsAndMaybeMask
  • CONFIG_USE_PREOPT_CACHES64-bit real environment__has_feature(ptrauth_calls)Optimization after A12 is not analyzed in this paper
  • and p10, p11, #0x0000fffffffffffeP11 & mask is obtainedbuckets
  • tbnz p11, #0, LLookupPreopt\Functionjudgep11The first0Whether the bit is not0for0Continue the following process
  • eor p12, p1, p1, LSR #7p12 = p1(_cmd)^(p1(_cmd) >> 7)
  • and p12, p12, p11, LSR #48P12 gets the subscript of the method cacheindexThe correspondingcache_t->insert-> cache_hashfunction
  • add p13, p10, p12, LSL #(1+PTRSHIFT)A 64 – bit#define PTRSHIFT 3.p10 = buckets.p12 = index(First queryindexA)bucket_tTake up16Byte (sel.impTwo Pointers),p12(index) to the left4The (= 1 + PTRSHIFT) bit is equal toindexTimes 16, which is essentially the same thingbucketsPrefix address plusindexabucket_tMemory size, foundindexThe location of thebucketAssigned top13That is:p13 = p10 + (p12 << 4) = buckets + index * 16. By memory translation Memory translationp13 = indexThe location of thebucket
  • 1:ldp p17, p9, [x13], #-BUCKET_SIZEA 64 – bit#define BUCKET_SIZE      (2 * __SIZEOF_POINTER__) x13translation-BUCKET_SIZE, to the nextbicketgetp17(imp).p9(sel)
  • cmp p9, p1To determinep9andp1If yes, perform 2:2: CacheHit \Mode, unequal execution3: cbz p9, \MissLabelDynamic
  • 2: CacheHit \ModeFind the cache and executeCacheHit``Mode=Noamal
  • 3: cbz p9, \MissLabelDynamic p9(sel)Compare with 0 and execute if the result is zeroMissLabelDynamic(passed parameter __objc_msgSend_uncached).b.hs 1b: Judge whether the result of the above comparison is unsigned less than, if true, execute,cmp p13, p10The comparison,b.hs 1b, if satisfiedp13 >= p10Continue the loop1:ldp p17, p9, [x13], #-BUCKET_SIZE
  • If the loop still does not find the cache, continue the following process and enter according to the architectureCACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
  • add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))Get the last bucket
  • add p12, p10, p12, LSL #(1+PTRSHIFT)Get the bucket for the first query
  • 4: ldp p17, p9, [x13], #-BUCKET_SIZEI’m going to walk through it, and I’m going toimpStored in thep17.selStored in thep9.cmp p9, p1The comparison,P9 (sel)withp1(_cmd).b.eq 2bIf the comparison is equal, execute2b (Find cache).cmp p9, #0judgep9(sel)Is it empty,ccmp p13, p12, #0, neTo comparep13(bucket)withp12(first_probed)If thep13p12> 0p13 > p12perform4bContinue the loop.
  • Query cannot be executedMissLabelDynamic(__objc_msgSend_uncached)

Partial structure judgment diagram:

CacheHitResolution:

Passing in the parameter NORMAL

// CacheHit: X17 = cached IMP, x10 = address of buckets, x1 = SEL, x16 = isa. macro cachehit. if $0 == NORMAL x10 = bucket,x1 = _cmd x16 = class TailCallCachedImp x17, x10, x1, x16 // authenticate and call imp .elseif $0 == GETIMP mov p0, p17 cbz p0, 9f // don't ptrauth a nil imp AuthAndResignAsIMP x0, x10, x1, x16 // authenticate imp and re-sign as IMP 9: ret // return IMP .elseif $0 == LOOKUP // No nil check for ptrauth: the caller would crash anyway when they // jump to a nil IMP. We don't care if that jump also fails ptrauth. AuthAndResignAsIMP x17, x10, x1, x16 // authenticate imp and re-sign as IMP cmp x16, x15 cinc x16, x16, ne // x16 += 1 when x15 ! = x16 (for instrumentation ; fallback to the parent class) ret // return imp via x17 .else .abort oops .endif .endmacroCopy the code

$0 = NORMAL execute TailCallCachedImp x17, x10, x1, x16 with x17= IMP, x10 = bucket,x1 = _cmd x16 = class

TailCallCachedImp resolution:

.macro TailCallCachedImp // $0 = cached imp, $1 = address of cached imp, $2 = SEL, $3 = isa // imp = imp ^ CLS, corresponding to 'encodeImp' function '(uintptr_t) 'eor $0, $0, $3 // Jump to imp and execute br $0.endmacroCopy the code

__objc_msgSend_uncachedResolution:

STATIC_ENTRY __objc_msgSend_uncached UNWIND __objc_msgSend_uncached, FrameWithNoSaves ! [method to quickly find. Drawio. PNG] (HTTP: / / https://p1-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/b8eb8c4e661f4f149b8ae6c0c717efd9~tplv-k3u1fbpfcp-w atermark.image?) // THIS IS NOT A CALLABLE C FUNCTION // Out-of-band p15 is the class to search MethodTableLookup TailCallFunctionPointer x17 END_ENTRY __objc_msgSend_uncachedCopy the code

MethodTableLookupResolution:

.macro MethodTableLookup SAVE_REGS MSGSEND // lookUpImpOrForward(obj, sel, cls, LOOKUP_INITIALIZE | LOOKUP_RESOLVER) // receiver and selector already in x0 and x1 mov x2, x16 mov x3, X0 receiver(self) x1 _cmd, x2 class x3 = 3 bl _lookUpImpOrForward // IMP in x0 mov x17, x0 RESTORE_REGS MSGSEND .endmacroCopy the code

LookUpImpOrForward is passed to receiver(self), _cmd, class, and 3.

3. Cache lookup flow chart:

Conclusion: The quick search process of the method is to call objc_msgSend, receiver to fetch the class, iterate over the cache bucket, and then extract sel of the bucket and _cmd to compare and judge the execution process.