IOS underlying principles + reverse article summary

The main purpose of this article is to understand the method lookup process of objc_msgSend

In the previous article iOS- Underlying Principles 11: Analysis of Cache Principles in ObjC_class, we analyzed the cache write process. Before the write process, there is a cache read process, namely, objc_msgSend and cache_getImp

Before analyzing, first know what is Runtime

The Runtime is introduced

Runtime is called runtime, which is different from compile time

  • Runtime is the process of code running, loading into memory, if there is an error at this point, the program will crash, is a dynamic phase

  • Compiling is the process of translating the source code into the code that can be recognized by the machine. It is mainly the most basic inspection and error report for the language, that is, lexical analysis, grammar analysis and so on. It is a static stage

Runtime can be used in the following three ways, and the relationship between the three implementations and the compiler layer and the underlying layer is shown in the figure

  • Through OC code, such as [Person sayNB]

  • By NSObject methods, for example isKindOfClass

  • Through the Runtime API, such as class_getInstanceSize

One of thecompilerThat’s what we knowThe compiler, i.e.,LLVMFor example OCallocCorresponding to the underlyingobjc_alloc.runtime system libararyisThe underlying library

Explore the nature of the method

Nature of method

In the ios-Underlying Principles 07: Isa and Class Association article, we understand the nature of the OC object by using the source code compiled by clang. Similarly, we use clang to compile the main. CPP file by looking at the implementation of method calls in the main function, as shown below

LGPerson *person = [LGPerson alloc]; [person sayNB]; [person sayHello]; LGPerson *person = ((LGPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("LGPerson"), sel_registerName("alloc")); ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("sayNB")); ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("sayHello"));Copy the code

As you can see from the above code, the essence of the method is objc_msgSend message sending

To verify, make the call to [Person sayNB] with the objc_msgSend method and see if it prints consistently

Note: 1, call objc_msgSend directly, #import

#import

#import

#import

#import

#import

LGPerson *person = [LGPerson alloc];   
objc_msgSend(person,sel_registerName("sayNB"));
[person sayNB];
Copy the code

The print-out results are as follows, found to be consistent, so[person sayNB]Is equivalent toobjc_msgSend(person,sel_registerName("sayNB"))

Object method calls – the actual execution is the superclass implementation In addition to validation, we can also try to have the person call execute the superclass implementation, implemented by objc_msgSendSuper

  • Define two classes: LGPerson and LGTeacher, which implement the sayHello method in the parent class

  • Calls in main
LGPerson *person = [LGPerson alloc]; LGTeacher *teacher = [LGTeacher alloc]; [person sayHello]; struct objc_super lgsuper; lgsuper.receiver = person; // The recipient of the message is still person lgsuper.super_class = [LGTeacher class]; Objc_msgSendSuper (& lgSuper, sel_registerName("sayHello"));Copy the code

The objc_msgSendSuper method has two parameters (struct, sel). The struct type is the struct object defined by objc_super, and the two properties (receiver and super_class) need to be specified

  • objc_msgSendSuperThe method parameters

  • objc_superThe source code to define

The print result is as follows

It turns out that both [Person sayHello] and objc_msgSendSuper execute an implementation of sayHello in the parent class, so here, we can make a guess: method calls, first look in the class, if not found in the class, will look in the parent class.

With that in mind, let’s explore the source implementation of objc_msgSend

Objc_msgSend Quick search process analysis

In objC4-781 source code, searchobjc_msgSendBecause our daily development is the architecture is ARM64, so need inarm64.sFile with the suffixobjc_msgSendSource code implementation, discovery isAssembly implementation, the flowchart for the overall implementation of its compilation is as follows

Objc_msgSend assembly source code

Objc_msgSend is the entry point for the source code for message sending, which is implemented in assembly. The _objc_msgSend source code is implemented as follows

//---- Message sending -- assembler entry --objc_msgSend is to get the isa information of the recipient
ENTRY _objc_msgSend 
/ / -- no window
	UNWIND _objc_msgSend, NoFrame 
	
//---- p0 is the first parameter of objc_msgSend, which is message receiver receiver
	cmp	p0, #0			// nil check and tagged pointer check 
//---- le less than -- supports tagGedPointer (small object type) flows
#if SUPPORT_TAGGED_POINTERS
	b.le	LNilOrTagged		// (MSB tagged pointer looks negative)
#else
//---- p0 = 0, return null
	b.eq	LReturnZero 
#endif 
//---- p0 indicates the process that receiver must exist
//---- take out isa according to the object, that is, take out ISA from the address pointed to by register x0, and save isa into register P13
	ldr	p13, [x0]    	// p13 = isa 
//---- in 64-bit architecture p16 = isa (p13) & ISA_MASK, take out shiftcls information, get the class information
	GetClassFromIsa_p16 p13		// p16 = class 
LGetIsaDone:
	// calls imp or objc_msgSend_uncached 
//---- if isa exists, go to the CacheLookup process, which is called sel-imp quick lookup process
	CacheLookup NORMAL, _objc_msgSend

#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
//---- = null, returns null
	b.eq	LReturnZero		// nil check 

	// tagged
	adrp	x10, _objc_debug_taggedpointer_classes@PAGE
	add	x10, x10, _objc_debug_taggedpointer_classes@PAGEOFF
	ubfx	x11, x0, #60#,4
	ldr	x16, [x10, x11, LSL #3]
	adrp	x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGE
	add	x10, x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGEOFF
	cmp	x10, x16
	b.ne	LGetIsaDone

	// ext tagged
	adrp	x10, _objc_debug_taggedpointer_ext_classes@PAGE
	add	x10, x10, _objc_debug_taggedpointer_ext_classes@PAGEOFF
	ubfx	x11, x0, #52#,8
	ldr	x16, [x10, x11, LSL #3]
	b	LGetIsaDone
// SUPPORT_TAGGED_POINTERS
#endif

LReturnZero:
	// x0 is already zero
	mov	x1, #0
	movi	d0, #0
	movi	d1, #0
	movi	d2, #0
	movi	d3, #0
	ret

	END_ENTRY _objc_msgSend
Copy the code

There are mainly the following steps

  • Step 1: Judgeobjc_msgSendMethod first argumentreceiverWhether is empty
    • If the supporttagged pointerJump toLNilOrTagged.
      • ifSmall objectsIs null, null is directly returned, i.eLReturnZero
      • ifSmall objectsIf not, small objects are processedisaTo the second step
    • If it’s not a small object,receiverIt’s not empty. There are two steps
      • fromreceiverRemove theisadepositp13Register,
      • throughGetClassFromIsa_p16,arm64Through architectureisa & ISA_MASKTo obtainshiftclsBitfield class information, i.eclass.GetClassFromIsa_p16Then go to [Step 2]
.macro GetClassFromIsa_p16 /* src */
//---- Used here for watchOS
#if SUPPORT_INDEXED_ISA 
	// Indexed isa
//---- Store the value of isa in register p16
	mov	p16, $0			// optimistically set dst = src 
	tbz	p16, #ISA_INDEX_IS_NPI_BIT, 1f	// done if not non-pointer isa
	// isa in p16 is indexed
//---- reads the base address of the _objc_Indexed_classes page into the X10 register
	adrp	x10, _objc_indexed_classes@PAGE 
//---- x10 = x10 + _objc_indexED_classes (offset in page) -- the x10 base address is offset in memory by offset
	add	x10, x10, _objc_indexed_classes@PAGEOFF
//---- Start from the ISA_INDEX_SHIFT bit of p16 and extract the ISA_INDEX_BITS to register p16. The remaining high bits are supplemented by 0
	ubfx	p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS  // extract index 
	ldr	p16, [x10, p16, UXTP #PTRSHIFT]	// load class from array
1:

//-- for 64-bit systems
#elif __LP64__ 
	// 64-bit packed isa
//---- p16 = class = isa & ISA_MASK(bit operation & get shiftcls information in isa)
	and	p16, $0, #ISA_MASK 

#else
	// 32-bit raw isa ---- Used for 32-bit systems
	mov	p16, $0

#endif

.endmacro
Copy the code
  • The second step is to obtain the ISA and enter the slow search processCacheLookup NORMAL

The CacheLookup cache looks for the assembly source code

//!!!!!!!!! Key !!!!!!!!!!!!
.macro CacheLookup 
	//
	// Restart protocol:
	//
	// As soon as we're past the LLookupStart$1 label we may have loaded
	// an invalid cache pointer or mask.
	//
	// When task_restartable_ranges_synchronize() is called,
	// (or when a signal hits us) before we're past LLookupEnd$1,
	// then our PC will be reset to LLookupRecover$1 which forcefully
	// jumps to the cache-miss codepath which have the following
	// requirements:
	//
	// GETIMP:
	// The cache-miss is just returning NULL (setting x0 to 0)
	//
	// NORMAL and LOOKUP:
	// - x0 contains the receiver
	// - x1 contains the selector
	// - x16 contains the isa
	// - other registers are set as per calling conventions
	//
LLookupStart$1:

//---- p1 = SEL, p16 = isa -- #define CACHE (2 * __SIZEOF_POINTER__), where __SIZEOF_POINTER__ indicates the size of the pointer
/ / - p11 = mask | buckets - from x16 16 bytes (isa) in translation, the cache is just remove the cache in the p11 register - isa distance 16 bytes: Isa (8 bytes) -superClass (8 bytes) - Cache (mask high 16 bits + Buckets low 48 bits)
	ldr	p11, [x16, #CACHE]				
//---- 64-bit real machine
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 
//-- p11(cache) &0x0000FFFFFFFFFFFF; mask is 16 bits higher than zero; buckets are stored in register P10
	and	p10, p11, #0x0000ffffffffffff	// p10 = buckets 
	
//-- p11(cache) moves 48 bits to the right to get mask (p11 store mask), mask & p1(msgSend second parameter cmds-sel), The subscript index (search subscript) of sel-IMP is obtained and stored in P12 (the hash subscript is calculated by sel & mask when cache insert is written, and also needs to be calculated in this way when reading).
	and	p12, p1, p11, LSR #48		// x12 = _cmd & mask 

//-- non-64-bit real machine
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 
	and	p10, p11, #~0xf			// p10 = buckets
	and	p11, p11, #0xf			// p11 = maskShift
	mov	p12, #0xffff
	lsr	p11, p12, p11				// p11 = mask = 0xffff >> p11
	and	p12, p1, p11				// x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif

//-- p12 is the first address of buckets with the subscript p10. The subscript * 1< 4(that is, 16) is used to obtain the offset of buckets and store them in register P12
//-- LSL #(1+PTRSHIFT)-- mask = occupied -1-- _cmd & mask -- remainder
	add	p12, p10, p12, LSL #(1+PTRSHIFT)   
		             // p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT)) -- PTRSHIFT is 3
		             
//-- remove the bucket from x12 and store IMP and SEL into P17 (store IMP) and P9 (store SEL)
	ldp	p17, p9, [x12]		// {imp, sel} = *bucket 
	
//-- compare sel with p1 (CMD passed in)
1:	cmp	p9, p1			// if (bucket->sel ! = _cmd)
//-- If not, jump to 2f
	b.ne	2f			// scan more
//-- if equality equals cacheHit, imp is returned
	CacheHit $0			// call or return imp 
	
2:	// not hit: p12 = not-hit bucket
// redirect __objc_msgsend_cached (), cached () and cached ()
	CheckMiss $0			// miss if bucket->sel == 0 
//-- check whether p12 (bucket) is equal to p10 (the first element in the bucket array), and if so, jump to step 3
	cmp	p12, p10		// wrap if bucket == buckets 
//-- go to the last element (the first bucket)
	b.eq	3f 
//-- from x12 (the first address of p12 buckets) to the actual memory size BUCKET_SIZE, the second bucket element is obtained, and the IMP -SEL is stored in p17 and P9 respectively
	ldp	p17, p9, [x12, #-BUCKET_SIZE]!	// {imp, sel} = *--bucket 
//-- Go to step 1 and continue comparing sel and CMD
	b	1b			// loop 

3:	// wrap: p12 = first bucket, w11 = mask
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
//-- Manually set to the last element
//-- Shift p11 (mask) 44 bits to the right is equivalent to shift mask 4 bits to the left, and directly locates the last element of buckets
	add	p12, p12, p11, LSR #(48 - (1+PTRSHIFT)) 
					// p12 = buckets + (mask << 1+PTRSHIFT) 
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	add	p12, p12, p11, LSL #(1+PTRSHIFT)
					// p12 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif

	// Clone scanning loop to miss instead of hang when cache is corrupt.
	// The slow path may detect any corruption and halt later.
//-- look for the cache again ()
//-- get the IMP -SEL in the x12 bucket and store it in p17 -- P9
	ldp	p17, p9, [x12]		// {imp, sel} = *bucket 
	
//-- compare sel with p1 (CMD passed in)
1:	cmp	p9, p1			// if (bucket->sel ! = _cmd)
//-- If not, go to the second step
	b.ne	2f			// scan more
//-- if the match is equal, return IMP
	CacheHit $0			// call or return imp  
	
2:	// not hit: p12 = not-hit bucket
//-- if not, CheckMiss
	CheckMiss $0			// miss if bucket->sel == 0 
//-- check whether p12 is equal to p10 (the first element in the bucket array) -- indicate that the previous bucket has not been found, but is still not found
	cmp	p12, p10		// wrap if bucket == buckets 
	b.eq	3f // If yes, go to step 3
//-- from x12 (the first address of p12 buckets) to the actual memory size BUCKET_SIZE, the second bucket element is obtained, and the IMP -SEL is stored in p17 and P9 respectively
	ldp	p17, p9, [x12, #-BUCKET_SIZE]!	// {imp, sel} = *--bucket 
//-- Go to step 1 and continue comparing sel and CMD
	b	1b			// loop 

LLookupEnd$1:
LLookupRecover$1:
3:	// double wrap
//-- jump to JumpMiss because it is normal, jump to __objc_msgSend_uncached

	JumpMiss $0 
.endmacro

// Here is the assembly function for the last jump
.macro CacheHit
.if $0 == NORMAL
	TailCallCachedImp x17, x12, x1, x16	// authenticate and call imp
.elseif $0 == GETIMP
	mov	p0, p17
	cbz	p0, 9f			// don't ptrauth a nil imp
	AuthAndResignAsIMP x0, x12, x1, x16	// authenticate imp and re-sign as IMP
9:	ret				// return IMP
.elseif $0 == LOOKUP
	// No nil check for ptrauth: the caller would crash anyway when they
	// jump to a nil IMP. We don't care if that jump also fails ptrauth.
	AuthAndResignAsIMP x17, x12, x1, x16	// authenticate imp and re-sign as IMP
	ret				// return imp via x17
.else
.abort oops
.endif
.endmacro

.macro CheckMiss
	// miss if bucket->sel == 0
.if $0 == GETIMP 
//-- If it is GETIMP, jump to LGetImpMiss
	cbz	p9, LGetImpMiss
.elseif $0 == NORMAL 
//-- If the value is NORMAL, jump to __objc_msgSend_uncached
	cbz	p9, __objc_msgSend_uncached
.elseif $0 == LOOKUP 
//-- For LOOKUP, jump to __objc_msgLookup_uncached
	cbz	p9, __objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

.macro JumpMiss
.if $0 == GETIMP
	b	LGetImpMiss
.elseif $0 == NORMAL
	b	__objc_msgSend_uncached
.elseif $0 == LOOKUP
	b	__objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro
Copy the code

Mainly divided into the following steps

  • The first step is to translate the first address of the cache by 16 bytes (because in objc_class, the first address is just 16 bytes away from the cache, that is, the first address of isa is 8 bytes, and the first address of superClass is 8 bytes) to obtain the CAhCE, the high 16 bits of the cache store the mask, and the low 48 bits of the cache store the buckets. The p11 = cache

  • [Step 2] The buckets and mask are extracted from the cache respectively, and the hash subscript is calculated by the mask according to the hash algorithm

    • The &operation of the cache and the mask (i.e. 0x0000FFFFFFFFFF) nullify the higher 16-bit mask to get the address of the buckets pointer, i.e. P10 = Buckets

    • Move the cache 48 bits to the right to obtain mask, that is, P11 = mask

    • Take the objc_msgSend parameter P1 (the second parameter _cmd) & msak, and get the bucket subscript index of sel-IMP that needs to be searched through the hash algorithm, that is, p12 = index = _cmd & mask. Why do you use this way? Since sel-IMP is stored using the same hash algorithm to calculate the hash subscript for storage, the reads need to be read in the same way, as shown below

  • [Step 3] Extract the bucket corresponding to the hash index and buckets according to the obtained addresses of the hash index and buckets

    • The value of PTRSHIFT is equal to 3, and the value of left shift is 4 bits (2^4 = 16 bytes) to calculate the actual size of a bucket. In the structure bucket_t, sel is 8 bytes, and IMP is 8 bytes

    • The offset of buckets is obtained by multiplying the calculated hash index by the memory size occupied by a single bucket

    • Obtain the bucket corresponding to the hash index from the first address + the actual offset

  • [Step 4] According to the obtained bucket, remove imp and store it in P17, that is, P17 = IMP; remove SEL and store it in P9, that is, P9 = SEL

  • [Step 5] The first recursive loop

    • Compare sel in the obtained bucket to _cmd(that is, p1) in the second argument of objc_msgSend

    • If so, it jumps directly to CacheHit and returns IMP

    • If not, there are two cases

      • If you can’t find it all the time, jump directly to Checkcached, because $0 is normal, jump to __objc_msgSend_uncached, that is, enter the slow search process

      • If the bucket obtained by the index is equal to the first element of the buckets, then the current bucket is set to the last element of the bucker by moving the bucket address +mask 44 bits to the right (the same as moving the mask 4 bits to the left). Then continue the recursive loop (the first recursive loop is nested within the second recursive loop), i.e. Step 6

      • If the current bucket does not equal the first element of buckets, the search continues forward into the first recursive loop

  • [Step 6] The second recursive loop: Repeat step 5, if the bucket is still equal to the first element of buckets, then jump to JumpMiss, where $0 is normal, and jump to __objc_msgSend_uncached, and cached. Enter the slow search process

Here’s the wholeQuickly findprocessThe change of the valueprocess