In the last article, we explored how the insert method inserts SEL and IMP into the cache when instance objects call instance methods. Now let’s look at how a method is retrieved from the cache when it is called, a quick sel-IMP lookup.
Objc_msgSend bedding
1. View the source code
How do I get into insert when I call a method?
In the source codeobjc_cache.mm
In addition to the insert method from the previous article, you can also find itcache_fill
Methods:Is in thecache_fill
Is calledinsert
And search againcache_fill
:
In other wordsCache writers
Cache write is in progresscache_fill
Operation, and takes place before writing to the cacheCache readers
Cache read process, which hasObjc_msgSend and cache_getImp
.
2. Clang
Use Clang to compile the following code:
@interface Person : NSObject
- (void)sayHello;
- (int)addNumber:(int)number;
@end @implementation Person - (void)sayHello{ NSLog(@"Hello world"); } - (int)addNumber:(int)number{ return number+1; } @end Person *p = [Person alloc]; [p sayHello]; int result = [p addNumber:2]; Copy the code
In the CPP file, we can see the compiled result:Whether it’s callingClass method alloc
, orExample method sayHello
, will be compiled toObjc_msgSend (message receiver, method body, method parameters..)
.
The meaning of the message receiver is that the method is found through the message receiverRoots path
.
3. Develop
#import <objc/message.h>
@interface Tercher : Person
@end
@implementation Tercher
@end Person *p = [Person alloc]; [p sayHello]; objc_msgSend(p, sel_registerName("sayHello")); Tercher *t = [Tercher alloc]; [t sayHello]; struct objc_super xsuper; xsuper.receiver = t; xsuper.super_class = [Person class]; objc_msgSendSuper(&xsuper, sel_registerName("sayHello")); Copy the code
Set Build Settings->Enable Strict Checking of objc_msgSend Calls to NO.
Objc_msgSend Quick lookup
objc_msgSend
Objc-msg-arm64.s assembler code: objc-msG-arm64.s
The objc_msgSend function is the core engine for all OC method calls and is responsible for finding the implementation of the method and executing it. Because the call frequency is very high, its internal implementation has a great impact on performance, so we use assembly language to write the internal implementation code. Assembly is characterized by high speed and parameter uncertainty.
Resolution:
1.
cmp p0, #0 // nil check and tagged pointer check
Copy the code
The first piece of code, CMP, compared, we can see from the comments that nil check is nil. P0 is the first parameter of objc_msgSend, the message receiver.
2.
#if SUPPORT_TAGGED_POINTERS
b.le LNilOrTagged // (MSB tagged pointer looks negative)
#else
b.eq LReturnZero
#endif
Copy the code
SUPPORT_TAGGED_POINTERS
Determine whether small object types are supported. Yesb.le
Jump toLNilOrTagged
, otherwise,b.eq LReturnZero
Returns an empty.
When small object types are supported, the result of CMP P0, #0 is still used to decide whether to continue, and LReturnZero is also called if the message receiver is empty.
Le = less equal; Eq equals equals
3.
ldr p13, [x0] // p13 = isa
Copy the code
According to the object get ISA into register P13.
4.
GetClassFromIsa_p16 p13 // p16 = class
Copy the code
In 64-bit true machines, will$0(incoming P13 -> ISA)
andISA_MASK
Mask and operation, can be obtainedClass class information
After finding the class information, you can offset tocache
Perform a method lookup, i.eCacheLookup NORMAL Quick lookup
.
5.
LGetIsaDone: // The ISA has been obtained
// calls imp or objc_msgSend_uncached
CacheLookup NORMAL, _objc_msgSend
Copy the code
CacheLookup NORMAL
Source:
.macro CacheLookup
//
// Restart protocol:
//
// As soon as we're past the LLookupStart$1 label we may have loaded
// an invalid cache pointer or mask. // // When task_restartable_ranges_synchronize() is called, // (or when a signal hits us) before we're past LLookupEnd$1, // then our PC will be reset to LLookupRecover$1 which forcefully // jumps to the cache-miss codepath which have the following // requirements: // // GETIMP: // The cache-miss is just returning NULL (setting x0 to 0) // // NORMAL and LOOKUP: // - x0 contains the receiver // - x1 contains the selector // - x16 contains the isa // - other registers are set as per calling conventions // LLookupStart$1: // p1 = SEL, p16 = isa ldr p11, [x16, #CACHE] // p11 = mask|buckets #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 and p10, p11, #0x0000ffffffffffff // p10 = buckets and p12, p1, p11, LSR #48 // x12 = _cmd & mask #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 and p10, p11, #~0xf // p10 = buckets and p11, p11, #0xf // p11 = maskShift mov p12, #0xffff lsr p11, p12, p11 // p11 = mask = 0xffff >> p11 and p12, p1, p11 // x12 = _cmd & mask #else #error Unsupported cache mask storage for ARM64. #endif add p12, p10, p12, LSL #(1+PTRSHIFT) // p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT)) ldp p17, p9, [x12] // {imp, sel} = *bucket 1: cmp p9, p1 // if (bucket->sel ! = _cmd) b.ne 2f // scan more CacheHit $0 // call or return imp 2: // not hit: p12 = not-hit bucket CheckMiss $0 // miss if bucket->sel == 0 cmp p12, p10 // wrap if bucket == buckets b.eq 3f ldp p17, p9, [x12, #-BUCKET_SIZE]! // {imp, sel} = *--bucket b 1b // loop 3: // wrap: p12 = first bucket, w11 = mask #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 add p12, p12, p11, LSR #(48 - (1+PTRSHIFT)) // p12 = buckets + (mask << 1+PTRSHIFT) #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 add p12, p12, p11, LSL #(1+PTRSHIFT) // p12 = buckets + (mask << 1+PTRSHIFT) #else #error Unsupported cache mask storage for ARM64. #endif // Clone scanning loop to miss instead of hang when cache is corrupt. // The slow path may detect any corruption and halt later. ldp p17, p9, [x12] // {imp, sel} = *bucket 1: cmp p9, p1 // if (bucket->sel ! = _cmd) b.ne 2f // scan more CacheHit $0 // call or return imp 2: // not hit: p12 = not-hit bucket CheckMiss $0 // miss if bucket->sel == 0 cmp p12, p10 // wrap if bucket == buckets b.eq 3f ldp p17, p9, [x12, #-BUCKET_SIZE]! // {imp, sel} = *--bucket b 1b // loop LLookupEnd$1: LLookupRecover$1: 3: // double wrap JumpMiss $0 .endmacro Copy the code
1.
// p1 = SEL, p16 = isa
ldr p11, [x16, #CACHE] // p11 = mask|buckets
Copy the code
Among them#CACHE == 2*8 = 16
To:From the class structure, we can see that shifting ISA by 16 bytes can obtain the cache, which is the final resultp11=cache
. But why are commentsp11 = mask|buckets
? In the 64-bit system, masks and buckets are stored together to save memory and facilitate access.
2.
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
and p10, p11, #0x0000ffffffffffff // p10 = buckets
and p12, p1, p11, LSR #48 // x12 = _cmd & mask
Copy the code
The p11 (mask | buckets) and 0 x0000ffffffffffff and operation, its high 16 MaLing, the result is the buckets, deposited in the p10.
and p12, p1, p11, LSR #48
Divided into two sections, first calculatep11, LSR #48
, p11 is logically moved 48 bits to the right to obtaincache
In themask
. thenp1
With the operationmask
Exists the result ofp12
In the. The p1 assel(_cmd)
. seeCacheLookup
Source code in the beginning of the comments.And finally the result of the operationp12
Is that methods existbuckets
theThe subscript
.
Because in the previous article, the position to insert in the insert method is the subscript calculated using sel & mask.
3.
add p12, p10, p12, LSL #(1+PTRSHIFT)
// p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT))
Copy the code
Here is also divided into two paragraphs:
p12, LSL #(1+PTRSHIFT)
. Global searchPTRSHIFT
:
64-bit real machine,PTRSHIFT = 3
, so the meaning of the first piece of code is the methodThe subscript
Perform a logical shift of 4 bits to the left. It’s the same thing as moving 4 to the left2 ^ 4
.
0000 0001 << 4 = 0001 0000 = 16 = 2^4
Copy the code
So the last piece of assembly code means the subscript of the method * 2^4. The results are stored in P10.
add p12, p10
Bucket_t (buckets’ first address saved by P12); bucket_t (buckets’ first address subscript * 2^4 bytes);
Why is the subscript multiplied by 16 bytes? The reason is that bucket_t stores SEL and IMP, both of which are 8 bytes, and a bucket_t is 16 bytes, so subscript times the size of each bucket_t to find the bucket_t. Bucket_t is the same as bucket in assembly. Bucket_t is a structure in C language, and bucket is assembly.
4.
ldp p17, p9, [x12] // {imp, sel} = *bucket
Copy the code
In the bucket of the method obtained above, IMP and SEL can be found, which are stored in P17 and P9 respectively.
5.
1: cmp p9, p1 // if (bucket->sel ! = _cmd)
b.ne 2f // scan more
CacheHit $0 // call or return imp
Copy the code
If sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd) if sel = p1(_cmd
6.
2: // not hit: p12 = not-hit bucket
CheckMiss $0 // miss if bucket->sel == 0
cmp p12, p10 // wrap if bucket == buckets
b.eq 3f
ldp p17, p9, [x12, #-BUCKET_SIZE]! // {imp, sel} = *--bucket
b 1b // loop
Copy the code
If sel is not equal to buckets p1(_cmd), the LDP p17, p9, [x12, # -bucket_size]! . That is, if the value is not equal, the bucket will be searched forward and the bucket will jump to 1 again for loop.
7.
3: // wrap: p12 = first bucket, w11 = mask
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
add p12, p12, p11, LSR #(48 - (1+PTRSHIFT))
// p12 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
add p12, p12, p11, LSL #(1+PTRSHIFT)
// p12 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
Copy the code
Add p12, p12, p11, LSR #(48 – (1+PTRSHIFT))
The p11 knew from the start p11 = mask | buckets, p11 logic moves to the right 44, can also be considered in the p11 mask left four, namely comments (mask < < 1 + PTRSHIFT) = = mask * 2 ^ 4.
In the previous article, the value of mask was equal to Capacity-1, which is the number of all constructs in buckets minus one.
If the first bucket is the same as the other buckets, they will move to the last bucket and compare buckets again
Quick search process summary:
The recommended reference
Deconstruct the implementation of the objc_msgSend function in depth
Objc_msgSend Quick lookup of process analysis