1. Cache insertion process analysis
How methods are cached in a class was explored in Cache_T Analysis like OC Low-level Exploration! But when is the method stored in the class cache?
Let’s start exploring when methods are inserted into the cache!
Open objc’s source code and place a breakpoint on the call method:
After the break, set a breakpoint in the INSERT method, and then use bt in the LLDB to view the stack:
On the stack you can see the entire flow from the object calling the method to entering the insert method:
_objc_msgSend_uncached -> lookUpImpOrForward -> log_and_fill_cache -> insert
You can see that the objc_msgSend method is called first when an object calls a method!
The objc_msgSend method involves the Runtime!
Run-time understanding of runtime
1. Static and dynamic programming
The first thing we need to know is that programming languages can be static and dynamic.
In a static language, all type judgments are determined before the program is run, and all members and methods of a class have their memory addresses determined at compile time. This means that all class objects can access only their own member variables and methods, otherwise the compiler will directly report an error. More common static languages such as Java, C ++, C and so on.
In dynamic languages, on the other hand, the determination of type, class member variables, and method memory locations are determined at runtime, and member variables and methods can be added dynamically. This means that if you call a method that doesn’t exist, the compilation will pass, and even if an object’s type is not what it looks like on the surface, its true type can only be determined after it’s run. Dynamic languages are more flexible and subscriptable than static languages. Oc is a dynamic language.
2. Compile time
Compile time, as the name implies, is when you are compiling. What is compilation? A compiler helps you translate the source code into code that the machine can recognize. (In a general sense, of course, but probably translated into some intermediate state of language.)
So compilation is simply a process of doing some translation work, such as checking if you accidentally miswrote any keywords, lexical analysis, grammatical analysis and so on. It’s like a teacher checking a student’s essay for typos and bad sentences. The compiler will tell you if it finds any errors.
If you are using Microsoft VS, click build and start compiling. If errors or warning is displayed below, it is checked by the compiler. This is called a compile-time error, and any type checking done during this process is also called compile-time type checking, or static type checking (static is not actually running the code in memory, but simply scanning the code as text).
So sometimes when people say that memory is allocated at compile time, they are definitely wrong.
3. Runtime
At runtime, the code runs and gets loaded into memory.
Your code is dead on disk until it is loaded into memory. It becomes alive only when it is in memory. Run-time type checking is different from compile-time type checking (or static type checking) described earlier. It’s not just scanning code. I’m doing things in memory, making judgments.
More details about the runtime can be found in the official documentation. (Objective-C Runtime Programming Guide)
4. Runtime initiation mode
1. OC method.
2. NSObject methods.
Objc dynamic library API.
Hierarchies can be represented by a graph:
5. The difference between runtime and compile time
Create an object with two methods, implement only one method, run:
The difference between compile time and run time is that the compilation succeeds, but an error is reported as soon as it runs!
6, through the underlying analysis
Next, let’s look at the underlying implementation of OC code through clang restoration, and find the main function:
int main(int argc, const char * argv[]) {
/* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool;
HPerson * p = ((HPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("HPerson"), sel_registerName("alloc"));
((void (*)(id, SEL))(void *)objc_msgSend)((id)p, sel_registerName("saySix"));
((void (*)(id, SEL))(void *)objc_msgSend)((id)p, sel_registerName("sayHello"));
}
return 0;
}
Copy the code
It is found that the upper code will get an explanation after compilation!
The process of calling the method is to call the objc_msgSend function, or message send!
The objc_msgSend function takes two parameters from the underlying code:
One is (id)objc_getClass(“HPerson”) or (id) P, the recipient of the message!
One is sel_registerName(” XXX “) sel!
Now all methods are called without arguments. What if they are called with arguments?
We add parameters:
Clang again:
int main(int argc, const char * argv[]) {
/* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool;
HPerson * p = ((HPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("HPerson"), sel_registerName("alloc"));
((void (*)(id, SEL, NSString *))(void *)objc_msgSend)((id)p, sel_registerName("saySomething:"), (NSString *)&__NSConstantStringImpl__var_folders_1h_55lzq4fd39b0mz94wmqthqf860mpgy_T_main_189844_mi_2);
((void (*)(id, SEL))(void *)objc_msgSend)((id)p, sel_registerName("saySix"));
((void (*)(id, SEL))(void *)objc_msgSend)((id)p, sel_registerName("sayHello"));
}
return 0;
}
Copy the code
You can see that there is an NSString argument!
Therefore, the message sending mode can be obtained:
Objc_msgSend (Message receiver, message body (SEL + argument))
7. Low-level code calls
Can we call it directly in code?
Set Enable Strict Checking of objc_msgSend Calls to NO in Build Setting
The objc_msgSend function is called with a relaxed set of parameters.
Then we try calling it directly from the code:
We found that the method is exactly the same as we normally use code to call!
NSObject method call
So how does NSObject get called?
If you go to nsobject. h, you can see the related methods:
So obviously performSelector has to do with methods, so let’s try it out:
With our normal call is the same!
View objc_msgSend source code
1. Assembly debugging
To open the firstassembly
:
Then place a breakpoint in front of the method and run:
Create a breakpoint at objc_msgSend and hold down CTRL to step into:
Objc_msgSend is the source code of objC.
2, view the source code
Open objc source and search for objc_msgSend:
Since the underlying objc_msgSend is written in assembly, let’s look directly at the.s file!
Since we’re mostly using real machines, we’ll look at arm64:
Find ENTRY _objc_msgSend:
ENTRY _objc_msgSend
UNWIND _objc_msgSend, NoFrame
cmp p0, #0 // nil check and tagged pointer check
#if SUPPORT_TAGGED_POINTERS
b.le LNilOrTagged // (MSB tagged pointer looks negative)
#else
b.eq LReturnZero
#endif
ldr p13, [x0] // p13 = isa
GetClassFromIsa_p16 p13, 1, x0 // p16 = class
LGetIsaDone:
// calls imp or objc_msgSend_uncached
CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached
Copy the code
We can see that _objc_msgSend is written in syntagma, whereas our previous source code was written in C or C ++. Why?
Because assembly is fast and dynamic!
Four, assembly source code analysis
1, assembly of common instructions
B instruction
bl
Jump to labeled out executionb.le
Check whether the value of CMP above is less than or equal to the execution label, otherwise go straight downb.ge
Greater than or equal to the execution address otherwise downb.lt
Determine if the value of camp above is less than the method in the address below or go straight downb.gt
Greater than the execution address or downb.eq
Equal to execute address otherwise downb.hi
The result of the comparison is unsigned greater than, execute the method in the address, otherwise no jumpb.hs
The instruction is to determine whether unsigned is less thanb.ls
The instruction is to determine whether unsigned greater thanb.lo
The instruction is to determine whether unsigned greater than or equal to
Ret return
mov x0,#0x10 -> x0 = 0x10
str w10 ,[sp]
Store the value of register W10 to sp stack space memorySTP x0, x1, [sp. # 0 x10] *
: x0, x1 values are stored in SP + 0x10ORR x0, WZR, # 0 x1
: x0 = wzr | 0x1stur w10 ,[sp]
Store the value of register W10 to sp stack space memoryldr w10 ,[sp]
W10 = value in sp stack memoryThe LDP x0, x1, [sp]
X0, x1 = sp stack memory values
Adrp gets a string from base address + offset (global variable)
cbz
Comparison, is zero, jump;cbnz
: compares. If the value is non-zero, the system jumps.CMP: Comparison function
Such as:CMP OPR1, OPR2. = (OPR1) - (OPR2)
2. Start analyzing
cmp p0, #0 // nil check and tagged pointer check
Copy the code
Search p0:
#if __LP64__
// true arm64
#define SUPPORT_TAGGED_POINTERS 1
#define PTR .quad
#define PTRSIZE 8
#define PTRSHIFT 3 // 1<<PTRSHIFT == PTRSIZE
// "p" registers are pointer-sized
#define UXTP UXTX
#define p0 x0
#define p1 x1
#define p2 x2
#define p3 x3
#define p4 x4
#define p5 x5
#define p6 x6
#define p7 x7
#define p8 x8
#define p9 x9
#define p10 x10
#define p11 x11
#define p12 x12
#define p13 x13
#define p14 x14
#define p15 x15
#define p16 x16
#define p17 x17
// true arm64
#else
// arm64_32
#define SUPPORT_TAGGED_POINTERS 0
#define PTR .long
#define PTRSIZE 4
#define PTRSHIFT 2 // 1<<PTRSHIFT == PTRSIZE
// "p" registers are pointer-sized
#define UXTP UXTW
#define p0 w0
#define p1 w1
#define p2 w2
#define p3 w3
#define p4 w4
#define p5 w5
#define p6 w6
#define p7 w7
#define p8 w8
#define p9 w9
#define p10 w10
#define p11 w11
#define p12 w12
#define p13 w13
#define p14 w14
#define p15 w15
#define p16 w16
#define p17 w17
// arm64_32
#endif
Copy the code
You can see that p0 is register x0! That’s the first argument we pass in, p.
Compare p0 with 0 to see if P0 is null!
That is, whether the message receiver exists, if not:
#if SUPPORT_TAGGED_POINTERS
b.le LNilOrTagged // (MSB tagged pointer looks negative)
#else
b.eq LReturnZero
#endif
Copy the code
If SUPPORT_TAGGED_POINTERS is 1, LNilOrTagged is displayed:
#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
b.eq LReturnZero // nil check
GetTaggedClass
b LGetIsaDone
// SUPPORT_TAGGED_POINTERS
#endif
Copy the code
Similar to when support_tagged_pointer is 0, LReturnZero is executed:
LReturnZero:
// x0 is already zero
mov x1, #0
movi d0, #0
movi d1, #0
movi d2, #0
movi d3, #0
ret
END_ENTRY _objc_msgSend
Copy the code
That is, empty the register, and then finish.
The normal case, of course, is to continue down:
ldr p13, [x0] // p13 = isa
Copy the code
Assign x0 to p13, which is the message receiver, the first argument p passed in.
The comment indicates that isa is assigned to p13. Why isa?
Because isa is the first address of p!
Get class-getClassFromisA_p16
Continue to:
GetClassFromIsa_p16 p13, 1, x0 // p16 = class
Copy the code
The comment indicates that class is assigned to p16!
Start exploring GetClassFromIsa_p16!
Call GetClassFromIsa_p16 and pass in p13, 1, x0.
Enter the GetClassFromIsa_p16:
/******************************************************************** * GetClassFromIsa_p16 src, needs_auth, auth_address * src is a raw isa field. Sets p16 to the corresponding class pointer. * The raw isa might be an indexed isa to be decoded, or a * packed isa that needs to be masked. * * On exit: * src is unchanged * p16 is a class pointer * x10 is clobbered * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
.macro GetClassFromIsa_p16 src, needs_auth, auth_address /* note: auth_address is not required if ! needs_auth */
#if SUPPORT_INDEXED_ISA
// Indexed isa
mov p16, \src // optimistically set dst = src
tbz p16, #ISA_INDEX_IS_NPI_BIT, 1f // done if not non-pointer isa
// isa in p16 is indexed
adrp x10, _objc_indexed_classes@PAGE
add x10, x10, _objc_indexed_classes@PAGEOFF
ubfx p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS // extract index
ldr p16, [x10, p16, UXTP #PTRSHIFT] // load class from array
1:
#elif __LP64__
.if \needs_auth == 0 // _cache_getImp takes an authed class already
mov p16, \src
.else
// 64-bit packed isa
ExtractISA p16, \src, \auth_address
.endif
#else
// 32-bit raw isa
mov p16, \src
#endif
.endmacro
Copy the code
.macro means this is a macro definition!
Look at the SUPPORT_INDEXED_ISA:
// Define SUPPORT_INDEXED_ISA=1 on platforms that store the class in the isa
// field as an index into a class table.
// Note, keep this in sync with any .s files which also define it.
// Be sure to edit objc-abi.h as well.
#if__ARM_ARCH_7K__ >= 2 || (__arm64__ && ! __LP64__)
# define SUPPORT_INDEXED_ISA 1
#else
# define SUPPORT_INDEXED_ISA 0
#endif
Copy the code
It’s 64-bit now, so we’ll focus on SUPPORT_INDEXED_ISA being 0 and __LP64__ :
.if \needs_auth == 0 // _cache_getImp takes an authed class already
mov p16, \src
.else
// 64-bit packed isa
ExtractISA p16, \src, \auth_address
.endif
Copy the code
Needs_auth is the second argument, which is 1!
So go ExtractISA, pass in P16, P13 (ISA), x0:
.macro ExtractISA and $0, $1, #ISA_MASK.endmacro
Copy the code
Add $1 to #ISA_MASK and assign $0!
Here and we looked before the source is very similar! Isa&ISA _MASK, class!
So this step assigns class to p16!
Why do I take the class out? Insert a cache into a class.
Then finish GetClassFromIsa_p16!
4, find cache -CacheLookup
4.1 the overview,
After class is obtained:
LGetIsaDone:
// calls imp or objc_msgSend_uncached
CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached
Copy the code
Then we go to cache up, which looks up the cache:
/******************************************************************** * * CacheLookup NORMAL|GETIMP|LOOKUP
MissLabelDynamic MissLabelConstant * * MissLabelConstant is only used for the GETIMP variant. * * Locate the implementation for a selector in a class method cache. * * When this is used in a function that doesn't hold the runtime lock, * this represents the critical section that may access dead memory. * If the kernel causes one of these functions to go down the recovery * path, we pretend the lookup failed by jumping the JumpMiss branch. * * Takes: * x1 = selector * x16 = class to be searched * * Kills: * x9,x10,x11,x12,x13,x15,x17 * * Untouched: * x14 * * On exit: (found) calls or returns IMP * with x16 = class, x17 = IMP * In LOOKUP mode, the two low bits are set to 0x3 * if we hit a constant cache (used in objc_trace) * (not found) jumps to LCacheMiss * with x15 = class * For constant caches in LOOKUP mode, the low bit * of x16 is set to 0x1 to indicate we had to fallback. * In addition, when LCacheMiss is __objc_msgSend_uncached or * __objc_msgLookup_uncached, 0x2 will be set in x16 * to remember we took the slowpath. * So the two low bits of x16 on exit mean: * 0: dynamic hit * 1: fallback to the parent class, when there is a preoptimized cache * 2: slowpath * 3: preoptimized cache hit * ********************************************************************/
#define NORMAL 0
#define GETIMP 1
#define LOOKUP 2
.macro CacheLookup Mode, Function, MissLabelDynamic, MissLabelConstant
//
// Restart protocol:
//
// As soon as we're past the LLookupStart\Function label we may have
// loaded an invalid cache pointer or mask.
//
// When task_restartable_ranges_synchronize() is called,
// (or when a signal hits us) before we're past LLookupEnd\Function,
// then our PC will be reset to LLookupRecover\Function which forcefully
// jumps to the cache-miss codepath which have the following
// requirements:
//
// GETIMP:
// The cache-miss is just returning NULL (setting x0 to 0)
//
// NORMAL and LOOKUP:
// - x0 contains the receiver
// - x1 contains the selector
// - x16 contains the isa
// - other registers are set as per calling conventions
//
mov x15, x16 // stash the original isa
LLookupStart\Function:
// p1 = SEL, p16 = isa
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
ldr p10, [x16, #CACHE] // p10 = mask|buckets
lsr p11, p10, #48 // p11 = mask
and p10, p10, #0xffffffffffff // p10 = buckets
and w12, w1, w11 // x12 = _cmd & mask
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
ldr p11, [x16, #CACHE] // p11 = mask|buckets
#if CONFIG_USE_PREOPT_CACHES
#if __has_feature(ptrauth_calls)
tbnz p11, #0, LLookupPreopt\Function
and p10, p11, #0x0000ffffffffffff // p10 = buckets
#else
and p10, p11, #0x0000fffffffffffe // p10 = buckets
tbnz p11, #0, LLookupPreopt\Function
#endif
eor p12, p1, p1, LSR #7
and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd >> 7)) & mask
#else
and p10, p11, #0x0000ffffffffffff // p10 = buckets
and p12, p1, p11, LSR #48 // x12 = _cmd & mask
#endif // CONFIG_USE_PREOPT_CACHES
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
ldr p11, [x16, #CACHE] // p11 = mask|buckets
and p10, p11, #~0xf // p10 = buckets
and p11, p11, #0xf // p11 = maskShift
mov p12, #0xffff
lsr p11, p12, p11 // p11 = mask = 0xffff >> p11
and p12, p1, p11 // x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif
add p13, p10, p12, LSL #(1+PTRSHIFT)
// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))
// do {
1: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel ! = _cmd) {
b.ne 3f // scan more
// } else {
2: CacheHit \Mode // hit: call or return imp
/ /}
3: cbz p9, \MissLabelDynamic // if (sel == 0) goto Miss;
cmp p13, p10 // } while (bucket >= buckets)
b.hs 1b
// wrap-around:
// p10 = first bucket
// p11 = mask (and maybe other bits on LP64)
// p12 = _cmd & mask
//
// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.
// So stop when we circle back to the first probed bucket
// rather than when hitting the first bucket again.
//
// Note that we might probe the initial bucket twice
// when the first probed slot is the last entry.
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
add p13, p10, w11, UXTW #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
// p13 = buckets + (mask << 1+PTRSHIFT)
// see comment about maskZeroBits
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
add p13, p10, p11, LSL #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
add p12, p10, p12, LSL #(1+PTRSHIFT)
// p12 = first probed bucket
// do {
4: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel == _cmd)
b.eq 2b // goto hit
cmp p9, #0 // } while (sel ! = 0 &&
ccmp p13, p12, #0, ne // bucket > first_probed)
b.hi 4b
LLookupEnd\Function:
LLookupRecover\Function:
b \MissLabelDynamic
#if CONFIG_USE_PREOPT_CACHES
#ifCACHE_MASK_STORAGE ! = CACHE_MASK_STORAGE_HIGH_16
#error config unsupported
#endif
LLookupPreopt\Function:
#if __has_feature(ptrauth_calls)
and p10, p11, #0x007ffffffffffffe // p10 = buckets
autdb x10, x16 // auth as early as possible
#endif
// x12 = (_cmd - first_shared_cache_sel)
adrp x9, _MagicSelRef@PAGE
ldr p9, [x9, _MagicSelRef@PAGEOFF]
sub p12, p1, p9
// w9 = ((_cmd - first_shared_cache_sel) >> hash_shift & hash_mask)
#if __has_feature(ptrauth_calls)
// bits 63.. 60 of x11 are the number of bits in hash_mask
// bits 59.. 55 of x11 is hash_shift
lsr x17, x11, #55 // w17 = (hash_shift, ...)
lsr w9, w12, w17 // >>= shift
lsr x17, x11, #60 // w17 = mask_bits
mov x11, #0x7fff
lsr x11, x11, x17 // p11 = mask (0x7fff >> mask_bits)
and x9, x9, x11 // &= mask
#else
// bits 63.. 53 of x11 is hash_mask
// bits 52.. 48 of x11 is hash_shift
lsr x17, x11, #48 // w17 = (hash_shift, hash_mask)
lsr w9, w12, w17 // >>= shift
and x9, x9, x11, LSR #53 // &= mask
#endif
ldr x17, [x10, x9, LSL #3] // x17 == sel_offs | (imp_offs << 32)
cmp x12, w17, uxtw
.if \Mode == GETIMP
b.ne \MissLabelConstant // cache miss
sub x0, x16, x17, LSR #32 // imp = isa - imp_offs
SignAsImp x0
ret
.else
b.ne 5f // cache miss
sub x17, x16, x17, LSR #32 // imp = isa - imp_offs
.if \Mode == NORMAL
br x17
.elseif \Mode == LOOKUP
orr x16, x16, #3 // for instrumentation, note that we hit a constant cache
SignAsImp x17
ret
.else
.abort unhandled mode \Mode
.endif
5: ldursw x9, [x10, #- 8 -] // offset -8 is the fallback offset
add x16, x16, x9 // compute the fallback isa
b LLookupStart\Function // lookup again with a new isa
.endif
#endif // CONFIG_USE_PREOPT_CACHES
.endmacro
Copy the code
4.2. Get buckets
First look at the arguments passed in: NORMAL, _objc_msgSend and __objc_msgSend_uncached.
Corresponding to Mode, Function, MissLabelDynamic, MissLabelConstant!
But this function takes four arguments, indicating that the last one is the default value!
Then follow along:
mov x15, x16 // stash the original isa
Copy the code
So here we’re assigning x16 or class to x15.
Continue to:
LLookupStart\Function: // p1 = SEL, p16 = isa
Copy the code
Function is passed _objc_msgSend, i.e. start _objc_msgSend!
Then comes a judgment:
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
/ /...
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
/ /...
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
/ /...
#else
#error Unsupported cache mask storage for ARM64.
#endif
Copy the code
Let’s start with CACHE_MASK_STORAGE:
#if defined(__arm64__) && __LP64__
#if TARGET_OS_OSX || TARGET_OS_SIMULATOR
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
#else
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16 // True 64-bit
#endif
#elifdefined(__arm64__) && ! __LP64__
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4
#else
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_OUTLINED
#endif
Copy the code
Since we’re mainly looking at the 64-bit mode of the real machine, we just need to look at CACHE_MASK_STORAGE_HIGH_16:
ldr p11, [x16, #CACHE] // p11 = mask|buckets
#if CONFIG_USE_PREOPT_CACHES
#if __has_feature(ptrauth_calls)
tbnz p11, #0, LLookupPreopt\Function
and p10, p11, #0x0000ffffffffffff // p10 = buckets
#else
and p10, p11, #0x0000fffffffffffe // p10 = buckets
tbnz p11, #0, LLookupPreopt\Function
#endif
eor p12, p1, p1, LSR #7
and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd >> 7)) & mask
#else
and p10, p11, #0x0000ffffffffffff // p10 = buckets
and p12, p1, p11, LSR #48 // x12 = _cmd & mask
#endif // CONFIG_USE_PREOPT_CACHES
Copy the code
LDR = x16 + #CACHE + p11!
#define CACHE
#define CACHE (2 * __SIZEOF_POINTER__)
Copy the code
__SIZEOF_POINTER__ is the size of the pointer, i.e. the CACHE is 16!
So x16 shifted 16 bytes, so class shifted 16 bytes to get cache! P11 is the first address of the cache (_bucketsAndMaybeMask)!
Next look at CONFIG_USE_PREOPT_CACHES:
#ifdefined(__arm64__) && TARGET_OS_IOS && ! TARGET_OS_SIMULATOR && ! TARGET_OS_MACCATALYST
#define CONFIG_USE_PREOPT_CACHES 1
#else
#define CONFIG_USE_PREOPT_CACHES 0
#endif
Copy the code
In real time, 1! Just look at this passage:
#if __has_feature(ptrauth_calls)
tbnz p11, #0, LLookupPreopt\Function
and p10, p11, #0x0000ffffffffffff // p10 = buckets
#else
and p10, p11, #0x0000fffffffffffe // p10 = buckets
tbnz p11, #0, LLookupPreopt\Function
Copy the code
__has_feature: This function determines whether the compiler supports a feature.
Ptrauth_calls: pointer authentication for arm64E architecture; Devices using Apple A12 or later A series processors (such as iPhone XS, iPhone XS Max, and iPhone XR or newer devices) support the ARM64E architecture.
Let’s look at most cases, which are true machines below A12, the else part!
P11&0x0000fffffffffffe is assigned to p10! So what is P10?
Review how to get buckets:
struct bucket_t *cache_t::buckets(a) const
{
uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
return (bucket_t *)(addr & bucketsMask);
}
Copy the code
P11 is the start address of the cache, _bucketsAndMaybeMask, 0x0000FFfffffffe is the mask, p10 is buckets!
After getting Buckets:
tbnz p11, #0, LLookupPreopt\Function
Copy the code
TBNZ: the jump occurs if the 0th bit is not 0.
Buckets are usually zero.
4.3 Start the search
Keep going:
eor p12, p1, p1, LSR #7
and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd >> 7)) & mask
Copy the code
LSR: move right by bit.
The hash is the same as the insert method before:
static inline mask_t cache_hash(SEL sel, mask_t mask)
{
uintptr_t value = (uintptr_t)sel;
#if CONFIG_USE_PREOPT_CACHES
value ^= value >> 7;
#endif
return (mask_t)(value & mask);
}
Copy the code
So this is rehash, get the subscript, p12 is the start value (begin)!
After getting the hash subscript:
add p13, p10, p12, LSL #(1+PTRSHIFT)
// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))
Copy the code
LSL: move to the left bit.
Then look for PTRSHIFT:
#if __LP64__
#define PTRSHIFT 3 // 1<<PTRSHIFT == PTRSIZE
#else
#define PTRSHIFT 2 // 1<<PTRSHIFT == PTRSIZE
#endif
Copy the code
So PTRSHIFT is 3!
P12 is the hash subscript. If you move 4 bits to the left, the memory size is multiplied by 16.
So p10 buckets translate the bucket!
So P13 is the bucket currently being looked for!
And then:
// do {
1: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel ! = _cmd) {
b.ne 3f // scan more
// } else {
2: CacheHit \Mode // hit: call or return imp
/ /}
3: cbz p9, \MissLabelDynamic // if (sel == 0) goto Miss;
cmp p13, p10 // } while (bucket >= buckets)
b.hs 1b
Copy the code
LDP x0, x1,[sp] : x0, x1 = values in sp stack memory.
CBZ: Comparison. If the value is zero, the switch will be forwarded.
Analysis:
1: Assign the imp and SEL of the current bucket to P17 and P9 respectively, then x13-16 (x13 is the previous bucket), then compare p9 (SEL) with the method we passed in, if not jump to 3, if not jump to 2.
2. Found the method we passed in: CacheHit CacheHit.
If p9 (sel) is empty, then MissLabelDynamic (__objc_msgSend_uncached) is the third parameter passed to CacheLookup. Check whether the address of BUCKET X13 is greater than or equal to the first address of bucket P10. If the address is greater than or equal to p10, go to 1.
These 3 steps are a do… The while loop!
Why p9 (SEL) is empty and then miss?
1. The hash algorithm for fetching and storing the cache is the same.
2. In the real case, if there is a hash conflict in the cache, then:
static inline mask_t cache_next(mask_t i, mask_t mask) {
return i ? i- 1 : mask;
}
Copy the code
The real machine stores the cache by bit, so if it is empty, it means that the cache has been searched, and no cache has been found, then miss!
Similarly, if the address of the bucket is less than that of the buckets, the cache is exhausted.
4.4 CacheHit CacheHit
When we find the method we passed in, we go into the cache and hit the CacheHit:
// CacheHit: x17 = cached IMP, x10 = address of buckets, x1 = SEL, x16 = isa
.macro CacheHit
.if $0 == NORMAL
TailCallCachedImp x17, x10, x1, x16 // authenticate and call imp
.elseif $0 == GETIMP
mov p0, p17
cbz p0, 9f // don't ptrauth a nil imp
AuthAndResignAsIMP x0, x10, x1, x16 // authenticate imp and re-sign as IMP
9: ret // return IMP
.elseif $0 == LOOKUP
// No nil check for ptrauth: the caller would crash anyway when they
// jump to a nil IMP. We don't care if that jump also fails ptrauth.
AuthAndResignAsIMP x17, x10, x1, x16 // authenticate imp and re-sign as IMP
cmp x16, x15
cinc x16, x16, ne // x16 += 1 when x15 ! = x16 (for instrumentation ; fallback to the parent class)
ret // return imp via x17
.else
.abort oops
.endif
.endmacro
Copy the code
$0 is the first value passed to CacheLookup, which is NORMAL!
So just look:
TailCallCachedImp x17, x10, x1, x16 // authenticate and call imp
Copy the code
Then enter the TailCallCachedImp function:
#if __has_feature(ptrauth_calls)
// JOP
.macro TailCallCachedImp
// $0 = cached imp, $1 = address of cached imp, $2 = SEL, $3 = isa
eor $1, $1, $2 // mix SEL into ptrauth modifier
eor $1, $1, $3 // mix isa into ptrauth modifier
brab $0, $1
.endmacro
#else
// not JOP
.macro TailCallCachedImp
// $0 = cached imp, $1 = address of cached imp, $2 = SEL, $3 = isa
eor $0, $0, $3
br $0
.endmacro
#endif
Copy the code
Let’s look at A12:
.macro TailCallCachedImp
// $0 = cached imp, $1 = address of cached imp, $2 = SEL, $3 = isa
eor $0, $0, $3
br $0
.endmacro
Copy the code
Eor: xOR by bit.
We passed in X17 (IMP), X10 (buckets), X1 (SEL), X16 (ISA)!
Then xor $0 (IMP) and $3 (isa, class) and assign $0.
Why do we have xOR here?
When we insert the cache, we encode:
// Sign newImp, with &_imp, newSel, and cls as modifiers.
uintptr_t encodeImp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, IMP newImp, UNUSED_WITHOUT_PTRAUTH SEL newSel, Class cls) const {
if(! newImp)return 0;
#if CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_PTRAUTH
return (uintptr_t)
ptrauth_auth_and_resign(newImp,
ptrauth_key_function_pointer, 0,
ptrauth_key_process_dependent_code,
modifierForSEL(base, newSel, cls));
#elif CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_ISA_XOR
return (uintptr_t)newImp ^ (uintptr_t)cls;
#elif CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_NONE
return (uintptr_t)newImp;
#else
#error Unknown method cache IMP encoding.
#endif
}
Copy the code
So here’s decoding to get imp!
Finally jump imp!
Here is objc_msgSend imp through sel search process!
4.5. Continue searching
If P13 (bucket) is smaller than P10 (buckets), then you jump out of the loop and continue down:
// wrap-around:
// p10 = first bucket
// p11 = mask (and maybe other bits on LP64)
// p12 = _cmd & mask
//
// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.
// So stop when we circle back to the first probed bucket
// rather than when hitting the first bucket again.
//
// Note that we might probe the initial bucket twice
// when the first probed slot is the last entry.
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
add p13, p10, w11, UXTW #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
// p13 = buckets + (mask << 1+PTRSHIFT)
// see comment about maskZeroBits
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
add p13, p10, p11, LSL #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
Copy the code
Just look at CACHE_MASK_STORAGE_HIGH_16:
add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
// p13 = buckets + (mask << 1+PTRSHIFT)
// see comment about maskZeroBits
Copy the code
Mask is the memory created by buckets, so move buckets to the right and mask memory.
So P13 is buckets’ last bucket!
And then:
add p12, p10, p12, LSL #(1+PTRSHIFT)
// p12 = first probed bucket
Copy the code
P12 (BEGIN) is moved four bits to the left, that is, begin * 16 obtained by hash. Then P10 (buckets) is moved four bits to the right.
So p12 is the bucket at begin!
Continue to:
// do {
4: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel == _cmd)
b.eq 2b // goto hit
cmp p9, #0 // } while (sel ! = 0 &&
ccmp p13, p12, #0, ne // bucket > first_probed)
b.hi 4b
Copy the code
Here the loop above is similar:
P17, p9, then bucket–.
2. Compare p9 (SEL) to an incoming SEL, if the same is entered into a CacheHit.
3. Compare whether P9 (SEL) is not 0 and whether BUkcet is greater than bucket p12 (begin).
4. If satisfied, jump back to 4 (the beginning of the loop).
4.6 not found
If p9 (SEL) is 0, or bukcet is less than p12 (bucket at begin), continue:
LLookupEnd\Function:
LLookupRecover\Function:
b \MissLabelDynamic
Copy the code
MissLabelDynamic (__objc_msgSend_uncached)