1. Review
I introduced the concepts of Runtime and compile time in my previous iOS Low-level exploration Runtime(PART 1): Runtime & The Nature of Methods. You also know that the OC method calls, which essentially send messages, are implemented at the bottom through the objc_msgSend method. So how does that work at the bottom?
2. How to implement message sending at the bottom level
Add 1
There are two versions of the Runtime: Legacy (earlier versions) and Modern (current versions)
- Programming interfaces for earlier versions:
Objective - 1.0 C
- Programming interface of current version:
Objective - 2.0 C
- Earlier versions were used for
Objective - 1.0 C
.32 -
theMac OS X
On the platform of - Current version:
iPhone
Procedures andMac OS X v10.5
And later in the systemA 64 - bit
The program
Objective-C Runtime Programming Guide
The following code is not unfamiliar! The call effect is the same, one is the upper OC object call method, one is the lower level message send.
JPStudent *stu = [[JPStudent alloc]init];
[stu test];
objc_msgSend(stu, sel_registerName("test"));
Copy the code
Sel_registerName is a C method that passes in a C string (which is our method name)
objc_msgSend(<#id _Nullable self#>, <#SEL _Nonnull op, ... # >)
sel_registerName(<#const char * _Nonnull str#>)
//C function to break into a string
Copy the code
Sel_registerName (“test”) is equivalent to @selector(test), so we can print out their addresses.
NSLog(@"%p---%p",sel_registerName("test"),@selector(test));
// Print the output
2021- 06- 29 12:58:50.610720+0800Exploration of the nature of the method [42704:741799] 0x7fff7b9f5ddc---0x7fff7b9f5ddc
Copy the code
From the print results, is a model touch a 😁
Add 2
The method invocation in OC is converted to the message sending objc_msgSend function at the bottom. The execution process can be roughly divided into three stages.
- Message Sending Process
- Dynamic method parsing process
- Message Forwarding Process
2.1 Searching for source Code
If you want to see the underlying objc_msgSend, you have to go to Apple’s source code.
Source code project find objc_msgSend
Oh my God! What ghost 👻? There are so many files, there is assembly, there is C/C++ which one to look at? And the architecture is different.
We are definitely looking for arm architecture, don’t ask why, just look for it, haha! Objc-msg-arm64. s file is basically located because the real phone of our mobile phone is based on ARM architecture and OC is implemented in C, C++ and assembly.
2.2 Viewing source Code
Now that you have found it, don’t stay outside, go in and have a look.
Oh, my god! This is familiar and strange (university learned) compilation ah! Demon 😈, nightmare ah! The time that university learns very muddled!
Pretty boy, hold on! Hold on! Compilation is indeed more difficult to chew, but also not chew immovable, a mouthful can not eat, slowly chew! Dry, finished!
3. Analysis and compilation
Assembly source code starts at ENTRY _objc_msgSend and ends at END_ENTRY _objc_msgSend.
ENTRY _objc_msgSend
UNWIND _objc_msgSend, NoFrame
cmp p0, #0 // nil check and tagged pointer check
#if SUPPORT_TAGGED_POINTERS
b.le LNilOrTagged // (MSB tagged pointer looks negative)
#else
b.eq LReturnZero
#endif
ldr p13, [x0] // p13 = isa
GetClassFromIsa_p16 p13, 1, x0 // p16 = class
LGetIsaDone:
// calls imp or objc_msgSend_uncached
CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached
#if SUPPORT_TAGGED_POINTERS
LNilOrTagged:
b.eq LReturnZero // nil check
GetTaggedClass
b LGetIsaDone
// SUPPORT_TAGGED_POINTERS
#endif
LReturnZero:
// x0 is already zero
mov x1, #0
movi d0, #0
movi d1, #0
movi d2, #0
movi d3, #0
ret
END_ENTRY _objc_msgSend
Copy the code
3.1 _objc_msgSend
p0
And empty contrast, that is, to determine whether the receiver exists inp0
isobjc_msgSend
The message receiver is the first parameter ofreceiver
if else
Judge if supportedtagged pointer
Jump toLNilOrTagged
If the small object is empty, then null is returned, i.eLReturnZero
. If the small object is not empty, the small object is processedisa
And went toCacheLookup NORMAL
GetClassFromIsa_p16
Is a macro defined byisa
Find the corresponding class,ExtractISA
It’s also a macro definition that will be passed inisa&isaMask
,class
And willclass
Assigned top16
GetClassFromIsa_p1
The macro definition
// p13(isa), 1, x0(isa)
//GetClassFromIsa_p16 macro definition
.macro GetClassFromIsa_p16 src, needs_auth, auth_address /* note: auth_address is not required if ! needs_auth */
#if SUPPORT_INDEXED_ISA
// Indexed isa
mov p16, \src // optimistically set dst = src
tbz p16, #ISA_INDEX_IS_NPI_BIT, 1f // done if not non-pointer isa
// isa in p16 is indexed
adrp x10, _objc_indexed_classes@PAGE
add x10, x10, _objc_indexed_classes@PAGEOFF
ubfx p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS // extract index
ldr p16, [x10, p16, UXTP #PTRSHIFT] // load class from array
1:
#elif __LP64__
.if \needs_auth == 0 // _cache_getImp takes an authed class already
mov p16, \src
.else
// 64-bit packed isa
ExtractISA p16, \src, \auth_address
.endif
#else
// 32-bit raw isa
mov p16, \src
#endif
.endmacro
Copy the code
ExtractISA
Macro definition
.macro ExtractISA
and $0, $1, #ISA_MASK
.endmacro
Copy the code
3.2 CacheLookUp
CacheLookUp
The core code
// NORMAL, _objc_msgSend, __objc_msgSend_uncached , MissLabelConstant
.macro CacheLookup Mode, Function, MissLabelDynamic, MissLabelConstant
//
// Restart protocol:
//
// As soon as we're past the LLookupStart\Function label we may have
// loaded an invalid cache pointer or mask.
//
// When task_restartable_ranges_synchronize() is called,
// (or when a signal hits us) before we're past LLookupEnd\Function,
// then our PC will be reset to LLookupRecover\Function which forcefully
// jumps to the cache-miss codepath which have the following
// requirements:
//
// GETIMP:
// The cache-miss is just returning NULL (setting x0 to 0)
//
// NORMAL and LOOKUP:
// - x0 contains the receiver
// - x1 contains the selector
// - x16 contains the isa
// - other registers are set as per calling conventions
//
mov x15, x16 // stash the original isa
LLookupStart\Function:
// p1 = SEL, p16 = isa
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
ldr p10, [x16, #CACHE] // p10 = mask|buckets
lsr p11, p10, #48 // p11 = mask
and p10, p10, #0xffffffffffff // p10 = buckets
and w12, w1, w11 // x12 = _cmd & mask
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
ldr p11, [x16, #CACHE] // p11 = mask|buckets
#if CONFIG_USE_PREOPT_CACHES
#if __has_feature(ptrauth_calls)
tbnz p11, #0, LLookupPreopt\Function
and p10, p11, #0x0000ffffffffffff // p10 = buckets
#else
and p10, p11, #0x0000fffffffffffe // p10 = buckets
tbnz p11, #0, LLookupPreopt\Function
#endif
eor p12, p1, p1, LSR #7
and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd >> 7)) & mask
#else
// p11 cache -> p10 = buckets
// p11, LSR #48 -> mask
// p1(_cmd) & mask = index -> p12
and p10, p11, #0x0000ffffffffffff // p10 = buckets
and p12, p1, p11, LSR #48 // x12 = _cmd & mask
#endif // CONFIG_USE_PREOPT_CACHES
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
ldr p11, [x16, #CACHE] // p11 = mask|buckets
and p10, p11, #~0xf // p10 = buckets
and p11, p11, #0xf // p11 = maskShift
mov p12, #0xffff
lsr p11, p12, p11 // p11 = mask = 0xffff >> p11
and p12, p1, p11 // x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif
// objc - source code debugging + assembly
// p11 cache -> p10 = buckets
// p1(_cmd) & mask = index -> p12
// (_cmd & mask) << 4 ->int 1 2 3 4 5 address ->int
// buckets (1, 2, 3, 4)
// b[i] -> b + i
// p13 is currently looking for buckets
add p13, p10, p12, LSL #(1+PTRSHIFT)
// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))
// do {
// *bucket-- p17, p9
Imp (p17) sel (p9)
Say1 = sel (p9)
1: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel ! = _cmd) {
b.ne 3f // scan more
// } else {
2: CacheHit \Mode // hit: call or return imp
/ /}
3: cbz p9, \MissLabelDynamic // if (sel == 0) goto Miss;
cmp p13, p10 // } while (bucket >= buckets)
b.hs 1b
// wrap-around:
// p10 = first bucket
// p11 = mask (and maybe other bits on LP64)
// p12 = _cmd & mask
//
// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.
// So stop when we circle back to the first probed bucket
// rather than when hitting the first bucket again.
//
// Note that we might probe the initial bucket twice
// when the first probed slot is the last entry.
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
add p13, p10, w11, UXTW #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
// p13 = buckets + (mask << 1+PTRSHIFT)
// see comment about maskZeroBits
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
add p13, p10, p11, LSL #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
add p12, p10, p12, LSL #(1+PTRSHIFT)
// p12 = first probed bucket
// do {
4: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel == _cmd)
b.eq 2b // goto hit
cmp p9, #0 // } while (sel ! = 0 &&
ccmp p13, p12, #0, ne // bucket > first_probed)
b.hi 4b
LLookupEnd\Function:
LLookupRecover\Function:
b \MissLabelDynamic
#if CONFIG_USE_PREOPT_CACHES
#ifCACHE_MASK_STORAGE ! = CACHE_MASK_STORAGE_HIGH_16
#error config unsupported
#endif
LLookupPreopt\Function:
#if __has_feature(ptrauth_calls)
and p10, p11, #0x007ffffffffffffe // p10 = buckets
autdb x10, x16 // auth as early as possible
#endif
// x12 = (_cmd - first_shared_cache_sel)
adrp x9, _MagicSelRef@PAGE
ldr p9, [x9, _MagicSelRef@PAGEOFF]
sub p12, p1, p9
// w9 = ((_cmd - first_shared_cache_sel) >> hash_shift & hash_mask)
#if __has_feature(ptrauth_calls)
// bits 63.. 60 of x11 are the number of bits in hash_mask
// bits 59.. 55 of x11 is hash_shift
lsr x17, x11, #55 // w17 = (hash_shift, ...)
lsr w9, w12, w17 // >>= shift
lsr x17, x11, #60 // w17 = mask_bits
mov x11, #0x7fff
lsr x11, x11, x17 // p11 = mask (0x7fff >> mask_bits)
and x9, x9, x11 // &= mask
#else
// bits 63.. 53 of x11 is hash_mask
// bits 52.. 48 of x11 is hash_shift
lsr x17, x11, #48 // w17 = (hash_shift, hash_mask)
lsr w9, w12, w17 // >>= shift
and x9, x9, x11, LSR #53 // &= mask
#endif
ldr x17, [x10, x9, LSL #3] // x17 == sel_offs | (imp_offs << 32)
cmp x12, w17, uxtw
.if \Mode == GETIMP
b.ne \MissLabelConstant // cache miss
sub x0, x16, x17, LSR #32 // imp = isa - imp_offs
SignAsImp x0
ret
.else
b.ne 5f // cache miss
sub x17, x16, x17, LSR #32 // imp = isa - imp_offs
.if \Mode == NORMAL
br x17
.elseif \Mode == LOOKUP
orr x16, x16, #3 // for instrumentation, note that we hit a constant cache
SignAsImp x17
ret
.else
.abort unhandled mode \Mode
.endif
5: ldursw x9, [x10, #- 8 -] // offset -8 is the fallback offset
add x16, x16, x9 // compute the fallback isa
b LLookupStart\Function // lookup again with a new isa
.endif
#endif // CONFIG_USE_PREOPT_CACHES
.endmacro
Copy the code
- through
cache
First address translation16
Bytes (because inobjc_class
, the first address distancecache
just16
Byte, i.e.,isa
占8
Bytes,superClass
Account for8
Bytes), getcahce
.cache
The high16
A depositmask
Low,48
A depositbuckets
, i.e.,p11 = mask
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
ldr p11, [x16, #CACHE] // p11 = mask|buckets
Copy the code
- from
cache
Respectively frombuckets
andmask
And by themask
According to theThe hash algorithm
Compute the hashThe subscript
. inarm64
Environment,mask
andbuckets
Put together to occupy8
Bytes,64
A; Among themmask
In the high16
positionbuckets
In the low48
position throughmask
(0x0000FFFFFFfffffe) and the operation (&
) will be high16
Bit erasure getsbuckets
; willbuckets
Assigned top10
. willcache
Moves to the right48
A,mask
, i.e.,p10 = buckets
.
#if CONFIG_USE_PREOPT_CACHES
#if __has_feature(ptrauth_calls)
tbnz p11, #0, LLookupPreopt\Function
and p10, p11, #0x0000ffffffffffff // p10 = buckets
#else
// Get buckets
and p10, p11, #0x0000fffffffffffe // p10 = buckets
tbnz p11, #0, LLookupPreopt\Function
#endif
// This part is in the cache_hash algorithm
eor p12, p1, p1, LSR #7
and p12, p12, p11, LSR #48 // x12 = (_cmd ^ (_cmd >> 7)) & mask
#else
and p10, p11, #0x0000ffffffffffff // p10 = buckets
and p12, p1, p11, LSR #48 // x12 = _cmd & mask
#endif // CONFIG_USE_PREOPT_CACHES
Copy the code
- will
objc_msgSend
The parameters of thep1
(The second parameter_cmd
)& msak
, and obtain the storage to be searched through the hash algorithmsel-imp
thebucket
The subscriptindex
, i.e.,p12 = index = _cmd & mask
. This is because the system is storingsel-imp
, is to hash the subscript, and then to store, so read also need to go through the same way.
static inline mask_t cache_hash(SEL sel, mask_t mask)
{
uintptr_t value = (uintptr_t)sel;
#if CONFIG_USE_PREOPT_CACHES
value ^= value >> 7;
#endif
return (mask_t)(value & mask);
}
Copy the code
Obtain the bucket corresponding to hash index by starting address + actual offset
- I know the subscripts,
buckets
I have the initial address of the_cmd
What about the location of? We all know that you can pan by memory address inbucket_t
Is stored inimp
andsel
.8 + 8 = 16
Bytes.
add p13, p10, p12, LSL #(1+PTRSHIFT)
// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))
Copy the code
Multiply buckets’ subscript by 16 (PTRSHIFT = 3) and add buckets’ subscript 4 to the left (PTRSHIFT = 16) to get the current bucket address. According to the obtained bucket, sel is removed and stored in P17, that is, P17 = SEL, imp is removed and stored in P9, that is, P9 = IMP.
1: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel ! = _cmd) {
b.ne 3f // scan more
// } else {
2: CacheHit \Mode // hit: call or return imp
/ /}
3: cbz p9, \MissLabelDynamic // if (sel == 0) goto Miss;
cmp p13, p10 // } while (bucket >= buckets)
b.hs 1b
Copy the code
cmp p9, p1
, if the current obtainedsel
With what to look forsel
If yes, the cache is hit.CacheHit
.- If not, enter
3
In the process, determine the currently obtainedSel, p9
Whether it is empty, and if it is empty, thenMiss
The cache did not hit. - If the acquired
sel
If it is not empty, it indicates that there is a subscript conflictbucket
Address with the firstbucket
If you get the address, greater than or equal to the first address, continue the comparison process, forward to find, loop on! Until the first address location is queried. - If the above loop is still not found, the following process will be entered,
CACHE_MASK_STORAGE_HIGH_16
Environment, same thingp11
Moves to the right48
A gainmask
And themask
Is equal to the total space capacity opened upMinus 1
, so get the location of the last storage space, that is, the first address on the basis of addmask*16
So herep13
The current largest storage space, which is the last storage space.
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
add p13, p10, w11, UXTW #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
add p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
// p13 = buckets + (mask << 1+PTRSHIFT)
// see comment about maskZeroBits
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
add p13, p10, p11, LSL #(1+PTRSHIFT)
// p13 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
Copy the code
P12 is the storage index of the _cmd method. Add the offset address index*16 to the first address of the bucket and assign the value to p12.
add p12, p10, p12, LSL #(1+PTRSHIFT)
// p12 = first probed bucket
Copy the code
This loop looks forward from the _cmd searched at the last position
#endif
add p12, p10, p12, LSL #(1+PTRSHIFT)
// p12 = first probed bucket
// do {
4: ldp p17, p9, [x13], #-BUCKET_SIZE // {imp, sel} = *bucket--
cmp p9, p1 // if (sel == _cmd)
b.eq 2b // goto hit
cmp p9, #0 // } while (sel ! = 0 &&
ccmp p13, p12, #0, ne // bucket > first_probed)
b.hi 4b
LLookupEnd\Function:
LLookupRecover\Function:
b \MissLabelDynamic
Copy the code
CMP p9, p1, if the sel currently obtained is the same as the sel being looked up, jump to flow 2, i.e. CacheHit, CacheHit
If not, check whether SEL is empty. If not, and the obtained IP address is larger than the location of p12, continue the loop.
If none of the above processes hit the cache, the MissLabelDynamic process is entered
3.3 CacheHit
Here’s an analysis of a CacheHit
// CacheHit: x17 = cached IMP, x10 = address of buckets, x1 = SEL, x16 = isa
.macro CacheHit
.if $0 == NORMAL
TailCallCachedImp x17, x10, x1, x16 // authenticate and call imp
.elseif $0 == GETIMP
mov p0, p17
cbz p0, 9f // don't ptrauth a nil imp
AuthAndResignAsIMP x0, x10, x1, x16 // authenticate imp and re-sign as IMP
9: ret // return IMP
.elseif $0 == LOOKUP
// No nil check for ptrauth: the caller would crash anyway when they
// jump to a nil IMP. We don't care if that jump also fails ptrauth.
AuthAndResignAsIMP x17, x10, x1, x16 // authenticate imp and re-sign as IMP
cmp x16, x15
cinc x16, x16, ne // x16 += 1 when x15 ! = x16 (for instrumentation ; fallback to the parent class)
ret // return imp via x17
.else
.abort oops
.endif
.endmacro
/ / call the imp
.macro TailCallCachedImp
// $0 = cached imp, $1 = address of cached imp, $2 = SEL, $3 = isa
eor $0, $0, $3
br $0
.endmacro
Copy the code
In CacheLookup, Mode is passed as NORMAL, and TailCallCachedImp is executed. Because in the storage of IMP, imp encoding processing, take out the execution of the call, the need to decode the operation.
If the cache does not hit, the MissLabelDynamic process is entered. MissLabelDynamic is the third parameter of CacheLookUp
// NORMAL, _objc_msgSend, __objc_msgSend_uncached , MissLabelConstant
.macro CacheLookup Mode, Function, MissLabelDynamic, MissLabelConstant
Copy the code
It is the __objc_MSgsend passed in _objc_MSgsend_uncached
Source code project global search __objc_msgSend_uncached
Analysis: In this function, we execute the macro MethodTableLookup, which continues to trace MethodTableLookup. In the assembly implementation of MethodTableLookup, we see the most important _lookUpImpOrForward method, A global lookup of _lookUpImpOrForward fails, indicating that the method is not implemented in assembly and needs to be found in C/C++ source code.
At this point, the analysis of assembly fast lookups in the message sending process ends, because lookUpImpOrForward is not implemented in assembly but in C/C++ and therefore is a slow lookups. LookUpImpOrForward Slow lookup for next analysis.
4. To summarize
- Why doesn’t the bottom layer
C
orC++
In assembly?
- Assembly is closer to machine language, direct operation of registers, high search efficiency
- Because some methods have unknown parameters, assembly can handle unknown parameters and be a little more dynamic
objc_msgSend
Function call, execution process can be roughly divided into three stages.
- Message sending process (1. Assemble quick lookup, 2. Slow lookup)
- Dynamic method parsing process
- Message Forwarding Process
- Flow chart of message sending:
More content continues to be updated
🌹 if you like, give it a thumbs up 👍🌹
🌹 think learned, can come a wave, collection + attention, comment + forward, lest you can not find me 😁🌹 next time
🌹 welcome everyone to leave a message exchange, criticism and correction, learn from each other 😁, improve themselves 🌹