Welcome to the iOS Basics series (suggested in order)
IOS low-level – Alloc and init explore
IOS Bottom – Isa for everything
IOS low-level – Analysis of the nature of classes
IOS Underlying – cache_t Process analysis
IOS Low-level – Method lookup process analysis
IOS bottom layer – Analysis of message forwarding process
IOS Low-level – How does Dyld load app
IOS low-level – class load analysis
IOS low-level – Load analysis of categories
In this paper,
This article focuses on the underlying nature of methods, the various situations in which methods are sent, the method lookup process, and more, combined with Cache_T, to gain a broader understanding of the message sending process.
The interview pit point
Here’s an interview question:
Why can subclasses call class methods to implement object methods on NSObject?
If you don’t dig into the method lookup process, you can get stuck. Below is an analysis of the method lookup process (with additional answers at the end).
The runtime briefly
In the previous article, we explained that cache_t caches a method, what that method is, and what the calling method is actually doing. These are closely related to the Runtime.
A.r untime is what
We all know that OC has runtime features, but oc is compiled into static languages such as C and C ++, so it does not have a runtime. At this point, the iOS layer encapsulates a set of APIS written by C, C ++, assembly, used to provide runtime functions for oc, which is called runtime.
B.r untime version
Runtime comes in two versions:
-
legacy
-
modern
Used in the underlying source code! __OBJC2__ and __OBJC2__ to distinguish between them. The current version is generally __OBJC2__, so we can basically ignore the Legacy version.
C. Runtime Call type
There are only three types of calls to the Runtime
-
Objective-c Code (example: @selector ())
-
Method of NSObject (example :performSelector())
-
Runtime API (example :sel_registerName())
Nature of method
A method is really just a piece of code lying quietly inside class_rw_T, which is, strictly speaking, the essence of the calling method.
Create a CJPerson class, initialize and call the method, and then call a custom function.
void play(){
NSLog(@"%s",__func__);
}
int main(int argc, const char * argv[]) {
@autoreleasepool {
CJPerson *person = [CJPerson alloc];
[person work];
play();
}
return 0;
}
Copy the code
When exploring the nature of classes, WE used clang compilation, and we’ll do the same here
clang -rewrite-objc main.m -o main.cpp
Open main.cpp and go straight to the end
int main(int argc, const char * argv[]) {
/* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool;
CJPerson *person = ((CJPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("CJPerson"), sel_registerName("alloc"));
((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("work"));
play();
}
return 0;
}
Copy the code
Tidy it up and get rid of the strong rotation
int main(int argc, const char * argv[]) {
/* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool;
CJPerson *person = objc_msgSend(objc_getClass("CJPerson"), sel_registerName("alloc"));
objc_msgSend(person, sel_registerName("work"));
play();
}
return 0;
}
Copy the code
As you can see, calling the method sends a message through objc_msgSend, but calling the play() function does not send a message.
In fact, sending a message is looking for a function implementationimp
The process ofPaly () function pointer
Directly to the function implementation, there is no need to send a message
Objc_msgSend has two parameters
- Id Message receiver
- Sel method number
With these two parameters, we can hash the SEL generated key&mask with id cache_t in the corresponding CLS, assuming there is a cache. This should be clear from the underlying iOS -cache_t process analysis.
Several differences in message sending
According to the development experience, methods are generally called in four ways: class object method, class class method, parent class object method, parent class method.
Verify in turn that you create a CJStudent class that inherits from CJPerson and both declare their own object and class methods. Call in CJStudent, then clang compiles (never mind the recursive loop, just see the compiled result, not run)
- (void)study{ [super work]; // Parent object method [self study]; } + (void)play{[super buy]; // parent class method [CJStudent play]; // Class method}Copy the code
Corresponding result part of CLang (after simplification) :
Static void _I_CJStudent_study(CJStudent * self, SEL _cmd) {objc_msgSendSuper({self, class_getSuperclass(objc_getClass("CJStudent"))}, sel_registerName("work")); // This class object method objc_msgSend(self, sel_registerName("study")); } static void _cjstudent_play (Class self, SEL _cmd) {objc_msgSendSuper({self, class_getSuperclass(objc_getMetaClass("CJStudent"))}, sel_registerName("buy")); // This class class method objc_msgSend(objc_getClass("CJStudent"), sel_registerName("play")); }Copy the code
The comprehensive results are as follows:
Method type | The underlying call | Message receiver | Pass the parent class |
---|---|---|---|
Object method of this class | objc_msgSend | self | There is no |
Class method | objc_msgSend | self.class | There is no |
Superclass object method | objc_msgSendSuper | self | The parent of the class |
Superclass class method | objc_msgSendSuper | self | The parent of a metaclass |
` ` `! | |||
As you can see, the most obvious difference here is objc_msgSend and objc_msgSendSuper | |||
` ` ` | |||
Basically, it can be confirmed that the main cause of inconsistent message sending isobjc_msgSendSuper “We often sayobjc_msgSend thatobjc_msgSendSuper Click on it to see what’s going on: |
You can see the alpha and betaobjc_msgSend
The main difference is in the first parameterobjc_super
struct objc_super { __unsafe_unretained _Nonnull id receiver; #if ! defined(__cplusplus) && ! __OBJC2__ __unsafe_unretained _Nonnull Class class; #else __unsafe_unretained _Nonnull Class super_class; #endif };Copy the code
Objc_super is a structure that takes two arguments, one is an ID receiver, since the runtime is __OBJC2__, the second is Class super_class.
Now that you understand what parameters mean, the above conclusion is easy to understand:
- A superclass object method goes to the list of methods in the superclass
- A superclass method is searched in the list of methods of the superclass’s metaclass
- The caller of this class method is
class
- Be careful,
super
callobjc_msgSendSuper
Tell the system
Look in the list of superclass methods, but the caller body is still self
Method Lookup process
1. Find an entry point
At this point, the source of everything points to objc_msgSend. But the problem comes again, this source code so many copies, which copy to see?
Here’s an idea:
Based on what we know so far, the calling method will execute objc_msgSend, and then the next objc_msgSend symbol breakpoint will be opened when the method is called. The break point comes to:
Objc_msgSend is located in a small area of objC source code.
Open a happy heartobjc
Source code, or try to searchobjc_msgSend
.There are more than 600 related, direct crash, it seems that this road is blocked, but also think about changing the search keyword.
Think about it another way,objc_msgSend
Is the method to be called. The general format for calling a method isThe method name ()
“, you can do a searchobjc_msgSend(
.There are only two parts to the search results,.h part
andAssembly parts
First of all,.h
It can be ruled out that source code implementation and invocation are not possible in.h
That one is leftassembly
Don’t,objc_msgSend
At the bottom, it is implemented in assembly.
In retrospect, objc_msgSend is a mutable parameter, which is not recognized effectively by static LANGUAGE C, and could indeed be implemented in assembly.
After all aspects of data research, it is confirmed that the fast lookup of objc_msgSend is implemented by assembly, and two reasons are obtained:
- In C, it is not possible to leave unknown parameters and jump to arbitrary function Pointers by using functions
objc_msgSend
At the bottom layer, high-frequency events have high performance requirements and must be fast enough- The use of assembly can effectively prevent system functions by hook, more security
2. Do a quick lookup
Now that I knowobjc_msgSend
Is assembly implementation, that can only be forced to look at the assembly.Here choose from the common onesarm64
Start with the general look assembly from the entryENTRY
To start, go straight to similarENTRY objc_msgSend
Is the place to start exploring
X0 to x7 stores parameters, and x0 also stores return values
Check if self is of type TaggedPoint, which does not need to send a message. Get the x0 address of the first parameter id and place it in p13, isa, 4. Class is obtained by isa_mask, which is why p16 is equal to class, which is where the fetch method is 5. After isa search is complete, check whether there is any in the cache first, that is, the quick search process beginsCopy the code
Here we extend ④ and ⑤
(4) : GetClassFromIsa_p16 The class is obtained internally by isa_mask
(5) : CacheLookup NORMAL
* CacheLookup NORMAL|GETIMP|LOOKUP * * Locate the implementation for a selector in a class method cache. * * Takes: * x1 = selector // the second argument sel * x16 = class to be searched // obtained by isaCopy the code
CacheLookup
Three types: normal (fast) lookup | GETIMP | slow lookup
#define SUPERCLASS __SIZEOF_POINTER__ #define CACHE(2 * __SIZEOF_POINTER__) 1.x16 pan CACHE(where CACHE is defined as a 16-byte macro) Take the value of cache_t and place it in p10 and p11. P10 represents buckets; P11 represents Occupied fourth place; P11 represents Occupied fourth place;Copy the code
struct cache_t { struct bucket_t *_buckets; // the first 8 bits mask_t _mask; / / four mask_t _occupied; / / 4}Copy the code
2. Use w1's _cmd & w11's mask to get w12, which is the hash subscript of the method. W is used here, because the mask type is 32 bits, and because the last four bits of the little endian mode are the mask.Copy the code
static inline mask_t cache_hash(cache_key_t key, mask_t mask) {
return (mask_t)(key & mask);
}
Copy the code
3. Get the effective address of bucket by translation, then remove IMP from BUCKET of X12 and put P17 and SEL into P9Copy the code
4. Compare sel in bucket with CMD passed in. If NoEqual goes through the 2fCheckMiss process and loops to check buckets, otherwise the CacheHit will hit buckets.Copy the code
5. Although CheckMiss is called, CheckMiss has CBZ to judge whether SEL is 0. If it is not 0, bucket ==buckets will be judged. In order to prevent multiple threads from updating the cache, there is a jump 1B re-lookup process; If 0, meaning no caching, start the slow __objc_msgSend_uncached processCopy the code
.macro CheckMiss // miss if bucket->sel == 0. cbz p9, LGetImpMiss .elseif $0 == NORMAL cbz p9, __objc_msgSend_uncached .elseif $0 == LOOKUP cbz p9, __objc_msgLookup_uncachedCopy the code
Skip to JumpMiss, and then skip to __objc_msgSend_uncached.Copy the code
.macro JumpMiss
.if $0 == GETIMP
b LGetImpMiss
.elseif $0 == NORMAL
b __objc_msgSend_uncached
.elseif $0 == LOOKUP
b __objc_msgLookup_uncached
Copy the code
__objc_msgLookup_uncached
__objc_msgLookup_uncached, cached, cached, cached, cached
STATIC_ENTRY __objc_msgLookup_uncached
UNWIND __objc_msgLookup_uncached, FrameWithNoSaves
MethodTableLookup
ret
Copy the code
Methodtable ELookup = cached and cached in the __objc_msgLookup_uncached process. Is it also lookup by assembly? Just keep reading:
.macro MethodTableLookup // push frame SignLR stp fp, lr, [sp, #-16]! mov fp, sp // save parameter registers: x0.. x8, q0.. q7 sub sp, sp, #(10*8 + 8*16) stp q0, q1, [sp, #(0*16)] stp q2, q3, [sp, #(2*16)] stp q4, q5, [sp, #(4*16)] stp q6, q7, [sp, #(6*16)] stp x0, x1, [sp, #(8*16+0*8)] stp x2, x3, [sp, #(8*16+2*8)] stp x4, x5, [sp, #(8*16+4*8)] stp x6, x7, [sp, #(8*16+6*8)] str x8, [sp, #(8*16+8*8)] // receiver and selector already in x0 and x1 mov x2, x16 bl __class_lookupMethodAndLoadCache3Copy the code
Pretty long, a large section of the concrete in front don’t quite understand, but you can see is the address, does not affect the overall process, we read after preparing the address parameter, directly to __class_lookupMethodAndLoadCache3.
Same old rules keep searching__class_lookupMethodAndLoadCache3
.
You will find that all are calls, but there is no similar implementation process, you may feel that Apple does not open source, if so, to explore here seems to have come to an end.
When you’re desperate, calm down and think,__class_lookupMethodAndLoadCache3
Is in the__objc_msgLookup_uncached
Later,__objc_msgLookup_uncached again in objc_msgSend
After. Keep goingobjc_msgSend
Symbol breakpoint to see if the assembly call looks like this.
Ached (_objc_msgLookup_uncached) after objc_msgSend, we get _objc_msgLookup_uncached. Again to see if there is a call in the _objc_msgLookup_uncached __class_lookupMethodAndLoadCache3
Did call __class_lookupMethodAndLoadCache3 _objc_msgLookup_uncached, but scrutiny is _class_lookupMethodAndLoadCache3, little in front of an underscore. Objc-runtime-new.mm line 4846
See here, as if discovered a new continent, is the bottom call_class_lookupMethodAndLoadCache3
, direct search,
First find a place to mark the positioning, indeed as expected to find _class_lookupMethodAndLoadCache3 implementation.
Here from assembly to jump to c, because of the slow search process will start, here also reverse explains why before calling _class_lookupMethodAndLoadCache3, there is a address argument processing:
- C, C ++, static languages need to determine the parameter list, so it needs to be prepared.
That’s objc_msgSend’s quick lookup process, or cache lookup. In general, a quick lookup is looking for the cache in cache_t, and a cache hit ends immediately; After all the search is not found, it begins to do the preparatory work before the slow search, and jumps to the slow search process.
3. Slow search
Directly to _class_lookupMethodAndLoadCache3, just call the lookUpImpOrForward inside.
IMP _class_lookupMethodAndLoadCache3(id obj, SEL sel, Class cls)
{
return lookUpImpOrForward(cls, sel, obj,
YES/*initialize*/, NO/*cache*/, YES/*resolver*/);
}
Copy the code
The lookUpImpOrForward method is longer and is divided into the preparation and lookup parts, and finally the verification and conclusion parts
A. Preparation
1. Check whether cache exists. If cache exists, obtain IMP directly through CLS and SEL and return.Copy the code
2. Check related class information a. Check the given class based on the list of all known classes. If any problem occurs, an exception is thrown internally. B. Determine whether the class has been implemented, and implement it if it has not been implemented. This part will be analyzed in detail in the class loading chapter later. C. Check whether the class is initialized. If it is not initialized, deinitialize it.Copy the code
B. Search section
Search part of the code is still relatively long, a screen can not accommodate, so divided into two
1. Class ready, again determine whether there is a cache (because OC dynamic language, can be modified anytime and anywhere), there is directly through CLS and SEL directly obtain IMP, and return.Copy the code
2. Look through the list of methods of this class (using binary lookup). If meth is found, it is populated into the cache and then returned. There is an extra {} class in the outer layer to form a local scope to prevent meth from having the same name.Copy the code
A. Imps exist and are not message forwarding type, populate the cache first, then return B. An IMP that exists and is of message forwarding type stops the search, does not cache the method, but calls itCopy the code
4. When the cache search for the parent class ends and meth is not found, the list of methods of the parent class is searched. If meth is found, it is first filled into the cache and then returned.Copy the code
5 recursive search after all the parent class still can not find IMP, start method forwarding process, and only once. Methods are forwarded for detailed analysis in the next chapter.Copy the code
That’s the slow lookup process for objc_msgSend. In general, slow lookup is a chain of method lookups from this class to the parent class and finally to NSObject. Find method_list and populate the cache. Can’t find parent cache and method_list, if found, populate cache; Could not find the final forward.
C. Verification and conclusion
Here, the conclusion is drawn directly based on the development experience and only one interview pit is verified
Object method: 1. Object method - own - success; 2. Object method - own not - find dad's - success; 3. Object method - own no - Dad no - find dad's dad - NSObject - success; 4. Object method - own not - dad not - find dad's dad -> NSObject also not - crash;Copy the code
Class method: 1. Class method - own - success; 2. Class method - they have no - dad has - success; NSObject does not have any object methods - crash 4. Class method - no - dad no - find dad's dad -> NSObject no - Object method - successCopy the code
All of the above is consistent with our knowledge of the search process, only class methods 3 and 4, how can we end up calling object methods? Oh my God, this doesn’t fit the OC worldview. Calling a class method implements an object method.
Provide a validation method:
Define and implement an object method in the class of NSObject, and then call that defined object method with the class name of any class.Copy the code
Classification statement - (void) instanceMethod {NSLog (@ - I'm object method "% s", __func__); } Class name call int main(int argc, const char * argv[]) {@autoreleasepool {[CJPerson instanceMethod]; } return 0; }Copy the code
After the execution, it was successful. How do you explain this irrational situation, actually fromisa
andsupclass
Go bitmap, you can find out.
Back to the interview question: Why can subclasses call class methods to implement NSObject’s object methods?
Call by class name, will go through the method list of the metaclass, finally find the method list of the root metaclass, but cannot find the corresponding class method; At this point, the supclass of the root metaclass points to the root class NSObject, so I look up the list of methods on NSObject, and because the list of methods on NSObject holds object methods, I find an object method called instanceMethod.
Write in the last
Objc_msgSend is one of the knotlines of iOS development, and its flow is closely tied to cache_T process analysis. The next chapter is the last part of sending message – message forwarding process analysis. Stay tuned.