The opening
Study hard, not anxious not impatient.
The previous article, Cache Analysis, described the class cache, cache_t internal structure analysis, cache data insertion process; What is not known is when the data will be inserted and what scenario will trigger the data insertion. So today we’re going to talk about how to trigger data insertion;
The cache triggered data write
In the insert method, the break point, as you can see from the breakpoint process, triggers cache:: The insert method is triggered by log_and_fill_cache();
log_and_fill_cache(Class cls, **IMP** imp, **SEL** sel, **id** receiver, Class implementer)
{
#if SUPPORT_MESSAGE_LOGGING
if (slowpath(objcMsgLogEnabled && implementer)) {
bool cacheIt = logMessageSend(implementer->isMetaClass(),
cls->nameForLogging(),
implementer->nameForLogging(),
sel);
if(! cacheIt)return;
}
#endif
cls->cache.insert(sel, imp, receiver);
}
Copy the code
Inside log_and_fill_cache, the cache.insert() method is called; And as you go through the execution sequence of the methods, you see that it was issued by objc_msgSend. I’m going to take you into the new territory, objc_msgSend;
Runtime
Runtime
Before we talk about objc_msgSend, we need to understand what Runtime is; Because objc_msgSend is executed in Runtime; Here’s a brief introduction to Runtime, and a later article on Runtime.
You all know that Objective-C is a dynamic language. OC converts source code into executable file, need to go through three steps: compile, link, run;
Compile phase: cannot know the function call, or the specific type of variable, compile phase, only responsible for the code into the machine can recognize the language, not responsible for memory allocation;
Run phase: load the required methods, variables, etc., from disk into memory; The program can only run if it enters memory; The bottom layer is through the form of dyld link, the code is loaded into memory; Runtime is the foundation of the Runtime mechanism;
Runtime call Mode
Call Runtime related methods using objective-C methods.
2, using NSObject, call Runtime related methods;
Use the API provided by the underlying objC to initiate.
Runtime API calls
As mentioned in the previous article, it is possible to view the execution logic of the OC code through the.cpp file, so today we will use the.cpp file to see how the underlying Runtime methods are called.
In the project, execute -(void)sayNB to see how sayNB is called; And generate. CPP files;
In the.cpp file, look at the main() method. You can see that the code above OC is compiled to C++ and is executed using Runtime.
Objc_msgSend (message receiver, message body), the message body contains sel;
So, executing the -(void)sayNB method is the same as calling objc_msgSend(person, sel_registerName(“sayNB”));
As you can see, Runtime has Runtime functionality, and the essence of the OC method is to send messages, via objc_msgSend
objc_msgSend
Objc_msgSend assembly
In the source code, we find that the source code of objc_msgSend is different in different system architectures. We use real machine architecture to analyze the process of objc_msgSend.
Look at the source code for objc_msgSend in the arm64 framework.
The objc_msgSend source code is very simple and written in assembly language; 🤣🤣🤣 and wonder why apple is using assembly language here instead of C/C++; And what do p0, x0 and so on mean?
The register structure of the simulator is marked as: rax, RBX, RCX, RDX, etc., in which self, sel and other data are stored. As shown in the figure below:
On ARM64, register structure is x0 ~ X30, etc. Self is stored in x0 and sel in x1, as shown below:
Objc_msgSend source code analysis
CMP p0, # 0:
CMP, contrast instruction; Compare p0, #0;
The address of P0, the recipient of the message is Person, means p0, which is the address of Person;
#0, the comparison object, the value is 0
cmp p0, #0 // Compare the current message receiver, whether it exists, whether there is a message receiver; If not, the if else logic is executed
#if SUPPORT_TAGGED_POINTERS // Then determine whether the current type is tagged Pointer
b.le LNilOrTagged // (MSB tagged pointer looks negative)
#else
b.eq LReturnZero // Person is not a small object type, then execute the code here; If the values are equal, return the default value -> empty message; An empty message is returned because there is no message receiver
#endif // If the message receiver exists, execute the following logic
ldr p13, [x0] // p13 = isa [x0] is the comparison object
GetClassFromIsa_p16 p13, 1, x0 // p16 = class
LGetIsaDone: // After the value is finished, execute the following logic
// calls imp or objc_msgSend_uncached
CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached
Copy the code
LDR p13, [x0] :
LDR, load instruction, load P13, [x0]
P13: is the isa
[X0] : indicates the message receiver
GetClassFromIsa_p16 p13, 1, x0: Processes p13, x0, and 1.
GetClassFromIsa_p16
__has_feature: determines whether the compiler supports the condition in (); Ptrauth_calls: For arm64 architecture, pointer validation; Devices using Apple A12 or later A series processors (such as the iPhone XS, iPhone XS Max, and iPhone XR or newer devices) support the ARM64E architecture. developer.apple.com
So, before you send a message, you need to get the class from the message receiver. So we need to get the class LGPerson from the Person object;
Objc_msgSend is stored in the cache, and cache is stored in the class. Therefore, you must obtain the class before you can execute objc_msgSend.
CacheLookup
We know that objc_msgSend is a process of finding imp through sel; So let’s analyze the process;
In the previous chapter, we focused on getting THE ISA by comparing the message receiver, and then getting the class according to the ISA, and then we talked about why we want to get the class, and what it is used for.
To obtain the index
In the source code, after LGetIsaDone is finished, the CacheLookup function is executed;
In the previous article, we learned that sel and IMP are stored in buckets, and buckets are stored in cache_T, so in the figure above, buckes are obtained by masking &cache_t.
But in the ARM architecture, the compilation of the source code, more complex, so we switch architecture, using MAC architecture analysis of the next process; After all, now we are in the MAC system, the source code combined with assembly, can be more simple and direct analysis of the source code;
We need to use sel to obtain the IMP corresponding to the method, and the IMP exists in a bucket, which contains many buckets. Therefore, we need to know the IMP’s bucket, and to obtain the bucket, we need to know the subscript corresponding to the bucket. The subscript can be obtained by hashing.
Through the above process, we get the data such as buckets and index, and then we start to get the corresponding bucket.
Gets the bucket to look for
The process of finding a bucket is also obtained by constantly shifting the memory address, so after getting the bucket, we need to find the IMP corresponding to sel. So we’re going to execute the CacheHit function;
CacheHit
conclusion
Based on the code, we analyze the message sending process of objc_msgSend. Finally, a brief summary:
By comparing the message receiver, the ISA is obtained, and then the class object is obtained through ISA. Class is translated from memory to get cache. The SEL and IMP are stored in the bucket structure, and buckets contain multiple bucket structures. The buckets structure is a part of the cache structure. The buckets are obtained by the mask &cache_t, and then the corresponding buckets are obtained by the index subscript and inside the buckets. Get the bucket, sel, IMP can get out;
Shortcut key knowledge
How to quickly view code keyword search results when there are more;
Hold Command and click the drop-down arrow to fold up
The results are as follows:
In this way, you can view the files where the search keywords are located.