This is the 9th day of my participation in Gwen Challenge
preface
We’ve already explored the insertion process in the method cache. Today we’ll dig into the details of the method cache and the upper trigger timing.
Method caches key details
Why is capacity expansion done at 3/4 of capacity?
- The load factor of 3/4 is the consensus of most data structure algorithms, and the space utilization is relatively large when the load factor is 0.75.
- When the cache is stored, the value calculated by the hash algorithm is used as the storage subscript. The remaining size of the cache space is critical to whether the subscript conflicts. When 3/4 is used as the load factor, the probability of hash conflicts is relatively low.
_bucketsAndMaybeMask
The role of
_bucketsAndMaybeMask
What’s inside isbucket
The memory ofThe first address
andbucketMask
The address;
Official explanation:
// _bucketsAndMaybeMask is a buckets_t pointer;
// _maybeMask is the buckets mask
Copy the code
- To obtain
_bucketsAndMaybeMask
Then you can get the next one by memory translationbucket
The address; buckets()
The method is actually right_bucketsAndMaybeMask
Inside the value of the operation, getbucket_t
;
The following code
struct bucket_t *cache_t::buckets() const
{
uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
return (bucket_t *)(addr & bucketsMask);//_bucketsAndMaybeMask and bucketsMask;
}
static constexpr uintptr_t bucketsMask = ~0ul;// All bits of bucketsMask are 1;
Copy the code
bucketMask
The value of the definition varies according to the architecture;
#if defined(__arm64__) && __LP64__// Real machine or __LP64__
#if TARGET_OS_OSX || TARGET_OS_SIMULATOR// Emulator or Mac
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS// High 16 bit large address
#else
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16/ / 16
#endif
#elifdefined(__arm64__) && ! __LP64__// True or not equal to __LP64__
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4/ / low four
#else
#define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_OUTLINED / / 1
#endif
Copy the code
buckets
The values
buckets()
The method is actually right_bucketsAndMaybeMask
Inside the value of the operation, getbucket_t
;buckets()
The returned data isbucket_t
The type of$n
But we use it a lot when we’re evaluatingp $n[1]
The use ofThe subscript value
The way it looksbucket_t
The values of theta are very similarAn array of
The value of phi, in factBucket () is not an array;Buckets ()
The returnedbucket_t
It’s just a structure, but it housesbucket_t
The space that I’m going to create is 1, 0A contiguous memory spaceSo just know the presentbucket_t
The location can be passedTranslation memory
To obtainNext adjacent area
The memory of;- use
p $n[1]
andp $n+1
It’s the same effect; - Arrays are also usedTranslation memoryThe value is set to
+ 1
Operation,Translation unitIt’s based on what’s inside the arrayThe data typeDecision;
Hash functions (cache_hash) and quadratic hash functions (cache_next)
- First hash function
cache_hash
mask_t m = capacity - 1; / / capacity - 1:4-1 = 3
mask_t begin = cache_hash(sel, m);// Get the hash value based on the capacity -1 and sel parameters
mask_t i = begin;
static inline mask_t cache_hash(SEL sel, mask_t mask)
{
uintptr_t value = (uintptr_t)sel;// Convert sel to an unsigned long integer
#if CONFIG_USE_PREOPT_CACHES
value ^= value >> 7;
#endif
return (mask_t)(value & mask);// Use sel transform value and mask to get the result by bit;
}
Copy the code
- Quadratic hash function
cache_next
cache_next(i, m)// The cache_next calculation is again hashed using the current location and capacity -1 as arguments
static inline mask_t cache_next(mask_t i, mask_t mask) {
return i ? i- 1 : mask;// If I is not 0, return i-1, otherwise return mask (capacity -1); It can be interpreted that the location where the conflict occurs is at the beginning of buckets. If it is not at the beginning of buckets, they move forward directly; if they jump to the position of capacity-1 at the beginning of buckets, they move forward in turn until they reach the position of begin again.
}
Copy the code
insert()
Closed loop flow of
- In order to find
insert()
The insertion time of ourinsert()
Method to view call stack information; - Is a kind of the stack
LIFO
The data structure of,index = 0
Is the last call from0
Start to view the call information in sequence and findlog_and_fill_cache
Method calledcache.insert
Methods;
static void
log_and_fill_cache(Class cls, IMP imp, SEL sel, id receiver, Class implementer)
{
#if SUPPORT_MESSAGE_LOGGING
if (slowpath(objcMsgLogEnabled && implementer)) {
bool cacheIt = logMessageSend(implementer->isMetaClass(),
cls->nameForLogging(),
implementer->nameForLogging(),
sel);
if(! cacheIt)return;
}
#endif
cls->cache.insert(sel, imp, receiver);
}
Copy the code
- To view
objc-cache.mm
File, which can be seen according to the commentsinsert()
The insertion timing of the topmost layer is throughobjc_msgSend
Triggered. Let’s take a lookobjc_msgSend
;
objc_msgSend
Compile time
- Compile timeAs the name suggestsWhile compiling. The compilertheThe source codeTranslated intoA code that the machine can recognize(Of course, this is only in a general sense, but in fact it may only be translated intoThe language of intermediate states). Compile time passSyntax analysis,Lexical analysisIsocompile-time type checking (Static type checking) to find in the code
errors
orwarning
Error message at compile time, etc. - Static checking does not run code in memory, but scans it as text. Some people are wrong when they say that memory is allocated at compile time.
The runtime
- Runtime is when code is loaded into memory for execution via DYLD; Runtime type checking is not the same as compile-time type checking (or static type checking) described earlier. Not simply scanning code, but doing operations and judgments in memory.
Runtime
There are two versions oneLegacy
Version (early version), oneModern
Version (current version)
- Programming interfaces for earlier versions:
Objective - 1.0 C
- Programming interface of current version:
Objective - 2.0 C
- Earlier versions were used for
Objective - 1.0 C
.32-bit Mac OS X
On the platform of - Current version:
The iPhone program
andMac OS X v10.5
及64-bit programs on systems after Mac OS X V10.5 '
- More information about OC can be found in the Apple documentation
- Objective-C Runtime Programming Guide
- Apple Official Documentation
Runtime
There are three ways to call it
OC
Methods:[p sayNB]
;NSObject
Apis provided:isKindofClass... ;
objc
The lower level APIobjc_msg_send
Methods:class_getInstanceSize... ;
Runtime
The initiator information of the three call modes is as follows
objc_msgSend
- using
clang
willOC
Rewrite the file toc++
File, after the restorationcpp
Part of the file can be viewedruntime
Call mode; OC
The method call inruntime
The layer becomes message sendingobjc_msgSend(id _Nullable self, SEL _Nonnull op, ...)
The method call contains two elements:
- Message receiver
- The message body:
sel
+parameter
// Overwrite the pre-oc method call
GCPerson *person = [GCPerson alloc];
[person sayHello];
// Rewrite to objc_msgSend
static void _I_GCTeacher_sayHello(GCTeacher * self, SEL _cmd) {
NSLog((NSString *)&__NSConstantStringImpl__var_folders_fw_k53y7bbd2rx7sdzz_s85z7840000gn_T_main_cd505b_mi_0,__func__);
}
Copy the code
objc_msgSendSuper
- When we looked at the rewritten CPP file we found that in addition to
objc_msgSend
In addition, there is anotherobjc_msgSendSuper
Methods; The guess is to send a message directly to the parent class; - To view
objc
The source code ofobjc_msgSendSuper
Is defined as follows,
OBJC_EXPORT id _Nullable
objc_msgSendSuper(struct objc_super * _Nonnull super, SEL _Nonnull op, ...)
OBJC_AVAILABLE(10.0.2.0.9.0.1.0.2.0);
Copy the code
objc_msgSendSuper
The parameters of theobjc_super
Are defined as follows
struct objc_super {
/// Specifies an instance of a class.
__unsafe_unretained _Nonnull id receiver;// Message receiver
/* super_class is the first class to search */
__unsafe_unretained _Nonnull Class super_class;// The first class the method looks for is super_class. If super_class cannot find it, it looks for the super_class's parent
};
Copy the code
- By building
objc_super
Struct pairs are arguments that we can call on our own initiativeobjc_msgSendSuper
methods
structobjc_super gc_objc_super; gc_objc_super.receiver = person; Message receiver gc_objc_super.super_class = gcPerson.class;// The first class the method looks for is super_class. If super_class cannot find it, it looks for the super_class's parent
objc_msgSendSuper(&gc_objc_super,@selector(sayHello));
Copy the code
Objc_msgSend Assembler view
Due to the dynamic characteristic of the OC language, basically all the methods are handled objc_msgSend, such a high frequency call of the underlying method, from the efficiency level considering it is best to use close to machine language, assembly language, assembly language at the same time based on address is safer, so the system is also implemented in assembly; Let’s take a look at objc_msgSend at the assembly level in the LibobJC library
- in
objc
Source searchobjc_msgSend
- Close all files and view the suffix as
.s
Assembler file of
- find
objc_msgSend
Assembly implementation of
Analysis of 4._objc_msgSend
Assembly part source code
ENTRY _objc_msgSend
UNWIND _objc_msgSend, NoFrame
cmp p0, #0 Nil check and tagged pointer check// compare the difference between p0 and #0
#if SUPPORT_TAGGED_POINTERS // Check whether the Taggedpointer type is supported
b.le LNilOrTagged // (MSB tagged pointer looks negative)// If the Taggedpointer type is supported, proceed as LNilOrTagged
#else
b.eq LReturnZero // Return nil if Taggedpointer is not supported
#endif
ldr p13, [x0] // p13 = isa, [x0]
//src, needs_auth, auth_address
GetClassFromIsa_p16 p13, 1, x0 Get CLSSS from GetClassFromIsa_p16: p16 = class
// receiver->class = method cache
LGetIsaDone:
// calls imp or objc_msgSend_uncached
CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached
END_ENTRY _objc_msgSend
Copy the code
Bucket IMP fetch
Bitwise xOR operation operation
^
Also known asThe bitwise exclusive or
Operation, ifc = a ^ b;
thea = c ^ b
;b
calledsalt.No practical effect, mainly forGenerate xOR resultsThe existed;
Methods the encoding
- Bucket_t stores IMP as a value of an unsigned long integer
explicit_atomic<uintptr_t> _imp;
And IMP itself is an address, so the set method is encoded when stored
void bucket_t::set(bucket_t *base, SEL newSel, IMP newImp, Class cls)
{
// IMP and Class are encoded here
uintptr_t newIMP = (impEncoding == Encoded
? encodeImp(base, newImp, newSel, cls)
: (uintptr_t)newImp);
if (atomicity == Atomic) {
_imp.store(newIMP, memory_order_relaxed);
if (_sel.load(memory_order_relaxed) != newSel) {
mega_barrier();
_sel.store(newSel, memory_order_relaxed);
}
} else{ _imp.store(newIMP, memory_order_relaxed); _sel.store(newSel, memory_order_relaxed); }}//IMP encoding method
uintptr_t encodeImp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, IMP newImp, UNUSED_WITHOUT_PTRAUTH SEL newSel, Class cls) const {
if(! newImp)return 0;
return (uintptr_t)newImp ^ (uintptr_t)cls;// Bitwise xOR
}
Copy the code
IMP
Obtaining method ofIMP imp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, Class cls)
It also needs to be decoded with the current extractIMP numerical
withclass
forExclusive or
operation
return (IMP)(imp ^ (uintptr_t)cls);
Copy the code
class
Is the salt used in the above encoding and decoding operation;
The delegate of the class
delegate
As aclass
A member of themethod property
There is not much difference between also passingLLDB
Debug to fromclass
Obtained from;- But we finally passed
protocols()
Method acquiredprotocol_array_t
.protocol_array_t
What’s stored inside isprotocol_ref_t
.protocol_ref_t
Is auintptr_t
Type, while notes that he isNo mapping:// protocol_t *, but unremapped
, can not directly print the relevant information, and we need to be able to view the relevant informationprotocol_t
Type; - Based on the above information, we re-search
protocol_ref_t
And was eventually found to be used for mappingprotocol_ref_t
The method ofremapProtocol()
, its internal implementation isStrong direct transferSo can weStrong goprotocol_ref_t
forprotocol_t
;
static ALWAYS_INLINE protocol_t *remapProtocol(protocol_ref_t proto)
{
runtimeLock.assertLocked();
// Protocols in shared cache images have a canonical bit to mark that they
// are the definition we should use
if (((protocol_t *)proto)->isCanonical())
return (protocol_t *)proto;// The protocol_t is returned
protocol_t *newproto = (protocol_t *)
getProtocol(((protocol_t *)proto)->mangledName);
return newproto ? newproto : (protocol_t *)proto;// The protocol_t is returned
}
Copy the code
LLDB debug code
2021- 0626 - 21:59:08.559358+0800 KCObjcBuild[81571:7062064] GCTeacher
(lldb) p/x pClass // Get the current class address
(Class) $0 = 0x0000000100004908 GCTeacher
(lldb) p/x 0x0000000100004908+0x20 // Memory offset gets bits
(long) $1 = 0x0000000100004928
(lldb) p (class_data_bits_t *)$1// Strong to bits
(class_data_bits_t *) $2 = 0x0000000100004928
(lldb) p $2->data()// Get data in bits
(class_rw_t *) $3 = 0x0000000101131b70
(lldb) p *$3// Prints the contents of the destination address of data: class_rw_t
(class_rw_t) $4 = {
flags = 2148007936
witness = 0
ro_or_rw_ext = {
std::__1::atomic<unsigned long> = {
Value = 4294984032
}
}
firstSubclass = nil
nextSiblingClass = NSUUID
}
(lldb) p $4.protocols()// Get protocols from class_rw_t
(const protocol_array_t) $5 = {
list_array_tt<unsigned long, protocol_list_t, RawPtr> = {
= {
list = {
ptr = 0x0000000100004270
}
arrayAndFlag = 4294984304
}
}
}
(lldb) p $5.list // Get the list of protocols
(const RawPtr<protocol_list_t>) $6 = {
ptr = 0x0000000100004270
}
(lldb) p $6.ptr // Get the PTR from protocols
(protocol_list_t *const) $7 = 0x0000000100004270
(lldb) p *$7 // Get the PROTOCOL_list_t in PTR
(protocol_list_t) $8 = (count = 1, list = protocol_ref_t [] @ 0x00007f922ab3a268)
(lldb) p $8.list[0]// Get a protocol_ref_t with index 0 from protocol_list_t
(protocol_ref_t) $10 = 4294986152
(lldb) p (protocol_t *)$10// Force protocol_ref_t to protocol_t
(protocol_t *) $11 = 0x00000001000049a8
(lldb) p *$11// Retrieve data stored in protocol_t
(protocol_t) $12 = {
objc_object = {
isa = {
bits = 4298453192
cls = Protocol
= {
nonpointer = 0
has_assoc = 0
has_cxx_dtor = 0
shiftcls = 537306649
magic = 0
weakly_referenced = 0
unused = 0
has_sidetable_rc = 0
extra_rc = 0
}
}
}
mangledName = 0x0000000100003e89 "TestDelegate"
protocols = 0x0000000100004338
instanceMethods = 0x0000000000000000
classMethods = 0x0000000000000000
optionalInstanceMethods = 0x0000000100004350
optionalClassMethods = 0x0000000000000000
instanceProperties = 0x0000000000000000
size = 96
flags = 0
_extendedMethodTypes = 0x0000000100004370
_demangledName = 0x0000000000000000
_classProperties = 0x0000000000000000
}
(lldb) p *$12.optionalInstanceMethods// Get optionalInstanceMethods in PROTOCOL_T
(method_list_t) $15 = {
entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 24, count = 1)
}
(lldb) p $15.get(0).big()// Print data with index=0 in optionalInstanceMethods
(method_t::big) $17 = {
name = "testDelegate"
types = 0x0000000100003ec6 "v16@0:8"
imp = 0x0000000000000000
}
(lldb)
Copy the code
The protocol_T protocol is obtained successfully, which is a supplement to the Class content explored earlier.