Runtime Series
Runtime Principle Exploration (I) — An in-depth understanding of ISA (Apple isa optimization)
An in-depth analysis of the Class structure
OC Class cache_t
Runtime principle exploration (4) — To investigate the bottom message mechanism
Exploring the Runtime principle (5) — The nature of Super
Runtime theory exploration (6) — Runtime in the interview question
☕️☕️ The length of this article is quite long. The purpose of creation is not to brush praise and read quantity in Jane’s book, but to review knowledge for myself in the future. If you are lucky enough to find this article and arouse your interest in reading, please have a full rest, calm down, and start reading with sufficient energy. I hope this article can help you. If you find any mistake, please leave a message to correct, thank you. ☕ ️ ☕ ️
Class Definition Review
Picking up where we left off, let’s go back to the definition of Class
struct objc_class : objc_object {
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable method cache
class_data_bits_t bits; // class_rw_t * plus custom RR /alloc flags is used to obtain specific class information
};
Copy the code
There’s also a cache_t cache that we haven’t read yet, so let’s take a look at it. The name makes sense. It means cache. Cache what? Caching methods.
Cache_t profile
Its bottom layer is through the hash table (hash table) data structure to achieve, used to cache once called method, can improve the speed of method search. First, review the normal flow of method invocation. Suppose we call an instance method [bj XXXX];
obj
->isa
->obj
theClass
Object – >method_array_t methods
-> Perform a traversal search on the table, and call the table if it is not foundobj
->superclass
->obj
The parent of the – >isa
->method_array_t methods
-> Perform a traversal search on the list of methods of the parent class, call if found, repeat this step if not found- Call if found, no duplicate process found
- Call if found, no duplicate process found
- Call if found, no duplicate process found
- until
NSObject
->isa
->NSObject
theClass
Object – >method_array_t methods ......
When XXXX is called frequently in an application, this layer-by-layer lookup is inefficient. Therefore, Apple has designed cache_t cache. When XXXX is called for the first time, it is searched in the usual way, and when found, it is added to the cache_t cache. The next time it is invoked, it is directly referenced in the cache_t cache. This greatly improves the efficiency of the lookup.
A hash table in cache_t
Given that cache_t cache is implemented using hash tables, let’s focus on how methods are cached. A hash/hash table is something most iOS developers have at least heard of, but the NSDictionary we use is actually a hash table data structure. Take a look at the definition of cache_t cache
struct cache_t {
struct bucket_t* _buckets;
mask_t _mask;
mask_t _occupied;
}
Copy the code
struct bucket_t *_buckets;
The hash/hash table used to cache methodsmask_t _mask;
— This value = hash length -1mask_t _occupied;
Represents the number of methods that have been cached
The storage unit in the _buckets hash table is bucket_t
struct bucket_t {
private:
cache_key_t _key;
IMP _imp;
}
Copy the code
cache_key_t _key;
This key is actually the SEL of the method, which is the method nameIMP _imp;
This is the memory address of the function corresponding to the method
If you think about how we normally use NSDictionaries, which are stored by a bunch of key-value key-value pairs, the underlying layer of NSDictionaries is the hash table, which we talked about earlier. For method caching, key is cache_KEY_t _key; Value is the bucket_t structure object above.
But how hash tables work is a data structure problem, which is briefly described here. First of all, a hash table is essentially an arrayWhen adding a member to a hash table, you first need to usekey
Evaluate an index, and then insert the element into the hash table at index positionSo the value from the hash table is obvious, you compute the index based on a key, and then you fetch the value from the hash table
The time complexity of the method query (that is, the value operation) here is O(1), compared with the time complexity of the traversal query from the method list at the beginning, it is O(n). Therefore, caching method can greatly improve the efficiency of method query, thus improving the efficiency of method invocation mechanism.
The algorithm that calculates the index value based on the key is called the hash algorithm, and you can design it yourself, but the goal is to minimize the number of different keys that produce the same index, which is called a hash collision, and make sure that the index value is within a reasonable range. The larger the index, the longer the hash table, which takes up real physical space, and our memory is limited. Hash tables are a design idea that trades space for time efficiency.
The index size calculated by the key is random and out of order, so the insertion order is also out of order during method cachingIt can be expected that there will be many empty Spaces in the hash table. For example, if the hash table is 16 in length, the value will store 10 methods, and if the hash table is 64 in length, it will only store 40 methods. Some space will eventually be wasted. But it improves the efficiency of the search. This is called space for time.
Parsing the hash algorithm in cache_t with the source code
Just to recap the hash algorithm that Apple uses here, it’s actually quite simple, as followsindex = @selector(XXXX) & mask
According to the characteristics of & operation, we can know the final resultindex <= mask
And themask
= The hash table length is -1
That is to say0 <= index <= The hash table length -1
, which actually covers the index range of the hash table. We also mentioned the problem of hash collisions, where different keys get the same index. Let’s take a look at the source code, search in objC source codecache_t
, you can find a method that is related to finding
bucket_t * cache_t::find(cache_key_t k, id receiver) // Find the key by k
{ assert(k ! =0);
bucket_t *b = buckets();
mask_t m = mask();
mask_t begin = cache_hash(k, m); // Use the cache_hash function [begin = k & m] to calculate the index value begin corresponding to the key value k, which is used to record the start index
mask_t i = begin; // begin assigns the value to I to switch indexes
do {
if (b[i].key() == 0 || b[i].key() == k) {
Bucket_t = k; bucket_t = k; bucket_t = k;
// If key = 0, bucket_t has not been cached at index I.
return&b[i]; }}while((i = cache_next(i, m)) ! = begin);// I = I -1, I = I -1, I = I -1, I = I -1, I = I -1, I = I -1
// When I =0, reassign mask to I so that it points to the last element of the hash table.
// Add the hash table to the end of the hash table in a circle. The hash table begins at the end of the hash table, and the index begins at the end of the hash table.
// If the bucket_t for key is not found, or bucket_t is empty, the loop ends, indicating that the search failed, and the bad_cache method is called.
// hack
Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
cache_t::bad_cache(receiver, (SEL)k, cls);
}
*********************************** cache_hash(k, m);
static inline mask_t cache_hash(cache_key_t key, mask_t mask)
{
return (mask_t)(key & mask);
}
*********************************** cache_next(i, m)
static inline mask_t cache_next(mask_t i, mask_t mask) {
// return (i-1) & mask; / / not arm64
return i ? i- 1 : mask; // arm64
}
Copy the code
Cache_t ::find is also called by another function in the source code — cache_fill_nolock, the cache fill (insert) operation, as shown below
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
cacheUpdateLock.assertLocked();
// Never cache before +initialize is done
if(! cls->isInitialized())return;
// Make sure the entry wasn't added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
if (cache_getImp(cls, sel)) return;
cache_t *cache = getCache(cls);
cache_key_t key = getKey(sel);
// Use the cache as-is if it is less than 3/4 full
mask_t newOccupied = cache->occupied() + 1;
mask_t capacity = cache->capacity();
if (cache->isConstantEmptyCache()) {
// Cache is read-only. Replace it.cache->reallocate(capacity, capacity ? : INIT_CACHE_SIZE); }else if (newOccupied <= capacity / 4 * 3) {
// Cache is less than 3/4 full. Use it as-is.
}
else {
// Cache is too full. Expand it.
cache->expand();
}
// Scan for the first unused slot and insert there.
// There is guaranteed to be an empty slot because the
// minimum size is 4 and we resized at 3/4 full.
bucket_t *bucket = cache->find(key, receiver);
if (bucket->key() == 0) cache->incrementOccupied();
bucket->set(key, imp);
}
Copy the code
If a bucket->key() == 0 is returned by cache->find, the slot is unused and no method has been cached. Bucket ->set(key, IMP); That is, cache the method to this location.
Based on the above analysis, the following diagrams summarize the overall flow of methods being stored in and taken from CACHE_T
Storing methods to cache_t
Query methods from cache_t
One other question you might have is, if you keep adding methods to the cache, what happens when the cache gets full? Let’s go back to the code we just looked atcache_fill_nolock
Function, just use the screenshot to read itApple does this when the number of cached methods reaches three-quarters of its current cache capacityexpand()
, the source code is as follows
void cache_t::expand(a)
{
cacheUpdateLock.assertLocked();
uint32_t oldCapacity = capacity();
uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;
if ((uint32_t) (mask_t)newCapacity ! = newCapacity) {// mask overflow - can't grow further
// fixme this wastes one bit of mask
newCapacity = oldCapacity;
}
reallocate(oldCapacity, newCapacity);
}
Copy the code
Uint32_t newCapacity = oldCapacity? oldCapacity*2 : INIT_CACHE_SIZE; If this function is called for the first time, an initial value of INIT_CACHE_SIZE is used to set the size of the cache
enum {
INIT_CACHE_SIZE_LOG2 = 2,
INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2)
};
Copy the code
The definition of INIT_CACHE_SIZE shows that it has a value of 4, which means that Apple’s initial capacity for cache_t is 4.
The last function called inside the expand() function is reallocate(oldCapacity, newCapacity); , we are entering its source code to see
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
bool freeOld = canBeFreed();
bucket_t *oldBuckets = buckets();
bucket_t *newBuckets = allocateBuckets(newCapacity);
// Cache's old contents are not propagated.
// This is thought to save cache memory at the cost of extra cache fills.
// fixme re-measure this
assert(newCapacity > 0);
assert((uintptr_t) (mask_t)(newCapacity- 1) == newCapacity- 1);
setBucketsAndMask(newBuckets, newCapacity - 1);
if (freeOld) {
cache_collect_free(oldBuckets, oldCapacity);
cache_collect(false); }}Copy the code
It is clear that the old cache space has been freed, but the condition is freeOld = true. The source of freeOld is given by canBeFreed() at the beginning of the function
bool cache_t::isConstantEmptyCache(a)
{
return
occupied() == 0 &&
buckets() == emptyBucketsForCapacity(capacity(), false);
}
bool cache_t::canBeFreed(a)
{
return! isConstantEmptyCache(); }Copy the code
The canBeFreed() function is actually very simple, which is to determine whether the cache is empty. If the cache is empty, there is no need to release the space. If the original cache is not empty, it is directly released. Cache_fill_nolock allocates a new cache space and then frees the old cache space. This means that all cached methods are lost after each expansion. In the cache_fill_nolock function, after expanding expand(), It simply puts the current method into the cache space, so the method that was cached before the expansion needs to be re-cached the next time it is called. Here’s a good taste.
How are methods of the parent class cached when called?
Now, we know that when we send a message to an object, we find its Class object via the object’s ISA. We look for the method in the Class object from the method cache cache_t. If not, we iterate through the Class object’s list of methods. The method must be cached in cache_t of the object’s Class object.
If the method is not found in the current Class object, it is entered into its parent’s Class object via its superclass. Again, its cache_t is searched first. If no method is found, its list of methods is iterated over. If a method is found in the list of methods, will it be cached in the cache_t of the current parent’s Class object or in the cache_t of the Class object that receives the message?
Static void cache_fill_NOLock (Class CLS, SEL SEL, IMP IMP, id receiver) Because it’s passing in a Class CLS as an argument and we just need to figure out who that Class CLS is.
Go to the upper level to call the function as shown in the following figurecache_fill
Look at it the same waycache_fill
As shown in the figure belowSo let’s look at this firstlog_and_cache()
And you can see that it’s actually beinglockUpImpOrForward()
Function call.
So let’s take a look firstlookUpMethodInClassAndLoadCache()
functionObviously, this function does not handle the superclass problem and is not what we are looking for.
Finally, to take a look at the remaining lookUpImpOrForward function, look at the Chinese annotations marked at ⚠️⚠️⚠️
/*********************************************************************** * lookUpImpOrForward. * The standard IMP Lookup. -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > ⚠ ️ ⚠ ️ ⚠ ️ standard IMP lookup process * initialize = = NO tries to get + initialize (but sometimes fails) * cache==NO skips optimistic unlocked lookup (but uses cache elsewhere) * Most callers should use initialize==YES and cache==YES. * inst is an instance of cls or a subclass thereof, or nil if none is known. * If cls is an un-initialized metaclass then a non-nil inst is faster. * May return _objc_msgForward_impcache. IMPs destined for external use * must be converted to _objc_msgForward or _objc_msgForward_stret. * If you don't want forwarding at all, use lookUpImpOrNil() instead. **********************************************************************/
IMP lookUpImpOrForward(Class cls, SEL sel, id inst,
bool initialize, bool cache, bool resolver)
{
IMP imp = nil;
bool triedResolver = NO;
runtimeLock.assertUnlocked();
// Optimistic cache lookup
if (cache) {/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > ⚠ ️ ⚠ ️ ⚠ ️ query the current Class object cache, if find a way, it returns the method
imp = cache_getImp(cls, sel);
if (imp) return imp;
}
// runtimeLock is held during isRealized and isInitialized checking
// to prevent races against concurrent realization.
// runtimeLock is held during method search to make
// method-lookup + cache-fill atomic with respect to method addition.
// Otherwise, a category could be added but ignored indefinitely because
// the cache was re-filled with the old value after the cache flush on
// behalf of the category.
runtimeLock.read();
if(! cls->isRealized()) {/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > ⚠ ️ ⚠ ️ ⚠ ️ current Class if has not been realized, is to realize the operation
// Drop the read-lock and acquire the write-lock.
// realizeClass() checks isRealized() again to prevent
// a race while the lock is down.
runtimeLock.unlockRead();
runtimeLock.write();
realizeClass(cls);
runtimeLock.unlockWrite();
runtimeLock.read();
}
if(initialize && ! cls->isInitialized()) {//-------------->⚠️⚠️⚠️ If the current Class is not initialized, the initialization operation is performed
runtimeLock.unlockRead();
_class_initialize (_class_getNonMetaClass(cls, inst));
runtimeLock.read();
// If sel == initialize, _class_initialize will send +initialize and
// then the messenger will send +initialize again after this
// procedure finishes. Of course, if this is not being called
// from the messenger then it won't happen. 2778172
}
retry:
runtimeLock.assertReading();
/ / Try this class 's cache. / / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > ⚠ ️ ⚠ ️ ⚠ ️ attempt from the class object cache lookup, if found, will jump to the done in return the method
imp = cache_getImp(cls, sel);
if (imp) goto done;
//---------------->⚠️⚠️⚠️ Try this class's method lists.// if found, cache it in the class's cache_t and skip to done to return it
{
Method meth = getMethodNoSuper_nolock(cls, sel);
if (meth) {
log_and_fill_cache(cls, meth->imp, sel, inst, cls);
imp = meth->imp;
gotodone; }}// Try superclass caches and method lists.------>⚠️⚠️⚠️ Enters the superclass object of the current Class object
{
unsigned attempts = unreasonableClassCount();
for (Class curClass = cls->superclass;//------>⚠️⚠️⚠️ Each time the for loop loops, it enters the superclass object at the upper level and performs the method query process inside the loopcurClass ! = nil; curClass = curClass->superclass) {// Halt if there is a cycle in the superclass chain.
if (--attempts == 0) {
_objc_fatal("Memory corruption in class list.");
}
// Superclass cache.------>⚠️⚠️⚠️ Searches in the current Superclass object cache
imp = cache_getImp(curClass, sel);
if (imp) {
if(imp ! = (IMP)_objc_msgForward_impcache) {// Found the method in a superclass. Cache it in this class.
log_and_fill_cache(cls, imp, sel, inst, curClass);
goto done;//------>⚠️⚠️⚠️ If a method is found in the current superclass cache, log_and_fill_cache is called to cache the method. Note that the parameter is CLS, that is, to cache the method in the cache_t of the Class object that receives the message. Then skip to done to return to the method
}
else {
// Found a forward:: entry in a superclass.
// Stop searching, but don't cache yet; call method
// resolver for this class first.
break;//---->⚠️⚠️⚠️ If the method found in the cache is _objc_msgForward_impcache, break out of the loop, enter the superclass at the next level, and search again}}// Superclass method list.---->⚠️⚠️⚠️ If no method is found in the Superclass cache, search for the current Superclass method list
Method meth = getMethodNoSuper_nolock(curClass, sel);
if (meth) {
//------>⚠️⚠️⚠️ If a method is found in the current superclass method list, log_and_fill_cache is called to cache the method. Note that the parameter is CLS, that is, to cache the method in the cache_t of the Class object that receives the message. Then skip to done to return to the method
log_and_fill_cache(cls, meth->imp, sel, inst, curClass);
imp = meth->imp;
gotodone; }}}// No implementation found. Try method resolver once.//------>⚠️⚠️⚠️
if(resolver && ! triedResolver) { runtimeLock.unlockRead(); _class_resolveMethod(cls, sel, inst); runtimeLock.read();// Don't cache the result; we don't hold the lock so it may have
// changed already. Re-do the search from scratch instead.
triedResolver = YES;
goto retry;
}
// No implementation found, and method resolver didn't help. //------>⚠️⚠️⚠️
// Use forwarding.
imp = (IMP)_objc_msgForward_impcache;
cache_fill(cls, sel, imp, inst);
done:
runtimeLock.unlockRead();
return imp;
}
Copy the code
Method meth = getMethodNoSuper_nolock(CLS, sel); One more thing to say, get into the implementation of it
static method_t *
getMethodNoSuper_nolock(Class cls, SEL sel)
{
runtimeLock.assertLocked();
assert(cls->isRealized());
// fixme nil cls?
// fixme nil sel?
for (automlists = cls->data()->methods.beginLists(), end = cls->data()->methods.endLists(); mlists ! = end; ++mlists) {method_t *m = search_method_list(*mlists, sel);//-- ⚠️⚠️⚠️ core function
if (m) return m;
}
return nil;
}
Copy the code
Enter its core function search_method_list(*mlists, sel)
static method_t *search_method_list(const method_list_t *mlist, SEL sel) { int methodListIsFixedUp = mlist->isFixedUp(); int methodListHasExpectedSize = mlist->entsize() == sizeof(method_t); If (builtin_expect (methodListIsFixedUp && methodListHasExpectedSize, 1)) {/ / - ⚠ ️ ⚠ ️ ⚠ ️ if the method of the list is sorted, In binary search return findMethodInSortedMethodList (sel, mlist); } else {// Linear search of unsorted method list //-- ⚠️⚠️⚠️ Linear search of unsorted method list for (auto& meth: *mlist) { if (meth.name == sel) return &meth; } } #if DEBUG // sanity-check negative results if (mlist->isFixedUp()) { for (auto& meth : *mlist) { if (meth.name == sel) { _objc_fatal("linear search worked when binary search did not"); } } } #endif return nil; }}Copy the code
Method according to the logic of the code, if the list is sorted, will use findMethodInSortedMethodList to look up, but it is actually with dichotomy to look up, specific code is as follows
static method_t *findMethodInSortedMethodList(SEL key, const method_list_t *list)
{
assert(list);
const method_t * const first = &list->first;
const method_t *base = first;
const method_t *probe;
uintptr_t keyValue = (uintptr_t)key;
uint32_t count;
//-- ⚠️⚠️⚠️count >>= 1 = count/=2, indicating that the search starts from the middle of the array
for (count = list->count; count ! =0; count >>= 1) {
probe = base + (count >> 1);
uintptr_t probeValue = (uintptr_t)probe->name;
if (keyValue == probeValue) {
// `probe` is a match.
// Rewind looking for the *first* occurrence of this value.
// This is required for correct category overrides.
while (probe > first && keyValue == (uintptr_t)probe[- 1].name) {
probe--;
}
return (method_t *)probe;
}
if (keyValue > probeValue) {
base = probe + 1; count--; }}return nil;
}
Copy the code
Runtime message sending process summary
Now that we have a basic understanding of the details of method query and method caching, we can combine the method lookup and method caching processes to describe the process of sending messages in the Runtime messaging mechanism
- (1) When an object receives a message
[obj message];
First, according toobj
theisa
Pointer to its class objectcls
The inside.- (2) in
obj
thecls
Inside, first go to cachecache_t
Inside query methodmessage
If found, the function is called directly.- (3) If the corresponding function is not found in the previous step, the
cls
If the corresponding function is found, the method will be cached toobj
The class objectcls
thecache_t
Inside, and then call the function.- (4) Before each cache operation, the cache capacity should be checked first. If the number of methods in the cache exceeds the specified critical value (3/4 of the set capacity), the cache should be expanded twice first, all the previously cached methods should be discarded, and the current methods should be stored in the new cache after expansion.
- (5) if the
obj
thecls
Object, the cache and method list cannot be foundmssage
Method is passedcls
thesuperclass
Pointer to its parent objectf_cls
inside- (6) to enter
f_cls
After, first in it’scache_t
Find insidemssage
If the method is found, the method is cached to firstMessage receiverobj
The class objectcls
thecache_t
Inside, and then calls the corresponding function of the method.- (7) If the method is not found in the previous step, it will be right
f_cls
A traversal binary/traversal search is performed on the list of methods, if foundmssage
Method, then again, will cache the method toMessage receiverobj
The class objectcls
thecache_t
Inside, and then calls the corresponding function of the method.Note that the method is not cached into the current superclass objectf_cls
Cache_t of.- (8) If no method has been found, it will pass
f_cls
thesuperclass
To enter the upper parent object, click(6) - > (7) - > (8)
Step flow is repeated. If you’re already at the base objectNSObject
Still not foundmssage
, enter the step(9)
- (9) Next will be the message mechanismDynamic method parsingphase
At this point, the message sending process and method caching strategy in the OC Runtime are analyzed.
🦋🦋🦋 Portal 🦋 port 🦋
Runtime Principle Exploration (I) — An in-depth understanding of ISA (Apple isa optimization)
An in-depth analysis of the Class structure
OC Class cache_t
Runtime principle exploration (4) — To investigate the bottom message mechanism
Exploring the Runtime principle (5) — The nature of Super
Runtime theory exploration (6) — Runtime in the interview question