IOS OC method cache source analysis

preface

The author sorted out a series of low-level articles about OC, hoping to help you. This article focuses on the underlying source code analysis of method caching.

1. Alloc principle of OC object creation in iOS

2. Align the OC object memory of iOS

3. The underlying principle of iOS OC ISA

4. Structural analysis of OC source code of iOS

In daily development, when we call methods, do we think about the question, if we call methods frequently, does Apple cache the methods that we use for efficiency? If so, how does it work? To understand this, this article does a source code analysis of cache_t.

1.cache_t

Cache_t is a 16-byte object in the objc_class structure. The source code for cache_t is:

struct cache_t { struct bucket_t *_buckets; mask_t _mask; mask_t _occupied; . }; struct bucket_t { private: // IMP-first is betterfor arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    MethodCacheIMP _imp;
    cache_key_t _key;
#else
    cache_key_t _key;
    MethodCacheIMP _imp;
#endif
}

using MethodListIMP = IMP;
typedef uintptr_t cache_key_t;
Copy the code

From the source, we know that by caching the method number SEL and the function address IMP in bucket_t (also known as the hash bucket). A TestObject class is defined as follows:

#import <Foundation/Foundation.h>

NS_ASSUME_NONNULL_BEGIN

@interface TestObject : NSObject{
    NSString *nickName;
}

@property(nonatomic,copy) NSString *name;

-(void)sayName;
-(void)sayHello;
-(void)sayTest;
+(void)sayNickName;

@end

NS_ASSUME_NONNULL_END

#import "TestObject.h"

@implementation TestObject

-(void)sayName{
    NSLog(@"%p",__func__);
}

-(void)sayHello{
    NSLog(@"%p",__func__);
}

-(void)sayTest{
    NSLog(@"%p",__func__);
}

+(void)sayNickName{
    NSLog(@"%p",__func__); } @end // Implement the code TestObject *testObject = [TestObject alloc];
Class tClass = object_getClass(testObject);
[testObject sayName];
[testObject sayHello];
NSLog(@"% @".testObject);

Copy the code

Since methods in instance objects are called in class, to verify that instance methods are in cache_T, we can use the LLDB directive to find cache_t and drill down, as shown in the figure below

TestObject
_mask
_occupied

TestObject *testObject = [[TestObject alloc] init];
Class tClass = object_getClass(testObject);
[testObject sayName];
[testObject sayHello];
[testObject sayTest];
NSLog(@"% @".testObject);
Copy the code

lldb
_mask
_occupied
buckets
sayTest

1.1 Cache_t Attribute value

_buckets:bucket_tAn array of structures,bucket_tIs used to store method SEL memory address and IMP.
_mask: indicates the size of the array as the mask. (since the array sizes maintained here are all integer powers of 2, the binary bit of _mask 000011,000111,001111) is just the right mask to hash the remainder. Just enough to keep it within cache size.
_occupied: is the number of methods currently cached, that is, the number of positions used in the array.

2. Principle analysis of method cache

The essence of the OC method is message sending (that is, objc_msgSend), and the bottom line is to find the IMP through the SEL of the method. Cache_t cache is read using objc_msgSend. Cache_t cache is first written using cache_fill.

 * Cache readers (PC-checked by collecting_in_critical())
 * objc_msgSend*
 * cache_getImp
 *
 * Cache writers (hold cacheUpdateLock while reading or writing; not PC-checked)
 * cache_fill         (acquires lock)
 * cache_expand       (only called from cache_fill)
 * cache_create       (only called from cache_expand)
 * bcopy               (only called from instrumented cache_expand)
 * flush_caches        (acquires lock)
 * cache_flush        (only called from cache_fill and flush_caches)
 * cache_collect_free (only called from cache_expand and cache_flush)
Copy the code

2.1 cache_fill

The method is first cached by cache_fill, the source code below

void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if ! DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}
Copy the code

The cache_fill method passes in the CLS Class Class and SEL,IMP for the method.

2.2 cache_fill_nolock

static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();

    // Never cache before +initialize is done
    if(! cls->isInitialized())return;

    // Make sure the entry wasn't added to the cache by some other thread // before we grabbed the cacheUpdateLock. if (cache_getImp(cls, sel)) return; cache_t *cache = getCache(cls); cache_key_t key = getKey(sel); // Use the cache as-is if it is less than 3/4 full mask_t newOccupied = cache->occupied() + 1; mask_t capacity = cache->capacity(); if (cache->isConstantEmptyCache()) { // Cache is read-only. Replace it. cache->reallocate(capacity, capacity ? : INIT_CACHE_SIZE); } else if (newOccupied <= capacity / 4 * 3) { // Cache is less than 3/4 full. Use it as-is. } else { // Cache is too full. Expand it. cache->expand(); } // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot because the // minimum size is 4 and we resized at 3/4 full. bucket_t *bucket = cache->find(key, receiver); if (bucket->key() == 0) cache->incrementOccupied(); bucket->set(key, imp); } cache_t *getCache(Class cls) { assert(cls); return &cls->cache; } cache_key_t getKey(SEL sel) { assert(sel); return (cache_key_t)sel; } /* Initial cache bucket count. INIT_CACHE_SIZE must be a power of two. */ enum { INIT_CACHE_SIZE_LOG2 = 2, INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2) }; #if __LP64__ typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits #else typedef uint16_t mask_t; #endif typedef uintptr_t cache_key_t;Copy the code

Looking at the various methods in the source code, getCache(CLS) fetches the class cache_t from CLS. GetKey (sel) converts sel to cache_KEY_t. Cache -> Occupied (); cache-> Capacity ();

mask_t cache_t::occupied() 
{
    return _occupied;
}

mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
}

mask_t cache_t::mask() 
{
    return _mask; 
}

Copy the code

Capacity () is the number of methods used to fetch the cache. The default value is also 0. If mask() has a value, add 1 to this value. This is equivalent to capturing the capacity of the method. IsConstantEmptyCache () determines whether the number of methods occupied is less than or equal to 3/4 of the capacity. If so, do nothing. Otherwise you need to start expanding. If there is no cache, you need to execute the Reallocate function. INIT_CACHE_SIZE in realLocate is 4, so the initial realLocate values are 0 and 4.

2.2.1 reallocate

Reinitialize the cache. The source code for this function is as follows:

Void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity) {// indicates whether the old cache canBeFreed bool freeOld = canBeFreed(); // buckets bucket_t = buckets(); // create a new bucket bucket_t *newBuckets = allocateBuckets(newCapacity); // Cache's old contents are not propagated. // This is thought to save cache memory at the cost of extra cache fills. // fixme re-measure this assert(newCapacity > 0); assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1); SetBucketsAndMask (newBuckets, newCapacity - 1); If (freeOld) {// Drop buckets cache_collect_free(oldBuckets, oldCapacity); cache_collect(false); } } bool cache_t::canBeFreed() { return ! isConstantEmptyCache(); } bucket_t *allocateBuckets(mask_t newCapacity) { // Allocate one extra bucket to mark the end of the list. // This can't overflow mask_t because newCapacity is a power of 2.
    // fixme instead put the end mark inline when +1 is malloc-inefficient
    bucket_t *newBuckets = (bucket_t *)
        calloc(cache_t::bytesForCapacity(newCapacity), 1);

    bucket_t *end = cache_t::endMarker(newBuckets, newCapacity);

#if __arm__
    // End marker's key is 1 and imp points BEFORE the first bucket. // This saves an instruction in objc_msgSend. end->setKey((cache_key_t)(uintptr_t)1); end->setImp((IMP)(newBuckets - 1)); #else // End marker's key is 1 and imp points to the first bucket.
    end->setKey((cache_key_t)(uintptr_t)1);
    end->setImp((IMP)newBuckets);
#endif
    
    if (PrintCaches) recordNewCache(newCapacity);

    return newBuckets;
}


void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
    // objc_msgSend uses mask and buckets with no locks.
    // It is safe for objc_msgSend to see new buckets but old mask.
    // (It will get a cache miss but not overrun the buckets' bounds). // It is unsafe for objc_msgSend to see old buckets and new mask. // Therefore we write new buckets, wait a lot, then write new mask. // objc_msgSend reads mask first, then buckets. // ensure other threads see buckets contents before buckets pointer mega_barrier(); _buckets = newBuckets; // ensure other threads see new buckets before new mask mega_barrier(); _mask = newMask; _occupied = 0; }Copy the code

From the source code, you can see that reallocate takes the old buckets and creates the new buckets, because the old buckets need to be erased before they can be judged free. Creating a new buckets is known in the allocateBuckets function, which uses the calloc function to request cache_T memory space, with default values for both key and IMP. The setBucketsAndMask function assigns buckets and _mask because newMask is passed in as 3 and _occupied is 0 because the method has not been cached and is only initialized. This is a good indication of the mask 3 obtained when the LLDB directive is first used.

2.2.2 expand

If the value of newOccupied is greater than 3/4 of capacity, expand() is required

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if((uint32_t)(mask_t)newCapacity ! = newCapacity) { // mask overflow - can't grow further // fixme this wastes one bit of mask newCapacity = oldCapacity; } reallocate(oldCapacity, newCapacity); } mask_t cache_t::capacity() { return mask() ? mask()+1 : 0; }Copy the code

When capacity() needs to be expanded, oldCapacity is 4 and newCapacity is 8. Then the reallocate function continues to execute, passing in parameters 4 and 8. The reallocate function will empty the old buckets, change the mask value to 7, and prepare the value to 0. But why is it that the LLDB directive reads “Occupied 1”? After this process has gone through, after executing the process of judgment, it will execute to

// Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the 
    // minimum size is 4 and we resized at 3/4 full.
    bucket_t *bucket = cache->find(key, receiver);
    if (bucket->key() == 0) cache->incrementOccupied();
    bucket->set(key, imp);
    
    void cache_t::incrementOccupied() 
{
    _occupied++;
}

Copy the code

Find looks for bucket_t using the key and receiver above. If key() is 0, the _occupied number will be +1. And the bucket key and IMP are populated.

2.2.3 the find function

bucket_t * cache_t::find(cache_key_t k, id receiver) { assert(k ! = 0); bucket_t *b = buckets(); mask_t m = mask(); Mask_t BEGIN = cache_hash(k, m); mask_t begin = cache_hash(k, m); mask_t begin = cache_hash(k, m); // begin assigns the value to I, which is used to switch the index mask_t I = begin;do {
        if(b [I]. The key () = = 0 | | b [I] key () = = k) {/ / use the I from the hash values, if out bucket_t key = k, the query is successful, returns the bucket_t, / / if the key = 0, No method has been cached on index I. Return bucket_t to abort the cached query.return&b[i]; }}while((i = cache_next(i, m)) ! = begin); // This step is actually the same as I = I -1, back updoCycle, equivalent to locate a hash table cell elements in it, and once again are compared, and the key value of k / / when I = 0, is I to hash the first element index will mask was assigned to the I, make its point to hash the last element, start to reverse traversal hash, / / it is equivalent to around, // If bucket_t is empty and bucket_t is not found, then the loop ends, indicating that the search failed. Call the bad_cache method. // hack Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache)); cache_t::bad_cache(receiver, (SEL)k, cls); } static inline mask_t cache_hash(cache_key_t key, mask_t mask) {return (mask_t)(key & mask);
}
Copy the code

We can see from find that the address of bucket_t is obtained by the size of the mask and the key obtained by the begin subscript of the hash function. Since the hash function is unordered, the location of bucket_t in buckets is also unordered.

Class methods are not found in a class cache_t because they are cached in the metaclass. Therefore, if you want to find a metaclass using an LLDB directive, you can first find a metaclass using isa. You can follow the above procedure to verify that the metaclass holds a class method.

3. The last

The essence of the OC method is message sending (that is, objc_msgSend), and the bottom line is to find the IMP through the SEL of the method.

1. Method cachingcache_tIn, respectivelybucketsPointer address to store the method array,maskTo hold the size of the method array,occupiedTo store the current method usage.
2. In the methodnewOccupiedThe new method occupies more than the current methodcapacity()Three-quarters of them need to be expanded.
3. This parameter is set during capacity expansionmaskforcapacity() * 2 - 1That is, 2 times the number of methods minus 1, for example 3 the first time, 7 the second time. The end will be the oldbucketsThe list is cleared. However, the methods that need to be expanded will eventually be addedbucketsThe inside.