preface

In the previous articleIOS underlying exploration ——– type of principle analysis (ii)We exploredIsa, Superclass, bits, then the entire class structure (as shown in the figure below) remainscacheThere is no exploration. So today, I’m going to complete this.

Cache literally means cache. So what is the underlying cache? Let’s explore.

From the previous article, we obtained bits by memory translation, so can we obtain cache by the same method, explore what cache is? What is his basic data structure?

cacheThe flow chart

1, to explorecacheData structure of

1.1. Preliminary study on cache memory structure through LLDB debugging

The code to explore is stillobjcSource code, create oneTestPersonClass, and get it in main.mClassAnd then through thelldbDebug view, as shown below:

The LLDB debugging result is as follows:

The contents of the class structure are: ISA (8 bytes), superclass (8 bytes), cache (16 bytes), and bits. Cache_t; cache_t; cache_t;

struct cache_t { private: explicit_atomic<uintptr_t> _bucketsAndMaybeMask; union { struct { explicit_atomic<mask_t> _maybeMask; #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; }; explicit_atomic<preopt_cache_t *> _originalPreoptCache; }; // omit code... // Static constexpr Uintptr_t maxMask = ((uintptr_t)1 << (64-maskshift)) -1; static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << maskShift) - 1; // omit code... Static bucket_t *emptyBuckets(); // omit code... // Insert SEL SEL, IMP IMP, id receiver void insert(SEL SEL, IMP IMP, id receiver); // omit code... }Copy the code

$3 = LLDB; $3 = LLDB; $3 = LLDB; $3 = LLDB; Here, there is a question, since the cache is a cache, then the cache should also be variables and methods, should be related to SEL or IMP, but, through the source code, but did not see related to it. According to the printed result, _bucketsAndMaybeMask and _originalPreoptCache contain values, respectively.

Since both have values, we need to look at the function methods of _bucketsAndMaybeMask and _originalPreoptCache.

Cache can cache, then corresponding to it, there should be increase, delete, change, check function, read and write must be indispensable. (uintptr_t)1 << (64-maskshift)) -1); Clearing and creating bucket_t; There are even insert operations (SEL SEL, IMP IMP, ID receiver). So we need to figure out which is the key.

1.2,cacheInside the cache method is inbucket_tinside

So if there’s an insert, then there’s a read and write, so let’s go into the insert and see what’s going on, okay?

Void cache_t::insert(SEL SEL, IMP IMP, id receiver) { Bucket_t *b = buckets(); mask_t m = capacity - 1; mask_t begin = cache_hash(sel, m); mask_t i = begin; // omit code...Copy the code

A quick scan shows that buckets is inserted and bucket_t is the receiver. So it’s time to look at the bucket_t implementation.

Struct bucket_t {private: x86_64. #if __arm64__ // explicit_atomic<SEL> _sel; #else // Explicit_atomic <SEL> _sel; explicit_atomic<uintptr_t> _imp; #endif // omit code...Copy the code

Into thebucket_tInside, it’s pretty clear that it’s an arraybucketsIt’s just a bunch of groupsimpandselThe cache of,_impand_selThe one-to-one correspondence is as follows:

  • To sum up, we can get a general data structure relationship chart:

2,cacheThe underlyinglldbAnalysis of the

2.1,lldbDelve into

Now that we have this structure in place, we need to go into the code and verify it. YeslldbDebug to see if the method is stored inside, through the frontlldbDebugging to$3And then we go down_bucketsAndMaybeMaskContents stored:In the end it diderrorThe error. It means that_bucketsAndMaybeMaskThe route we’re looking for doesn’t work. Keep changing_originalPreoptCacheTry:Found the same way.

When we find that none of them work, we go directly to the source code to find out if there is a relevant method. Combined with the general data structure of the cache we just analyzed, we can lock down bucket_t related methods. The combination of SEL and IMP is stored in the buckets array: buckets

// omit code... struct bucket_t *buckets() const; // omit code...Copy the code

You can viewbuckets()Store the case againlldbDebugging, and executing firstTestPersonThe inside of thesaySomethingMethod, there will be a cache, so in the beginning, executesaySomethingMethods:

As a result of this debugging, we can get _sel and _IMP information, although their value is 0, but the direction is correct. In the $3 column, _occupied is 0. Because the method is not called, caching is not started. So before debugging, you need to call.

According to the debugging result, it is found that the value stored by $5 is empty, because the value of _sel and _IMP is 0, and we clearly get the value from buckets’ array. This happens because apple hashes _SEL and _IMP into the buckets array.

With aHash chain table. What’s the hash list? Let’s see what happens nextlldbDebug the steps throughp $3.buckets()[2], is to go tobucketsThe first of an array of2The value of each corner mark:So there we have it.

2.2,HashThe list

B: You don't like buckets()[2]?

So that’s what a hash list is, right?

  • There are two kinds of computer storage, one is sequential storage, the other is chain storage.

A hash list is a data structure that combines an array with a linked list, for example, a set of key numbers{12,13,25,23,38,34,6,84,91}.The Hash tableLong for14.HashFunction foraddress(key) = key % 11; And there is aarray[13]The array structure of. In this set of key numbers, the hash function calculates12, 23, 34The hashes of phi are the same1And then,12, 23, 34It’s stored in an array as a linked list structure1As shown in the figure below:

The other number processing method is the same, according to the hash function calculated by the hash value, and then the corresponding number into the corresponding number inside.

2.3. Verify the compatibility of Apple systembucketsThe array is hashed

If the data is stored in the cache, it may not be stored in the 0 bit, but may be stored in the other bit. For example, I found this method, which I found in position two.

Now what we want to know is what is this method that we just found? So you need to go to bucket_t and see if there’s a method for that. See the source code:

inline SEL sel() const { return _sel.load(memory_order_relaxed); }
Copy the code

There is asel()Method output. So, inlldbDebug inside, addsel()To print the method name:That’ll give us what we need to find.

So the corresponding is to find the output result of _IMP, so still want to find a method in the source code:

inline IMP imp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, Class cls)
Copy the code

According to the source code, you need to pass in onebaseandclsBecause we don’t knowbaseWhat was it, so it went viralnil.clsisTestPersonClass, passpClass, the result can be obtained:

Next, use code to analyze.

3. Break away from source code analysiscache

3.1, copy the source code, simplified version implementationClass structureandThe cache structure

The reason for using code analysis:

  • The first reason: it’s easier to download unfamiliar code or source code that you can’t run debugging directly.
  • Second reason: when in uselldb, add or subtract some attributes, methods, need to perform more repeated steps again, more tedious;
  • Third reason: the way you sample on a small scale gives you a clearer picture of what’s underneath.

We will use the source code to define a similar class data structure, to see the situation in the cache.

Create a project and create a TestPerson class with the following code:

//TestTestPerson class #import NS_ASSUME_NONNULL_BEGIN @interface TestPerson: NSObject - (void)say1; - (void)say2; - (void)say3; - (void)say4; - (void)say5; - (void)say6; - (void)say7; + (void)sayHappy; @end #import "TestPerson.h" @implementation TestPerson - (void)say1{ NSLog(@"TestPerson say : %s",__func__); } - (void)say2{ NSLog(@"TestPerson say : %s",__func__); } - (void)say3{ NSLog(@"TestPerson say : %s",__func__); } - (void)say4{ NSLog(@"TestPerson say : %s",__func__); } - (void)say5{ NSLog(@"TestPerson say : %s",__func__); } - (void)say6{ NSLog(@"TestPerson say : %s",__func__); } - (void)say7{ NSLog(@"TestPerson say : %s",__func__); } + (void)sayHappy{ NSLog(@"TestPerson say : %s",__func__); H > #import "testPerson. h" #import <objc/runtime.h> typedef uint32_t mask_t; Struct sc_bucket_t {SEL _sel; struct sc_bucket_t {SEL _sel; IMP _imp; }; Struct sc_cache_t {//uintptr_t _bucketsAndMaybeMask Uintptr_t addr = _bucketsandmaybemask. load(memory_order_relaxed);) struct sc_bucket_t *_bukets; // 8 mask_t _maybeMask; // 4 uint16_t _flags; // 2 uint16_t _occupied; / / 2}; Struct sc_class_data_bits_t {uintptr_t bits; }; Struct sc_objc_class {Class isa; Class superclass; struct sc_cache_t cache; // formerly cache pointer and vtable struct sc_class_data_bits_t bits; }; int main(int argc, const char * argv[]) { @autoreleasepool { TestPerson *p = [TestPerson alloc]; Class pClass = p.class; struct sc_objc_class *sc_class = (__bridge struct sc_objc_class *)(pClass); NSLog(@"%hu - %u",sc_class->cache._occupied,sc_class->cache._maybeMask); } return 0; }Copy the code

Now, do not callTestPersonClass instance method and class method, run, check the result:

Now callTestPersonthesay1Method,[p say1];

And then we can add moreSay2, say3, say4, say5, say6, say7Example method, see the print result:It’s kind of down. What’s going on?

Class methods do not existcacheinside

3.2.1 The methods in the class are stored in another newly created memory

To find the answer, you need to enter the source code inside.In the structurecache_tInside, one of his addresses is stored in_bucketsAndMaybeMask, and the_bucketsAndMaybeMaskIs made up of two parts:bucketsandmaskThat means it’s storing two pieces of data. So how do we get these two data sets?

One entry point is the insertion of the cache method we mentioned earlier.

We know thatcacheIt’s a cache, so there’s a cache, there’s an insert, there’s a write. So we can be incache_tFind the way to insert:

From the point of view of the insertion method, the insertion parameters are SEL, IMP, and receiver message receiver (who called the message). So look at the implementation inside the insert method:

Void cache_t::insert(SEL SEL, IMP IMP, id receiver) {// Omit code... // Use the cache as-is if we expect to fill ratio. Mask_t newOccupied = occupied() + 1; unsigned oldCapacity = capacity(), capacity = oldCapacity; if (slowpath(isConstantEmptyCache())) { // Cache is read-only. Replace it. if (! capacity) capacity = INIT_CACHE_SIZE; reallocate(oldCapacity, capacity, /* freeOld */false); } else if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity))) { // Cache is less than 3/4 or 7/8 full. Use it as-is. } #if CACHE_ALLOW_FULL_UTILIZATION else if (capacity <= FULL_UTILIZATION_CACHE_SIZE && newOccupied +  CACHE_END_MARKER <= capacity) { // Allow 100% cache utilization for small buckets. Use it as-is. } #endif else { capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; if (capacity > MAX_CACHE_SIZE) { capacity = MAX_CACHE_SIZE; } reallocate(oldCapacity, capacity, true); } bucket_t *b = buckets(); mask_t m = capacity - 1; mask_t begin = cache_hash(sel, m); mask_t i = begin; // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot. do { if (fastpath(b[i].sel() == 0)) { incrementOccupied(); b[i].set<Atomic, Encoded>(b, sel, imp, cls()); return; } if (b[i].sel() == sel) { // The entry was added to the cache by some other thread // before we grabbed the cacheUpdateLock. return; } } while (fastpath((i = cache_next(i, m)) ! = begin)); bad_cache(receiver, (SEL)sel); }Copy the code

The final value of this method is “occupied” : 1; if the final value is “occupied”, the final value is “1”; if the final value is “occupied”, the final value is “1”; When compiling for the first time, the cache is empty, so enter the following judgment:

if (slowpath(isConstantEmptyCache())) { // Cache is read-only. Replace it. if (! capacity) capacity = INIT_CACHE_SIZE; //INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2), INIT_CACHE_SIZE_LOG2 = 2, so shift 2 to the left. //0x001, offset by 2 bits, 0x100, so INIT_CACHE_SIZE is 4, reallocate(oldCapacity, capacity, /* freeOld */false); }Copy the code

INIT_CACHE_SIZE = 4 (capacity = 4) Reallocate () opens up the realLocate () method:

Void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld) {// Buckets (), Assign to old oldBuckets bucket_t *oldBuckets = buckets(); // newBuckets bucket_t *newBuckets = allocateBuckets(newCapacity); // Cache's old contents are not propagated. // This is thought to save cache memory at the cost of extra cache fills. //  fixme re-measure this ASSERT(newCapacity > 0); ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1); // The first address of cache_t is _bucketsAndMaybeMask. This is the first compilation, so an empty container is inserted, SetBucketsAndMask (newBuckets, newCapacity-1) is a loaded container for the content to be inserted next; if (freeOld) { collect_free(oldBuckets, oldCapacity); }}Copy the code

Let’s take a look at the buckets creation process and enter the allocateBuckets() method to see how it is implemented:

size_t cache_t::bytesForCapacity(uint32_t cap) { return sizeof(bucket_t) * cap; Cache_t ::endMarker(struct bucket_t *b, struct bucket_t *b, // (uint32_t cap) {return (bucket_t *)((uintptr_t)b + bytesForCapacity(cap)) -1; } bucket_t *cache_t::allocateBuckets(mask_t newCapacity) { // Allocate one extra bucket to mark the end of the list. // This can't overflow mask_t because newCapacity is a power of 2. bucket_t *newBuckets = (bucket_t *)calloc(bytesForCapacity(newCapacity), 1); Bucket_t *end = endMarker(newBuckets, newCapacity); End marker's sel is 1 and IMP points BEFORE the first bucket instruction in objc_msgSend. end->set<NotAtomic, Raw>(newBuckets, (SEL)(uintptr_t)1, (IMP)(newBuckets - 1), nil); #else // End marker's sel is 1 and imp points to the first bucket.  end->set<NotAtomic, Raw>(newBuckets, (SEL)(uintptr_t)1, (IMP)newBuckets, nil); #endif if (PrintCaches) recordNewCache(newCapacity); // Record the new cache return newBuckets; }Copy the code

This is how newBuckets started.

Then we go to the setBucketsAndMask() method:

void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask) { #ifdef __arm__ // // ensure other threads see buckets contents before buckets pointer mega_barrier(); / / here store newBuckets into _bucketsAndMaybeMask _bucketsAndMaybeMask. Store (uintptr_t newBuckets, memory_order_relaxed); // ensure other threads see new buckets before new mask // set memory barrier to prevent multiple threads from accessing mega_barrier(); _maybeMask.store(newMask, memory_order_relaxed); _occupied = 0; # elif __x86_64__ | | i386 / / / / macOS and simulator ensure other threads see buckets contents before buckets pointer / / here store newBuckets into _bucketsAndMaybeMask _bucketsAndMaybeMask. Store (uintptr_t newBuckets, memory_order_release); // ensure other threads see new buckets before new mask // write _maybemask. store(newMask, memory_order_release); _occupied = 0; #else #error Don't know how to do setBucketsAndMask on this architecture. #endif}Copy the code

3.3.2 rainfall distribution on 10-12,cacheThrough the stored address to the newly opened space read and write method

  • Through source code analysis, it is known that in the cache, only pointer addresses are stored, and the method is called, but there is another open space (source code analysis, said the container). The sel and IMP of each method are stored in a bucket, which in turn is stored in the newly created memory. This is where buckets array comes from. When a method is read, it simply goes to the pointer address in the cache and finds the corresponding bucket in the newly created space to fetch the method.

  • The advantage of this is that no matter how many methods are stored in the cache, the size of the cache does not increase as the number of methods increases, because the cache contains only pointer addresses, each of which is only 8 bytes. This is how Apple optimizes memory. Here is the illustration:

3.3.3,bucketsArray storage passesHashTo deal with

Void cache_t::insert(SEL SEL, IMP IMP, id receiver); After opening a new storage space:

// buckets bucket_t *b = buckets(); M = 3 mask_t m = capacity - 1; // Mask_t m = capacity - 1; // Buckets is an array and starts with a space of 4, so the first method is hashed to an address that is the original cache location. mask_t begin = cache_hash(sel, m); mask_t i = begin;Copy the code

TestPerson: mask_t m = 3; TestPerson: say1; mask_t m = 3; TestPerson: say1;

The cache_hash() method is used to calculate the starting point of storage once you get the buckets array:

static inline mask_t cache_hash(SEL sel, mask_t mask) 
{
    uintptr_t value = (uintptr_t)sel;
#if CONFIG_USE_PREOPT_CACHES
    value ^= value >> 7;
#endif
    return (mask_t)(value & mask);
}
Copy the code

Once the hash value is calculated, caching begins. A do-while loop is then performed to find the empty position.

Do {if (fastPath (b[I].sel() == 0)) {// incrementOccupied(); // The method is: _occupied++; B [I]. Set <Atomic, Encoded>(b, sel, IMP, CLS ()); // add sel and IMP to buckets' I; // add sel and IMP to buckets' I; } if (b[I].sel() == sel) { Upon hearing The entry added to The cache by some other thread // before we cacheUpdateLock. Return; } while (fastPath ((I = cache_next(I, m))! = begin));Copy the code

Cache_next () : cache_next() : cache_next() : cache_next() : cache_next();

#if CACHE_END_MARKER // __arm__ || __x86_64__ || __i386__ static inline mask_t cache_next(mask_t i, mask_t mask) { return (i+1) & mask; } #elif __arm64__ static inline mask_t cache_next(mask_t I, mask_t mask) {return I? i-1 : mask; }Copy the code

So this is the first time that you do an insert.

3.3.4 According to the increase or decrease of methods in the class, the memory space will be re-created

The next method inserts:

  • When cached methods occupy less capacity than the current total capacityThree quarters of(capacity * 3 / 4), then perform the judgment:else if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity)))And thecache_fill_ratio(capacity) = capacity * 3 / 4:
static inline mask_t cache_fill_ratio(mask_t capacity) {
    return capacity * 3 / 4;
}
Copy the code
  • When the storage is full. Apple supports full storage and does nothing when it is full
#if CACHE_ALLOW_FULL_UTILIZATION
    else if (capacity <= FULL_UTILIZATION_CACHE_SIZE && newOccupied + CACHE_END_MARKER <= capacity) {
        // Allow 100% cache utilization for small buckets. Use it as-is.
    }
Copy the code
  • When the cached method takes up more capacity than the current total capacityThree quarters of(capacity * 3 / 4), then it needs to be expandedcapacity * 2, the implementation ofelseJudgment. I just figured it out,capacity4Double the capacity8, butmask_t m = capacity - 1, so,m = 7.elseJudge source code:
else {
        capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
        if (capacity > MAX_CACHE_SIZE) {
            capacity = MAX_CACHE_SIZE;
        }
        reallocate(oldCapacity, capacity, true);
    }
Copy the code

How to expand the capacity? I need to go inside the Reallocate () method:

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        collect_free(oldBuckets, oldCapacity);
    }
}
Copy the code

The collect_free() method creates a new space, and since the freeOld passed in is true, it also frees the old space into the collect_free() method:

void cache_t::collect_free(bucket_t *data, mask_t capacity) { #if CONFIG_USE_CACHE_LOCK cacheUpdateLock.assertLocked(); #else runtimeLock.assertLocked(); #endif if (PrintCaches) recordDeadCache(capacity); _garbage_make_room (); Garbage_byte_size += cache_t::bytesForCapacity(capacity); Garbage_refs [garbage_count++] = data; Cache_t ::collectNolock(false); // Free memory;Copy the code

So, explains the above, in engineering, and the second printing, why only say7 address, while the other six address 0, also indirectly as you can see, expansion of the operation for two times, is the first time in 4, the expansion to 8, when eight also cannot meet the expanded to 16, so just have 1-15 of the results. Then why is Say7 still around?

Analysis, step by step according to the result of just source code analysis, when say7 into came in, the system found that memory is not enough, then it will be expanding operations, the old space memory is stored say1 to say6 method, expansion at this time, save say7 in new space memory inside, and then to release the old space memory, so, When printing the address from say1 to say6, it is 0x0.

  • The apple system does this because the original memory can not be modified. You can only create a new memory to replace the original memory. At the same time, panning an array is performance-costly. When new memory is created, apple’s system treats the data stored in the old memory as inactive data that can be removed.

As shown in the figure above, before depositingsay3, the original memory has been freed, when executed againSay1, say2When:There may be space between method stores because of hash function calculations.

4,cacheWhat was done before the method was inserted?

The flow that the cache performs when a method is about to be cached, but who initiates the cached message? Next, use the instance method to explore how the TestPerson class calls the – (void)saySomething method to insert().

4.1, exploreinsert()Previous Execution process

Since we want to explorecacheWhat is done before the method is inserted, so we are not incache_tIn the structure, the method of insertion is tracked. Then ininsert()Method to create a breakpoint in the implementation. The diagram below:When run to breakpoint, can passlldb, the inputBt instructionView the stack message (actually, you can also see it on the left side of the project) :Based on the stack message, we know that an insert is being performedinsert()Method is preceded by the method flow that executes: frommain()Function of theautoreleasepool –> _objc_msgSend_uncached –> lookUpImpOrForward –> log_and_fill_cache –> cache_t::insert.

4.2. View information through assembly

Just from the stack information, you know how to call instance methods[p saySomething], there will be from_objc_msgSend_uncachedPerform tocache_t::insertThis is part of the process, but from the start of calling the method, to_objc_msgSend_uncachedThis part of the process, we don’t know yet, so we can use the ultimate —assembly.perform[p saySomething]After, is to goobjc_msgSend()Methods. Then trace the breakpoint (it’s too much, two screenshots) : objc_msgSend()Method, and then execute it_objc_msgSend_uncached()Methods. At this point, you are connected to the previous step.

  • callinsert()Method flow:[p saySomething]The underlying implementationobjc_msgSend –> _objc_msgSend_uncached –> lookUpImpOrForward –> log_and_fill_cache –> cache_t::insert

theninsert()Flow chart of is:

Ok, cache analysis is now complete.