Ios-underlying Principle 11: Analysis of cache principles in objC_class

IOS underlying principles + reverse article summary

The main purpose of this article is to understand cache_T and sel-IMP caching principles

The overall analysis

In the previous ios-Underlying Principle 07: Isa and Class Association and IOS-Underlying Principle 08: Class & Class Structure analysis, isa and bits in objc_class were analyzed. This time, the cache property in objc_calss was analyzed

What is stored in the cache?

First of all, we need to know what is stored in the cache?

Looking at the cache_t source code, we see that there are three architectural processes. In the real architecture,Mask and the bucketIt was written together for the purpose ofTo optimize theCan be passed through the respectivemaskTo get the corresponding data
- CACHE_MASK_STORAGE_OUTLINEDRepresents the running environmentThe simulatorormacOS
- CACHE_MASK_STORAGE_HIGH_16Indicates that the running environment is64bitA:
- CACHE_MASK_STORAGE_LOW_4Indicates that the running environment isThe 64bitA:

Struct cache_t {#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED//macOS, explicit_atomic Struct bucket_t * _buckets; // This is equivalent to a struct bucket_t * _buckets; Buckets () explicit_atomic<struct bucket_t *> _buckets; explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 // Explicit_atomic <uintptr_t> _maskAndBuckets; // They are written together to optimize mask_t _mask_unused; // The mask is similar to the MASK of ISA, that is, the bit field // mask omits.... #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 // Explicit_atomic <uintptr_t> _maskAndBuckets; mask_t _mask_unused; // The mask is similar to the MASK of ISA, that is, the bit field // mask omits.... #else #error Unknown cache mask storage type. #endif #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; // The method is omitted..... }Copy the code

To viewbucket_tSource code, also divided into two versions,A: 和 The real machine, the difference is thatsel 和 impIs not in the same order

Struct bucket_t {private: #if __arm64__ // 主 机 //explicit_atomic <uintptr_t> _imp; explicit_atomic<SEL> _sel; Explicit_atomic <SEL> _sel; explicit_atomic<uintptr_t> _imp; #endif // omit other parts of the method}Copy the code

Therefore, through the above two structure source code, cache cache is SEL-IMP

The overall structure is shown in the figure below

Look for sel-IMP in the cache

There are two ways to look for stored SEL-IMPs in cache_T

Search through source code
Look in the project without the source code

The preparatory work

To define aLGPersonClass, and define twoattributeAnd the fiveInstance methodsAnd its implementation

inmainDefined in theLGPersonThe object of the classp, and call three of the instance methods, adding a breakpoint where p calls the first method

Search through source code

Run execute, break in[p sayHello];Section, where the following LLDB debugging process is executed

– Cache attributeIs obtained throughpclassThe first address of is shifted by 16 bytes, i.eThe first address + 0 x10Obtain the address of the cache

- We know that 'sel-imp' is in the '_buckets' property of' cache_t '(currently in' macOS 'environment), and that' sel-IMP 'is in the' _buckets' property of 'cache_t'. And the cache_T structure provides a method to get the _buckets property, as well as the sel-IMP, when you get the _buckets property, These two acquisition methods also provide the corresponding acquisition methods' sel() 'and' IMP (pClass) 'in the' bucket_t 'structureCopy the code

As you can see from the figure above, when no method call is made, the cache is not cached. After a method call is made, the cache has a cache, that is, every call to the method will cache the method.

We now know how to obtain sel-IMP from cache, okvalidationIs that what we call sel and IMP for printing? Can be achieved bymachoViewOpen the targetExecutable fileIn theMethods listTo see theimpThe value of is consistent, as shown below, is found to be consistent, so print thissel-impisLGPersontheInstance methods

Following the steps above, we call a method again, this time we want to get the second SEL, whose debug LLDB is as follows

The first method is simply fetched by calling the corresponding method at the first address of _buckets. How about the second? In the previousIOS – Underlying Principle 08: Class & Class structure analysisThere is a concept mentioned in the articleThe pointer offsetSo we can offset this by the first address of the _buckets attribute, i.ep *($9+1)To obtain the second methodSel and impIf there are multiple methods that need to be fetched, and so on, for examplep *($9+i)

Out of the source code through the project search

When you leave the source environment, you copy the required portion of the source code into the project. The complete code is as follows

typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits struct lg_bucket_t { SEL _sel; IMP _imp; }; struct lg_cache_t { struct lg_bucket_t * _buckets; mask_t _mask; uint16_t _flags; uint16_t _occupied; }; struct lg_class_data_bits_t { uintptr_t bits; }; struct lg_objc_class { Class ISA; Class superclass; struct lg_cache_t cache; // formerly cache pointer and vtable struct lg_class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags }; int main(int argc, const char * argv[]) { @autoreleasepool { LGPerson *p = [LGPerson alloc]; Class pClass = [LGPerson class]; // objc_clas [p say1]; [p say2]; //[p say3]; //[p say4]; struct lg_objc_class *lg_pClass = (__bridge struct lg_objc_class *)(pClass); NSLog(@"%hu - %u",lg_pClass->cache._occupied,lg_pClass->cache._mask); for (mask_t i = 0; i<lg_pClass->cache._mask; Struct lg_bucket_t bucket = lg_pClass->cache._buckets[I]; NSLog(@"%@ - %p",NSStringFromSelector(bucket._sel),bucket._imp); } NSLog(@"Hello, World!" ); } return 0; }Copy the code

There’s a problem here, in the source code,objc_classtheISAProperties are inherited fromobjc_objectYes, but when we copied it over, we removed itobjc_class, this attribute needs to be made clear, otherwise the print result will be problematic, as shown in the figure below.

After adding the ISA attribute, adding two method calls, the correct print should look like this

Add the call of two methods, that is, unannotate say3 and say4, and the print result is as follows

In view of the above printout, there are the following questions

1._maskWhat is?
2,_occupiedWhat is?
3. Why does it print occupied and mask with more method callsWill change?
4,bucketWhy is the data thereMissing case? For example, in 2-7, only say3 and say4 methods have function Pointers
5. Why is the printing order of SAY3 and SAY4 in 2-7 that “SAY4” is printed first and “SAY3” is printed later, and they are still next to each otherThe order is out of order?
6. Printedcache_tIn the_ocupiedWhy is it from2To start?

With these questions mentioned above, let’s explore the underlying principles of cache

Cache_t Underlying principles

First of all, from thecache_tIn the_maskAttribute start analysis, findcache_tThe function that causes the change inincrementOccupied()function

The concrete implementation of this function is

void incrementOccupied(); Void cache_t::incrementOccupied() {_occupied++; }Copy the code

Source code, global searchincrementOccupied()Function, found only incache_ttheinsertMethods have calls

insertMethod, understood ascache_tInsert, andcacheIs stored insel-imp, so the principle of cache is frominsertMethod. The following is a flowchart of cache analysis

Global searchinsert(Methods, found onlycache_fillMethod

Global searchcache_fillBefore writing, there is another step, that is, cache read, that is, look for SEL-IMP, as shown in the following

While the focus of this article is on the principles of cache storage, the insert method is followed by a flowchart for writing to cache_T

Insert method analysis

ininsertMethod, its source implementation is as followsIt is mainly divided into the following parts

[Step 1]To calculateOut of the currentCache usage
[Step 2] According toCache usage judgmentperformoperation
[Step 3] For the storagebucketAn internalImp and SEL assignment

The first step is to calculate the current cache occupation according to the occupied value. When the occupied property is not assigned and there is no method call, the occupied() is 0 and the newOccupied is 1, as shown below

mask_t newOccupied = occupied() + 1;
Copy the code

On the calculation of the cache usage, there are the following points:

If you alloc the occupied space and the object is already created, the occupied method will add 1 if you call init again
The occupied method is implicitly called when there is a property assignment, and the set method is added, meaning the occupied function adds several occupied assignments to its current level
The occupied also increases when there are method calls, so if there are a few calls, the occupied adds a few to the occupied list

[Step 2] Determine the operation to be performed according to the cache usage

If it isFirst creationIs opened by default4a

If (slowPath (isConstantEmptyCache())) {// Occupied () = 0; // Cache is read-only. Replace it. If (! capacity) capacity = INIT_CACHE_SIZE; Reallocate (1<<2 -- 100) Reallocate (oldCapacity, capacity, /* freeOld */false); // So far, the process of if has been initialized to create space}Copy the code

If the cache is occupiedLess than or equal to 3/4, no treatment is performed

else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { 
    // Cache is less than 3/4 full. Use it as-is.
}
Copy the code

If the cache is occupiedMore than three-quartersIs requiredTwice the capacityAs well asOpen up space

Else {// If it exceeds 3/4, expand the capacity (double capacity) // Expand the capacity algorithm: If there is a cap, expand the capacity twice; if there is no cap, initialize it as 4 capacity = capacity? capacity * 2 : INIT_CACHE_SIZE; If (capacity > MAX_CACHE_SIZE) {capacity = MAX_CACHE_SIZE; } reallocate(oldCapacity, capacity, true);} Reallocate (oldCapacity, capacity, true); // Memory expansion is complete}Copy the code

Realloc method: open up space

The method, inFirst creationAs well asTwice the capacity, whose source code implementation is shown in the figureThere are mainly the following steps

AllocateBuckets method: The system is asked to open the memory, that is, to open the bucket, when the bucket is only a temporary variable
The setBucketsAndMask method stores a temporary bucket in the cache, which can be stored in two cases:
- If it isA:, according to theBucket and mask location storeAnd willoccupiedOccupancy is set to0
- ifNot a real machine.Buckets and masks are stored normally“And set the occupied to 0
If you have old buckets, you need to clear the previous cache by calling the cache_collect_free method, which is implemented as follows in the source code

The implementation of the method mainly has the following steps: –_garbage_make_roomMethod: Create garbage collection space– if it isFor the first time,, you need toAllocating reclaim Space

- If 'it's not the first time', increase the memory segment, i.e. 'old memory *2' - Record 'store' the 'bucket' this time - 'cache_collect' method: garbage collection, clean up the old bucket! [cache_collect source implementation] (HTTP: / / https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/99cd5d125e35427289b2d4ba1a4e75bf~tplv-k3u1fbpfcp-z oom-1.image)Copy the code

[Step 3] Assign internal IMP and SEL values to buckets that need to be stored

This part is mainly based on the cache_hash method, namely the hash algorithm, to calculate the hash subscript stored by SEL-IMP, divided into the following three cases:

If the location of the hashscript does not store SEL, that is, if the location of the hashscript gets SEL equal to 0, store SEL-IMP and increase the occupied size by 1
If the sel stored in the current hash subscript is equal to the SEL to be inserted, return directly
If the SEL stored by the current hash subscript is not equal to the SEL to be inserted, re-hash through the cache_next method, namely the hash conflict algorithm, to get the new subscript, and then compare and store it

The two hash algorithms involved, the source code is as follows

cache_hash: hash algorithm

static inline mask_t cache_hash(SEL sel, mask_t mask) { return (mask_t)(uintptr_t)sel & mask; // Pass sel & mask (mask = cap-1)}Copy the code

cache_next: Hash conflict algorithm

#if __arm__ || __x86_64__ || __i386__ // objc_msgSend has few registers available. // Cache scan increments and wraps at  special end-marking bucket. #define CACHE_END_MARKER 1 static inline mask_t cache_next(mask_t i, mask_t mask) { return (i+1) & mask; // (subscript the current hash +1) & mask, re-hash, } #elif __arm64__ // objc_msgSend has lots of registers available. // Cache scan decrements. No end marker needed. #define CACHE_END_MARKER 0 static inline mask_t cache_next(mask_t i, mask_t mask) { return i ? i-1 : mask; // if I is null, then mask = cap-1, if I is not null, then i-1, insert sel-imp forward}Copy the code

At this point, the basic analysis of cache_t’s rationale is complete, and we now have answers to the questions mentioned earlier

Question answer

1._maskWhat is?

_mask is the mask data used to calculate the hash subscript in a hash algorithm or hash collision algorithm, where mask is capacity – 1

2,_occupiedWhat is?

_occupied indicates the number of SEL-IMPs occupied in the final hash table (meaning the number of SEL-IMPs stored in the occupied memory);

Init causes occupied changes
Attribute assignment, which is also implicitly called, causes the occupied change
Method calls, resulting in the occupied change

3. Why does it print occupied and mask with more method callsWill change?

Since the occupied space is 4 during cache initialization, as the number of method calls increases, the number of SEL-IMps stored (newOccupied + CACHE_END_MARKER (equal to 1)) exceeds 3/4 of the occupied capacity. For example, if there are 4, and the occupied is 2, You need to double the capacity of the cache

4,bucketWhy is the data thereMissing case? For example, in 2-7, only say3 and say4 methods have function Pointers

The reason is that all the original memory is cleared and memory is applied for again during capacity expansion

5. Why is the printing order of SAY3 and SAY4 in 2-7 printed first and then printed after SAY3, and they are still next to each other? That is, the order is wrong?

Since the storage of SEL-IMP calculates subscripts through the hash algorithm, the calculated subscripts may have stored SEL, so the hash subscripts need to be recalculated through the hash conflict algorithm, so the subscripts are random and not fixed

6. Why does the ocupied in the printed cache_t start at 2?

The reason why LGPerson creates an alloc object and assigns values to two of its properties is that assignment implicitly calls the set method, which also causes the occupied change