Ios-underlying Principles: Analysis of cache Principles in OBJC_class (11)

(Summary of articles on underlying principles of iOS)

(iOS)

The main purpose of this article is to understand cache_T and sel-IMP caching principles

The overall analysis

In the previous ios-underlying Principles: Isa and class Association principles (7) and ios-underlying Principles: Class & Class Structure Analysis (8), isa and bits in objc_class are analyzed. This time, the cache property in objc_calss is mainly analyzed

What is stored in the cache?

First of all, we need to know what is stored in the cache?

Looking at the cache_t source code, you can see that the processing is divided into three architectures. In the real machine’s architecture, the mask and bucket are written together for optimization, and the corresponding data can be obtained by their respective masks
- CACHE_MASK_STORAGE_OUTLINEDRepresents the running environmentThe simulatorormacOS
- CACHE_MASK_STORAGE_HIGH_16Indicates that the running environment is64bitA:
- CACHE_MASK_STORAGE_LOW_4Indicates that the running environment isThe 64bitA:

Struct cache_t {#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED//macOS, explicit_atomic Struct bucket_t * _buckets; // This is equivalent to a struct bucket_t * _buckets; Buckets () explicit_atomic<struct bucket_t *> _buckets; explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 // Explicit_atomic <uintptr_t> _maskAndBuckets; // They are written together to optimize mask_t _mask_unused; // The mask is similar to the MASK of ISA, that is, the bit field // mask omits.... #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 // Explicit_atomic <uintptr_t> _maskAndBuckets; mask_t _mask_unused; // The mask is similar to the MASK of ISA, that is, the bit field // mask omits.... #else #error Unknown cache mask storage type. #endif #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; // The method is omitted..... } copy codeCopy the code

To viewbucket_tSource code, also divided into two versions,A: 和 The real machine, the difference is thatsel 和 impIs not in the same order

Struct bucket_t {private: #if __arm64__ // 主 机 //explicit_atomic <uintptr_t> _imp; explicit_atomic<SEL> _sel; Explicit_atomic <SEL> _sel; explicit_atomic<uintptr_t> _imp; #endif // method and other parts omitted} copy codeCopy the code

Therefore, through the above two structure source code, cache cache is SEL-IMP

The overall structure is shown in the figure below

Look for sel-IMP in the cache

There are two ways to look for stored SEL-IMPs in cache_T

Search through source code
Look in the project without the source code

The preparatory work

To define aLGPersonClass, and define twoattributeAnd the fiveInstance methodsAnd its implementation

inmainDefined in theLGPersonThe object of the classp, and call three of the instance methods, adding a breakpoint where p calls the first method

Search through source code

Run execute, break in[p sayHello];Section, where the following LLDB debugging process is executed

– Cache attributeIs obtained throughpclassThe first address of is shifted by 16 bytes, i.eThe first address + 0 x10Obtain the address of the cache

- We know that 'sel-imp' is in the '_buckets' property of' cache_t '(currently in' macOS 'environment), and that' sel-IMP 'is in the' _buckets' property of 'cache_t'. And the cache_T structure provides a method to get the _buckets property, as well as the sel-IMP, when you get the _buckets property, The fetching of these two methods also provides the corresponding fetching method 'sel()' and 'IMP (pClass)' copy code in the 'bucket_t' structureCopy the code

As you can see from the figure above, when no method call is made, the cache is not cached. After a method call is made, the cache has a cache, that is, every call to the method will cache the method.

We now know how to obtain sel-IMP from cache, okvalidationIs that what we call sel and IMP for printing? Can be achieved bymachoViewOpen the targetExecutable fileIn theMethods listTo see theimpThe value of is consistent, as shown below, is found to be consistent, so print thissel-impisLGPersontheInstance methods

Following the steps above, we call a method again, this time we want to get the second SEL, whose debug LLDB is as follows

The first method is simply fetched by calling the corresponding method at the first address of _buckets. How about the second? In the previousIOS – Underlying Principles: Class & Class Structure Analysis (8)There is a concept mentioned in the articleThe pointer offsetSo we can offset this by the first address of the _buckets attribute, i.ep *($9+1)To obtain the second methodSel and impIf there are multiple methods that need to be fetched, and so on, for examplep *($9+i)

Out of the source code through the project search

When you leave the source environment, you copy the required portion of the source code into the project. The complete code is as follows

typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits struct lg_bucket_t { SEL _sel; IMP _imp; }; struct lg_cache_t { struct lg_bucket_t * _buckets; mask_t _mask; uint16_t _flags; uint16_t _occupied; }; struct lg_class_data_bits_t { uintptr_t bits; }; struct lg_objc_class { Class ISA; Class superclass; struct lg_cache_t cache; // formerly cache pointer and vtable struct lg_class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags }; int main(int argc, const char * argv[]) { @autoreleasepool { LGPerson *p = [LGPerson alloc]; Class pClass = [LGPerson class]; // objc_clas [p say1]; [p say2]; //[p say3]; //[p say4]; struct lg_objc_class *lg_pClass = (__bridge struct lg_objc_class *)(pClass); NSLog(@"%hu - %u",lg_pClass->cache._occupied,lg_pClass->cache._mask); for (mask_t i = 0; i<lg_pClass->cache._mask; Struct lg_bucket_t bucket = lg_pClass->cache._buckets[I]; NSLog(@"%@ - %p",NSStringFromSelector(bucket._sel),bucket._imp); } NSLog(@"Hello, World!" ); } return 0; } copy codeCopy the code

There’s a problem here, in the source code,objc_classtheISAProperties are inherited fromobjc_objectYes, but when we copied it over, we removed itobjc_class, this attribute needs to be made clear, otherwise the print result will be problematic, as shown in the figure below.

After adding the ISA attribute, adding two method calls, the correct print should look like this

Add the call of two methods, that is, unannotate say3 and say4, and the print result is as follows

In view of the above printout, there are the following questions

1._maskWhat is?
2,_occupiedWhat is?
3. Why does it print occupied and mask with more method callsWill change?
4,bucketWhy is the data thereMissing case? For example, in 2-7, only say3 and say4 methods have function Pointers
5. Why is the printing order of SAY3 and SAY4 in 2-7 that “SAY4” is printed first and “SAY3” is printed later, and they are still next to each otherThe order is out of order?
6. Printedcache_tIn the_ocupiedWhy is it from2To start?

With these questions mentioned above, let’s explore the underlying principles of cache

Cache_t Underlying principles

First of all, from thecache_tIn the_maskAttribute start analysis, findcache_tThe function that causes the change inincrementOccupied()function

The concrete implementation of this function is

void incrementOccupied(); Void cache_t::incrementOccupied() {_occupied++; } copy codeCopy the code

Source code, global searchincrementOccupied()Function, found only incache_ttheinsertMethods have calls

insertMethod, understood ascache_tInsert, andcacheIs stored insel-imp, so the principle of cache is frominsertMethod. The following is a flowchart of cache analysis

Global searchinsert(Methods, found onlycache_fillMethod

Global searchcache_fillBefore writing, there is another step, that is, cache read, that is, look for SEL-IMP, as shown in the following

While the focus of this article is on the principles of cache storage, the insert method is followed by a flowchart for writing to cache_T

Insert method analysis

ininsertMethod, its source implementation is as followsIt is mainly divided into the following parts

[Step 1]To calculateOut of the currentCache usage
[Step 2] According toCache usage judgmentperformoperation
[Step 3] For the storagebucketAn internalImp and SEL assignment

The first step is to calculate the current cache occupation according to the occupied value. When the occupied property is not assigned and there is no method call, the occupied() is 0 and the newOccupied is 1, as shown below

mask_t newOccupied = occupied() + 1; Copy the codeCopy the code

On the calculation of the cache usage, there are the following points:

allocWhen applying for spaceObject has been createdIf called againinitMethod,occupiedWill also be+ 1
whenThere's attribute assignmentIs implicitly calledsetMethod,occupiedWill also increase, i.eWith a few attribute assignments, the occupied adds a few to the occupied list
whenThere are method callsWhen,occupiedWill also increase, i.eAfter a few calls, the occupied will add a few to the occupied list

[Step 2] Determine the operation to be performed according to the cache usage

If it isFirst creationIs opened by default4a

If (slowPath (isConstantEmptyCache())) {// Occupied () = 0; // Cache is read-only. Replace it. If (! capacity) capacity = INIT_CACHE_SIZE; Reallocate (1<<2 -- 100) Reallocate (oldCapacity, capacity, /* freeOld */false); // So far, the process of if has been to initialize the creation of} copy codeCopy the code

If the cache is occupiedLess than or equal to 3/4, no treatment is performed

else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) { // Cache is less than 3/4 full. Use it as-is. } Copy the codeCopy the code

If the cache is occupiedMore than three-quartersIs requiredTwice the capacityAs well asOpen up space

Else {// If it exceeds 3/4, expand the capacity (double capacity) // Expand the capacity algorithm: If there is a cap, expand the capacity twice; if there is no cap, initialize it as 4 capacity = capacity? capacity * 2 : INIT_CACHE_SIZE; If (capacity > MAX_CACHE_SIZE) {capacity = MAX_CACHE_SIZE; } reallocate(oldCapacity, capacity, true);} Reallocate (oldCapacity, capacity, true); // Memory expansion is complete} Copy codeCopy the code

Realloc method: open up space

The method, inFirst creationAs well asTwice the capacity, whose source code implementation is shown in the figureThere are mainly the following steps

AllocateBuckets method: The system is asked to open the memory, that is, to open the bucket, when the bucket is only a temporary variable
The setBucketsAndMask method stores a temporary bucket in the cache, which can be stored in two cases:
- If it isA:, according to theBucket and mask location storeAnd willoccupiedOccupancy is set to0
- ifNot a real machine.Buckets and masks are stored normally“And set the occupied to 0
If you have old buckets, you need to clear the previous cache by calling the cache_collect_free method, which is implemented as follows in the source code

The implementation of the method mainly has the following steps: –_garbage_make_roomMethod: Create garbage collection space– if it isFor the first time,, you need toAllocating reclaim Space

- If 'it's not the first time', increase the memory segment, i.e. 'old memory *2' - Record 'store' the 'bucket' this time - 'cache_collect' method: garbage collection, clean up the old bucket! [cache_collect source implementation] (HTTP: / / https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/99cd5d125e35427289b2d4ba1a4e75bf~tplv-k3u1fbpfcp-z Ooom 1.image) copy the codeCopy the code

[Step 3] Assign internal IMP and SEL values to buckets that need to be stored

This part is mainly based on the cache_hash method, namely the hash algorithm, to calculate the hash subscript stored by SEL-IMP, divided into the following three cases:

If the location of the hash subscriptNot stored selIs the position of the subscriptGet sel is equal to 0At this time willSel - imp storageGo in, and willoccupiedTake up the sizeAdd 1
If the current hash subscript stores selIs equal to theThe SEL to be inserted is returned directly
If the current hash subscript stores selIs not equal toThe SEL to be inserted is reroutedcache_nextMethods theHash collision algorithm, redo the hash calculation, get the new subscript, and then compare and store

The two hash algorithms involved, the source code is as follows

cache_hash: hash algorithm

static inline mask_t cache_hash(SEL sel, mask_t mask) { return (mask_t)(uintptr_t)sel & mask; // Copy the code with sel & mask (mask = cap-1)}Copy the code

cache_next: Hash conflict algorithm

#if __arm__ || __x86_64__ || __i386__ // objc_msgSend has few registers available. // Cache scan increments and wraps at  special end-marking bucket. #define CACHE_END_MARKER 1 static inline mask_t cache_next(mask_t i, mask_t mask) { return (i+1) & mask; // (subscript the current hash +1) & mask, re-hash, } #elif __arm64__ // objc_msgSend has lots of registers available. // Cache scan decrements. No end marker needed. #define CACHE_END_MARKER 0 static inline mask_t cache_next(mask_t i, mask_t mask) { return i ? i-1 : mask; // If I is null, then mask, mask = cap-1, if not, then i-1, insert sel-IMP} forward copy codeCopy the code

At this point, the basic analysis of cache_t’s rationale is complete, and we now have answers to the questions mentioned earlier

Question answer

1._maskWhat is?

_mask is the mask data used to calculate the hash subscript in a hash algorithm or hash collision algorithm, where mask is capacity – 1

2,_occupiedWhat is?

_occupied indicates the number of SEL-IMPs occupied in the final hash table (meaning the number of SEL-IMPs stored in the occupied memory);

initCan lead tooccupiedchange
Attribute assignmentIs also implicitly called, resulting inoccupiedchange
The method call, resulting in occupied changes

3. Why does it print occupied and mask with more method callsWill change?

Since the occupied space is 4 during cache initialization, as the number of method calls increases, the number of SEL-IMps stored (newOccupied + CACHE_END_MARKER (equal to 1)) exceeds 3/4 of the occupied capacity. For example, if there are 4, and the occupied is 2, You need to double the capacity of the cache

4,bucketWhy is the data thereMissing case? For example, in 2-7, only say3 and say4 methods have function Pointers

The reason is that all the original memory is cleared and memory is applied for again during capacity expansion

5. Why is the printing order of SAY3 and SAY4 in 2-7 printed first and then printed after SAY3, and they are still next to each other? That is, the order is wrong?

Since the storage of SEL-IMP calculates subscripts through the hash algorithm, the calculated subscripts may have stored SEL, so the hash subscripts need to be recalculated through the hash conflict algorithm, so the subscripts are random and not fixed

6. Why does the ocupied in the printed cache_t start at 2?

The reason why LGPerson creates an alloc object and assigns values to two of its properties is that assignment implicitly calls the set method, which also causes the occupied change