This is the seventh day of my participation in the August More text Challenge. For details, see:August is more challenging
In the previous article, we talked about NSObject’s parent class being objc_class, and it contains the following information
Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
Copy the code
Today we’ll explore cache_t
1. Knowledge preparation
1.1 an array
An array is a collection used to store multiple pieces of data of the same type. It has the following advantages and disadvantages:
- Advantages: Access to a subscript content is very convenient, fast
- Disadvantages: The operation of inserting and deleting the array is time-consuming
1.2 list
A linked list is a non-sequential and non-sequential storage structure on a physical storage unit. The logical order of data elements is realized by the link order of Pointers in the list. It has the following advantages and disadvantages:
- Advantages: Easy to insert or delete elements of a node
- Disadvantages: It takes time to find the elements of a node at a certain location
1.3 a hash table
A hash table is a data structure that is accessed directly based on key values. It has the following advantages and disadvantages:
- Advantages: 1. Fast access to an element. 2, insert and delete operation is also very convenient
- Disadvantages: need to go through a series of operations more complex
2. Cache data structure
Class structure: The objc_class structure consists of ISA, superclass, cache, and bits. Isa and superClass are both structure Pointers, each 8 bytes long. Therefore, the cache data structure can be explored using memory translation: first address +16 bytes.
2.1 Explore objC source code
Find the definition for cache_t
struct cache_t { private: explicit_atomic<uintptr_t> _bucketsAndMaybeMask; union { struct { explicit_atomic<mask_t> _maybeMask; #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; }; explicit_atomic<preopt_cache_t *> _originalPreoptCache; }; . };Copy the code
_bucketsAndMaybeMask
: generic, passed inuintptr_t
Type, 8 bytesunion
: union, containing a structure and a pointer to the structure_originalPreoptCache
struct
: contains_maybeMask
,_flags
,_occupied
Three member variables, and_originalPreoptCache
The mutex
We found the data structure for cache_t, but we don’t know what it’s for. By looking at cache_t’s respective methods, we can see that it’s adding, deleting, and looking around bucket_t to find the definition of bucket_t
struct bucket_t { private: // IMP-first is better for arm64e ptrauth and no worse for arm64. // SEL-first is better for armv7* and i386 and x86_64. #if __arm64__ explicit_atomic<uintptr_t> _imp; explicit_atomic<SEL> _sel; #else explicit_atomic<SEL> _sel; explicit_atomic<uintptr_t> _imp; #endif ... };Copy the code
bucket_t
Contained in thesel
andimp
- Different architectures,
sel
andimp
The order is different
It’s not hard to see from sel and IMP that methods should be cached in cache_t
2.2 cache_t structure
3. Underlying principle of cache
3.1 insert
function
In the cache_t structure, find the insert function
struct cache_t { ... void insert(SEL sel, IMP imp, id receiver); . };Copy the code
3.2 createbucket
Insert when the cache list is empty
INIT_CACHE_SIZE_LOG2 = 2, INIT_CACHE_SIZE = (1 << INIT_CACHE_SIZE_LOG2), mask_t newOccupied = occupied() + 1; unsigned oldCapacity = capacity(), capacity = oldCapacity; if (slowpath(isConstantEmptyCache())) { // Cache is read-only. Replace it. if (! capacity) capacity = INIT_CACHE_SIZE; reallocate(oldCapacity, capacity, /* freeOld */false); }Copy the code
newOccupied
: Size of the existing cache +1capacity
: The value is 4 (1 << 2), the initial capacity of the cache listreallocate
Function, first created,freeOld
The incomingfalse
Reallocate, buckets are created, and setBucketsAndMask is called
bucket_t *newBuckets = allocateBuckets(newCapacity);
setBucketsAndMask(newBuckets, newCapacity - 1);
Copy the code
The setBucketsAndMask function varies according to the architecture. Take the non-real machine code that is currently running as an example
void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
#ifdef __arm__
// ensure other threads see buckets contents before buckets pointer
mega_barrier();
_bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_relaxed);
// ensure other threads see new buckets before new mask
mega_barrier();
_maybeMask.store(newMask, memory_order_relaxed);
_occupied = 0;
#elif __x86_64__ || i386
// ensure other threads see buckets contents before buckets pointer
_bucketsAndMaybeMask.store((uintptr_t)newBuckets, memory_order_release);
// ensure other threads see new buckets before new mask
_maybeMask.store(newMask, memory_order_release);
_occupied = 0;
#else
#error Don't know how to do setBucketsAndMask on this architecture.
#endif
}
Copy the code
- The incoming
newMask
Is the capacity of the cache list -1 and is used as the mask - will
buckets
Bucket, stored to_bucketsAndMaybeMask
In the. Strong gouintptr_t
Type, which stores only structure Pointers, that is:buckets
The first address - will
newMask
Mask, stored to_maybeMask
In the _occupied
Set it to 0 becausebuckets
The bucket is currently empty
3.3 capacity
If newOccupied + 1 is less than or equal to 75%, no expansion is required
#define CACHE_END_MARKER 1
if (fastpath(newOccupied + CACHE_END_MARKER <= cache_fill_ratio(capacity))) {
// Cache is less than 3/4 or 7/8 full. Use it as-is.
}
// Historical fill ratio of 75% (since the new objc runtime was introduced).
static inline mask_t cache_fill_ratio(mask_t capacity) {
return capacity * 3 / 4;
}
Copy the code
CACHE_END_MARKER
: End of system insertion, boundary action
If the value exceeds 75%, double the capacity expansion
MAX_CACHE_SIZE_LOG2 = 16,
MAX_CACHE_SIZE = (1 << MAX_CACHE_SIZE_LOG2),
capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
if (capacity > MAX_CACHE_SIZE) {
capacity = MAX_CACHE_SIZE;
}
reallocate(oldCapacity, capacity, true);
Copy the code
capacity
The capacity cannot exceed 2 times65536
- call
reallocate
Function, when expandingfreeOld
The incomingtrue
Reallocate function, when freeOld is passed true
bucket_t *oldBuckets = buckets();
bucket_t *newBuckets = allocateBuckets(newCapacity);
setBucketsAndMask(newBuckets, newCapacity - 1);
if (freeOld) {
collect_free(oldBuckets, oldCapacity);
}
Copy the code
- create
buckets
Bucket, replace the originalbuckets
The newbuckets
The capacity is the capacity after expansion - Release the original
buckets
- The original
buckets
Method cache in, all cleared
3.4 Calculating subscripts
Insert, call the hash function, calculate the subscript of sel
mask_t m = capacity - 1;
mask_t begin = cache_hash(sel, m);
mask_t i = begin;
Copy the code
- will
capacity - 1
As the mask of a hash function, used to calculate subscripts
3.5 Write Cache
Insert to get buckets
bucket_t *b = buckets();
Copy the code
The buckets function, which performs the & operation, returns the bucket_T structure fat needle, i.e. the address of the buckets head
static constexpr uintptr_t bucketsMask = ~0ul;
struct bucket_t *cache_t::buckets() const
{
uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
return (bucket_t *)(addr & bucketsMask);
}
Copy the code
- The value of bucketsMask varies with different architectures
~ 0 ul
:0b1111111111111111111111111111111111111111111111111111111111111111
&
Operation: If both corresponding bits are 1, then the result value of that bit is 1- so
Addr & ~ 0 ul
As a result,addr
Using the subscript to get the bucket is equivalent to a memory shift. If sel does not exist in the bucket, write to the cache
if (fastpath(b[i].sel() == 0)) {
incrementOccupied();
b[i].set<Atomic, Encoded>(b, sel, imp, cls());
return;
}
Copy the code
incrementOccupied
Function,_occupied
for++
set
Function,sel
andimp
writebucket
If sel exists and is the same as the current SEL, return it directly
if (b[i].sel() == sel) {
// The entry was added to the cache by some other thread
// before we grabbed the cacheUpdateLock.
return;
}
Copy the code
Otherwise, it is a hash conflict
3.6 Preventing hash conflicts
The cache_next function, which varies from framework to framework, takes the non-real code currently running as an example:
static inline mask_t cache_next(mask_t i, mask_t mask) {
return (i+1) & mask;
}
Copy the code
- On the basis of conflicting subscripts, proceed first
+ 1
, and againmask
for&
operation
In the do… While, the cache_next function is called until the hash conflict is resolved
do { ... } while (fastpath((i = cache_next(i, m)) ! = begin));Copy the code
Conclusion:
capacity
: Capacity of the cache listoccupied
: Indicates the size of the existing cachemaybeMask
Use:capacity-1
The value of is used as the mask for calculating subscripts in hash algorithms and hash conflicts- When writing to cache, if
Size + bounds after writing to the cache
overcapacity75%
For capacity expansion- Expansion: Create a new bucket to release the original space
- All method caches in the original buckets are cleared
- Double the capacity first, and then write to the cache
- Use the hash function to compute the indices, use the indices to find
bucket
- judge
bucket
In thesel
If no, write - If there is
sel
, and the currentsel
Same, directreturn
- Hash conflict
- Different frames have different algorithms
- On the basis of conflicting subscripts, proceed first
+ 1
, and againmask
for&
operation - in
do... while
Until the hash conflict is resolved
3.7 Why Do I Use 3/4 capacity Expansion
A hash table has two parameters that affect its performance: initial capacity and load factor
- The initial capacity is the number of buckets in the hash table. The initial capacity is the capacity when the hash table was created
- A load factor is a measure of the satisfaction a hash table is allowed to gain before its hash table capacity is automatically increased
When the number of entries in a hash table exceeds the product of the load factor and the current capacity, the table will be rehashed. That is, the internal data structure will be rebuilt. So hash table buckets are about twice the load factor defined at 3/4, providing a good compromise between time and space costs
- If the load factor is set to 1, then the expansion will occur only when the element is full. Although it can maximize the space utilization, it will increase the hash conflict, so the query efficiency will become low. So when the load factor is relatively large: save space resources, increase the search cost
- If the load factor is set to 0.5, it will be expanded when the space is generally reached. Although a smaller load factor can reduce hash collisions to the greatest extent possible, the waste of space will be large. So when the load factor is relatively small: save time resources, consume space resources