OC Basic principles of exploration document summary

Analyze the structure of a class at the bottom level and how operations on the class are implemented at the bottom level, including cache and bits analysis.

Main Contents:

The understanding of the underlying structure of objC_class, and the relationship between objC_class and objC_Object

Understanding of ISA, the trend of ISA and the inheritance relationship of classes

Detailed interpretation of cache

A detailed interpretation of bits

1. Understanding the underlying structure of objC_class

Find out what the underlying structure of the class is. You can look at the underlying structure through Clang compilation and see that it is created through objc_class. Analyze the relationship between objc_class and objc_Object.

1.1 Create a WYPerson class

@interface WYPerson : NSObject

@property (nonatomic, assign) int age;

@end

@implementation WYPerson

- (void)eat{
    NSLog(@"eat");
}
Copy the code

1.2 Underlying structure

Let’s first look at what the Class structure is

In objC source code, you can see that Class is defined by the objC_class structure, which represents a Class

typedef struct objc_class *Class;
Copy the code

NSObject structure

Code:

#ifndef _REWRITER_typedef_NSObject
#define _REWRITER_typedef_NSObject
typedef struct objc_object NSObject;
typedef struct {} _objc_exc_NSObject;
#endif

struct NSObject_IMPL {
	Class isa;
};
Copy the code

Description:

Defines a structure of an NSObject object from the objC_Object structure
NSObject_IMPL is implemented by pseudo-inheritance, and Class is defined by the objC_class structure, so NSObject_IMPL is created based on objC_class

WYPerson structure

Code:

#ifndef _REWRITER_typedef_WYPerson
#define _REWRITER_typedef_WYPerson
typedef struct objc_object WYPerson;
typedef struct {} _objc_exc_WYPerson;
#endif

extern "C" unsigned long OBJC_IVAR_$_WYPerson$_age;
struct WYPerson_IMPL {
	struct NSObject_IMPL NSObject_IVARS;
	int _age;
};
Copy the code

Description:

Objc_objcet defines a WYPerson object structure
WYPerson_IMPL is also implemented through pseudo-inheritance, which inherits from NSObject_IMPL, that is, all members with NSObject

1.3 Structure template

Objc_Object

Source:

Struct objc_object {private: isa_t isa; Public :// getIsa() allows this to be a tagged pointer object Class getIsa(); }Copy the code

In fact, there are many methods in objC_Object, I only write getIsa(), more methods can be viewed objC source

Description:

Objc_object has only one member, ISA, and none of the others
So when we create an object, where are all the properties and methods and so on, actually in the class.
The object obtains the class information through ISA, and then obtains the corresponding attributes, member variables, methods, and protocols from the class information.
It is important to note that the ISA retrieved from getIsa() is not complete, only the Class information.

Objc_class:

Source:

Struct objc_class: struct objc_object {// struct objc_object: struct objc_object; // From the inherited objc_object Class superclass; // Parent cache_t cache; Class_data_bits_t bits; Class_rw_t *data() const {// Get bits data. Return bits.data(); }}Copy the code

Description:

Objc_class inherits from objc_Object, fully demonstrating that a class is also an object and everything originates from objc_Object
Class structures include ISA, superclass, Cache, and bits.
Isa inherits from objc_Object
Cache is used to store the method information of the cache, including SEL and IMP, which can quickly search for messages
Bits data() contains methods and protocols. Attributes, member variables.

1.4 summarize

Objc_class is derived from Objc_Object, so the class itself is also a class object. A class is a metaclass object
The underlying structure of objC_object has an ISA, indicating that the underlying structure of the object contains only ISA.
All objects, classes, metaclasses, and protocols have ISA members, demonstrating that in an object-oriented world everything is an object
The bottom layer of all objects is a structure created using the objC_Object template. The bottom layer of all classes is a structure created using the objC_class template
The metaclass also has isa, because it has its own class, the root metaclass, and the root metaclass’s class is itself, so the root metaclass also has isa.
The objc_class contains ISA, superClass, Cache, and bits.

2. Understanding of ISA, the trend of ISA and the inheritance relationship of classes

Isa is inherited from objc_Object and has been explained in detail in the underlying analysis of OC objects, which is not analyzed here. You only know where ISA is going and how classes are inherited

Through this diagram, you can visualize the relationship between objects, classes, metaclasses and root metaclasses.

Diagram:

2.1 yuan class

2.1.1 What is a metaclass

A metaclass is a class of objects. Each class has a unique metaclass to store information about the class.
Metaclasses are defined and created by the system, based on our custom classes
The metaclass itself has no name, and because it is associated with the class, the name is the same as the class name, and we cannot use it and cannot see it directly
Metaclasses are used to store class information, and all class methods are stored in metaclasses

2.1.2 Why metaclasses are Needed (What are the functions of metaclasses?)

Metaclasses are used to manage information about the class itself, such as class methods
In the object-oriented world, everything is an object, the class is also an object, called the class object, and the class object belongs to the class is metaclass, through the metaclass can be class as an object.
The metaclass belongs to the root metaclass.

2.1.3 Where are class methods stored?

Instance methods and class methods are distinguished at the upper level, but they are functions at the lower level. Instance methods and class methods cannot be distinguished, so they can be distinguished by different storage places.
Class methods are stored in metaclasses, and instance methods are stored in classes
Class methods are also stored in metaclasses as object methods

2.2 Isa position analysis

2.2.1 Analysis based on the diagram

It can be seen from the figure that:

Object’s ISA points to a class
Class ISA points to the metaclass
The ISA of the metaclass points to the root metaclass
The NSObject class also points to the root metaclass
The root metaclass points to itself

2.2.2 validation

In my previous blog analysis of the underlying structure of objects, I learned that ISA contains class information, so we can verify the results in the bitmap by looking at the class information in ISA.

1, get the class information (object ISA)

Description:

X /4gx perOSN can get the Person attributes including ISA
P /x ISA value & ISA_MASK Fetch the isa class address through the mask
The x/4gx class address gets information about the class

2, check metaclass information (class ISA)

Description:

X /4gx class information address gets class information including ISA
P /x isa value & ISA_MASK gets the root metaclass address
You can see that the ISA for the class information is itself the address of the class information, without any other information
So the isa address of the class is actually the metaclass of LGPerson

3. Check the root metaclass (isa for metaclass)

Description:

Look at the isa of the class, it’s NSObject, which is actually the root metaclass,

4, verify the root metaclass isa

Description:

NSObject the ROOT metaclass isa points to itself again
You can also see that the ISA of a class is directly metaclass information, with no other information

2.3 Analysis of inheritance relationship

Inheritance relationships do not include objects, only relationships between classes and metaclasses
All classes inherit from NSObject (except NSProxy, which you can read about on the blog…).
Metaclases also have their own inheritance chain, which ultimately inherits from the NSObject class
The important thing is that the NSObject metaclass inherits from the NSObject class. And NSObject inherits from nil, which means it doesn’t inherit from any classes, which means that NSObject is the origin of everything.
There’s only classes in the inheritance chain of a class, not all metaclases in the inheritance chain of a metaclass, there’s also an NSObject class, and that’s something to watch out for when you loop through the metaclass inheritance chain.

2.4 summarize

An object’s ISA points to a class, the class points to a metaclass, the metaclass points to the root metaclass, and the root metaclass points to itself

The NSObject class points to the NSObject root metaclass, which points to itself

Inheritance relationships refer to class or metaclass relationships, not object relationships

Objects inherit from classes, and metaclasses also inherit from each other

It’s important to note that the root metaclass inherits from NSObject

That last NSObject inherits from nil,NSObject is where everything came from

3, the cache

A cache is a property in a class structure that stores SEL and IMP for methods that have been called. This can make the method to improve the efficiency of the call, the main content of the study is the cache structure and storage, as for how to query imp in the cache, will be detailed interpretation in the process of message sending.

3.1 Overview of the Cache structure

A cache is a section of the class structure used to store SEL and IMP key-value pairs of methods that have been called.

3.1.1 cache_t structure

Source:

Struct cache_t {#if CACHE_MASK_STORAGE == may include: outline // Struct bucket_t * _buckets; Buckets () explicit_atomic<struct bucket_t *> _buckets; explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 // The purpose of writing together is to optimize mask_t _mask_unused; // The following are masks, i.e., masks -- similar to isa masks, i.e., bitfields // masks omit.... // explicit_atomic<uintptr_t> _maskAndBuckets; mask_t _mask_unused; // The following are masks, i.e., masks -- similar to isa masks, i.e., bitfields // masks omit.... #else #error Unknown cache mask storage type. #endif #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; // the method is omitted..... }Copy the code

If you look at the source code, there are three types, true 64-bit, true 32-bit, emulator or MacOS, and we’re looking at 64-bit true,

Members: (1) cache_mask_storage_may include buckets and _mask. (2) CACHE_MASK_STORAGE_HIGH_16 may include 64 bits. It has attributes _maskAndBuckets and _mask_unused. 3) CACHE_MASK_STORAGE_LOW_4 represents the 32 bits of the true machine and has attributes _maskAndBuckets and _mask_unused. 5) uint16_t _occupied;

Macros define architectural judgments

#if defined(__arm64__) && __LP64__// true 64-Bit #define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16 #elif defined(__arm64__) &&! __LP64__// true 32-bit #define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4 #else #define CACHE_MASK_STORAGE Cache_mask_storage_may include: bank statementsCopy the code

Here __arm64 is true, __LP64 is 64-bit, so the first macro definition is 64-bit true, the second macro definition is 32-bit true, and the third macro definition is emulator and macOS
Architecture judgment, macOS is I386, emulator is x86, real machine is ARM64.

Generic view

Property uses explicit_atomic<uintptr_t>, which is just a generic to pass in a type that the explicit_atomic does atomic operations on.

The 64-bit architecture of the real machine includes _maskAndBuckets/_flags/_occupied, where _maskAndBuckets consists of _mask and BUCKETS to optimize performance. Sel and IMP are stored in buckets.

Therefore, the cache_T structure in 64-bit true machines contains _maskAndBuckets, _flags, _occupied

_maskAndBuckets, which is made up of _mask and buckets to optimize performance, stores SEL and IMP in buckets. Mask is equal to capacity-1, that is, mask is the current cache -1

_flags: The _flags attribute is a set of positional flags

_OCCUPIED: number of SEL and IMP key-value pairs

3.1.2 Bucket_T structure

Here you can see imp and SEL are stored, but the real machine and emulator macOS are stored in a different order.

struct bucket_t { private: // IMP-first is better for arm64e ptrauth and no worse for arm64. // SEL-first is better for armv7* and i386 and x86_64.  #if __arm64__ explicit_atomic<uintptr_t> _imp; explicit_atomic<SEL> _sel; #else explicit_atomic<SEL> _sel; explicit_atomic<uintptr_t> _imp; #endif }Copy the code

3.1.3 summary

3.2 Viewing cache data Changes Without the source code Environment

Through the use of the cache structure in the upper layer, see how the members of the operation is the essence of their own build a source environment, source code at the bottom is the structure, we are in the upper layer with the structure to store and access can be achieved.

Code:

typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits struct lg_bucket_t { SEL _sel; IMP _imp; }; struct lg_cache_t { struct lg_bucket_t * _buckets; mask_t _mask; uint16_t _flags; uint16_t _occupied; }; struct lg_class_data_bits_t { uintptr_t bits; }; struct lg_objc_class { Class ISA; Class superclass; struct lg_cache_t cache; // formerly cache pointer and vtable struct lg_class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags }; int main(int argc, const char * argv[]) { @autoreleasepool { LGPerson *p = [LGPerson alloc]; Class pClass = [LGPerson class]; // objc_clas [p say1]; [p say2]; //[p say3]; //[p say4]; struct lg_objc_class *lg_pClass = (__bridge struct lg_objc_class *)(pClass); NSLog(@"%hu - %u",lg_pClass->cache._occupied,lg_pClass->cache._mask); for (mask_t i = 0; i<lg_pClass->cache._mask; Bucket struct lg_bucket_t bucket = lg_pClass-> cache.buckets [I]; NSLog(@"%@ - %p",NSStringFromSelector(bucket._sel),bucket._imp); } NSLog(@"Hello, World!" ); } return 0; }Copy the code

Print result:

Question:

What is _mask?
What is _occupied?
Why does the printed version of occupied and Mask change as method calls increase?
Why does the printed version of occupied and Mask change as method calls increase?
Why is bucket data missing? For example, in 2-7, only say3 and say4 have function Pointers
Why is “say4” printed first and “say3” printed next to “say3” printed next to “say3” in 2-7?
Why does _ocupied in printed cache_t start at 2?

3.3 Mechanism Analysis of Cache

Looking at the attribute values in cache_t after executing the method, I found some confusion that requires an understanding of the underlying principles. Start with a change in the value of occupied to see how it works.

3.3.1 Finding stored functions

IncrementOccupied () ¶ In cache_t, incrementOccupied() sets the occupied property.

In practice, we found that the occupied increment increment occurs when the method is called once, so we need to look at the source code to see when this increment occurs.
Occupied ++; occupied+1; occupied+1; Find the incrementOccupied() function

3.3.1.2 It is found that the implementation of this function is autoincrement

It can be shown here that the increment of Occupied is done by calling incrementOccupied(), so the next step is to see who is calling the function

3.3.1.3 Global search incrementOccupied() finds a call to insert in cache_t

3.3.1.4 A global search for the INSERT () method shows that only calls in the cache_fill method match

Before writing to cache_fill, there is another step, namely cache read, to find SEL-IMP, as shown below

Here is not found how to call, maybe the system did not show us
But the system says what it does internally is read the cache first and write the cache later

3.3.1.6 summary:

After a message is sent, the cache needs to be read first

If they do not exist in the cache, the resulting SEL and IMP need to be written to the cache

Write caching is implemented by calling cache_fill()->insert()->

3.3.2 Insert () function analysis

The insert() function is used to insert SEL and IMP, with emphasis on the hash algorithm

Source:

ALWAYS_INLINE void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver) { #if CONFIG_USE_CACHE_LOCK cacheUpdateLock.assertLocked(); #else runtimeLock.assertLocked(); #endif ASSERT(sel ! = 0 && cls->isInitialized()); // Use the cache as-is if it is less than 3/4 full. Mask_t newOccupied = occupied() + 1; Unsigned oldCapacity = capacity(), capacity = oldCapacity; unsigned oldCapacity = capacity(); Slowpath (isConstantEmptyCache())) {// Cache is read-only. Replace it. If (! capacity) capacity = INIT_CACHE_SIZE; // Set the initial capacity to 4 reallocate(oldCapacity, capacity, /* freeOld */false); // Open a space with capacity 4, Elseif (fastPath (newpath (occupied + CACHE_END_MARKER) <= capacity /4 * 3)) {// Occupied () + 1 + 1 <=3/4; // Cache is less than 3/4 full. Use it as-is. // Cache is less than 3/4 full capacity ? capacity * 2 : INIT_CACHE_SIZE; If (capacity > MAX_CACHE_SIZE) {// The maximum cache size cannot exceed 2^16. Capacity = MAX_CACHE_SIZE; } reallocate(oldCapacity, capacity, true); } bucket_t *b = buckets(); Mask_t m = capacity-1; mask_t m = capacity-1; Mask_t begin = cache_hash(sel, m); // Hash algorithm, (uintptr_t) (uintptr_t)sel & mask mask_t I = begin; // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot because the // minimum Size is 4 and we resized at 3/4 full. // Check whether the location was empty. If it was not empty and other sel contents were stored, a conflict occurred. Do {// If (fastPath (b[I].sel() == 0)) {incrementOccupied(); Set <Atomic, Encoded>(sel, imp, CLS); // set<Atomic, Encoded>(sel, imp, CLS); / / store return; } // other threads have already added it, Upon hearing (b[I].sel() == sel) {// The entry was added to The cache by some other thread // before we ear The cacheUpdateLock. return; } } while (fastpath((i = cache_next(i, m)) ! = begin)); // hash collision algorithm: I? i-1 : mask; cache_t::bad_cache(receiver, (SEL)sel, cls); }Copy the code

[Step 1] : Create space

Get the amount of cache occupied. This is a temporary variable. If it is added successfully, it will be used to judge whether to expand the capacity later.
Get the current total cache. This is the existing cache capacity, which is used to determine whether capacity expansion is needed this time.
Open up space for the first time.
- If the capacity is empty, the system initializes the space for the first time
- First get the capacity of the open space
- The first space is 4
- Then call the reallocate() function to open it up
- The old space will not be deleted because it is being created for the first time
Direct storage without capacity expansion
- Cache is used when the cache usage is less than or equal to 3/4
- It should be noted that on the basis of the current cache, the previous one is +1. In this judgment, the current cache +2 is less than or equal to 3/4 of the current capacity, which does not need to be expanded
- This is because you need to consider the impact of multithreading
capacity
- A capacity expansion is a doubling of the previous capacity
- The maximum capacity is 2 to the 16th
- The value passed in is True, so the old space needs to be deleted and a new space needs to be created. After expansion, the space needs to be created again

[Step 2] : Calculate the storage location

Hash algorithm to calculate the storage location, hash algorithm will definitely involve hash conflict, hash conflict after the need to calculate again to get the storage location. The specific hashing algorithm will be analyzed later.

Code:

bucket_t *b = buckets(); Mask_t m = capacity-1; mask_t m = capacity-1; Mask_t begin = cache_hash(sel, m); // Hash algorithm, (uintptr_t) (uintptr_t)sel & mask mask_t I = begin; // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot because the // minimum Size is 4 and we resized at 3/4 full. // Check whether the location was empty. If it was not empty and other sel contents were stored, a conflict occurred. Do {// If (fastPath (b[I].sel() == 0)) {incrementOccupied(); Set <Atomic, Encoded>(sel, imp, CLS); // set<Atomic, Encoded>(sel, imp, CLS); / / store return; } // other threads have already added it, Upon hearing (b[I].sel() == sel) {// The entry was added to The cache by some other thread // before we ear The cacheUpdateLock. return; } } while (fastpath((i = cache_next(i, m)) ! = begin)); // hash collision algorithm: I? i-1 : mask;Copy the code

Code description:

The storage subscript is obtained using the cache_hash() algorithm
If sel() of the storage location is empty, it is stored directly
If the SEL () in the storage position is the same as the SEL to be inserted, use the sel directly instead of storing it
Otherwise, the cache_next() hash collision algorithm is used to calculate the storage location and re-compare

[Third step] : Storage storage is directly set up, stored in SEL, IMP, CLS.

Code:

cache_t::bad_cache(receiver, (SEL)sel, cls);
Copy the code

3.3.3 RealLocate () function analysis

This method is used to create a new space, but it is important to delete the old space in case of expansion.

Source: Description:

Create a new bucket_t space with the new capacity
The new buckets and capacity are set to the cache using the setBucketsAndMask() function
Remove the old buckets with the cache_collect_free() function

3.3.4 setBucketsAndMask() function analysis

This method is used to set buckets and masks. There are three methods: real machine 64-bit, real machine non-64-bit, and non-real machine.

3.3.5 Cache_Collect_free () Function Analysis

This method performs garbage collection on the data

Garbage collection, cleaning out old buckets
After capacity expansion, there is no need to migrate cache information, which can improve performance because the more methods are called, the more information will be migrated at each capacity expansion, resulting in poor performance, while avoiding hash conflicts to a certain extent.

3.3.6 _garbage_make_room function

Create a garbage collection space
If this is the first time, you need to allocate reclamation space
If it is not the first time, the memory segment is doubled, that is, the original memory *2

3.3.7 Answering questions

After the analysis of the principle, the next question can be answered:

Q: What is _mask? A: _mask refers to the mask data, which is used to calculate the hash subscript in the hash algorithm or hash collision algorithm. Mask is equal to capacity – 1.

Q: What is _occupied? A: Sel-IMP is stored in cache; _occupied+1 is used whenever a new method is called.

Q: Why do the printed versions of occupied and mask change as method calls increase? Answer: With new method calls, the method is stored in the cache naturally, so _occupied increases. The sel-IMP stored in the cache is increasing, so the cache needs to be expanded, and the mask is -1, so the mask also increases.

Q: Why is bucket data missing? Say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4, say3, say4

Q: Why is “say4” printed first and “say3” printed next to “say3” printed next to “say3” in 2-7? A: The storage method is hash table, hash algorithm to calculate the subscript, not sequential storage structure.

Q: Why does _ocupied in printed cache_t start with 2? A: The reason for this is that LGPerson assigns values to the two properties of the object created by alloc. The assignment implicitly calls the set method. The set method also causes the occupied change

3.3.8 summary

The cache stores SEL-IMP, which is stored in the hash table, and the subscript is calculated by the hash algorithm

The capacity created for the first time is 4

The storage capacity is expanded when occupied+2 is greater than 3/4 of the capacity

Capacity expansion is not about expanding memory on the basis of the original memory, but deleting the old space and creating new space

A single expansion doubles the original capacity, which is at most 2 to the 16th power

3.4 Mask Calculation

This paper introduces the storage format of maskAndBuckets and how to obtain maskAndBuckets, mask, and buckets respectively through the calculation of masks.

3.4.1 Storage Format:

MaskAndBuckets are 64 in total
Mask holds the top 16 places
Buckets was ranked 44th
The middle 4 bits are more advantageous for sending messages and are used in the query cache.

3.4.2 Mask Data

MaskShift is used to calculate the mask, which can be obtained by shifting maskAndBuckets to the right by one digit
MaskZeroBits store the four bits in the middle, which can be used to calculate the number of bits to get buckets’ mask
MaxMask indicates the maximum mask
- Mask = capacity -1
- The maximum capacity is 2 to the 16, so mask is 2 to the 16 minus 1
- Mask has 16 bits
BucketsMask is the mask of buckets. The last 44 digits of bucketsMask are ones.
- The calculation method is 1<<(MaskShift-MaskZerobits) -1
- maskShift-maskZeroBits = 48-4 = 44
- 1<<44 yields data that is 1 in the 45th place and 0 in the last 44
- 1<<44-1 is the last 44 bits of the data are 1’s

3.4.3 Mask Calculation:

Calculation of BucketsAndMask:

The passed value newBuckets is the newly created buckets
The newMask passed is a mask
__mindmap__topic first shifts the mask 48 bits to the left and places it 16 bits higher
Then he and Buckets put buckets in the last 44th place
- Here should use |, that is, as long as there is 1, it is shown as 1
- This saves the first 16 bits and the last 44 bits

Buckets:

bucketsMask 1<<44 -1
- 1 < < 44. 1 in the 45th place, 0 in the last 44
- 44-1 < < 1. If you subtract 1, you’re going to have 1 for the next 44 places
So if you rub buckets ask, you get buckets in the last 44

Mask:

We move 48 to the right, or 16 to the end, to get the mask

3.5 Hash algorithm

It is important to pay attention to this, because in the case of quick message lookup, the IMP is queried in the cache through the incoming SEL, which requires a hash algorithm. Quick lookups are implemented in assembly, so if it’s not clear here, it’s even harder to understand what happens later.

3.5.1 Introduction to Hashing

How to Store an address Bucket Is stored in a hash table, so you need to use the hash algorithm to store the bucket. The stored object is used as a variable, and the hash function of the hash algorithm is used to calculate the storage address.

Hash conflict If two storage objects hash to the same address, it indicates that a hash conflict occurs. To resolve this problem, the variable is computed using a hash collision algorithm to obtain a new hash address. If it is still in conflict, the hash collision function is continued to calculate the new hash address

3.5.2 Storage Logic:

Source:

bucket_t *b = buckets(); Mask_t m = capacity-1; mask_t m = capacity-1; Mask_t begin = cache_hash(sel, m); // Hash algorithm, (uintptr_t) (uintptr_t)sel & mask mask_t I = begin; // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot because the // minimum Size is 4 and we resized at 3/4 full. // Check whether the location was empty. If it was not empty and other sel contents were stored, a conflict occurred. Do {// If (fastPath (b[I].sel() == 0)) {incrementOccupied(); Set <Atomic, Encoded>(sel, imp, CLS); // set<Atomic, Encoded>(sel, imp, CLS); / / store return; } // other threads have already added it, Upon hearing (b[I].sel() == sel) {// The entry was added to The cache by some other thread // before we ear The cacheUpdateLock. return; } } while (fastpath((i = cache_next(i, m)) ! = begin)); // hash collision algorithm: I? i-1 : mask;Copy the code

Determine if the address is changed and sel is not stored, then directly store sel. 3. If the sel has been stored, but the SEL is the one you want to store, then directly return (because of multi-threading). Shows conflicts after 5, conflict hash algorithm to calculate the hash address, 6, if the hash address do not agree with the first hash address, then enter on 7, if equal, exit, direct conflict because hash address using the last round I calculated to * if equal, next time calculation is equal, So it stops when it’s equal, otherwise it’s going to create an infinite loop * and in this case the hash collision algorithm is going through all the positions, so it just exits.

3.5.3 Hash Algorithm:

Source:

static inline mask_t cache_hash(SEL sel, mask_t mask) 
{
    return (mask_t)(uintptr_t)sel & mask;
}
Copy the code

It can also be seen here that sel and mask match to get hash address
And this mask is capacity minus 1

3.5.4 Hash Conflict Algorithm:

Source:

#if __arm__ || __x86_64__ || __i386__ // objc_msgSend has few registers available. // Cache scan increments and wraps at  special end-marking bucket. #define CACHE_END_MARKER 1 static inline mask_t cache_next(mask_t i, mask_t mask) { return (i+1) & mask; // add mask} // add mask} Registers available. // Cache scan decrements. No end marker needed. #define CACHE_END_MARKER 0 static inline mask_t cache_next(mask_t i, mask_t mask) { return i ? i-1 : mask; // If I is present, store the index -1, i.e., the first bit of the index. If it is 0, then place it directly in the first position, i.e., the last position.Copy the code

So you can see here that the hash collision algorithm is very simple, just move it forward one bit
If you move all the way to the first place and it still conflicts, you put it to the last place. And then we move forward
It will exit the loop by moving to the value of the original hash algorithm

Conclusion: 1, first by the hash algorithm to get the hash address 2, determine if you already have a value but not the current value of depositing instructions hash conflict, conflict is executed hash algorithm 3, if not sel in hash address, direct deposit 3, if there is a sel hash address, and is the need to deposit in the value of the Note other threads have been saved, directly return can be

Answers to a few little questions

Q: Why delete and recreate instead of adding more memory to the original? A: After capacity expansion, the mask of the hash algorithm will change, leading to the change of the hash address, easy to conflict, and it is not convenient to increase the memory on the original basis.

Q: If it’s already discarded, why does it need to be expanded? A: Because this indicates that he can make as many method calls as he can make after the expansion, he expands it so that he doesn’t have to discard it next time.

The hash algorithm is random. If only one hash method is used, it can exist in 0,1,2 places.

3.6 validation

So let’s actually verify that by calling a method to see if it actually gets stored. When a method is called, sel and Imp are stored in the bucket in the cache.

Method creation:

Method call:

Analysis process:

Just like before, we’re looking for the memory of this object
We need to offset 16 bytes here
Once you have a cache, you can retrieve SEL and IMP using the cache_t structure you learned above
From the source analysis, we know that SEL-IMP is in the _buckets attribute of cache_T (currently in macOS), and that cache_t provides a method to get the _buckets attribute buckets()
Sel-imp (pClass) and SEL (pClass) are also available in the bucket_t structure.

This method can be found in cache->bucket->sel-> IMP

3.7 summarize

Cache stores SEL and IMP, which can send messages quickly. Sel – IMP is stored in the hash table, and the subscript is calculated by hash algorithm.

The maskAndBuckets in the cache contain buckets, and buckets contain multiple buckets, each of which stores a SEL and IMP key-value pair.

MaskAndBuckets is stored in a hash table, and cache is stored in a hash table, or hash table, which calculates subscripts to write data, so there is no order.

MaskAndBuckets’ mask is stored in the first 16 bits, while Buckets is stored in the last 44 bits. The middle four bits are always 0. The middle four bits are 0 so as to jump to the last subscript more quickly when searching for a method and calculate by mask.

For capacity

The capacity created for the first time is 4

It expands when storing occupied+2 is greater than 3/4 of its capacity (this is in consideration of multi-threading)

Capacity expansion does not expand memory on the basis of the original memory, but deletes the old space and creates a new space. Therefore, each capacity expansion needs to clear the previous cache space and clear SEL and IMP of the cache.

A single expansion doubles the original capacity, which is at most 2 to the 16th power

The hash algorithm

Sel and upper mask can get the hash address

If it does, you need to call the hash collision algorithm

The hash collision algorithm is I :i-1? mask

That is, move one bit forward to store, and if moving all the way to the first still conflicts, insert from the last bit

Exit if begin is still in conflict

4、 bits

Bits stores class information, including member variables, attributes, instance methods, and protocols. So how do you store it, you need to look at the underlying structure

4.1 Bits understanding

Bits has only one use for class_rw_T, so we need to analyze class_rw_T.

Source:

struct class_data_bits_t { friend objc_class; // Values are the FAST_ flags above. uintptr_t bits; Private: // Get data, class_rw_t class_rw_t* data() const {return (class_rw_t *)(bits & FAST_DATA_MASK); }}Copy the code

4.2 class_rw_t

Source:

struct class_rw_t { // Be warned that Symbolication knows the layout of this structure. uint32_t flags; uint16_t witness; #if SUPPORT_INDEXED_ISA uint16_t index; #endif // attribute ro or rwe, which is a structure containing ro and rwe explicit_atomic<uintptr_t> ro_or_rw_ext; Class firstSubclass; Class nextSiblingClass; private: using ro_or_rw_ext_t = objc::PointerUnion<const class_ro_t *, class_rw_ext_t *>; const ro_or_rw_ext_t get_ro_or_rwe() const { return ro_or_rw_ext_t{ro_or_rw_ext}; } void set_ro_or_rwe(const class_ro_t *ro) { ro_or_rw_ext_t{ro}.storeAt(ro_or_rw_ext, memory_order_relaxed); } void set_ro_or_rwe(class_rw_ext_t *rwe, const class_ro_t *ro) { // the release barrier is so that the class_rw_ext_t::ro initialization // is visible to lockless readers rwe->ro = ro; ro_or_rw_ext_t{rwe}.storeAt(ro_or_rw_ext, memory_order_release); } class_rw_ext_t *extAlloc(const class_ro_t *ro, bool deep = false); public: void setFlags(uint32_t set) { __c11_atomic_fetch_or((_Atomic(uint32_t) *)&flags, set, __ATOMIC_RELAXED); } void clearFlags(uint32_t clear) { __c11_atomic_fetch_and((_Atomic(uint32_t) *)&flags, ~clear, __ATOMIC_RELAXED); } // set and clear must not overlap void changeFlags(uint32_t set, uint32_t clear) { ASSERT((set & clear) == 0); uint32_t oldf, newf; do { oldf = flags; newf = (oldf | set) & ~clear; } while (! OSAtomicCompareAndSwap32Barrier(oldf, newf, (volatile int32_t *)&flags)); } // get rwe class_rw_ext_t *ext() const {return get_ro_or_rwe().dyn_cast<class_rw_ext_t *>(); } // open rWE space class_rw_ext_t *extAllocIfNeeded() {auto v = get_ro_or_rwe(); if (fastpath(v.is<class_rw_ext_t *>())) { return v.get<class_rw_ext_t *>(); } else { return extAlloc(v.get<const class_ro_t *>()); } } class_rw_ext_t *deepCopy(const class_ro_t *ro) { return extAlloc(ro, true); } // get ro class_ro_t *ro() const {auto v = get_ro_or_rwe(); If (slowpath(V.I.S <class_rw_ext_t *>())) {return v.set <class_rw_ext_t *>()->ro; Return v.get<const class_ro_t *>(); } // set ro void set_ro(const class_ro_t *ro) {auto v = get_ro_or_rwe(); // why is there class_rw_ext_t？？？？？ if (v.is<class_rw_ext_t *>()) { v.get<class_rw_ext_t *>()->ro = ro; } else { set_ro_or_rwe(ro); Const method_array_t methods() const {auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->methods; } else { return method_array_t{v.get<const class_ro_t *>()->baseMethods()}; Const property_array_t properties() const {auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->properties; } else { return property_array_t{v.get<const class_ro_t *>()->baseProperties}; Const protocol_array_t protocols() const {auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->protocols; } else { return protocol_array_t{v.get<const class_ro_t *>()->baseProtocols}; }}};Copy the code

The members and functions of the structure are analyzed as follows:

Ro_or_rw_ext is a PointerUnion structure that stores ro and RWE. This structure provides functions to operate on the stored RO and RWE. You can see that this structure provides the storeAt and get() functions.

using ro_or_rw_ext_t = objc::PointerUnion<const class_ro_t *, class_rw_ext_t *>; template <class PT1, class PT2> class PointerUnion { uintptr_t _value; static_assert(alignof(PT1) >= 2, "alignment requirement"); static_assert(alignof(PT2) >= 2, "alignment requirement"); struct IsPT1 { static const uintptr_t Num = 0; }; struct IsPT2 { static const uintptr_t Num = 1; }; template <typename T> struct UNION_DOESNT_CONTAIN_TYPE {}; uintptr_t getPointer() const { return _value & ~1; } uintptr_t getTag() const { return _value & 1; } public: explicit PointerUnion(const std::atomic<uintptr_t> &raw) : _value(raw.load(std::memory_order_relaxed)) { } PointerUnion(PT1 t) : _value((uintptr_t)t) { } PointerUnion(PT2 t) : _value ((uintptr_t) t | 1) {} / / store data void storeAt (STD: : atomic < uintptr_t > & raw, std::memory_order order) const { raw.store(_value, order); } template <typename T> bool is() const { using Ty = typename PointerUnionTypeSelector<PT1, T, IsPT1, PointerUnionTypeSelector<PT2, T, IsPT2, UNION_DOESNT_CONTAIN_TYPE<T>>>::Return; return getTag() == Ty::Num; } template <typename T> T get() const {ASSERT(is<T>() && "Invalid accessor called"); return reinterpret_cast<T>(getPointer()); } template <typename T> T dyn_cast() const { if (is<T>()) return get<T>(); return T(); }};Copy the code

Some functions are provided to get and set members ro_or_rw_ext. Here are the functions for get and set

const ro_or_rw_ext_t get_ro_or_rwe() const {
        return ro_or_rw_ext_t{ro_or_rw_ext};
    }

    void set_ro_or_rwe(const class_ro_t *ro) {
        ro_or_rw_ext_t{ro}.storeAt(ro_or_rw_ext, memory_order_relaxed);
    }

    void set_ro_or_rwe(class_rw_ext_t *rwe, const class_ro_t *ro) {
        // the release barrier is so that the class_rw_ext_t::ro initialization
        // is visible to lockless readers
        rwe->ro = ro;
        ro_or_rw_ext_t{rwe}.storeAt(ro_or_rw_ext, memory_order_release);
    }
Copy the code

Provide the function ext() to get class_rw_ext_t

The private get_ro_or_rwe() function is called and class_rw_ext_t is retrieved from the structure.

// get rwe class_rw_ext_t *ext() const {return get_ro_or_rwe().dyn_cast<class_rw_ext_t *>(); }Copy the code

Provide the function ro() to get class_ro_t

If rWE exists, check ro in RWE first. If not, get RO in RW directly. Don’t confuse this order. (Although it’s not clear why Apple would do this)

const class_ro_t *ro() const { auto v = get_ro_or_rwe(); If (slowpath(V.I.S <class_rw_ext_t *>())) {return v.set <class_rw_ext_t *>()->ro; Return v.get<const class_ro_t *>(); }Copy the code

Note also the function extAllocIfNeeded(), which is used to create RwE space when attaching class data to a class, or when creating methods or attributes at runtime

Class_rw_ext_t *extAllocIfNeeded() {auto v = get_ro_or_rwe(); if (fastpath(v.is<class_rw_ext_t *>())) { return v.get<class_rw_ext_t *>(); } else { return extAlloc(v.get<const class_ro_t *>()); }}Copy the code

Gets an array of method lists, an array of property lists, and an array of protocol lists

We should also pay attention to the order here. It can be seen that methods in RWE are judged first, and methods in RO are obtained only if there is no RWE

// Get an array of all methods listed const method_array_t methods() const {auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->methods; } else { return method_array_t{v.get<const class_ro_t *>()->baseMethods()}; Const property_array_t properties() const {auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->properties; } else { return property_array_t{v.get<const class_ro_t *>()->baseProperties}; Const protocol_array_t protocols() const {auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->protocols; } else { return protocol_array_t{v.get<const class_ro_t *>()->baseProtocols}; }}Copy the code

Note:

There are no functions in THE RW that provide access to member variables, only methods, attributes, and protocols

The only member for class information is ro_OR_RW_ext, which also indicates that RW itself does not store data, but in RO and RWE

Rw indicates that data can be read and written, retrieved and stored

For all data, it is necessary to determine whether rWE exists. If so, data in RWE should be obtained first; if not, data in RO should be obtained

4.3 Understanding of class_RO_T

Source:

struct class_ro_t { uint32_t flags; uint32_t instanceStart; uint32_t instanceSize; #ifdef __LP64__ uint32_t reserved; Uint8_t * ivarLayout; const char * name; // method, protocol, attribute method_list_t * baseMethodList; // List of base methods protocol_list_t * baseProtocols; // List of base protocols const ivar_list_t * ivars; // Uint8_t * weakIvarLayout const uint8_t * weakIvarLayout; property_list_t *baseProperties; // This field exists only when RO_HAS_SWIFT_INITIALIZER is set. _objc_swiftMetadataInitializer __ptrauth_objc_method_list_imp _swiftMetadataInitializer_NEVER_USE[0]; _objc_swiftMetadataInitializer swiftMetadataInitializer() const { if (flags & RO_HAS_SWIFT_INITIALIZER) { return _swiftMetadataInitializer_NEVER_USE[0]; } else { return nil; } } method_list_t *baseMethods() const { return baseMethodList; } class_ro_t *duplicate() const { if (flags & RO_HAS_SWIFT_INITIALIZER) { size_t size = sizeof(*this) + sizeof(_swiftMetadataInitializer_NEVER_USE[0]); class_ro_t *ro = (class_ro_t *)memdup(this, size); ro->_swiftMetadataInitializer_NEVER_USE[0] = this->_swiftMetadataInitializer_NEVER_USE[0]; return ro; } else { size_t size = sizeof(*this); class_ro_t *ro = (class_ro_t *)memdup(this, size); return ro; }}};Copy the code

Description:

It has a list of base methods, a list of base protocols, a list of member variables, and a list of base properties
The reason for the base word is that this is where the class data is stored when the class is loaded into memory, and the data that is attached to the class and created dynamically at runtime is not stored here, but in the RWE.
That is, ro stores only the data of the class itself
Ro contains a list of member variables because it is only stored in RO and not in RWE, so it is read-only and cannot be written to.
We know that there is no entry function to fetch member variables in RW, so we can only fetch them indirectly through ro

4.4 class_rw_ext_t

Source:

Struct class_rw_ext_t {const class_ro_t *ro; method_array_t methods; property_array_t properties; protocol_array_t protocols; char *demangledName; uint32_t version; };Copy the code

Description:

It can be seen that rWE also contains RO, which means that RWE not only contains classification or runtime new data, but also contains the underlying data of the class.

At the same time, we can see in the function of rW’s method list, attribute list and protocol list, in fact, the method list, attribute list and protocol list of RWE also contain the basic list of RO. This can be seen more clearly later when analyzing the loading process of the class. It will copy data from RO directly into RWE

That is to say, ro data can be obtained in three places: 1) directly through RW -> RO ->methods; 2) RW -> RWE -> RO ->methods; 3)rw->rwe->methods

4.5 List Data

The protocol will be analyzed later, but not here.

4.5.1 List Array

The list array format is the same, but the method list array is used as an example. The list of methods is method_list_t, and the method is method_t. Source:

class method_array_t : public list_array_tt<method_t, method_list_t> { typedef list_array_tt<method_t, method_list_t> Super; public: method_array_t() : Super() { } method_array_t(method_list_t *l) : Super(l) { } method_list_t * const *beginCategoryMethodLists() const { return beginLists(); } method_list_t * const *endCategoryMethodLists(Class cls) const; method_array_t duplicate() { return Super::duplicate<method_array_t>(); }};Copy the code

4.5.2 Method List, member variable list and attribute list

Both are stored in entsize_list_tt, the style of which will be examined later

// Two bits of entsize are used for fixup markers. entsize_list_tt<method_t, method_list_t, 0x3> { bool isUniqued() const; bool isFixedUp() const; void setFixedUp(); Uint32_t indexOfMethod(const method_t *meth) const {uint32_t I = (uint32_t)(((uintptr_t)meth - (uintptr_t)this) / entsize()); ASSERT(i < count); return i; }}; struct ivar_list_t : Entsize_list_tt <ivar_t, ivar_list_t, 0> { Bool containsIvar(Ivar Ivar) const {return (Ivar >= (Ivar)&*begin() &&ivar < (Ivar)&*end()); }}; struct property_list_t : entsize_list_tt<property_t, property_list_t, 0> { };Copy the code

4.5.3 Method, attribute, member variable structure

4.5.3.1 method_t

Source:

*/ struct method_t {SEL name; const char *types; MethodListIMP imp; Struct SortBySELAddress: struct SortBySELAddress: struct SortBySELAddress: struct SortBySELAddress: public std::binary_function<const method_t&, const method_t&, bool> { bool operator() (const method_t& lhs, const method_t& rhs) { return lhs.name < rhs.name; }}; };Copy the code

Description:

A method contains sel, method type, function pointer IMP
It also provides a function for sorting. And this is what you’re going to use when you load your class, to determine the order of the two methods, so let’s just remember that there’s such a thing

4.5.3.2 ivar_t

Source:

struct ivar_t { #if __x86_64__ // *offset was originally 64-bit on some x86_64 platforms. // We read and write only 32 bits of it. // Some metadata provides all 64 bits. This is harmless for unsigned // little-endian values. // Some code uses all 64 bits. class_addIvar() over-allocates the // offset for their benefit. #endif int32_t *offset; // offset const char *name; // name const char *type; // Type // alignment is sometimes -1; use alignment() instead uint32_t alignment_raw; // How many bytes align uint32_t size; Uint32_t alignment() const {//8 bytes alignment if (alignment_RAW == ~(uint32_t)0) return 1U << WORD_SHIFT; Return 1 << alignment_RAW; }};Copy the code

Description:

A member variable just has the variable name and type, no extra data, okay

4.5.3.3 property_t *

Source:

// struct property_t {const char *name; // struct property_t {const char *name; const char *attributes; };Copy the code

Description:

Attributes in addition to the attribute name are attributes, which store things like copy, strong, and so on.

4.6 validation

Use LLDB to see if all the class information in bits is consistent with the class information we define.

All methods list, protocol list, attribute list, and member variable list can be obtained by using pointer offset to get bits data, and then rWE and RO in RW respectively.

4.6.1 obtain bits

Isa property: Inherited isa from objc_Object, 8 bytes
Class superclass: is the Class type, is a pointer, 8 bytes
cache

struct cache_t { #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED explicit_atomic<struct bucket_t *> _buckets; Explicit_atomic <mask_t> _mask; explicit_atomic<mask_t> _mask; // mask_t is of type mask_t, which is an alias for an unsigned int, #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 EXPLicit_atomic <uintptr_t> _maskAndBuckets; Mask_t _mask_unused; #if __LP64__ uint16_t _flags; #if __LP64__ uint16_t _flags; // is the uint16_t type, uint16_t is an alias for unsigned short, making up 2 bytes // is of the uint16_t type, uint16_t is an alias for unsigned short and takes up 2 bytesCopy the code

Here is a look at real 64-bit only:

_maskAndBuckets is a uintptr_t type, which is a pointer of 8 bytes
_mask_unused Indicates the mask_T type, which takes 4 bytes
_flags is the uint16_t type, uint16_t is an alias for unsigned short, two bytes
The size of occupied is also of the uint16_t type, which is an alias for unsigned short and is two bytes long

Calculation:

The size of the cache class = 12 + 2 + 2 = 16 bytes

Conclusion: You need to shift 32 bytes from the first address to get bits

4.6.2 Starting To Obtain information

[First step] Get bits by pointer offset

Description:

Because you need to offset 32 bytes, and this is a hexadecimal value, you need to change from 0x100002250 to 0x100002270.

[Step 2] Get rW

Description:

P * 30−>data() yields bits.data() data(∗ number), which requires 30->data() to obtain bits.data() data(* number). 3-> = 3-> = 3-> = 3->
P *$4 is all the object information for this data
All information is stored in ro_OR_RW

[Step 3] Get the list of methods

P $4.methods() get method information
P $5.list gets the list of methods
P *$6 Here we get the first data in the list of methods. So that’s the first method
The first method can also be obtained by p $7.get(0)

[Step 4] Get the property list

The procedures are the same as the list of methods, and you’ve already looked at the underlying implementation of RW in detail above, so I’ll go straight to the procedures.

Step 5: Get the list of member variables

You can see that you need to get ro first, and then ro to get the list of member variables, because there is no entry function in the RW to get member variables

4.7 summarize

The underlying structure of classes in OC is objC_class, inherited from the objC_Object structure

The bits in objc_class store attributes, methods, protocols, and member variables. Member variables are not stored directly in class_rw_T, but in class_ro_T, which is clean memory

Class_rw_t gets all the information about the class, including ro and RWE

To obtain data from RW, determine whether there is RWE first. If there is RWE, obtain data through RWE; otherwise, obtain data through RO

Ro is clean memory, and the data of the class itself is stored in RO when the class is loaded

Rwe is dirty memory. The members that are dynamically created or categorized at runtime are all in RWE, including attributes, methods, and protocols.

The list of properties, methods, and protocols in RWE also includes data in RO

The RWE structure contains ro

5, a brief summary

The underlying structure of classes in OC is objC_class, inherited from the objC_Object structure

The objC_class structure contains four attributes: ISA, superClass, Cache, and bits

Isa inherits from Objc_Object and contains metaclass information

SuperClass is also a class structure, representing the parent class

A cache is a list of cached methods. Sel and IMP are stored in a hash table, and IMPs are searched in the cache through SEL when a quick message is sent

Both cache storage and cache retrieval are calculated by hashing algorithm

Bits stores class information, including attributes, methods, protocols, and member variables. The bits data is rW, but it is divided into two types. The clean memory RO stores only the data of the class, and the dirty memory RWE stores classified data and dynamically created data at runtime

Metaclasses are defined and created at compile time to manage classes, which are classes of class objects. Class methods exist as object methods in metaclasses.

The metaclass itself is not visible and can not be used directly. You can view the metaclass information through the isa of the class

Inheritance is a relationship between a class and a metaclass, not an object, and the root metaclass inherits from NSObject, and the parent of NSObject is nil, meaning there’s no parent, NSObject is where everything came from.

The ISA of an object points to the class, the ISA of the class points to the metaclass, the ISA of the metaclass points to the root metaclass, and the ISA of the root metaclass points to itself

The ISA of the NSObject class also points to the root metaclass

Pointer offset is used in the verification process of LLDB. If there is any doubt, you can see the link of pointer offset principle analysis

Underlying analysis of class 02-OC

1. Understanding the underlying structure of objC_class

1.1 Create a WYPerson class

1.2 Underlying structure

1.3 Structure template

1.4 summarize

2. Understanding of ISA, the trend of ISA and the inheritance relationship of classes

2.1 yuan class

2.1.1 What is a metaclass

2.1.2 Why metaclasses are Needed (What are the functions of metaclasses?)

2.1.3 Where are class methods stored?

2.2 Isa position analysis

2.2.1 Analysis based on the diagram

2.2.2 validation

2.3 Analysis of inheritance relationship

2.4 summarize

3, the cache

3.1 Overview of the Cache structure

3.1.1 cache_t structure

3.1.2 Bucket_T structure

3.1.3 summary

3.2 Viewing cache data Changes Without the source code Environment

3.3 Mechanism Analysis of Cache

3.3.1 Finding stored functions

IncrementOccupied () ¶ In cache_t, incrementOccupied() sets the occupied property.

3.3.1.2 It is found that the implementation of this function is autoincrement

3.3.1.3 Global search incrementOccupied() finds a call to insert in cache_t

3.3.1.4 A global search for the INSERT () method shows that only calls in the cache_fill method match

Before writing to cache_fill, there is another step, namely cache read, to find SEL-IMP, as shown below

3.3.1.6 summary:

3.3.2 Insert () function analysis

3.3.3 RealLocate () function analysis

3.3.4 setBucketsAndMask() function analysis

3.3.5 Cache_Collect_free () Function Analysis

3.3.6 _garbage_make_room function

3.3.7 Answering questions

3.3.8 summary

3.4 Mask Calculation

3.4.1 Storage Format:

3.4.2 Mask Data

3.4.3 Mask Calculation:

3.5 Hash algorithm

3.5.1 Introduction to Hashing

3.5.2 Storage Logic:

3.5.3 Hash Algorithm:

3.5.4 Hash Conflict Algorithm:

3.6 validation

3.7 summarize

4、 bits

4.1 Bits understanding

4.2 class_rw_t

The members and functions of the structure are analyzed as follows:

4.3 Understanding of class_RO_T

4.4 class_rw_ext_t

4.5 List Data

4.5.1 List Array

4.5.2 Method List, member variable list and attribute list

4.5.3 Method, attribute, member variable structure

4.5.3.1 method_t

4.5.3.2 ivar_t

4.5.3.3 property_t *

4.6 validation

4.6.1 obtain bits

4.6.2 Starting To Obtain information

4.7 summarize

5, a brief summary

Related Posts

IOS Startup Optimization for low-level exploration (I) — Getting to know LLVM

Communication scheme between iOS modules based on routing

For the next new project, should I write it in Swift?