1. Review
Isa is introduced in iOS low-level exploration (top), bits is introduced in iOS low-level exploration (middle), and there isa cache that is not explored and analyzed, this time mainly analyzing cache attributes.
2. The structure of the cache
Our goal is to explore the cache by first understanding its structure and then analyzing it.
struct objc_class : objc_object {
objc_class(const objc_class&) = delete;
objc_class(objc_class&&) = delete;
void operator=(const objc_class&) = delete;
void operator=(objc_class&&) = delete;
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags. Code omitted here... }Copy the code
We can see from the class structure that cache is of type cache_t, so let’s take a look inside cache_t.
2.1 cache_t
struct cache_t {
private:
explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
union {
struct {
explicit_atomic<mask_t> _maybeMask;
#if __LP64__
uint16_t _flags;
#endif
uint16_t _occupied;
};
explicit_atomic<preopt_cache_t *> _originalPreoptCache;
};
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
// _bucketsAndMaybeMask is a buckets_t pointer
// _maybeMask is the buckets mask
static constexpr uintptr_t bucketsMask = ~0ul; static_assert(! CONFIG_USE_PREOPT_CACHES,"preoptimized caches not supported");
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
static constexpr uintptr_t maskShift = 48;
static constexpr uintptr_t maxMask = ((uintptr_t)1< < (64 - maskShift)) - 1;
static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << maskShift) - 1;
static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers.");
#if CONFIG_USE_PREOPT_CACHES
static constexpr uintptr_t preoptBucketsMarker = 1ul;
static constexpr uintptr_t preoptBucketsMask = bucketsMask & ~preoptBucketsMarker;
#endif. Code omitted here... }Copy the code
The structure of cache_T is easy to see from the underlying source code, and can also be viewed by code testing LLDB
From the console output, you can see that the structure is exactly the same. To viewcache_t
Source, we also found that the bottom is divided into 3 architectures to deal with, which is the real machine architecturemask
andbucket
Is written together, the purpose is to optimize, you can obtain the corresponding data through their respective masks.
CACHE_MASK_STORAGE_OUTLINED
: Indicates the running environmentThe simulator
ormacOS
systemCACHE_MASK_STORAGE_HIGH_16
: Indicates that the operating environment is64
A real machineCACHE_MASK_STORAGE_LOW_4
: Indicates that the operating environment isThe 64 - bit
The real machine
The following code is also found in the cache_t structure
static bucket_t *emptyBuckets();
static bucket_t *allocateBuckets(mask_t newCapacity);
static bucket_t *emptyBucketsForCapacity(mask_t capacity, bool allocate = true);
static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap);
void bad_cache(id receiver, SEL sel) __attribute__((noreturn, cold));
Copy the code
So you can see from this code that we’re operating on bucket_t, so what’s the important role of bucket_t?
2.2 bucket_t
Here is the bucket_t core code
struct bucket_t {
private:
// IMP-first is better for arm64e ptrauth and no worse for arm64.
// SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
explicit_atomic<uintptr_t> _imp;
explicit_atomic<SEL> _sel;
#else
explicit_atomic<SEL> _sel;
explicit_atomic<uintptr_t> _imp;
#endif
// Compute the ptrauth signing modifier from &_imp, newSel, and cls.
uintptr_t modifierForSEL(bucket_t *base, SEL newSel, Class cls) const {
return(uintptr_t)base ^ (uintptr_t)newSel ^ (uintptr_t)cls; }... Code omitted here... }Copy the code
From the above 👆 bucket_t structure source can be seen, bucket_t inside the storage is SEL and IMP, also divided into two versions, the real machine and non-real machine, the difference is that the order of SEL and IMP is not consistent. A cache is a method cache 😁.
A simple structure can be drawn as follows
So is it method caching? How do you do method caching? And I’m going to explore it further
2.2.1 buckets ()
As you can see from the source, there is a buckets() method to get bucket_t
(lldb) p $2.buckets()
(bucket_t *) $3 = 0x00000001003623c0
(lldb) p *$3
(bucket_t) $4 = {
_sel = {
std::__1::atomic<objc_selector *> = (null) {
Value = nil
}
}
_imp = {
std::__1::atomic<unsigned long> = {
Value = 0
}
}
}
(lldb)
Copy the code
How embarrassing! Nothing! The _sel value is nil, okay? Isn’t it method caching? Where did the method go? This is because we did not call the method, where to cache that! So let’s call a method and see
(lldb) p [p sayHello]
2021- 06- 25 15:37:47.401935+0800 JPBuild[18788:5401308] -[JPPerson sayHello]
(lldb) p/x pClass
(Class) $5 = 0x0000000100008688 JPPerson
(lldb) p (cache_t*)0x0000000100008698
(cache_t *) $6 = 0x0000000100008698
(lldb) p *$6
(cache_t) $7 = {
_bucketsAndMaybeMask = {
std::__1::atomic<unsigned long> = {
Value = 4301295648
}
}
= {
= {
_maybeMask = {
std::__1::atomic<unsigned int> = {
Value = 7
}
}
_flags = 32808
_occupied = 1
}
_originalPreoptCache = {
std::__1::atomic<preopt_cache_t *> = {
Value = 0x0001802800000007
}
}
}
}
(lldb) p *$7.buckets()
(bucket_t) $8 = {
_sel = {
std::__1::atomic<objc_selector *> = (null) {
Value = nil
}
}
_imp = {
std::__1::atomic<unsigned long> = {
Value = 0
}
}
}
(lldb)
Copy the code
2.2.2 sel () and imp ()
What the hell 👻?? What? Still not! But we found that _maybeMask, _flags, _occupied have values. Bucket_t: sel(); bucket_t: sel(); bucket_t: sel();
(lldb) p $7.buckets()[1]
(bucket_t) $10 = {
_sel = {
std::__1::atomic<objc_selector *> = (null) {
Value = nil
}
}
_imp = {
std::__1::atomic<unsigned long> = {
Value = 0
}
}
}
(lldb) p $7.buckets()
(bucket_t *) $11 = 0x0000000100609020
(lldb) p *$11
(bucket_t) $12 = {
_sel = {
std::__1::atomic<objc_selector *> = (null) {
Value = nil
}
}
_imp = {
std::__1::atomic<unsigned long> = {
Value = 0
}
}
}
(lldb) p $12.sel()
(SEL) $13 = (null)
(lldb) p $7.buckets()[2]
(bucket_t) $14 = {
_sel = {
std::__1::atomic<objc_selector *> = (null) {
Value = nil
}
}
_imp = {
std::__1::atomic<unsigned long> = {
Value = 0
}
}
}
(lldb) p $7.buckets()[3]
(bucket_t) $15 = {
_sel = {
std::__1::atomic<objc_selector *> = "" {
Value = ""
}
}
_imp = {
std::__1::atomic<unsigned long> = {
Value = 49128
}
}
}
(lldb) p $15.sel()
(SEL) $16 = "sayHello"
(lldb)
Copy the code
You can see the output of the method we called “sayHello”, IMP can also output, using the following method
inline IMP imp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, Class cls) const {
uintptr_t imp = _imp.load(memory_order_relaxed);
if(! imp)return nil;
Copy the code
The output is as follows
(lldb) p $15.imp(nil,pClass)
(IMP) $17 = 0x0000000100003960 (JPBuild`-[JPPerson sayHello])
(lldb)
Copy the code
3. Break away from source code analysis
Above is in the bottom source code inside view structure, and combined with LLDB debugging to analyze, so we can not source mode? What should I do? So the next step is to do direct code analysis by imitating the source code structure.
3.1 Small-scale sampling, imitating the source code structure
typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits
struct jp_bucket_t {
SEL _sel;
IMP _imp;
};
struct jp_cache_t {
struct jp_bucket_t *_bukets; / / 8
mask_t _maybeMask; / / 4
uint16_t _flags; / / 2
uint16_t _occupied; / / 2
};
struct jp_class_data_bits_t {
uintptr_t bits;
};
// cache class
struct jp_objc_class {
Class isa;// In the source code, objc_class's ISA attribute is inherited from objc_Object,
// But when we copy it in, we remove the inheritance of objc_class,
// This property needs to be explicit, otherwise the printed result will be problematic
Class superclass;
struct jp_cache_t cache; // formerly cache pointer and vtable
struct jp_class_data_bits_t bits;
};
Copy the code
methods
@implementation LGPerson
- (void)say1{
NSLog(@"LGPerson say : %s",__func__);
}
- (void)say2{
NSLog(@"LGPerson say : %s",__func__);
}
- (void)say3{
NSLog(@"LGPerson say : %s",__func__);
}
- (void)say4{
NSLog(@"LGPerson say : %s",__func__);
}
- (void)say5{
NSLog(@"LGPerson say : %s",__func__);
}
- (void)say6{
NSLog(@"LGPerson say : %s",__func__);
}
- (void)say7{
NSLog(@"LGPerson say : %s",__func__);
}
+ (void)sayHappy{
NSLog(@"LGPerson say : %s",__func__);
}
@end
Copy the code
3.2 Code Testing
Call two methods, print and see if they’re cachedTwo methods are called, both printed out, so let’s call a few more methodsBased on the results of the two tests printed above,_occupied
和_maybeMask
The number of method calls varies and the value increases. then_occupied
和_maybeMask
What are these two guys?
See the next blog analysis
IOS Low-level exploration structures – Cache Analysis (part 2)
More content continues to be updated
🌹 please move your little hands and give a thumbs-up 👍🌹
🌹 like can come a wave, collect + attention, comment + forward, so as not to find me next time, ha ha 😁🌹
🌹 welcome everyone to leave a message exchange, criticism and correction, learn from each other 😁, improve themselves 🌹