This is the 9th day of my participation in Gwen Challenge

preface

We’ve already explored the insertion process in the method cache. Today we’ll dig into the details of the method cache and the upper trigger timing.

Method caches key details

Why is capacity expansion done at 3/4 of capacity?

  1. The load factor of 3/4 is the consensus of most data structure algorithms, and the space utilization is relatively large when the load factor is 0.75.
  2. When the cache is stored, the value calculated by the hash algorithm is used as the storage subscript. The remaining size of the cache space is critical to whether the subscript conflicts. When 3/4 is used as the load factor, the probability of hash conflicts is relatively low.

_bucketsAndMaybeMaskThe role of

  1. _bucketsAndMaybeMaskWhat’s inside isbucketThe memory ofThe first addressandbucketMaskThe address;

Official explanation:

// _bucketsAndMaybeMask is a buckets_t pointer;
// _maybeMask is the buckets mask
Copy the code
  1. To obtain_bucketsAndMaybeMaskThen you can get the next one by memory translationbucketThe address;
  2. buckets()The method is actually right_bucketsAndMaybeMaskInside the value of the operation, getbucket_t;

The following code

struct bucket_t *cache_t::buckets() const
{
    uintptr_t addr = _bucketsAndMaybeMask.load(memory_order_relaxed);
    return (bucket_t *)(addr & bucketsMask);//_bucketsAndMaybeMask and bucketsMask;
}
static constexpr uintptr_t bucketsMask = ~0ul;// All bits of bucketsMask are 1;
Copy the code
  1. bucketMaskThe value of the definition varies according to the architecture;
#if defined(__arm64__) && __LP64__// Real machine or __LP64__
    #if TARGET_OS_OSX || TARGET_OS_SIMULATOR// Emulator or Mac
    #define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS// High 16 bit large address
    #else
    #define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_HIGH_16/ / 16
    #endif
#elifdefined(__arm64__) && ! __LP64__// True or not equal to __LP64__
    #define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_LOW_4/ / low four
#else
    #define CACHE_MASK_STORAGE CACHE_MASK_STORAGE_OUTLINED / / 1
#endif
Copy the code

bucketsThe values

  1. buckets()The method is actually right_bucketsAndMaybeMaskInside the value of the operation, getbucket_t;
  2. buckets()The returned data isbucket_tThe type of$nBut we use it a lot when we’re evaluatingp $n[1]The use ofThe subscript valueThe way it looksbucket_tThe values of theta are very similarAn array ofThe value of phi, in factBucket () is not an array;
  3. Buckets ()The returnedbucket_tIt’s just a structure, but it housesbucket_tThe space that I’m going to create is 1, 0A contiguous memory spaceSo just know the presentbucket_tThe location can be passedTranslation memoryTo obtainNext adjacent areaThe memory of;
  4. usep $n[1]andp $n+1It’s the same effect;
  5. Arrays are also usedTranslation memoryThe value is set to+ 1Operation,Translation unitIt’s based on what’s inside the arrayThe data typeDecision;

Hash functions (cache_hash) and quadratic hash functions (cache_next)

  1. First hash functioncache_hash
mask_t m = capacity - 1; / / capacity - 1:4-1 = 3
mask_t begin = cache_hash(sel, m);// Get the hash value based on the capacity -1 and sel parameters
mask_t i = begin;

static inline mask_t cache_hash(SEL sel, mask_t mask) 
{
    uintptr_t value = (uintptr_t)sel;// Convert sel to an unsigned long integer
#if CONFIG_USE_PREOPT_CACHES
    value ^= value >> 7;
#endif
    return (mask_t)(value & mask);// Use sel transform value and mask to get the result by bit;
}
Copy the code
  1. Quadratic hash functioncache_next

cache_next(i, m)// The cache_next calculation is again hashed using the current location and capacity -1 as arguments

static inline mask_t cache_next(mask_t i, mask_t mask) {
    return i ? i- 1 : mask;// If I is not 0, return i-1, otherwise return mask (capacity -1); It can be interpreted that the location where the conflict occurs is at the beginning of buckets. If it is not at the beginning of buckets, they move forward directly; if they jump to the position of capacity-1 at the beginning of buckets, they move forward in turn until they reach the position of begin again.
}
Copy the code

insert()Closed loop flow of

  1. In order to findinsert()The insertion time of ourinsert()Method to view call stack information;
  2. Is a kind of the stackLIFOThe data structure of,index = 0Is the last call from0Start to view the call information in sequence and findlog_and_fill_cacheMethod calledcache.insertMethods;

static void
log_and_fill_cache(Class cls, IMP imp, SEL sel, id receiver, Class implementer)
{
#if SUPPORT_MESSAGE_LOGGING
    if (slowpath(objcMsgLogEnabled && implementer)) {
        bool cacheIt = logMessageSend(implementer->isMetaClass(), 
                                      cls->nameForLogging(),
                                      implementer->nameForLogging(), 
                                      sel);
        if(! cacheIt)return;
    }
#endif
    cls->cache.insert(sel, imp, receiver);
}
Copy the code
  1. To viewobjc-cache.mmFile, which can be seen according to the commentsinsert()The insertion timing of the topmost layer is throughobjc_msgSendTriggered. Let’s take a lookobjc_msgSend;

objc_msgSend

Compile time

  1. Compile timeAs the name suggestsWhile compiling. The compilertheThe source codeTranslated intoA code that the machine can recognize(Of course, this is only in a general sense, but in fact it may only be translated intoThe language of intermediate states). Compile time passSyntax analysis,Lexical analysisIsocompile-time type checking (Static type checking) to find in the codeerrorsorwarningError message at compile time, etc.
  2. Static checking does not run code in memory, but scans it as text. Some people are wrong when they say that memory is allocated at compile time.

The runtime

  1. Runtime is when code is loaded into memory for execution via DYLD; Runtime type checking is not the same as compile-time type checking (or static type checking) described earlier. Not simply scanning code, but doing operations and judgments in memory.
  2. RuntimeThere are two versions oneLegacyVersion (early version), oneModernVersion (current version)
  • Programming interfaces for earlier versions:Objective - 1.0 C
  • Programming interface of current version:Objective - 2.0 C
  • Earlier versions were used forObjective - 1.0 C.32-bit Mac OS XOn the platform of
  • Current version:The iPhone programandMac OS X v10.564-bit programs on systems after Mac OS X V10.5 '
  1. More information about OC can be found in the Apple documentation
  • Objective-C Runtime Programming Guide
  • Apple Official Documentation
  1. RuntimeThere are three ways to call it
  • OCMethods:[p sayNB];
  • NSObjectApis provided:isKindofClass... ;
  • objcThe lower level APIobjc_msg_sendMethods:class_getInstanceSize... ;
  1. RuntimeThe initiator information of the three call modes is as follows

objc_msgSend

  1. usingclangwillOCRewrite the file toc++File, after the restorationcppPart of the file can be viewedruntimeCall mode;
  2. OCThe method call inruntimeThe layer becomes message sendingobjc_msgSend(id _Nullable self, SEL _Nonnull op, ...)The method call contains two elements:
  • Message receiver
  • The message body:sel+parameter
// Overwrite the pre-oc method call
GCPerson *person = [GCPerson alloc];
[person sayHello];

// Rewrite to objc_msgSend
static void _I_GCTeacher_sayHello(GCTeacher * self, SEL _cmd) {
    NSLog((NSString *)&__NSConstantStringImpl__var_folders_fw_k53y7bbd2rx7sdzz_s85z7840000gn_T_main_cd505b_mi_0,__func__);
}
Copy the code

objc_msgSendSuper

  1. When we looked at the rewritten CPP file we found that in addition toobjc_msgSendIn addition, there is anotherobjc_msgSendSuperMethods; The guess is to send a message directly to the parent class;
  2. To viewobjcThe source code ofobjc_msgSendSuperIs defined as follows,
OBJC_EXPORT id _Nullable
objc_msgSendSuper(struct objc_super * _Nonnull super, SEL _Nonnull op, ...)
    OBJC_AVAILABLE(10.0.2.0.9.0.1.0.2.0);
Copy the code
  1. objc_msgSendSuperThe parameters of theobjc_superAre defined as follows
struct objc_super {
    /// Specifies an instance of a class.
    __unsafe_unretained _Nonnull id receiver;// Message receiver
    /* super_class is the first class to search */
    __unsafe_unretained _Nonnull Class super_class;// The first class the method looks for is super_class. If super_class cannot find it, it looks for the super_class's parent
};
Copy the code
  1. By buildingobjc_superStruct pairs are arguments that we can call on our own initiativeobjc_msgSendSupermethods
        structobjc_super gc_objc_super; gc_objc_super.receiver = person; Message receiver gc_objc_super.super_class = gcPerson.class;// The first class the method looks for is super_class. If super_class cannot find it, it looks for the super_class's parent
        objc_msgSendSuper(&gc_objc_super,@selector(sayHello));
Copy the code

Objc_msgSend Assembler view

Due to the dynamic characteristic of the OC language, basically all the methods are handled objc_msgSend, such a high frequency call of the underlying method, from the efficiency level considering it is best to use close to machine language, assembly language, assembly language at the same time based on address is safer, so the system is also implemented in assembly; Let’s take a look at objc_msgSend at the assembly level in the LibobJC library

  1. inobjcSource searchobjc_msgSend
  2. Close all files and view the suffix as.sAssembler file of

  1. findobjc_msgSendAssembly implementation of

Analysis of 4._objc_msgSendAssembly part source code

ENTRY _objc_msgSend
	UNWIND _objc_msgSend, NoFrame

	cmp	p0, #0			Nil check and tagged pointer check// compare the difference between p0 and #0
#if SUPPORT_TAGGED_POINTERS  // Check whether the Taggedpointer type is supported
	b.le	LNilOrTagged		// (MSB tagged pointer looks negative)// If the Taggedpointer type is supported, proceed as LNilOrTagged
#else
	b.eq	LReturnZero  // Return nil if Taggedpointer is not supported
#endif
	ldr	p13, [x0]		// p13 = isa, [x0]
    //src, needs_auth, auth_address
	GetClassFromIsa_p16 p13, 1, x0	Get CLSSS from GetClassFromIsa_p16: p16 = class
// receiver->class = method cache
LGetIsaDone:
	// calls imp or objc_msgSend_uncached
	CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached
	END_ENTRY _objc_msgSend

Copy the code

Bucket IMP fetch

Bitwise xOR operation operation

  1. ^Also known asThe bitwise exclusive orOperation, ifc = a ^ b;thea = c ^ b;
  2. bcalledsalt.No practical effect, mainly forGenerate xOR resultsThe existed;

Methods the encoding

  1. Bucket_t stores IMP as a value of an unsigned long integerexplicit_atomic<uintptr_t> _imp;And IMP itself is an address, so the set method is encoded when stored
void bucket_t::set(bucket_t *base, SEL newSel, IMP newImp, Class cls)
{
// IMP and Class are encoded here
    uintptr_t newIMP = (impEncoding == Encoded
                        ? encodeImp(base, newImp, newSel, cls)
                        : (uintptr_t)newImp);
    if (atomicity == Atomic) {
        _imp.store(newIMP, memory_order_relaxed);
        if (_sel.load(memory_order_relaxed) != newSel) {
            mega_barrier();
            _sel.store(newSel, memory_order_relaxed);
        }
    } else{ _imp.store(newIMP, memory_order_relaxed); _sel.store(newSel, memory_order_relaxed); }}//IMP encoding method
    uintptr_t encodeImp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, IMP newImp, UNUSED_WITHOUT_PTRAUTH SEL newSel, Class cls) const {
        if(! newImp)return 0;
        return (uintptr_t)newImp ^ (uintptr_t)cls;// Bitwise xOR
    }
Copy the code
  1. IMPObtaining method ofIMP imp(UNUSED_WITHOUT_PTRAUTH bucket_t *base, Class cls)It also needs to be decoded with the current extractIMP numericalwithclassforExclusive oroperation
return (IMP)(imp ^ (uintptr_t)cls);
Copy the code
  1. classIs the salt used in the above encoding and decoding operation;

The delegate of the class

  1. delegateAs aclassA member of themethod propertyThere is not much difference between also passingLLDBDebug to fromclassObtained from;
  2. But we finally passedprotocols()Method acquiredprotocol_array_t.protocol_array_tWhat’s stored inside isprotocol_ref_t.protocol_ref_tIs auintptr_tType, while notes that he isNo mapping:// protocol_t *, but unremapped, can not directly print the relevant information, and we need to be able to view the relevant informationprotocol_tType;
  3. Based on the above information, we re-searchprotocol_ref_tAnd was eventually found to be used for mappingprotocol_ref_tThe method ofremapProtocol(), its internal implementation isStrong direct transferSo can weStrong goprotocol_ref_tforprotocol_t;
static ALWAYS_INLINE protocol_t *remapProtocol(protocol_ref_t proto)
{
    runtimeLock.assertLocked();
    // Protocols in shared cache images have a canonical bit to mark that they
    // are the definition we should use
    if (((protocol_t *)proto)->isCanonical())
        return (protocol_t *)proto;// The protocol_t is returned
    protocol_t *newproto = (protocol_t *)
        getProtocol(((protocol_t *)proto)->mangledName);
    return newproto ? newproto : (protocol_t *)proto;// The protocol_t is returned
}
Copy the code

LLDB debug code

2021- 0626 - 21:59:08.559358+0800 KCObjcBuild[81571:7062064] GCTeacher
(lldb) p/x pClass // Get the current class address
(Class) $0 = 0x0000000100004908 GCTeacher
(lldb) p/x 0x0000000100004908+0x20 // Memory offset gets bits
(long) $1 = 0x0000000100004928
(lldb) p (class_data_bits_t *)$1// Strong to bits
(class_data_bits_t *) $2 = 0x0000000100004928
(lldb) p $2->data()// Get data in bits
(class_rw_t *) $3 = 0x0000000101131b70
(lldb) p *$3// Prints the contents of the destination address of data: class_rw_t
(class_rw_t) $4 = {
  flags = 2148007936
  witness = 0
  ro_or_rw_ext = {
    std::__1::atomic<unsigned long> = {
      Value = 4294984032
    }
  }
  firstSubclass = nil
  nextSiblingClass = NSUUID
}
(lldb) p $4.protocols()// Get protocols from class_rw_t
(const protocol_array_t) $5 = {
  list_array_tt<unsigned long, protocol_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x0000000100004270
      }
      arrayAndFlag = 4294984304
    }
  }
}
(lldb) p $5.list  // Get the list of protocols
(const RawPtr<protocol_list_t>) $6 = {
  ptr = 0x0000000100004270
}
(lldb) p $6.ptr  // Get the PTR from protocols
(protocol_list_t *const) $7 = 0x0000000100004270
(lldb) p *$7  // Get the PROTOCOL_list_t in PTR
(protocol_list_t) $8 = (count = 1, list = protocol_ref_t [] @ 0x00007f922ab3a268) 
(lldb) p $8.list[0]// Get a protocol_ref_t with index 0 from protocol_list_t
(protocol_ref_t) $10 = 4294986152
(lldb) p (protocol_t *)$10// Force protocol_ref_t to protocol_t
(protocol_t *) $11 = 0x00000001000049a8
(lldb) p *$11// Retrieve data stored in protocol_t
(protocol_t) $12 = {
  objc_object = {
    isa = {
      bits = 4298453192
      cls = Protocol
       = {
        nonpointer = 0
        has_assoc = 0
        has_cxx_dtor = 0
        shiftcls = 537306649
        magic = 0
        weakly_referenced = 0
        unused = 0
        has_sidetable_rc = 0
        extra_rc = 0
      }
    }
  }
  mangledName = 0x0000000100003e89 "TestDelegate"
  protocols = 0x0000000100004338
  instanceMethods = 0x0000000000000000
  classMethods = 0x0000000000000000
  optionalInstanceMethods = 0x0000000100004350
  optionalClassMethods = 0x0000000000000000
  instanceProperties = 0x0000000000000000
  size = 96
  flags = 0
  _extendedMethodTypes = 0x0000000100004370
  _demangledName = 0x0000000000000000
  _classProperties = 0x0000000000000000
}
(lldb) p *$12.optionalInstanceMethods// Get optionalInstanceMethods in PROTOCOL_T
(method_list_t) $15 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 24, count = 1)
}
(lldb) p $15.get(0).big()// Print data with index=0 in optionalInstanceMethods
(method_t::big) $17 = {
  name = "testDelegate"
  types = 0x0000000100003ec6 "v16@0:8"
  imp = 0x0000000000000000
}
(lldb) 
Copy the code

The protocol_T protocol is obtained successfully, which is a supplement to the Class content explored earlier.