When creating a subclass of NSObject, this article mainly introduces the process of calling the alloc method, the analysis of the actual size of the object created by the class, the analysis of the size of the memory allocated by the system, and the factors affecting the size of the object.

This article mainly introduces the cause analysis of inconsistent memory occupied size caused by different order for the same number of members and types of members when we create the structure, the anti-hook operation of the system for alloc method, and the memory opening process of alloc function.

So today in this article we’re going to explore how when we create a subclass of NSObject, we’re going to call alloc, and we’re going to bind that class object to ISA.

InitIsa function

Class CLS and ISA are bound in the initIsa function. The code is as follows:

inline void 
objc_object::initIsa(Class cls)
{
    initIsa(cls, false, false);
}
Copy the code

Then proceed to initIsa(CLS, false, false); In SUPPORT_INDEXED_ISA, conditional branches are found in the implementation function of SUPPORT_INDEXED_ISA. Check the conditional branch criteria and simplify the code.

// Define SUPPORT_INDEXED_ISA=1 as the index of the class table on platforms where classes are stored in isa fields. Be careful to keep in sync with any.s files that define it. Be sure to edit objc-abi.h as well. #if __ARM_ARCH_7K__ >= 2 || (__arm64__ && ! __LP64__) # define SUPPORT_INDEXED_ISA 1 #else # define SUPPORT_INDEXED_ISA 0 #endifCopy the code

You can see that SUPPORT_INDEXED_ISA=1 is the default, which means the branch is true. Therefore, after eliminating redundant code, the simplified code is as follows:

inline void objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor) { isa_t newisa(0); if (! nonpointer) { newisa.setClass(cls, this); } else { newisa.bits = ISA_INDEX_MAGIC_VALUE; // isa.nonpointer is part of ISA_MAGIC_VALUE newisa.has_cxx_dtor = hasCxxDtor; newisa.indexcls = (uintptr_t)cls->classArrayIndex(); newisa.extra_rc = 1; } // In some cases, this write must be performed in a single store (for example, when implementing a class because other threads may try to use the class at the same time). Fixme uses atoms to guarantee single storage and memory order. But not too atomic, because we don't want to break the instantiation isa = newisa; }Copy the code

When the simplification completed the implementation of the function, we began to analyze the code in the function, the first is to create a union (also known as the common body) object newisa, so first to look at the structure of the union ISA_t, the code is as follows:

Isa_t (uintptr_t value) : bits(value) {} uintptr_t bits; // uintptr_t () {} uintptr_t bits; Private: // Accessing this class requires a custom ptrauth operation, so forcing clients to access setClass/getClass via private. Class cls; public: #if defined(ISA_BITFIELD) struct { ISA_BITFIELD; // in the isa.h file, the meaning of each position lock in the ISA bitfield is defined}; bool isDeallocating() { return extra_rc == 0 && has_sidetable_rc == 0; } void setDeallocating() { extra_rc = 0; has_sidetable_rc = 0; } #endif void setClass(Class cls, objc_object *obj); Class getClass(bool authenticated); Class getDecodedClass(bool authenticated); };Copy the code

Before analyzing the code in the function, we first do the expansion of knowledge, that is, as the saying goes, prepare for war, understand the relevant knowledge points, in order to read the source code faster and more efficiently.

Supplement:

  1. uintptr_tYou can see this keyword used to define variables in many places, so what is it? With curiosity, after a meal of operation, find such a definition, and is inLinuxThe platform’susr/include/stdint.hDefinition in the header file. The specific code is as follows:
    #if __WORDSIZE == 64
    # ifndef __intptr_t_defined
    typedef long int		intptr_t;
    #  define __intptr_t_defined
    # endif
    typedef unsigned long int	uintptr_t;
    #else
    # ifndef __intptr_t_defined
    typedef int			intptr_t;
    #  define __intptr_t_defined
    # endif
    typedef unsigned int		uintptr_t;
    #endif
    Copy the code

    Read down the following points:

    • On 64-bit systems the long int type is individually called intptr_t. Uintptr_t unsigned long int

    • On 32-bit systems, the int type is individually called intptr_t. The unsigned int type is called uintptr_t.

    • The reason for doing this is probably to improve the portability of the program across different systems.

  2. A domainC allows a structure to specify the memory size of its members in bits, which are called bitwiseA domain.

Similarities and differences between structures and complexes

  • With:The structure of the bodyandA consortiumAre structural data types that have one or more members and can be nested with each other.
  • Vision:The structure of the bodyAmong all members yescoexistenceCan assign values at the same time.A consortiumBetween all members of theThe mutexOnly one value can be valid at a time, and assigning a value to one member immediately makes the other members junk data, not valid values.

MASK (MASK)

In iOS development, it is common to see some am& (and) operations, ~(inverse) operations and so on. In bit-domain operations, it is common to encounter am& (and) operations or displacement operations to access the value of the specified location. Objc source code is always run on MAC, so let’s take a look at ISA_BITFIELD for x86 architecture, as shown below:

#   define ISA_BITFIELD                                                        
      uintptr_t nonpointer        : 1;                                         
      uintptr_t has_assoc         : 1;                                         
      uintptr_t has_cxx_dtor      : 1;                                         
      uintptr_t shiftcls          : 44;
      uintptr_t magic             : 6;                                         
      uintptr_t weakly_referenced : 1;                                         
      uintptr_t unused            : 1;                                         
      uintptr_t has_sidetable_rc  : 1;                                         
      uintptr_t extra_rc          : 8
Copy the code

The information represented by each position in the X86 ISA_BITFIELD is shown below:

x86Schematic diagram of the architecture:

arm64Schematic diagram of the architecture:

The nonpointer bit is 0, indicating whether pointer optimization is enabled for isa Pointers. 0: pure ISA pointer, 1: not only class object address, ISA contains class information, object reference count, etc.

Has_assoc is in 1 bit, indicating the flag bit of the associated object. 0: no, 1: yes.

Has_cxx_dtor in 2 bits indicates whether the object has a destructor of C++ or Objc. If it has a destructor, the destructor logic needs to be done. If not, the object can be freed faster.

Shiftcls occupies 3 to 46 bits in the x86 architecture and represents the value of the storage class pointer. When pointer optimization is enabled, it occupies 3 to 35 bits in the ARM64 architecture.

Magic takes up 47 to 52 bits in the x86 architecture and 36 to 41 bits in the ARM64 architecture, and is used by the modulator to determine whether the current object is a real object or has no initialized space.

Weakly_referenced Takes up 53 bits in the x86 architecture and 42 bits in the ARM64 architecture. Weakly_referenced indicates whether an object is or has been pointed to an ARC weak variable. Objects without weak references can be released faster.

Unused Occupies 54 bits in x86 architecture and 43 bits in ARM64 architecture, indicating whether the object is freeing memory.

Has_sidetable_rc has 55 bits in the x86 architecture and 44 bits in the ARM64 architecture, indicating that when the object reference count is greater than 10, the variable needs to be borrowed to store the carry.

Extra_rc takes up 56 to 63 bits in x86 architecture and 45 to 63 bits in ARM64 architecture. When representing the reference count value of this object, the reference count value is actually subtracted by 1, for example: Extra_rc is 9 if the object’s reference count is 10, and has_sideTABLE_rc needs to be used if the reference count is greater than 10.

InitIsa function implementation

Once you know about these extensions, go back to the isa_t structure where you started, mainly the setClass function, which has a comment that basically says setClass fields in isa. Accepts the class to set and a pointer to the object that will eventually use ISA. This is necessary for the pointer to be signed correctly. Note: This method does not support setting indexed ISA. When using index ISA, it can only be used to set the classes of the original ISA. Uintptr_t signedCls = (uintptrtr_t)ptrauth_sign_unauthenticated((void *)newCls, ISA_SIGNING_KEY, ptrauth_blend_discriminator(obj, ISA_SIGNING_DISCRIMINATOR)); Because we are calling the result pointer as a function pointer, we need to sign it, that is, the newCls passed in.

And then back to the start of initIsa function, while creating a fairly complex object newisa, first to determine whether the incoming nonpointer to false, from the function flow with came in, you can see call initIsa write when I die to false, So the conditional branch goes to newisa.setClass(CLS, this); “, and then proceed to the implementation of the setClass function. Since there are many conditional branches, the stupidest method is to add breakpoints one by one and watch the one that goes away, as shown below:

Shiftcls = (uintptr_t)newCls >> 3; This code, as added above, is used to store the value of the class pointer.