The first two articles focused on alloc flow analysis and memory alignment when OC creates objects. This article continues to explore how OC associates objects with classes. What is the nature of the object?

Introduction to Basic Knowledge

  • Before analyzing the key process, let’s first understand some basic knowledge, which is more conducive to our subsequent exploration and analysis

A consortium

  • Union, “union” : It can also be called a Commons
  • In the previous article, we introduced the memory alignment of structures. If you are not familiar with this article, please refer to the previous article (OC object principle memory alignment).
  • The syntax and structure of unions are very similar, but their memory footprint is quite different, mainly in the following aspects:
    • The members of a structure are co-existing: each member occupies different memory, and they have no effect on each other.
    • The members of a consortium are mutually exclusive: all members share the same memory segment. If the value of one member is changed, all other members will be affected.
    • Structure memory: Greater than or equal to the sum of memory used by all members (memory alignment required)
    • Union memory: equal to the memory occupied by the largest member, can only store the value of one member at a time
  • Example:
union LGUnion {
    char    a;
    int     b;
    double  c;
} union1;

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
    
    union1.a = 'a';
    union1.b = 10;
    union1.c = 5.5;
}
Copy the code
  • Memory condition without assignment:
    • P union1: Outputs the value of union1
    • P/T union1: Outputs the value of union1 in binary form

  • rightaMemory after assignment:

  • Memory after assigning to b:

  • Memory after c is assigned:

  • Conclusion:As you can see from the example above, unionunion1The size of memory occupied is the maximum memberdouble cSize of memory occupied by 8 bytes, memberchar aThe lower 8 bits of total memory, the memberint bThe lower 32 bits of total memory;

A domain

  • Some data in storage do not need to occupy a complete byte, only need to occupy one or more binary bits, such as switch only on and off two states, 0 and 1 is enough, that is, with a binary bit. It is with this in mind that C provides another data structure called bitfields.
  • Definition: When defining a structure or union, the member variables are followed by: digitalIs used to limit the number of bits occupied by a member variable. This is the bitfield. Such as:
struct LGScruct {
    bool a: 1;
    bool b: 1;
    bool c: 1;
    bool d: 1;
}struct1;

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
    
    struct1.a = 1;
    struct1.b = 0;
    struct1.c = 1;
    struct1.d = 1;
    
    NSLog(@"%ld".sizeof(struct LGScruct));
}
Copy the code

  • As can be seen from the print above:
    • The structure variable is printed firststruct1The memory space occupied is 1 byte. If there is no bit-field limitation, this structure is 4 bytes in size
    • x/1bt &struct1: Prints the data stored in the memory address of the variable struct1 in single-byte binary form
    • The lower four bits (low to high, right to left) correspond toa,b,c,dThe value of the; The total size of the structure is an integer multiple of the largest member variable, i.eboolInteger multiple of type (1 byte), 1 byte;
  • There are a few other rules for bitfields :(this one is a little easier, so you can verify it for yourself if you’re interested)
    • The width of the bitfield must not exceed the width of the data type to which it is attached, otherwise it will generate errors at compile time.
    • When adjacent members are of the same type, if the sum of their bit widths is less than that of the typesizeofSize, so that the following member is stored next to the preceding member, otherwise it is stored from the new storage unit, offset to an integer multiple of the type size.

Clang to introduce

  • ClangisCLanguage,C++,Objective-CA lightweight compiler for languages
  • ClangThe object file can be compiled toC++File:clang -rewrite-objc mian.m -o main.cpp
  • ClangcompileUIKitAn error (fatal error: 'UIKit/UIKit.h' file not foundSolutions to the problem:
    • The specifiedSDKPath:Clang-rewrite-objc-fobjc-arc-fobjc-runtime = ios-13.0.0-isysroot / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneSimulator platform/Developer/SDKs/iPhoneSimulator13.0. The SDK main.m -o main.cpp
    • usexcrun:
      • The simulator:xcrun -sdk iphonesimulator clang -arch x86_64 -rewrite-objc main.m -o main-arm64.cpp
      • Mobile phone:xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp

Clang analyzes the nature of objects

  • main.mDefinition:
#import <Foundation/Foundation.h>

@interface LGPerson : NSObject
@property (nonatomic) NSString *lgName;
@end

@implementation LGPerson
@end

int main(int argc, const char * argv[]) {
    LGPerson *p = [LGPerson alloc];
    NSLog(@"p:%@", p);
    return 0;
}
Copy the code
  • useclangwillmain.mCompiled intomain.cpp:clang -rewrite-objc main.m -o main.cpp
  • main.cppKey code:
#ifndef _REWRITER_typedef_LGPerson
#define _REWRITER_typedef_LGPerson
typedef struct objc_object LGPerson;
typedef struct {} _objc_exc_LGPerson;
#endif

extern "C" unsigned long OBJC_IVAR_$_LGPerson$_lgName;
struct LGPerson_IMPL {
	struct NSObject_IMPL NSObject_IVARS;
	NSString *_lgName;
};

// @property (nonatomic) NSString *lgName;
/* @end */


// @implementation LGPerson

static NSString * _I_LGPerson_lgName(LGPerson * self, SEL _cmd) { return(* (NSString((* *)char *)self + OBJC_IVAR_$_LGPerson$_lgName)); }
static void _I_LGPerson_setLgName_(LGPerson * self, SEL _cmd, NSString *lgName) { (*(NSString((* *)char *)self + OBJC_IVAR_$_LGPerson$_lgName)) = lgName; }
// @end

int main(int argc, const char * argv[]) {
    LGPerson *p = ((LGPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("LGPerson"), sel_registerName("alloc"));
    NSLog((NSString *)&__NSConstantStringImpl__var_folders_1d__07nngks0bl536_p0nvqhjkm0000gn_T_main_06b9c6_mi_0, p);
    return 0;
}
Copy the code
  • As you can see,LGPersonIs, in fact,struct objc_objectType,objc_objectIs defined as follows, and there is one by defaultClassThe type ofisavariable
struct objc_object {
    Class _Nonnull isa __attribute__((deprecated));
};
Copy the code
  • fromLGPersonThe implementation of thestruct LGPerson_IMPLIt can also be seen that the first member variable isstruct NSObject_IMPLType, defined as follows:
struct NSObject_IMPL {
	Class isa;
};
Copy the code
  • whileLGPerson_IMPLThe other member variable in is our custom propertylgName:NSString *_lgName;
  • OCProperties in thegetterandsetterMethods, and the corresponding ones here are_I_LGPerson_lgNameand_I_LGPerson_setLgName_
    • The first two arguments to both methods areselfand_cmd, these two parameters are implicit by default
    • ingetterMethods andsetterAll of the methods are throughself + OBJC_IVAR_$_LGPerson$_lgNameTo get or change variableslgNameThe value of the,OBJC_IVAR_$_LGPerson$_lgNameDefinition:extern"C"unsignedlongintOBJC_IVAR_$_LGPerson$_lgName__attribute__ ((used, section ("__DATA,__objc_ivar"))) = __OFFSETOFIVAR__(struct LGPerson, _lgName);Here, the memory address of the variable is found by displacement, and then the value or assignment is carried out
  • Conclusion:
    • The definition of a class is changed and compiled into a structure at the bottom, with the first member variable beingisa;
    • Nature of objectforstruct objc_objectStructure pointer to type;
    • OCAutomatically generates member variables,getterMethods andsetterMethods;

Objc source analysis isa initialization process

  • The last two articles described how creating an object ends up calling the method_class_createInstanceFromZoneIs initializedisaWill be calledobj->initInstanceIsa(cls, hasCxxDtor);orobj->initIsa(cls);
static ALWAYS_INLINE id
_class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone,
                              int construct_flags = OBJECT_CONSTRUCT_NONE,
                              bool cxxConstruct = true,
                              size_t *outAllocatedSize = nil)
{
    ASSERT(cls->isRealized());

    // Read class's info bits all at once for performance
    bool hasCxxCtor = cxxConstruct && cls->hasCxxCtor();
    bool hasCxxDtor = cls->hasCxxDtor();
    bool fast = cls->canAllocNonpointer();
    size_t size;

    size = cls->instanceSize(extraBytes);
    if (outAllocatedSize) *outAllocatedSize = size;

    id obj;
    if (zone) {
        obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size);
    } else {
        obj = (id)calloc(1, size);
    }
    if(slowpath(! obj)) {if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) {
            return _objc_callBadAllocHandler(cls);
        }
        return nil;
    }

    if(! zone && fast) { obj->initInstanceIsa(cls, hasCxxDtor); }else {
        // Use raw pointer isa on the assumption that they might be
        // doing something weird with the zone or RR.
        obj->initIsa(cls);
    }

    if(fastpath(! hasCxxCtor)) {return obj;
    }

    construct_flags |= OBJECT_CONSTRUCT_FREE_ONFAILURE;
    return object_cxxConstructFromClass(obj, cls, construct_flags);
}
Copy the code
  • Let’s explore initializationisaWhat does the process do

Dynamic debugging

  • ininitInstanceIsaandinitIsaAdd breakpoints separately and run debugging
  • inobjPrint before assignment:

  • inobjPrint after assignment:

  • As you can see from the two prints,objAfter the assignment, the memory address0x000000010065dc30The type of fromidTurned out to beLGPerson *That is, the memory address and class in this processLGPersonSo you can print out the object type of the memory addressLGPerson *

Call Flow analysis

  • Different architectures may have different processes for handling calls, but eventually they will all be calledinitInstanceIsaorinitIsa, and both methods will eventually be calledobjc_object::initIsa, the specific process is as follows:

  • You can see from the above flow that it will eventually be calledobjc_object::initIsaAnd in this methodisaPerform the assignment, and focus on the analysis of this method
  • Before analyzing this method, let’s take a look at a few key facts involved:
    • ISA_BITFIELD
    • isa_t

ISA_BITFIELD analysis

  • Here is the judgment of the architecture, the macro definition corresponding to different architectures is not the same, debuggingobjcThe source code process is run inMacOSOn the system, it usesx86_64Architecture, the following is the corresponding definition :(other architectures have similar definitions, which will be analyzed in detail in the supplement)
# elif __x86_64__
#   define ISA_MASK        0x00007ffffffffff8ULL
#   define ISA_MAGIC_MASK  0x001f800000000001ULL
#   define ISA_MAGIC_VALUE 0x001d800000000001ULL
#   define ISA_HAS_CXX_DTOR_BIT 1
#   define ISA_BITFIELD                                                        \
      uintptr_t nonpointer        : 1;                                         \
      uintptr_t has_assoc         : 1;                                         \
      uintptr_t has_cxx_dtor      : 1;                                         \
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \
      uintptr_t magic             : 6;                                         \
      uintptr_t weakly_referenced : 1;                                         \
      uintptr_t unused            : 1;                                         \
      uintptr_t has_sidetable_rc  : 1;                                         \
      uintptr_t extra_rc          : 8
#   define RC_ONE   (1ULL<<56)
#   define RC_HALF  (1ULL<<7)

Copy the code
  • inx86_64The meanings of each field in the schema:
    • nonpointer: indicates whether it is correctisaPointer on pointer optimization; Zero: pureisaPointer, 1:isaContains class information, object reference count and other information;
    • has_assoc: Associated object flag; 0: none. 1: exists
    • has_cxx_dtor: Indicates whether the object hasC++orObjcIf there is a destructor, it needs to do the destructor logic, if there is no, it can be faster to free the object
    • shiftcls: Stores the value of the class pointer
    • magic: used by the debugger to determine if the current object is a real object or if there is no space to initialize
    • weakly_referenced: indicates whether the object is or has been pointed to oneARCObjects without weak references can be freed faster
    • unused: Indicates whether the sign is unused
    • has_sidetable_rc: When the object reference count is greater than 10, it needs to borrow this variable to store carry
    • extra_rc: represents the referential count value of the object, which is actually the referential count value minus 1. For example, if the object has a reference count of 10, thenextra_rc9. If the reference count is greater than 10, this parameter is requiredhas_sidetable_rc

Isa_t analysis

  • isa_tDefinition:
union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    uintptr_t bits;

private:
    // Accessing the class requires custom ptrauth operations, so
    // force clients to go through setClass/getClass by making this
    // private.
    Class cls;

public:
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };

    bool isDeallocating() {
        return extra_rc == 0 && has_sidetable_rc == 0;
    }
    void setDeallocating() {
        extra_rc = 0;
        has_sidetable_rc = 0;
    }
#endif

    void setClass(Class cls, objc_object *obj);
    Class getClass(bool authenticated);
    Class getDecodedClass(bool authenticated);
};
Copy the code
  • From the above source can be seen:
    • isa_tIt’s a union, and as we’ve seen before, the member variable stores of a union are mutually exclusive, member variablesbitsUse the same block of memory as the structure;
    • At the top are the two constructors. The first constructor does nothing, and the second constructor uses the parameters passed invalueFor member variablesbitsPerform an assignment;
    • In the structure belowISA_BITFIELDFor macro definitions, which actually use bitfields, this structure, and member variablesbitsSharing the same memory space;

Objc_object: : initIsa analysis

  • objc_object::initIsaImplementation:
inline void 
objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT boolhasCxxDtor) { ASSERT(! isTaggedPointer()); isa_t newisa(0);

    if(! nonpointer) { newisa.setClass(cls,this);
    } else{ ASSERT(! DisableNonpointerIsa); ASSERT(! cls->instancesRequireRawIsa());#if SUPPORT_INDEXED_ISA
        ASSERT(cls->classArrayIndex() > 0);
        newisa.bits = ISA_INDEX_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.indexcls = (uintptr_t)cls->classArrayIndex();
#else
        newisa.bits = ISA_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
#   if ISA_HAS_CXX_DTOR_BIT
        newisa.has_cxx_dtor = hasCxxDtor;
#   endif
        newisa.setClass(cls, this);
#endif
        newisa.extra_rc = 1;
    }

    isa = newisa;
}
Copy the code
  • To viewisaVariable type:
struct objc_object {
private:
    isa_t isa;
   	/ /... omit
}
Copy the code
  • Here,isaisisa_tType variable,isaisobjc_objectThe first member of a structure that the class automatically converts to during underlying compilationstruct objc_objecttype
  • Here toisaAssignments use variablesnewisa
  • And as you can see from the top,newisaisisa_tType,newisa(0)Is the callisa_tThe second constructor in the class is initialized, where the value 0 is assigned toisa_tMember variables inbitsDo the assignment to avoid dirty data.
  • Determine whether or notnonpointer, that is, not pureisaPointer:
    • No: callnewisa.setClass(cls, this);, directly assign
    • Is:isaPointer optimization, rightisaPointer to thebitsAnd the variable assignment corresponding to the bitfield;

conclusion

  • isaisisa_tConsortium types, their membersbitsIt shares the same block of memory as the structure location domain
  • rightisaDuring initialization, check whether it isnonpointerThat is, whether pointer optimization is enabled
    • If no, the value is directly assigned
    • Yes: Assigns a value to the bitfield in the union

Added: ISA_BITFIELD analysis

# if __arm64__
// ARM64 simulators have a larger address space, so use the ARM64e
// scheme even when simulators build for ARM64-not-e.
#   if __has_feature(ptrauth_calls) || TARGET_OS_SIMULATOR
#     define ISA_MASK        0x007ffffffffffff8ULL
#     define ISA_MAGIC_MASK  0x0000000000000001ULL
#     define ISA_MAGIC_VALUE 0x0000000000000001ULL
#     define ISA_HAS_CXX_DTOR_BIT 0
#     define ISA_BITFIELD                                                      \
        uintptr_t nonpointer        : 1;                                       \
        uintptr_t has_assoc         : 1;                                       \
        uintptr_t weakly_referenced : 1;                                       \
        uintptr_t shiftcls_and_sig  : 52;                                      \
        uintptr_t has_sidetable_rc  : 1;                                       \
        uintptr_t extra_rc          : 8
#     define RC_ONE   (1ULL<<56)
#     define RC_HALF  (1ULL<<7)
#   else
#     define ISA_MASK        0x0000000ffffffff8ULL
#     define ISA_MAGIC_MASK  0x000003f000000001ULL
#     define ISA_MAGIC_VALUE 0x000001a000000001ULL
#     define ISA_HAS_CXX_DTOR_BIT 1
#     define ISA_BITFIELD                                                      \
        uintptr_t nonpointer        : 1;                                       \
        uintptr_t has_assoc         : 1;                                       \
        uintptr_t has_cxx_dtor      : 1;                                       \
        uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
        uintptr_t magic             : 6;                                       \
        uintptr_t weakly_referenced : 1;                                       \
        uintptr_t unused            : 1;                                       \
        uintptr_t has_sidetable_rc  : 1;                                       \
        uintptr_t extra_rc          : 19
#     define RC_ONE   (1ULL<<45)
#     define RC_HALF  (1ULL<<18)
#   endif

# elif __x86_64__
#   define ISA_MASK        0x00007ffffffffff8ULL
#   define ISA_MAGIC_MASK  0x001f800000000001ULL
#   define ISA_MAGIC_VALUE 0x001d800000000001ULL
#   define ISA_HAS_CXX_DTOR_BIT 1
#   define ISA_BITFIELD                                                        \
      uintptr_t nonpointer        : 1;                                         \
      uintptr_t has_assoc         : 1;                                         \
      uintptr_t has_cxx_dtor      : 1;                                         \
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \
      uintptr_t magic             : 6;                                         \
      uintptr_t weakly_referenced : 1;                                         \
      uintptr_t unused            : 1;                                         \
      uintptr_t has_sidetable_rc  : 1;                                         \
      uintptr_t extra_rc          : 8
#   define RC_ONE   (1ULL<<56)
#   define RC_HALF  (1ULL<<7)

# else
#   error unknown architecture for packed isa
# endif
Copy the code
  • As can be seen from the above source code, here is the first to determine the architecture, mainly divided intoarm64andx86_64Two kinds of architecture
  • x86_64The content of the architecture is clear and has been analyzed previously, so let’s focus on itarm64Processing of architecture
  • arm64Architecture:
    • Again, I’m making a conditional judgment here, mainly__has_feature(ptrauth_calls)andTARGET_OS_SIMULATOR
    • Among themTARGET_OS_SIMULATORIt’s the simulator, represented herearm64Simulator device for architecture
    • then__has_feature(ptrauth_calls)What is it?
    • Here are the highlights__has_feature(ptrauth_calls)The role of

__has_feature (ptrauth_calls) is introduced

  • __has_featureThis function checks whether the compiler supports a function
  • ptrauth_calls: pointer authentication againstarm64eArchitecture; useApple A12Or higherAA family of processor devices (e.giPhone XS,iPhone XS MaxandiPhone XROr newer devices) supportarm64earchitecture
  • Reference links: developer.apple.com/documentati…
  • The following are the real machinesiPhone 12andiPhone 8To verify

Verification Method 1: Use the ISA to verify storage data

  • becausearm64If and else branches store different data structuresweakly_referencedValue for validation
  • The test code
LGPerson *p = [LGPerson alloc];
__weak typeof(p) weakP = p;
NSLog(@"p:%@", p);
Copy the code
  • In __weak Typeof (p) weakP = p; The value of Weakly_referenced is 0 before execution and will change to 1 after execution

  • To test the iPhone 8

    • Perform before

    • After the execution:

  • To test the iPhone 12

    • Perform before

    • After performing

  • Verification results: From the verification results of the above two devices, it can be seen that __weak Typeof (P) weakP = P is executed; Before and after, iPhone 8 changes bit 43 (from right to left) and iPhone 12 changes bit 3 (from right to left), which correspond to the ELSE and if branches of the ARM64 architecture

Verification method two: through breakpoints and assembly verification

  • in[LGPerson alloc]Set a breakpoint at
  • After execution at the breakpoint, add a symbolic breakpoint_objc_rootAllocWithZoneAnd continue running
  • iPhone 8

  • iPhone 12

  • Verification results: iPhone 8The use of the0xffffffff8.iPhone 12The use of the0x7ffffffffffff8In theobjcSource,isa.hIt is found that the two values correspond to respectivelyarm64In the architectureelseThe branchISA_MASKValues andifThe branchISA_MASKvalue

conclusion

  • The __has_feature(ptrauth_calls) is used to determine whether the compiler supports pointer authentication

  • Devices above the iPhone X series (arm64E architecture, devices using Apple A12 or later A series processors) support pointer authentication

  • For arm64 architecture

    • IPhone X series and above (included) devices use the following structure:
    #     define ISA_MASK        0x007ffffffffffff8ULL
    #     define ISA_MAGIC_MASK  0x0000000000000001ULL
    #     define ISA_MAGIC_VALUE 0x0000000000000001ULL
    #     define ISA_HAS_CXX_DTOR_BIT 0
    #     define ISA_BITFIELD                                                      \
            uintptr_t nonpointer        : 1;                                       \
            uintptr_t has_assoc         : 1;                                       \
            uintptr_t weakly_referenced : 1;                                       \
            uintptr_t shiftcls_and_sig  : 52;                                      \
            uintptr_t has_sidetable_rc  : 1;                                       \
            uintptr_t extra_rc          : 8
    #     define RC_ONE   (1ULL<<56)
    #     define RC_HALF  (1ULL<<7)
    Copy the code
    • The following structure is used for devices below iPhone X (not included) :
    #     define ISA_MASK        0x0000000ffffffff8ULL
    #     define ISA_MAGIC_MASK  0x000003f000000001ULL
    #     define ISA_MAGIC_VALUE 0x000001a000000001ULL
    #     define ISA_HAS_CXX_DTOR_BIT 1
    #     define ISA_BITFIELD                                                      \
            uintptr_t nonpointer        : 1;                                       \
            uintptr_t has_assoc         : 1;                                       \
            uintptr_t has_cxx_dtor      : 1;                                       \
            uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
            uintptr_t magic             : 6;                                       \
            uintptr_t weakly_referenced : 1;                                       \
            uintptr_t unused            : 1;                                       \
            uintptr_t has_sidetable_rc  : 1;                                       \
            uintptr_t extra_rc          : 19
    #     define RC_ONE   (1ULL<<45)
    #     define RC_HALF  (1ULL<<18)
    Copy the code