preface

We’ve written about the process of creating an object and calculating the size of its memory, but what exactly is an object? So let’s do the analysis

Compile to C++ file

As we know, OC will eventually become C/C++ code under compiler action, which will then be converted into assembly, and finally generate binary code that can be recognized, so we can explore the underlying code in C/C++. There are two ways to convert OC code to C++.

1. clang

  • clangIs made up ofAppleLead writing, based onLLVMtheC/C++/Objective-CThe compiler.
  • Convert the code toC++The following steps are requiredmain.mintomain.cpp) :
    • First open the terminalGo to the file where you want to convert the code
    • Then execute the following code
clang -rewrite-objc main.m -o main.cpp
Copy the code

UIKit/ uikit.h cannot be found

main.m:8:9: fatal error: 'UIKit/UIKit.h' file not found
#import <UIKit/UIKit.h>^ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~1 error generated.
Copy the code

You can replace the second step with:

clang -rewrite-objc -fobjc-arc -fobjc-runtime=ios13.0. 0 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator143..sdk main.m
Copy the code

Note 1: -o indicates the output name. CPP, where main.m is printed as main. CPP note 2: / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneSimulator platform/Developer/SDKs/iPhoneSimulator14.3. The path to the SDK Is the path of the iPhoneSimulator on your computer. The version number needs to be modified according to the actual version on your computer.

2. xcrun

  • In the installationXCodeIt was installed by the wayxcrunCommand,xcrunCommand inclangOn the basis of some packaging, better use some
  • Convert the code toC++The steps andclangThe commands are as follows:

In the simulator:

xcrun -sdk iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main.cpp
Copy the code

In the real machine:

xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main.cpp
Copy the code

Second, the structure of class

In the above steps, we have the C++ file, let’s analyze it:

  • inmain.cppSearch in fileWSPerson(in themain.mWe defined one inWSPersonClass), we get a structure:
struct WSPerson_IMPL {
	struct NSObject_IMPL NSObject_IVARS;
};
Copy the code

Let’s define a wsName property in WSPerson, compile it into C++ code, and look at this structure:

struct WSPerson_IMPL {
	struct NSObject_IMPL NSObject_IVARS;
	NSString *_wsName;
};
Copy the code

There is also a wsName, so the object is essentially a structure. The NSObject_IVARS member is a structure. What is the structure? Let’s search:

struct NSObject_IMPL {
	Class isa;
};
Copy the code

Conclusion: NSObject_IVARS is the member variable ISA

  • inWSPerson_IMPLUp there, we noticed aobjc_object:
typedef struct objc_object WSPerson;
Copy the code

You can see that WSPerson inherits from objc_Object, and we know that in OC, classes inherit from NSObject, and at the lower level of the substance, classes inherit from Objc_Object.

  • So let’s seeClass:
typedef struct objc_class *Class;
Copy the code

Class is a pointer to a structure with an ID under it:

typedef struct objc_object *id;
Copy the code

Is also a structure pointer, now have a question to lead to the solution:

Why does id person not have *? Because it is itself a pointer.

  • Let’s seewsNameParameters:
extern "C" unsigned long int OBJC_IVAR_$_WSPerson$_wsName __attribute__ ((used, section ("__DATA,__objc_ivar"))) = __OFFSETOFIVAR__(struct WSPerson, _wsName);

// get
static NSString * _I_WSPerson_wsName(WSPerson * self, SEL _cmd) { return(* (NSString((* *)char *)self + OBJC_IVAR_$_WSPerson$_wsName)); }

// set
static void _I_WSPerson_setWsName_(WSPerson * self, SEL _cmd, NSString *wsName) { (*(NSString((* *)char *)self + OBJC_IVAR_$_WSPerson$_wsName)) = wsName; }
Copy the code

We see that these two functions, essentially get and set methods, have self and _cmd in both methods, which are hidden arguments to the function.

  • So how do we get the parameters?
    • First of all:(char *)selfisWSPersonA pointer to the
    • OBJC_IVAR_$_WSPerson$_wsNameisoffsetThe offset
    • And then I’m going to force thetawsName
  • Illustration:

We got it by gettingWSPersontheThe first addressAnd then through the attributeOffset value offset, to get the corresponding properties.

Bit-field and union

Before looking at ISA, let’s look at lower bit fields and unions

1. A domain

Let’s start with an example:

struct struct1 {
    BOOL top;
    BOOL left;
    BOOL bottom;
    BOOL right;
}s1;
Copy the code
  • According to the previous articleMemory alignmentWe can get thisstruct1The occupied memory is4 bytesBut in essence4 bytes? We know thatBOOLThe value is0 or 1And so is binary0 or 1with4 bytesStorage causes a relatively large waste, which can be essentially like this, as shown in the figure:
// 4 bytes = 4 * 8 bits = 32 bits
00000000 00000000 00000000 00001111
Copy the code

We only need 4 bits to function as a struct1, 4 bits is half a byte, we are at least 1 byte, so how can we make struct1 occupy 1 byte? Bit fields can do that

Bit-field concept: The so-called bit-field is to divide the binary bits in a byte into several different regions and specify the number of bits in each region. Each domain has a domain name, which allows you to operate by domain name in the program, so that several different objects can be represented in a one-byte binary field. Bitfields are a data structure in C language.

  • According to the concept that is, you need to put the member variables in the above structureSpecify the numberLet’s verify that by defining a structure like thisstruct2:
struct struct2 {
    BOOL top: 1;
    BOOL left: 1;
    BOOL bottom: 1;
    BOOL right: 1;
}s2;
Copy the code

And then print bothsize:



Summary: Bitfields can be optimized for memory by specifying the number of bits of a member variable.

2. Union (union)

Union concept: Union is also called union and common body, which is similar to a data structure of struct to some extent. Union and struct can also contain a variety of data types and variables, with obvious differences:

unionandstructThe difference between:

  • struct:structAll the members of thecoexistenceThe advantage is thatTolerance is a great, more comprehensive. The disadvantage is thatstructThe allocation of memory space isextensiveThe,It doesn't matter whether it works or not.
  • union: unionThe members areMutually exclusive, the disadvantage is thatNot enough tolerance. But the advantage is memory usageA more detailed, a member of a consortiumShared memory spaceSo alsoIt saves memory space.

Let’s show this in code:

/ / structure
struct LGTeacher1 {
    char name;
    int age;
    double weight;
}t1;

/ / a consortium
union LGTeacher2 {
    char name;
    double weight;
    int age;
}t2;

// Assign values separately
t1.name = 'K';
t1.age =  69;
t1.weight = 180;


t2.name = 'C';
t2.age = 69;
t2.weight = 179.9;
Copy the code

And then we use the p command to print them separately



You can see the structureLGTeacher1The message is displayed normally, but commonwealthLGTeacher2There are some anomalies in the display. Let’s usep/t(p/tPrint binary information) to printt2Take a look at:



Let’s use the diagram to analyze this structure:

  • The order of assignment isname.age.weightAs you can seeweightData coverageageandnameThe data. It shows that the last value assigned affects the previous value, and also shows the unionDon't toleranceFeatures.

isa_t

4. Isa structure

Next, let’s analyze the structure of ISA. In the previous alloc flow, we analyzed object creation. The last step in initInstanceIsa creates isa of type ISA_t. Let’s look at its structure:

union isa_t {
    isa_t() { } // constructor
    isa_t(uintptr_t value) : bits(value) { } // bit-field constructor

    uintptr_t bits;

private:
    // Accessing the class requires custom ptrauth operations, so
    // force clients to go through setClass/getClass by making this
    // private.
    Class cls;

public:
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };

    bool isDeallocating() {
        return extra_rc == 0 && has_sidetable_rc == 0;
    }
    void setDeallocating() {
        extra_rc = 0;
        has_sidetable_rc = 0;
    }
#endif

    void setClass(Class cls, objc_object *obj);
    Class getClass(bool authenticated);
    Class getDecodedClass(bool authenticated);
};
Copy the code
  • The originalisa_tIt’s a union, two member variablesbitsandclsCommon memory space, which isThe mutexWhen the first kindisa_t() { }At initialization,clsThere’s no default value, and the second oneisa_t(uintptr_t value) : bits(value) { }At initialization,CLS will have value.
  • inObjc4-818.2 -Is created using the second method:
isa_t newisa(0)
Copy the code
  • isa_tIt also provides aA domainTo store some information. This member isISA_BITFIELD, it is aMacro definition, there are__arm64__and__x86_64__Two structures. Let’s do it here__x86_64__Structure:
# elif __x86_64__
#   define ISA_MASK        0x00007ffffffffff8ULL
#   define ISA_MAGIC_MASK  0x001f800000000001ULL
#   define ISA_MAGIC_VALUE 0x001d800000000001ULL
#   define ISA_HAS_CXX_DTOR_BIT 1
#   define ISA_BITFIELD                                                        
      uintptr_t nonpointer        : 1;  // Indicates whether to enable pointer optimization for isa Pointers. 0: pure ISA Pointers, 1: Isa contains not only the address of the class object, but also the reference count of the object
      uintptr_t has_assoc         : 1;  // The associated object flag bit, 0 does not exist, 1 exists
      uintptr_t has_cxx_dtor      : 1;  // does the object have a destructor for C++ or Objc? If it has a destructor, it needs to do the destructor logic. If it does not, it can release the object faster
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/   // Store the value of the class pointer. With pointer optimization turned on, 33 bits are used to store class Pointers in the ARM64 architecture
      uintptr_t magic             : 6;  // It is used by the debugger to determine if the current object is a real object or if there is no space to initialize it
      uintptr_t weakly_referenced : 1;  // Whether the object is or was referred to an ARC weak variable. Objects without weak references can be freed faster
      uintptr_t unused            : 1;  // Whether to use it
      uintptr_t has_sidetable_rc  : 1;  // When the object reference technique is greater than 10, we need to borrow this variable to store the carry
      uintptr_t extra_rc          : 8   The extra_rc value is 9 if the object's reference count is 10. If the reference count is greater than 10, the following has_sideTABLE_rc is used.
#   define RC_ONE   (1ULL<<56)
#   define RC_HALF  (1ULL<<7)
Copy the code

As you can see, THERE are two types of ISA:

  • nonpointerfor0: pureisaPointer to the
  • nonpointerfor1: Not only class object address,isaContains class information, object reference counts, and so on.

We use the figure to show the ISA_BITFIELD distribution:

  • In the figurenonpointerinA "0".has_associnThe first bit.has_cxx_dtorin2nd.shiftclsin3 ~ 46.magicin47 ~ 52.weakly_referencedin53.unusedin54.has_sidetable_rcinAt 55.extra_rcin56~63position

From the figure, it is clear that ShiftCLs is the core data, so let’s analyze it

shiftcls:

  • We are ininitIsaIn createisaandnewisa.bitAssign a breakpoint to get:



  • In this picture, we see the change before and after the assignment becausenewisaisCommonwealth of the union, sobitOther values are also assigned due to memory sharing, notably:cls.nonpointerandmagic.
  • Let’s print it outclsBinary display of:



In the domainISA_BITFIELD,magicin47 ~ 52, soBinary 11 1011Converted toThe decimal systemget59By the same token,nonpointerinA "0",nonpointer = 1.

  • And then we hadnewisa.bitAfter the assignment, enter thesetClassMethod to tell by a breakpoint that the step has been taken:
#else // Nonpointer isa, no ptrauth
    shiftcls = (uintptr_t)newCls >> 3;
#endif
Copy the code

In this step, there’s a 3 to the right move, why do I move 3 to the right? The purpose of moving three bits to the right is to reduce memory consumption, because Pointers to classes need to be aligned with 8 bytes. That is, Pointers to classes must be multiples of 8, and the last three bits of the binary are zeros.

  • inMoves to the rightAfter, print againnewisa:
(isa_t) $43 = {
  bits = 8303516107965569
  cls = LGPerson
   = {
    nonpointer = 1
    has_assoc = 0
    has_cxx_dtor = 0
    shiftcls = 536875152
    magic = 59
    weakly_referenced = 0
    unused = 0
    has_sidetable_rc = 0
    extra_rc = 0}}Copy the code

Here shiftcls has a value, so is there any other way we can see if ISA is associated with a class? We’ll try again.

validationisaAssociation class:

1. Through bits and (&)Mask ISA_MASK:
# elif __x86_64__
#   define ISA_MASK        0x00007ffffffffff8ULL
Copy the code

We get the object’s ISA first:

(lldb) x/4gx p1
0x100647de0: 0x011d800100008481 0x0000000000000000
0x100647df0: 0x0000000000000000 0x0000000000000000
// p1 is an object of class LGPerson
0x011D800100008481 is ISA
Copy the code

We print the next lgPerson. class in hexadecimal, then the ISA bit and (&) ISA_MASK:

(lldb) p/x LGPerson.class
(Class) $61 = 0x0000000100008480 LGPerson

(lldb) p/x 0x011d800100008481 & 0x00007ffffffffff8ULL
(unsigned long long) $60 = 0x0000000100008480
Copy the code

The discovery bits and the operation result in the same hexadecimal as lgPerson. class, which tells us that ISA has associated LGPerson.

2. Through displacement (>>and<<) operation:

In the above ISA_BITFIELD distribution diagram, we clearly know the position of Shiftcls in 64-bit, and we can perform the following operations:

  • Moves to the right firstThree (> > 3)That will benonpointer.has_assoc.has_cxx_dtorWipe 0:
(lldb) p/x 0x011d800100008481 >> 3
(long) $63 = 0x0023b00020001090
(lldb) 
Copy the code

Then move $63 right 20 bits (<< 20), magic, Weakly_referenced, unused, has_sideTABLE_rc, extra_RC apply 0:

(lldb) p/x $63 << 20
(long) $65 = 0x0002000109000000
Copy the code

This leaves only Shiftcls, and then restores shiftcls to the right 17 bits with $65 (>> 17) :

(lldb) p/x $65 >> 17
(long) $66 = 0x0000000100008480

(lldb) p/x LGPerson.class
(Class) $67 = 0x0000000100008480 LGPerson
Copy the code

The whole process is more intuitive with graphical analysis:



You get the sumLGPerson.classthehexadecimalIt’s the same thing. It turns outisaAssociated with theLGPersonClass.