preface

We often say that OC is an object oriented programming language. Objects are the most frequently encountered things in the whole process of writing code. So what is an object? In my last article exploring bottom-structure alignment for iOS, I mentioned that objects are essentially structures. So is that true or not? We’re going to explore it today.

Explore the underlying layer of the object

As we all know, Objcet-C is a programming language that extends C with object-oriented features, and its underlying layer is actually C/C++ code. There is no such thing as an object in C/C++, so an object in OC must be transformed into something that exists in C/C++. We can explore this by following this clue.

Preparation before exploration

Let’s first prepare an objectDMPerson“To begin our journey of discovery. And then in ourmain.m, initialize it

Convert to C/C++ code

Using clang lightweight compiler code restoration, we can convert OC code into C/C++ code to see the underlying structure of OC code.

Development:

Clang is an Apple-led C/C++/Objective-C/Objective-C++ compiler written in C++, based on LLVM and distributed under the LLVM BSD license.

We can convert the main.m file into C/C++ code with the following command

/ / simulator
xcrun -sdk iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main.cpp 
/ / real machine
xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main.cpp
Copy the code

Open themain.appAfter the file, you can see there’s a bunch of themC/C++In which to search for ourDMPersonClass, you can see the following codeBy a member variable in the structure_dmNameand_dmAgeWe can confirm that this is what we’re looking forDMPersonThe low-level implementation in C/C++, look at it more closely, yes, it is onestructThat isThe structure of the body. At this point, we can draw the conclusion we started with:

Conclusion 1: Objects are essentially structures at the bottom

In the above analysis of THE C/C++ code, it becomes obvious that there is a problem

In the DMPerson class we defined, there are only two properties, dmName and dmAge.

But in the underlying C/C++ code, our DMPerson_IMPL structure contains three member variables, in addition to dmName and dmAge, and an NSOBject_IVARS member variable of the NSObject_IMPL structure type.

From their name, we can guess that NSObject_IMPL is the underlying implementation of NSObject in OC, and that the way the structure is nested within the structure is similar to the implementation we inherit in OC. So let’s continue our exploration

First, find the implementation of the NSObject_IMPL structure

And you can see here,NSObjectIs, in fact,objc_objectLet’s look againobjc_objectThe implementation of the

If this feels a bit familiar, open the OC code nsobjcet-. h and we can see the following

That’s what we’re looking forNSObject, we can draw two conclusions:

Conclusion two: NSObject is essentially an ObjC_object structure at the bottom of C/C++.

Conclusion 3: Inheritance in OC is implemented in the way of nested structure in the bottom layer.

Isa analysis

In the above code, we can see that there is only one member variable in NSObject, which is ISA of type Class. So what is ISA, and we’re going to explore that, first of all let’s look at the underlying definition of its type Class.

typedef struct objc_class *Class;

struct objc_class {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;

#if! __OBJC2__
    Class _Nullable super_class                              OBJC2_UNAVAILABLE;
    const char * _Nonnull name                               OBJC2_UNAVAILABLE;
    long version                                             OBJC2_UNAVAILABLE;
    long info                                                OBJC2_UNAVAILABLE;
    long instance_size                                       OBJC2_UNAVAILABLE;
    struct objc_ivar_list * _Nullable ivars                  OBJC2_UNAVAILABLE;
    struct objc_method_list * _Nullable * _Nullable methodLists                    OBJC2_UNAVAILABLE;
    struct objc_cache * _Nonnull cache                       OBJC2_UNAVAILABLE;
    struct objc_protocol_list * _Nullable protocols          OBJC2_UNAVAILABLE;
#endif

} OBJC2_UNAVAILABLE;
Copy the code

As you can see, Class is actually a pointer to a structure of type Objc_class *.

expand

The id type, which we often use, can also be defined here

typedef struct objc_object *id;
Copy the code

It is a pointer to a structure of type ObjC_Object *, so it can point to any instance

Now let’s look at ISA, which is something that we actually touched on in the previous article on iOS – exploring the underlying alloc process. The last step in the alloc process is to bind the memory address we requested to our Class using the initIsa method.

inline void 
objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor)
{ 
    ASSERT(!isTaggedPointer()); 
    
    isa_t newisa(0);
    if(! nonpointer) { newisa.setClass(cls, this);
    } else {
        ASSERT(! DisableNonpointerIsa);ASSERT(! cls->instancesRequireRawIsa());
#if SUPPORT_INDEXED_ISA
        ASSERT(cls->classArrayIndex(a) >0);
        newisa.bits = ISA_INDEX_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.indexcls = (uintptr_t)cls->classArrayIndex(a);#else
        newisa.bits = ISA_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
#   if ISA_HAS_CXX_DTOR_BIT
        newisa.has_cxx_dtor = hasCxxDtor;
#   endif
        newisa.setClass(cls, this);
#endif
        newisa.extra_rc = 1;
    }
    isa = newisa;
}
Copy the code

In the middle of all this code, there’s a very important thing, which is isa_t, but what is it

union isa_t {
    isa_t() {}isa_t(uintptr_t value) : bits(value) { }

    uintptr_t bits;

private:
    // Accessing the class requires custom ptrauth operations, so
    // force clients to go through setClass/getClass by making this
    // private.
    Class cls;

public:
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };

    bool isDeallocating(a) {
        return extra_rc == 0 && has_sidetable_rc == 0;
    }
    void setDeallocating(a) {
        extra_rc = 0;
        has_sidetable_rc = 0;
    }
#endif

    void setClass(Class cls, objc_object *obj);
    Class getClass(bool authenticated);
    Class getDecodedClass(bool authenticated);
};
Copy the code

Unions and bitfields

A consortium

In the above code, we see a new structure called union, we call it union, so what is it, what are its properties, let’s use the following example to illustrate, first look at a piece of code

struct DMStudent1 {
    char        *name;
    int         age;
    double      height ;
};

union DMStudent2 {
    char        *name;
    int         age;
    double      height ;
};

int main(int argc, const char * argv[]) {
    @autoreleasepool {
  
        struct DMStudent1   student1;
        student1.name = "mantou";
        student1.age  = 15;
        student1.height = 190.1;

        union DMStudent2    student2;
        student2.name = "mantou";
        student2.age  = 15;
        student2.height = 190.1;
    }
    return 0;
}
Copy the code

Print the information stored by student1 and student2 one by one. The following information is displayed

Every time thestudent1When a value is assigned, all the assigned data is stored. And every timestudent2After the assignment, the data that we can access correctly each time is always the data of the last assignment, and the rest of the data can be interpreted as dirty data and meaningless. So we can say that

All variables in a sturct “coexist”

Advantages: all rivers run into the sea, tolerance. If you come, I’ll save it for you

Disadvantages: The allocation of memory space is extensive, whether you use it or not all of it is allocated to you

Each variable in a union is mutually exclusive

Pros: Just not “inclusive” enough

Disadvantages: Using memory is more delicate and flexible, but also saves memory space

A domain

With that said, let’s look at another knowledge point location field. We can see that the combination isa_t contains a structure, and inside that structure isa member called ISA_BITFIELD, which is defined as follows

// Take the arm64-bit architecture as an example

#     define ISA_BITFIELD                                                      \
        uintptr_t nonpointer        : 1;                                       \
        uintptr_t has_assoc         : 1;                                       \
        uintptr_t has_cxx_dtor      : 1;                                       \
        uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
        uintptr_t magic             : 6;                                       \
        uintptr_t weakly_referenced : 1;                                       \
        uintptr_t unused            : 1;                                       \
        uintptr_t has_sidetable_rc  : 1;                                       \
        uintptr_t extra_rc          : 19
Copy the code

Here, we use the knowledge of the bit field, let’s take 🌰

@interface DMPerson : NSObject
@property (nonatomic.assign) BOOL fat;
@property (nonatomic.assign) BOOL rich;
@property (nonatomic.assign) BOOL handsome;
@end
#import "DMPerson.h"

@implementation DMPerson
{
    struct {
        char fat      : 1;        / / is fat
        char rich     : 1;        // Whether they have money
        char handsome : 1;        / / is handsome
    }myself;
}

- (void)setFat:(BOOL)fat {
    myself.fat = fat;
}

- (void)setRich:(BOOL)rich {
    myself.rich = rich;
}

- (void)setHandsome:(BOOL)handsome {
    myself.handsome = handsome;
}

- (BOOL)fat {
    BOOL ret = myself.fat;
    return ret;
}

- (BOOL)rich {
    BOOL ret = myself.rich;
    return ret;
}

- (BOOL)handsome {
    BOOL ret = myself.handsome;
    return ret;
}
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        DMPerson *p = [[DMPerson alloc] init];
        p.fat = YES;
        p.rich = NO;
        p.handsome = YES;
        NSLog(@"fat : %d,rich : %d,handsome : %d",p.fat, p.rich, p.handsome);
    }
    return 0;
}
Copy the code

We’re going to look at the value of our “myself” structure through a breakpoint

Now let’s look at the details of the structure

Is it the same as the data that we set? That’s what we callA domainNormally, a structure like this needs to be occupied3 bytesTo represent the data stored, but when the bit field is used, we only needthreeThat is1 byteYou can store the content. soA domainFor the purpose ofMake memory more optimized.

NonpointerIsa&TaggedPoint

With the bit fields out of the way, let’s go back to our code

    if(! nonpointer) { newisa.setClass(cls,this);
    } else{... A series of assignments to a bit field... newisa.setClass(cls,this);
    }
    isa = newisa;
Copy the code

As you can see, when nonpointer is 0, the mapping between the class and the address is directly bound, which we call TaggedPoint. When nonpointer is 1, in addition to storing information about the class, it also stores some extra special information, which we call NonpointerIsa.

In September 2013, apple introduced the iPhone5s with a 64-bit processor. On a 64-bit CPU, the number of bits used by Pointers is 8 bytes 64-bit. A memory address actually does not use 64 bits to store, generally 32 bits can store a 2 billion number (2^31=2147483648, another one as a sign bit). So, Apple made a distinction between ISA on a need-based basis. Apple came up with TaggedPointer and NonpointerIsa. Use TaggedPointet for small objects to store their values. NonpointerIsa is used for large memory consuming objects to use ISA bit-by-bit, partly to store the actual object address and partly to store additional information.

TaggedPointer

For small objects like NSDate and NSNumber, the vast majority of the values stored are not greater than the order of 2 billion. If you use the pointer, heap memory method, it is bound to cause memory waste and performance loss. Apple uses uintptr_t bits to store the value directly in the isa_t and some special signs to indicate that the isa isa TaggedPoint type. This allows the value to be stored using ISA without the need to allocate memory on the heap to store the value. Remember that heap memory is allocated, freed, and accessed much more slowly than stack memory.

NonpointerIsa

Isa is not just a pointer. In the ARM64 architecture, for example, only 33 bits are actually used to store object addresses. The rest of the bits are used to store special values.

uintptr_t nonpointer : 1; // Indicates whether the pointer is nonpointer

uintptr_t has_assoc : 1; // Whether there is an associated object

uintptr_t has_cxx_dtor : 1; // does the object have a C++ or Objc destructor? If so, the destructor logic is required. If not, the object can be released faster

uintptr_t shiftcls : 33; // Object address

uintptr_t magic : 6; // Used by the debugger to determine whether the current object is a real object or there is no space to initialize

uintptr_t weakly_referenced : 1; // If the object is pointing to or has been pointing to an ARC weak variable, objects without weak references can be freed faster

uintptr_t deallocating : 1; // Whether the release is underway

uintptr_t has_sidetable_rc : 1; // The 19 bits of uintptr_t are enough to store in real life

Uintptr_t extra_rc: 19 uintPtr_t extra_rc: 19 uintPtr_t extra_rc: 19

Isa restores class information

Use the mask to restore the class information

Another interesting thing we saw when looking at the ISA_BITFIELD definition was the ISA_MASK. So what is this ISA_MASK thing? We call this a mask. We know that NonpointerIsa contains not only the class information, but also some other special information. The mask is a direct way for us to mask the other special information and find the class information directly, which you can think of as the following figure

Let’s also take 🌰

int main(int argc, const char * argv[]) {
    @autoreleasepool {

        DMPerson *p = [DMPerson alloc];
        NSLog(@ "% @",p);
    }
    return 0;
}
Copy the code

So again, this DMPerson class, we’re going to alloc it, we’re going to put a breakpoint on it, we’re going to print out what we believe

We find that when we get the isa value with the maskBitwise andAnd then, the result is information about our class.

Restore class information by bit operation

Since the test machine is the x86_64 architecture, I first post the NonpointerIsa structure for the X86_64 architecture

#   define ISA_BITFIELD                                                        \
      uintptr_t nonpointer        : 1;                                         \
      uintptr_t has_assoc         : 1;                                         \
      uintptr_t has_cxx_dtor      : 1;                                         \
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \
      uintptr_t magic             : 6;                                         \
      uintptr_t weakly_referenced : 1;                                         \
      uintptr_t unused            : 1;                                         \
      uintptr_t has_sidetable_rc  : 1;                                         \
      uintptr_t extra_rc          : 8
Copy the code

We analyze that the class information we need to find is stored in the location shiftCLs, with 3 special bits in front of it and 6+1+1+1+8=17 special bits behind it. We thought that to get the class information stored in the ShiftCLs location, we would need to eliminate other special bits of information so that it would be easy

To illustrate it graphically, it looks something like this

conclusion

We use clang to convert OC code into underlying C/C++ code, thus confirming:

  1. The underlying nature of an object is a structure
  2. NSObjectAt the bottom of C/C++ is essentiallyobjc_objectStructure.
  3. whileOCIn the inheritance, the underlying is the use ofThe structure is nestedIs implemented in the following way.

Then we found the definition of ISA from the underlying code, which extended the concept of union and bit field;

Then we looked at the difference between NonpointerIsa and TaggedPoint;

We ended up reverting the process from ISA to classes in two different ways.