We explored the nature of objects in the last article. We know that at the bottom of an object is an objc_Object structure. The first member of objC_Object is ISA.

isaThe point to

Let’s first define a class that inherits NSObject, JSPerson, instantiated in the main method.

// JSPerson.h @interface JSPerson : NSObject @end // JSPerson.m @implementation JSPerson @end //main.m int main(int argc, const char * argv[]) { @autoreleasepool { // 0x00007ffffffffff8 JSPerson *p = [JSPerson alloc]; NSLog(@"%@",p); / / breakpoint}Copy the code

At the break point on the NSLog line, we print the object’s address using LLDB:

(lldb) x/4gx p
0x10045e6e0: 0x001d8001000083a9 0x0000000000000000
0x10045e6f0: 0x6c6f6f54534e5b2d 0x7370616e53726162
(lldb) p/x 0x001d8001000083a9 & 0x00007ffffffffff8 // the isa-& mask gets the address isa points to
(long) $1 = 0x00000001000083a8
(lldb) po 0x00000001000083a8 // Print the ISA pointing address
JSPerson
Copy the code

Using the LLDB command above, we see that isa points to the JSPerson class, that is, the object’s ISA points to the class. In the last video we explored that the underlying class is actually objc_class and it inherits from Objc_Object, so that means that the class should also have an ISA pointer, so where does that isa pointer point to? With this question in mind, we continue to explore:

(lldb) x/4gx 0x00000001000083a8 // Class object address
0x1000083a8: 0x0000000100008380 0x00007fff8e92c118
0x1000083b8: 0x000000010055b1a0 0x0004801000000007
(lldb) p/x 0x0000000100008380 & 0x00007ffffffffff8 // the isa-& mask gets the address isa points to
(long) $6 = 0x0000000100008380
(lldb) po 0x0000000100008380 // Print the ISA pointing address
JSPerson
Copy the code

We find that the address that isa points to also prints JSPerson, and that this address is not the same as the address isa points to. Why should a class have two objects with different memory addresses? Can a class object create more than one object like an instance object? Let’s write some code to verify this:

void jsTestClassNum(void){
    Class class1 = [JSPerson class];
    Class class2 = [JSPerson alloc].class;
    Class class3 = object_getClass([JSPerson alloc]);
    Class class4 = [JSPerson alloc].class;
    NSLog(@"\n%p-\n%p-\n%p-\n%p",class1,class2,class3,class4);
}
Copy the code

We define a function that prints four ways to get the address of a class object:

0x1000083a8-
0x1000083a8-
0x1000083a8-
0x1000083a8
Copy the code

All four methods print the same result, indicating that there are no more than one class object, and the address of the class object is the same as the address pointed to by the isa of the instance object. The ISA of a class object points to something new called a metaclass. Verify that the metaclass really exists by opening the compiled binary with MachOView:

We searched the symbol table for the class keyword and found _OBJC_METACLASS_$_JSPerson, indicating that the compiler does generate metaclass objects at compile time.

Now that we know that compile time will help us create metaclass objects, does that metaclass also have isa Pointers? We continue our exploration using LLDB.

(lldb) x/4gx JSPerson.class
0x1000083a8: 0x0000000100008380 0x00007fff8e92c118
0x1000083b8: 0x00007fff671dc140 0x0000801000000000
(lldb) p/x 0x0000000100008380 & 0x00007ffffffffff8
(long) $1 = 0x0000000100008380
(lldb) po 0x0000000100008380 // Get the metaclass address
JSPerson

(lldb) x/4gx 0x0000000100008380
0x100008380: 0x00007fff8e92c0f0 0x00007fff8e92c0f0
0x100008390: 0x00000001006058e0 0x0001e03100000007
(lldb) p/x 0x00007fff8e92c0f0 & 0x00007ffffffffff8
(long) $3 = 0x00007fff8e92c0f0
(lldb) po 0x00007fff8e92c0f0 // the metaclass ISA points to the address
NSObject

(lldb) p/x NSObject.class
(Class) $5 = 0x00007fff8e92c118 NSObject // The metaclass isa points to a different address
(lldb) x/4gx 0x00007fff8e92c0f0
0x7fff8e92c0f0: 0x00007fff8e92c0f0 0x00007fff8e92c118
0x7fff8e92c100: 0x0000000100605960 0x0005e03100000007
(lldb) p/x 0x00007fff8e92c0f0 & 0x00007ffffffffff8
(long) $6 = 0x00007fff8e92c0f0
(lldb) po 0x00007fff8e92c0f0// The root metaclass ISA points to itself
NSObject
Copy the code

It turns out that the isa of the metaclass refers to the metaclass of NSObject, the root metaclass, and the ISA of the root metaclass refers to the root metaclass itself. This is where isa’s position becomes clearer, in a classic picture from the official document:

SuperClass superClass superClass superClass superClass superClass

// JSStudent.h @interface JSStudent : jsperson@end // jsstudent.m@implementation jsstudent@end void JSTestNSObject(void){// NSObject *object1  = [NSObject alloc]; // NSObject Class Class = object_getClass(object1); // NSObject metaClass = object_getClass(Class); // NSObject rootMetaClass Class rootMetaClass = object_getClass(metaClass); // NSObject rootMetaClass Class rootRootMetaClass = object_getClass(rootMetaClass); NSLog (@ "\ n \ n % p % p instance objects class p metaClass \ n \ n % % p root metaClass \ n % p spikes metaClass", object1, class, metaClass, rootMetaClass, rootRootMetaClass); // pMetaClass = object_getClass(jsperson.class); Class psuperClass = class_getSuperclass(pMetaClass); NSLog(@"%@ - %p",psuperClass,psuperClass); Class tMetaClass = object_getClass(jsstudent.class); Class tsuperClass = class_getSuperclass(tMetaClass); NSLog(@"%@ - %p",tsuperClass,tsuperClass); // NSObject Root Class special case Class nsuperClass = class_getSuperclass(nsobjject); NSLog(@"%@ - %p",nsuperClass,nsuperClass); // root metaClass -> NSObject Class = class_getSuperclass(metaClass); NSLog(@"%@ - %p",rnsuperClass,rnsuperClass); }Copy the code

Print result:

0x1006055e0Instance objects0x7fff8e92c1180x7fff8e92c0f0The metaclass0x7fff8e92c0f0A metaclass0x7fff8e92c0f0Root root metaclass NSObject -0x7fff8e92c0f0
JSPerson - 0x100008418
(null) - 0x0 //NSObject has no parent class
NSObject - 0x7fff8e92c118
Copy the code

The printing result is obvious, so far isa bitmap and inheritance chain we have explored, summed up is the official classic bitmap.

Translation memory

Before exploring the structure of a class, let’s introduce the concept of memory translation. We define an array array. We define a pointer to array pArray as follows:

int array[4] = {1.2.3.4};
int *pArray  = array;
NSLog(@"%p - %p - %p - %p",&array,&array[0],&array[1],&array[2]);
NSLog(@"%p - %p - %p",pArray,pArray+1,pArray+2);
// The following is the print result:
0x7ffeefbff440 - 0x7ffeefbff440 - 0x7ffeefbff444 - 0x7ffeefbff448
0x7ffeefbff440 - 0x7ffeefbff444 - 0x7ffeefbff448
Copy the code

We see that array and array[0] have the same address, which makes sense because the array points to the address of the first element. PArray, pArray+1, and pArray+2 refer to array[0], array[1], and array[2], respectively. We can use the memory translation principle to fetch elements anywhere in the array:

for (int i = 0; i<4; i++) { int value = *(pArray+i); NSLog(@"%d",value); } // Print the result 1, 2, 3, 4Copy the code

Now that we know about memory panning, let’s move on to exploring classes.

Class structure memory

We open objC and search objc_class to find the structure of the class

struct objc_class : objc_object {
  objc_class(const objc_class&) = delete;
  objc_class(objc_class&&) = delete;
  void operator= (const objc_class&) = delete;
  void operator=(objc_class&&) = delete;
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
 	/// omit code
 }
Copy the code

We know that the size of the memory used by a structure is influenced by member variables (methods exist in method areas), so we omit the code for the following methods. The class structure has four member variables isa, superclass, cache, and bits. Isa isa superclass that refers to a superclass. Cache isa superclass that refers to bits. Memory translation according to the previous section we know that the bits of memory address is the address of the class and the first three members get memory size, isa and superclass is a pointer to the class type occupies the first 8 bytes each is easy to understand, the key here is the cache how many bytes, we see cache_t source structure:

struct cache_t {
private:
    explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
    union {
        struct {
            explicit_atomic<mask_t>    _maybeMask;/ / 4
#if __LP64__
            uint16_t                   _flags;/ / 2
#endif
            uint16_t                   _occupied;/ / 2
        };
        explicit_atomic<preopt_cache_t *> _originalPreoptCache;
    };
    /// omit static variable and method code
 }
typedef unsigned long           uintptr_t;
typedef uint32_t mask_t; / / 4 bytes
Copy the code

The contents of cache_t are so large that it can be confusing at first, but there is a pattern. The code following line 352 is static variables and methods. Static variables are actually stored in the static area and methods are stored in the method area. So the memory size of cache_t depends on _bucketsAndMaybeMask and a union.

The size of _bucketsAndMaybeMask is the size of the uintptr_t. The size of the _originalPreoptCache is determined by preopt_cache_t. The size of the _originalPreoptCache is 8.

struct preopt_cache_t {
    int32_t  fallback_class_offset;/ / 4 bytes
    union {
        struct {
            uint16_t shift       :  5;
            uint16_t mask        : 11;
        };
        uint16_t hash_params;
    };/ / 1 byte
    uint16_t occupied    : 14;/ / 1 byte
    uint16_t has_inlines :  1;/ / 1 byte
    uint16_t bit_one     :  1;/ / 1 byte
    preopt_cache_entry_t entries[];

    inline int capacity(a) const {
        return mask + 1; }};typedef unsigned short uint16_t;// take up one byte
Copy the code

The size of _originalPreoptCache is also 8. So we get a cache size of 16. So the memory address of the BITS member is the translation of the class address by 32(0x20 in hexadecimal) bytes.

bits

With that in mind, let’s look at the bits member. Let’s first look at the definition of class_data_bits_t:

struct class_data_bits_t {
    friend objc_class;

    // Values are the FAST_ flags above.
    uintptr_t bits;
private:
    bool getBit(uintptr_t bit) const
    {
        return bits & bit;
    }

    // Atomically set the bits in `set` and clear the bits in `clear`.
    // set and clear must not overlap.
    void setAndClearBits(uintptr_t set, uintptr_t clear)
    {
        ASSERT((set & clear) == 0);
        uintptr_t newBits, oldBits = LoadExclusive(&bits);
        do {
            newBits = (oldBits | set) & ~clear;
        } while (slowpath(!StoreReleaseExclusive(&bits, &oldBits, newBits)));
    }

    void setBits(uintptr_t set) {
        __c11_atomic_fetch_or((_Atomic(uintptr_t) *)&bits, set, __ATOMIC_RELAXED);
    }

    void clearBits(uintptr_t clear) {
        __c11_atomic_fetch_and((_Atomic(uintptr_t) *)&bits, ~clear, __ATOMIC_RELAXED);
    }

public:

    class_rw_t* data(a) const {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
    }
    void setData(class_rw_t *newData)
    {
        ASSERT(!data()  ||  (newData->flags & (RW_REALIZING | RW_FUTURE)));
        // Set during realization or construction only. No locking needed.
        // Use a store-release fence because there may be concurrent
        // readers of data and data's contents.
        uintptr_t newBits = (bits & ~FAST_DATA_MASK) | (uintptr_t)newData;
        atomic_thread_fence(memory_order_release);
        bits = newBits;
    }

    // Get the class's ro data, even in the presence of concurrent realization.
    // fixme this isn't really safe without a compiler barrier at least
    // and probably a memory barrier when realizeClass changes the data field
    const class_ro_t *safe_ro(a) const {
        class_rw_t *maybe_rw = data(a);if (maybe_rw->flags & RW_REALIZED) {
            // maybe_rw is rw
            return maybe_rw->ro(a); }else {
            // maybe_rw is actually ro
            return (class_ro_t*)maybe_rw; }}/// omit code
};
Copy the code

Class_data_bits_t has two publicly available methods that return values, data() and safe_ro(). Let’s look at data(), which returns class_rw_t. Let’s look at its definition:

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint16_t witness;
#if SUPPORT_INDEXED_ISA
    uint16_t index;
#endif
    explicit_atomic<uintptr_t> ro_or_rw_ext;
    Class firstSubclass;
    Class nextSiblingClass;
		/// omit code
    const method_array_t methods(a) const {
        auto v = get_ro_or_rwe(a);if (v.is<class_rw_ext_t* > ()) {return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
        } else {
            return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
        }
    }
    const property_array_t properties(a) const {
        auto v = get_ro_or_rwe(a);if (v.is<class_rw_ext_t* > ()) {return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->properties;
        } else {
            return property_array_t{v.get<const class_ro_t*>(&ro_or_rw_ext)->baseProperties}; }}const protocol_array_t protocols(a) const {
        auto v = get_ro_or_rwe(a);if (v.is<class_rw_ext_t* > ()) {return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->protocols;
        } else {
            return protocol_array_t{v.get<const class_ro_t*>(&ro_or_rw_ext)->baseProtocols}; }}};Copy the code

Methods (), properties(), protocols(), and methods() are stored in this class.

//  JSPerson.h
@interface JSPerson : NSObject
{
    NSString *nickName;
}
@property (nonatomic, copy) NSString *name;
@property (nonatomic, copy) NSString *hobby;

- (void)sayNB;
+ (void)saySomething;

@end
//  JSPerson.m
#import "JSPerson.h"

@implementation JSPerson
- (void)sayNB{
    
}
+ (void)saySomething{
    
}

@end
//main
JSPerson *p1 = [[JSPerson alloc] init];
NSLog(@"%@",p1);

Copy the code

attribute

We break the point in the main method and use LLDB to debug:

(lldb) p/x JSPerson.class
(Class) $0 = 0x0000000100008530 JSPerson
(lldb) p/x 0x0000000100008530+0x20
(long) $1 = 0x0000000100008550 / / address bits
(lldb) p (class_data_bits_t *)0x0000000100008550
(class_data_bits_t *) $2 = 0x0000000100008550
(lldb) p $2->data() //bits.data()
(class_rw_t *) $3 = 0x000000010102db50
(lldb) p *$3
(class_rw_t) $4 = {
  flags = 2148007936
  witness = 1
  ro_or_rw_ext = {
    std::__1::atomic<unsigned long> = {
      Value = 4295000224
    }
  }
  firstSubclass = nil
  nextSiblingClass = NSUUID
}
(lldb) p $3.properties()// Fetch the property list
(const property_array_t) $5 = {
  list_array_tt<property_t.property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000081d0
      }
      arrayAndFlag = 4295000528
    }
  }
}
  Fix-it applied, fixed expression was: 
    $3->properties()
(lldb) p $5.list
(const RawPtr<property_list_t>) $6 = {
  ptr = 0x00000001000081d0
}
(lldb) p $6.ptr
(property_list_t *const) $7 = 0x00000001000081d0
(lldb) p *$7
(property_list_t) $8 = {
  entsize_list_tt<property_t.property_list_t.0, PointerModifierNop> = (entsizeAndFlags = 16, count = 2)}// You can see that count=2, only two attributes
(lldb) p $8.get(0)
(property_t) $9 = (name = "name", attributes = "T@\"NSString\",C,N,V_name")
(lldb) p $8.get(1)
(property_t) $10 = (name = "hobby", attributes = "T@\"NSString\",C,N,V_hobby")
Copy the code

The properties() method of bits.data() stores the properties of the object, but does not have the nickName.

Member variables

There are no attributes or methods for the ivar keyword in the class_rw_t structure. We go back to class_data_bits_t and find a safe_ro() method. Let’s look at the safe_ro() return value structure class_ro_t:

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif
    union {
        const uint8_t * ivarLayout;
        Class nonMetaclass;
    };
    explicit_atomic<const char *> name;
    // With ptrauth, this is signed if it points to a small list, but
    // may be unsigned if it points to a big list.
    void *baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;
    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
    /// omit code
};
Copy the code

We find that there is an ivars. Is this where the instance variable is stored?

(lldb) p/x JSPerson.class
(Class) $0 = 0x0000000100008530 JSPerson
(lldb) p/x 0x0000000100008530+0x20
(long) $1 = 0x0000000100008550
(lldb) p (class_data_bits_t *)0x0000000100008550
(class_data_bits_t *) $2 = 0x0000000100008550
(lldb) p $2.safe_ro()
(const class_ro_t *) $3 = 0x00000001000080a0
  Fix-it applied, fixed expression was: 
    $2->safe_ro()
(lldb) p $3->ivars/ / get ivars
(const ivar_list_t *const) $4 = 0x0000000100008168
(lldb) p *$4
(const ivar_list_t) $5 = {
  entsize_list_tt<ivar_t.ivar_list_t.0, PointerModifierNop> = (entsizeAndFlags = 32, count = 3)
}
(lldb) p $5.get(0)
(ivar_t) $6 = {
  offset = 0x00000001000084d8
  name = 0x0000000100003f18 "nickName"
  type = 0x0000000100003f79 "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $5.get(1)
(ivar_t) $7 = {
  offset = 0x00000001000084e0
  name = 0x0000000100003f21 "_name"
  type = 0x0000000100003f79 "@\"NSString\""
  alignment_raw = 3
  size = 8
}
(lldb) p $5.get(2)
(ivar_t) $8 = {
  offset = 0x00000001000084e8
  name = 0x0000000100003f27 "_hobby"
  type = 0x0000000100003f79 "@\"NSString\""
  alignment_raw = 3
  size = 8
}
Copy the code

As you can see, the compiler automatically generates a member variable with an _ for the safe_ro() ivars member. We also see baseMethodList, baseProtocols, and baseProperties in the class_ro_T structure, which we’ll explore later.

Instance methods

Let’s continue with the list of methods:

(lldb) p/x JSPerson.class
(Class) $0 = 0x0000000100008530 JSPerson
(lldb) p/x 0x0000000100008530+0x20
(long) $1 = 0x0000000100008550
(lldb) p (class_data_bits_t*) $1
(class_data_bits_t *) $2 = 0x0000000100008550
(lldb) p $2.data()
(class_rw_t *) $3 = 0x000000010092c330
  Fix-it applied, fixed expression was: 
    $2->data()
(lldb) p *$3
(class_rw_t) $4 = {
  flags = 2148007936
  witness = 1
  ro_or_rw_ext = {
    std::__1::atomic<unsigned long> = {
      Value = 4295000224
    }
  }
  firstSubclass = nil
  nextSiblingClass = NSUUID
}
(lldb) p $4.methods()
(const method_array_t) $5 = {
  list_array_tt<method_t.method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080e8
      }
      arrayAndFlag = 4295000296
    }
  }
}
(lldb) p $5.list
(const method_list_t_authed_ptr<method_list_t>) $6 = {
  ptr = 0x00000001000080e8
}
(lldb) p $6.ptr
(method_list_t *const) $7 = 0x00000001000080e8
(lldb) p *$7
(method_list_t) $8 = {
  entsize_list_tt<method_t.method_list_t.4294901763.method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 5)}// There are 5 methods
(lldb) p $8.get(0)
(method_t) $9 = {}Instead of calling get(0) as an attribute, look at the method_t structure
(lldb) p $8.get(0).big()
(method_t::big) $10 = {
  name = "sayNB"
  types = 0x0000000100003f71 "v16@0:8"
  imp = 0x0000000100003b50 (KCObjcBuild`-[JSPerson sayNB])
}
(lldb) p $8.get(1).big()
(method_t::big) $11 = {
  name = "hobby"
  types = 0x0000000100003f85 "@ @ 0:8 16"
  imp = 0x0000000100003bc0 (KCObjcBuild`-[JSPerson hobby])
}
(lldb) p $8.get(2).big()
(method_t::big) $12 = {
  name = "setHobby:"
  types = 0x0000000100003f8d "v24@0:8@16"
  imp = 0x0000000100003bf0 (KCObjcBuild`-[JSPerson setHobby:])
}
(lldb) p $8.get(3).big()
(method_t::big) $13 = {
  name = "name"
  types = 0x0000000100003f85 "@ @ 0:8 16"
  imp = 0x0000000100003b60 (KCObjcBuild`-[JSPerson name])
}
(lldb) p $8.get(4).big()
(method_t::big) $14 = {
  name = "setName:"
  types = 0x0000000100003f8d "v24@0:8@16"
  imp = 0x0000000100003b90 (KCObjcBuild`-[JSPerson setName:])
}
Copy the code

There are two methods in the list: get, set, and sayNB, but there is no saySomething. Where is the class method stored?

Class method

It’s easy to think of metaclasses, so let’s look at the list of metaclasses in the same way

(lldb) x/4gx JSPerson.class
0x100008530: 0x0000000100008508 0x0000000100357140
0x100008540: 0x000000010076a5b0 0x0001802800000003
(lldb) po 0x0000000100008508 // Find the metaclass address
JSPerson
(lldb) p/x 0x0000000100008508+0x20
(long) $2 = 0x0000000100008528
(lldb) p (class_data_bits_t *)0x0000000100008528
(class_data_bits_t *) $3 = 0x0000000100008528
(lldb) p $3.data()
(class_rw_t *) $4 = 0x000000010076a550
  Fix-it applied, fixed expression was: 
    $3->data()
(lldb) p *$4
(class_rw_t) $5 = {
  flags = 2684878849
  witness = 1
  ro_or_rw_ext = {
    std::__1::atomic<unsigned long> = {
      Value = 4312049361
    }
  }
  firstSubclass = nil
  nextSiblingClass = 0x00007fff861ddcd8
}
(lldb) p $5.methods()
(const method_array_t) $6 = {
  list_array_tt<method_t.method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x0000000100008080
      }
      arrayAndFlag = 4295000192
    }
  }
}
(lldb) p $6.list
(const method_list_t_authed_ptr<method_list_t>) $7 = {
  ptr = 0x0000000100008080
}
(lldb) p $7.ptr
(method_list_t *const) $8 = 0x0000000100008080
(lldb) p *$8
(method_list_t) $9 = {
  entsize_list_tt<method_t.method_list_t.4294901763.method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 1)
}
(lldb) p $9.get(0).big()
(method_t::big) $10 = {
  name = "saySomething"
  types = 0x0000000100003f71 "v16@0:8"
  imp = 0x0000000100003b40 (KCObjcBuild`+[JSPerson saySomething])
}
Copy the code

So the class methods of a class are stored in the list of methods of a metaclass.

conclusion

In this section, we explored the structure of the class, the placement of isa Pointers, and the storage of attributes, methods, and member variables in the class.

  • Classes are by nature objects.
  • Instance methods are stored in classes
  • Class methods are stored in metaclasses

Class_rw_t stores attributes, methods, protocols and other information, and class_ro_T stores member variables, baseMethodList, baseProtocols, baseProperties and other information. So what’s the difference between class_ro_t and class_rw_t? We’ll explore that in the next article.