Isa analysis to the metaclass

In the “Nature of objects explored at the bottom of OC and ISA” article we learned how to find classes through ISA, namely ISA & mask:

Does the class still have isa?

We print the class in hexadecimal:

(lldb) p/x 0x011d8001000080f9 & 0x00007ffffffffff8
(long) $2 = 0x00000001000080f8
Copy the code

Then print its memory:

(lldb) x/4gx 0x00000001000080f8
0x1000080f8: 0x00000001000080d0 0x00007fff8061d008
0x100008108: 0x00007fff2020eaf0 0x0000801000000000
Copy the code

Very similar to the way we print objects, we continue with the isa class:

Discover class isa is also a class!

So are these two classes the same? To print the new class in hexadecimal format:

(lldb) p/x 0x00000001000080d0 & 0x00007ffffffffff8
(long) $4 = 0x00000001000080d0
Copy the code

Find that the first class (0x00000001000080F8) is not the same as the next class (0x00000001000080D0)!

Do classes and objects constantly open up space in memory, and there will be more than one class?

So let’s verify:

Discover that the classes are all the same! And both are the first class (0x00000001000080F8)!

So what is this next class (0x00000001000080D0)?

Import the compiled Mach-O file into rotten Apple to take a look:

In the symbol table, you’ll see a METACLASS, which is the next class, the METACLASS! Metaclass is generated and compiled by the system!

Tidy up the process, as shown below:

Isa bitmap and inheritance chain

1. Isa goes bitmap

Are there any other classes in the metaclass ISA?

We continue to explore:

(lldb) x/4gx 0x00000001000080d0
0x1000080d0: 0x00007fff8061cfe0 0x00007fff8061cfe0
0x1000080e0: 0x00000001089bab10 0x0002e03100000003
(lldb) p/x 0x00007fff8061cfe0 & 0x00007ffffffffff8
(long) $5 = 0x00007fff8061cfe0
Copy the code

Discover that metaclass ISA and isa classes are exactly the same!

Print its memory address:

(lldb) x/4gx 0x00007fff8061cfe0
0x7fff8061cfe0: 0x00007fff8061cfe0 0x00007fff8061d008
0x7fff8061cff0: 0x0000000108b09aa0 0x0002e03100000007
Copy the code

Find that this class and class ISA are exactly the same!

Print out this particular class to see what it is:

This is NSObject — root metaclass!

So let’s explore NSObject one more time:

(lldb) p/x NSObject.class
(Class) $8 = 0x00007fff8061d008 NSObject
Copy the code

Finding that NSObject is different from the one we printed, keep exploring:

(lldb) x/4gx 0x00007fff8061d008
0x7fff8061d008: 0x00007fff8061cfe0 0x0000000000000000
0x7fff8061d018: 0x0000000108b09c70 0x0002801000000003
(lldb) p/x 0x00007fff8061cfe0 & 0x00007ffffffffff8
(long) $10 = 0x00007fff8061cfe0
Copy the code

Discover that the class in NSObject isa is Isa itself!

Verify this with the code:

Consistent with our exploration, we can then derive a bitmap of ISA:

2. Inheritance chain

OC objects have inheritance chains. Do metaclasses also have inheritance chains?

Let’s print the parent of the HPerson metaclass:

Discover that the parent of HPerson’s metaclass is NSObject, the root metaclass!

Is the parent of a metaclass the root metaclass?

What if HPerson has a parent class? What would be the parent of the HPerson metaclass?

Create a new KPerson class, make KPerson inherit from HPerson, print its parent class:

Found that the parent of KPerson’s metaclass is HPerson!

This means that metaclasses also have inheritance chains!

What is the parent of the root metaclass?

Let’s print it out:

And discovered that the parent of the root metaclass is NSObject, the root class!

What about the parent of the root class?

Print it again:

Found that the root class has no parent class!

So we get a diagram of the inheritance chain:

3, summarize

Isa bitmap and inheritance chain can be restored to isa classic flow chart:

According to the picture, there are 3 routes:

  1. Isa route: Object -> class -> metaclass -> root metaclass -> root metaclass
  2. Class inheritance: class -> parent -> root ->nil
  3. Inheritance of metaclass: metaclass -> parent of metaclass -> root metaclass -> root class ->nil

Third, the structure of the source analysis class

We already know that the memory space of an object holds ISA and member variables. What about the memory space of a class? What to store?

If we search objC source code for objc_class, we can see the class structure:

/// Represents an instance of a class. struct objc_object { Class _Nonnull isa OBJC_ISA_AVAILABILITY; }; Struct objc_class: struct objc_object {// Class ISA; Class superclass; cache_t cache; // formerly cache pointer and vtable class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags };Copy the code

You can see that the class inherits from Objc_Object and has four member variables, of which ISA inherits from Objc_Object.

The remaining three superclasses are clearly superclasses, and the other two still need to be explored.

Bits first, the comment says class_rw_t! Look down in the objc_class structure to find:

class_rw_t *data() const {
        return bits.data();
    }
Copy the code

Then enter the class_rw_t structure:

You’ll find lists of methods, property lists, protocol lists, and so on!

So how do we look at the memory structure of a class?

Pointer and memory translation

1, the pointer

  • Normal pointer:

    As you can see, the two memory addresses are different, but they point to the same value! Which is the value copy!

  • Object pointer:

    As you can see, the pointer address is different, the memory address is also different!

    These two situations can be expressed in a graph:

  • Array pointer:

We can see that the address of the first element of the array is the first address, and the array pointer is moved according to the size of the array type.

So can we just move the pointer to get the data?

2. Memory translation

Value by moving pointer address:

We found that memory can be offset, offset after the address value can get the value stored in memory, this is the value operation!

We can get the memory address of the class. Can we use memory offset to get the data stored in the class?

Fifth, class structure memory calculation

We open up the objC source project (alloc Exploration of OC Underlying Exploration) and print the memory of the HPerson class:

(lldb) x/6gx HPerson.class
0x100008238: 0x0000000100008210 0x0000000108679140
0x100008248: 0x0000000108671380 0x0000802c00000000
0x100008258: 0x0000000108f2a234 0x00000002000b9980
Copy the code

The first 8 bytes are isa (class type) and the second 8 bytes are superclass (class type).

(lldb) po 0x0000000108679140
NSObject
Copy the code

NSObject, the parent of the HPerson class, can also be verified:

(lldb) p/x NSObject.class
(Class) $2 = 0x0000000108679140 NSObject
Copy the code

The same superclass memory address as the HPerson class!

The size of cache and bits is unknown, so we do not know how to read them.

Continue exploring the cache_t source code:

struct cache_t { private: explicit_atomic<uintptr_t> _bucketsAndMaybeMask; union { struct { explicit_atomic<mask_t> _maybeMask; #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; }; explicit_atomic<preopt_cache_t *> _originalPreoptCache; }; // omit methods, global variables, etc., methods in the method area, global variables in the global area do not occupy structure storage}Copy the code

Find the cache_T structure with an explicit_atomic

and a union.

Enter the explicit_atomic structure:

// Version of std::atomic that does not allow implicit conversions
// to/from the wrapped type, and requires an explicit memory order
// be passed to load() and store().
template <typename T>
struct explicit_atomic : public std::atomic<T> {
    explicit explicit_atomic(T initial) noexcept : std::atomic<T>(std::move(initial)) {}
    operator T() const = delete;
    
    T load(std::memory_order order) const noexcept {
        return std::atomic<T>::load(order);
    }
    void store(T desired, std::memory_order order) noexcept {
        std::atomic<T>::store(desired, order);
    }
    
    // Convert a normal pointer to an atomic pointer. This is a
    // somewhat dodgy thing to do, but if the atomic type is lock
    // free and the same size as the non-atomic type, we know the
    // representations are the same, and the compiler generates good
    // code.
    static explicit_atomic<T> *from_pointer(T *ptr) {
        static_assert(sizeof(explicit_atomic<T> *) == sizeof(T *),
                      "Size of atomic must match size of original");
        explicit_atomic<T> *atomic = (explicit_atomic<T> *)ptr;
        ASSERT(atomic->is_lock_free());
        return atomic;
    }
};
Copy the code

Discover that explicit_atomic is a generic type whose true size comes from generics!

Explicit_atomic

size depends on uintptr_t, i.e. 8 bytes:

(lldb) po sizeof(uintptr_t)
8
Copy the code

Let’s look at unions, which are mutually exclusive, with a structure and an explicit_atomic

type. Let’s look at explicit_atomic , The explicit_atomic size depends on preopt_cache_t *, which is a pointer type, i.e., size 8!

So cache_t has a size of 16!

The bits address is class_datA_bits_t *. The bits address is class_datA_bits_t *. The bits address is class_datA_bits_t *.

Structure of LLDB analysis class

We have previously analyzed bits as class_rw_t* :

class_rw_t *data() const {
        return bits.data();
    }
Copy the code

$4 = class_data_bits_t*; $4 = class_data_bits_t*;

(lldb) p $4->data()
(class_rw_t *) $5 = 0x0000000108f2a230
Copy the code

Getting a variable of type class_rw_t * $5, proceed to print $5:

(lldb) p *$5
(class_rw_t) $6 = {
  flags = 2148007936
  witness = 0
  ro_or_rw_ext = {
    std::__1::atomic<unsigned long> = {
      Value = 4295000184
    }
  }
  firstSubclass = nil
  nextSiblingClass = NSUUID
}
Copy the code

I get class_rw_t!

But what if we don’t have the list of methods, properties, and so on we want?

As before, we call the method to get the property list:

(lldb) p $6.properties()
(const property_array_t) $7 = {
  list_array_tt<property_t, property_list_t, RawPtr> = {
     = {
      list = {
        ptr = 0x00000001000081c0
      }
      arrayAndFlag = 4295000512
    }
  }
}
Copy the code

But we still don’t have attributes, so what do we do?

Let’s first look at property_array_t, which goes to the properties method:

class property_array_t : 
    public list_array_tt<property_t, property_list_t, RawPtr>
{
    typedef list_array_tt<property_t, property_list_t, RawPtr> Super;

 public:
    property_array_t() : Super() { }
    property_array_t(property_list_t *l) : Super(l) { }
};
Copy the code

When we go to list_array_tt, we see the comment:

/*********************************************************************** * list_array_tt<Element, List, Ptr> * Generic implementation for metadata that can be augmented by categories. * * Element is the underlying metadata type (e.g. method_t) * List is the metadata's list type (e.g. method_list_t) * List is a template applied to Element to make Element*. Useful for * applying qualifiers to the pointer type. * * A list_array_tt has one of three values: * - empty * - a pointer to a single list * - an array of pointers to lists * * countLists/beginLists/endLists iterate the metadata lists * count/begin/end iterate the underlying metadata elements * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /Copy the code

List is a list of metadata!

So let’s go ahead and look at the list in LLDB:

(lldb) p $7.list
(const RawPtr<property_list_t>) $8 = {
  ptr = 0x00000001000081c0
}
Copy the code

Print PTR again:

(lldb) p $8.ptr
(property_list_t *const) $9 = 0x00000001000081c0
Copy the code

Then restore property_list_t:

(lldb) p *$9
(property_list_t) $10 = {
  entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 2)
}
Copy the code

So property_list_t is the property list, count = 2!

Then print the properties of the class:

(lldb) p $10.get(0)
(property_t) $11 = (name = "name", attributes = "T@\"NSString\",&,N,V_name")
(lldb) p $10.get(1)
(property_t) $12 = (name = "nickName", attributes = "T@\"NSString\",&,N,V_nickName")
Copy the code

The HPerson class was successfully printed:

Show me the process:

7. Bit data analysis of class

Add class methods and attribute methods to the class:

In the same way we get attributes, we get the following methods:

(lldb) x/6gx HPerson.class
0x100008220: 0x00000001000081f8 0x0000000108679140
0x100008230: 0x0000000108671380 0x0000802400000000
0x100008240: 0x0000000108c7d4b4 0x00000002000b9980
(lldb) p (class_data_bits_t *)0x100008240
(class_data_bits_t *) $1 = 0x0000000100008240
(lldb) p $1->data()
(class_rw_t *) $2 = 0x0000000108c7d4b0
(lldb) p $2.methods()
(const method_array_t) $3 = {
  list_array_tt<method_t, method_list_t, method_list_t_authed_ptr> = {
     = {
      list = {
        ptr = 0x00000001000080d8
      }
      arrayAndFlag = 4295000280
    }
  }
}
  Fix-it applied, fixed expression was: 
    $2->methods()
(lldb) p $3.list
(const method_list_t_authed_ptr<method_list_t>) $4 = {
  ptr = 0x00000001000080d8
}
(lldb) p $4.ptr
(method_list_t *const) $5 = 0x00000001000080d8
(lldb) p *$5
(method_list_t) $6 = {
  entsize_list_tt<method_t, method_list_t, 4294901763, method_t::pointer_modifier> = (entsizeAndFlags = 27, count = 6)
}
Copy the code

As you can see, there are 6 methods. Let’s print the methods:

(lldb) p $6.get(0)
(method_t) $7 = {}
(lldb) p $6.get(1)
(method_t) $8 = {}
Copy the code

Found empty!

Why is that?

Let’s take a look at the source and find the property_list_t structure:

struct property_list_t : entsize_list_tt<property_t, property_list_t, 0> {
};
Copy the code

It’s empty, we get property_t, keep looking for property_t:

struct property_t {
    const char *name;
    const char *attributes;
};
Copy the code

We found that there were only two member variables, name and Attributes, which matched the information we printed!

Looking at the list of methods, find method_array_t:

// Two bits of entsize are used for fixup markers. // Reserve the top half of entsize for more flags. We never // need entry sizes anywhere close to 64kB. // // Currently there is one flag defined: the small method list flag, // method_t::smallMethodListFlag. Other flags are currently ignored. // (NOTE: these bits are only ignored on runtimes that support small // method lists. Older runtimes will treat them as part of the entry // size!) struct method_list_t : entsize_list_tt<method_t, method_list_t, 0xffff0003, method_t::pointer_modifier> { bool isUniqued() const; bool isFixedUp() const; void setFixedUp(); uint32_t indexOfMethod(const method_t *meth) const { uint32_t i = (uint32_t)(((uintptr_t)meth - (uintptr_t)this) / entsize()); ASSERT(i < count); return i; } bool isSmallList() const { return flags() & method_t::smallMethodListFlag; } bool isExpectedSize() const { if (isSmallList()) return entsize() == method_t::smallSize; else return entsize() == method_t::bigSize; } method_list_t *duplicate() const { method_list_t *dup; if (isSmallList()) { dup = (method_list_t *)calloc(byteSize(method_t::bigSize, count), 1); dup->entsizeAndFlags = method_t::bigSize; } else { dup = (method_list_t *)calloc(this->byteSize(), 1); dup->entsizeAndFlags = this->entsizeAndFlags; } dup->count = this->count; std::copy(begin(), end(), dup->begin()); return dup; }};Copy the code

There’s a lot of stuff in there, but the one we need to get is method_t, so keep looking at method_t:

struct method_t { static const uint32_t smallMethodListFlag = 0x80000000; method_t(const method_t &other) = delete; // The representation of a "big" method. This is the traditional // representation of three pointers storing the selector, types // and implementation. struct big { SEL name; const char *types; MethodListIMP imp; }; public: big &big() const { ASSERT(! isSmall()); return *(struct big *)this; } // Omit the rest}Copy the code

There are no member variables, but there is a big structure and a big method, so we should print the big structure:

(lldb) p $6.get(0).big()
(method_t::big) $10 = {
  name = "printName"
  types = 0x0000000100003f77 "v16@0:8"
  imp = 0x0000000100003d80 (HObjectBuild`-[HPerson printName])
}
(lldb) p $6.get(1).big()
(method_t::big) $11 = {
  name = "name"
  types = 0x0000000100003f8b "@16@0:8"
  imp = 0x0000000100003d90 (HObjectBuild`-[HPerson name])
}
(lldb) p $6.get(2).big()
(method_t::big) $12 = {
  name = ".cxx_destruct"
  types = 0x0000000100003f77 "v16@0:8"
  imp = 0x0000000100003e30 (HObjectBuild`-[HPerson .cxx_destruct])
}
(lldb) p $6.get(3).big()
(method_t::big) $13 = {
  name = "setName:"
  types = 0x0000000100003f93 "v24@0:8@16"
  imp = 0x0000000100003db0 (HObjectBuild`-[HPerson setName:])
}
(lldb) p $6.get(4).big()
(method_t::big) $14 = {
  name = "nickName"
  types = 0x0000000100003f8b "@16@0:8"
  imp = 0x0000000100003de0 (HObjectBuild`-[HPerson nickName])
}
(lldb) p $6.get(5).big()
(method_t::big) $15 = {
  name = "setNickName:"
  types = 0x0000000100003f93 "v24@0:8@16"
  imp = 0x0000000100003e00 (HObjectBuild`-[HPerson setNickName:])
}
Copy the code

The method is printed correctly!

But class methods don’t. Why? Because class methods are inside metaclasses!