preface
We often say that OC is an object oriented programming language. Objects are the most frequently encountered things in the whole process of writing code. So what is an object? In my last article exploring bottom-structure alignment for iOS, I mentioned that objects are essentially structures. So is that true or not? We’re going to explore it today.
Explore the underlying layer of the object
As we all know, Objcet-C is a programming language that extends C with object-oriented features, and its underlying layer is actually C/C++ code. There is no such thing as an object in C/C++, so an object in OC must be transformed into something that exists in C/C++. We can explore this by following this clue.
Preparation before exploration
Let’s first prepare an objectDMPerson
“To begin our journey of discovery. And then in ourmain.m
, initialize it
Convert to C/C++ code
Using clang lightweight compiler code restoration, we can convert OC code into C/C++ code to see the underlying structure of OC code.
Development:
Clang is an Apple-led C/C++/Objective-C/Objective-C++ compiler written in C++, based on LLVM and distributed under the LLVM BSD license.
We can convert the main.m file into C/C++ code with the following command
/ / simulator
xcrun -sdk iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main.cpp
/ / real machine
xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main.cpp
Copy the code
Open themain.app
After the file, you can see there’s a bunch of themC/C++
In which to search for ourDMPerson
Class, you can see the following codeBy a member variable in the structure_dmName
and_dmAge
We can confirm that this is what we’re looking forDMPerson
The low-level implementation in C/C++, look at it more closely, yes, it is onestruct
That isThe structure of the body
. At this point, we can draw the conclusion we started with:
Conclusion 1: Objects are essentially structures at the bottom
In the above analysis of THE C/C++ code, it becomes obvious that there is a problem
In the DMPerson class we defined, there are only two properties, dmName and dmAge.
But in the underlying C/C++ code, our DMPerson_IMPL structure contains three member variables, in addition to dmName and dmAge, and an NSOBject_IVARS member variable of the NSObject_IMPL structure type.
From their name, we can guess that NSObject_IMPL is the underlying implementation of NSObject in OC, and that the way the structure is nested within the structure is similar to the implementation we inherit in OC. So let’s continue our exploration
First, find the implementation of the NSObject_IMPL structure
And you can see here,NSObject
Is, in fact,objc_object
Let’s look againobjc_object
The implementation of the
If this feels a bit familiar, open the OC code nsobjcet-. h and we can see the following
That’s what we’re looking forNSObject
, we can draw two conclusions:
Conclusion two: NSObject is essentially an ObjC_object structure at the bottom of C/C++.
Conclusion 3: Inheritance in OC is implemented in the way of nested structure in the bottom layer.
Isa analysis
In the above code, we can see that there is only one member variable in NSObject, which is ISA of type Class. So what is ISA, and we’re going to explore that, first of all let’s look at the underlying definition of its type Class.
typedef struct objc_class *Class;
struct objc_class {
Class _Nonnull isa OBJC_ISA_AVAILABILITY;
#if! __OBJC2__
Class _Nullable super_class OBJC2_UNAVAILABLE;
const char * _Nonnull name OBJC2_UNAVAILABLE;
long version OBJC2_UNAVAILABLE;
long info OBJC2_UNAVAILABLE;
long instance_size OBJC2_UNAVAILABLE;
struct objc_ivar_list * _Nullable ivars OBJC2_UNAVAILABLE;
struct objc_method_list * _Nullable * _Nullable methodLists OBJC2_UNAVAILABLE;
struct objc_cache * _Nonnull cache OBJC2_UNAVAILABLE;
struct objc_protocol_list * _Nullable protocols OBJC2_UNAVAILABLE;
#endif
} OBJC2_UNAVAILABLE;
Copy the code
As you can see, Class is actually a pointer to a structure of type Objc_class *.
expand
The id type, which we often use, can also be defined here
typedef struct objc_object *id; Copy the code
It is a pointer to a structure of type ObjC_Object *, so it can point to any instance
Now let’s look at ISA, which is something that we actually touched on in the previous article on iOS – exploring the underlying alloc process. The last step in the alloc process is to bind the memory address we requested to our Class using the initIsa method.
inline void
objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor)
{
ASSERT(!isTaggedPointer());
isa_t newisa(0);
if(! nonpointer) { newisa.setClass(cls, this);
} else {
ASSERT(! DisableNonpointerIsa);ASSERT(! cls->instancesRequireRawIsa());
#if SUPPORT_INDEXED_ISA
ASSERT(cls->classArrayIndex(a) >0);
newisa.bits = ISA_INDEX_MAGIC_VALUE;
// isa.magic is part of ISA_MAGIC_VALUE
// isa.nonpointer is part of ISA_MAGIC_VALUE
newisa.has_cxx_dtor = hasCxxDtor;
newisa.indexcls = (uintptr_t)cls->classArrayIndex(a);#else
newisa.bits = ISA_MAGIC_VALUE;
// isa.magic is part of ISA_MAGIC_VALUE
// isa.nonpointer is part of ISA_MAGIC_VALUE
# if ISA_HAS_CXX_DTOR_BIT
newisa.has_cxx_dtor = hasCxxDtor;
# endif
newisa.setClass(cls, this);
#endif
newisa.extra_rc = 1;
}
isa = newisa;
}
Copy the code
In the middle of all this code, there’s a very important thing, which is isa_t, but what is it
union isa_t {
isa_t() {}isa_t(uintptr_t value) : bits(value) { }
uintptr_t bits;
private:
// Accessing the class requires custom ptrauth operations, so
// force clients to go through setClass/getClass by making this
// private.
Class cls;
public:
#if defined(ISA_BITFIELD)
struct {
ISA_BITFIELD; // defined in isa.h
};
bool isDeallocating(a) {
return extra_rc == 0 && has_sidetable_rc == 0;
}
void setDeallocating(a) {
extra_rc = 0;
has_sidetable_rc = 0;
}
#endif
void setClass(Class cls, objc_object *obj);
Class getClass(bool authenticated);
Class getDecodedClass(bool authenticated);
};
Copy the code
Unions and bitfields
A consortium
In the above code, we see a new structure called union, we call it union, so what is it, what are its properties, let’s use the following example to illustrate, first look at a piece of code
struct DMStudent1 {
char *name;
int age;
double height ;
};
union DMStudent2 {
char *name;
int age;
double height ;
};
int main(int argc, const char * argv[]) {
@autoreleasepool {
struct DMStudent1 student1;
student1.name = "mantou";
student1.age = 15;
student1.height = 190.1;
union DMStudent2 student2;
student2.name = "mantou";
student2.age = 15;
student2.height = 190.1;
}
return 0;
}
Copy the code
Print the information stored by student1 and student2 one by one. The following information is displayed
Every time thestudent1
When a value is assigned, all the assigned data is stored. And every timestudent2
After the assignment, the data that we can access correctly each time is always the data of the last assignment, and the rest of the data can be interpreted as dirty data and meaningless. So we can say that
All variables in a sturct “coexist”
Advantages: all rivers run into the sea, tolerance. If you come, I’ll save it for you
Disadvantages: The allocation of memory space is extensive, whether you use it or not all of it is allocated to you
Each variable in a union is mutually exclusive
Pros: Just not “inclusive” enough
Disadvantages: Using memory is more delicate and flexible, but also saves memory space
A domain
With that said, let’s look at another knowledge point location field. We can see that the combination isa_t contains a structure, and inside that structure isa member called ISA_BITFIELD, which is defined as follows
// Take the arm64-bit architecture as an example
# define ISA_BITFIELD \
uintptr_t nonpointer : 1; \
uintptr_t has_assoc : 1; \
uintptr_t has_cxx_dtor : 1; \
uintptr_t shiftcls : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
uintptr_t magic : 6; \
uintptr_t weakly_referenced : 1; \
uintptr_t unused : 1; \
uintptr_t has_sidetable_rc : 1; \
uintptr_t extra_rc : 19
Copy the code
Here, we use the knowledge of the bit field, let’s take 🌰
@interface DMPerson : NSObject
@property (nonatomic.assign) BOOL fat;
@property (nonatomic.assign) BOOL rich;
@property (nonatomic.assign) BOOL handsome;
@end
#import "DMPerson.h"
@implementation DMPerson
{
struct {
char fat : 1; / / is fat
char rich : 1; // Whether they have money
char handsome : 1; / / is handsome
}myself;
}
- (void)setFat:(BOOL)fat {
myself.fat = fat;
}
- (void)setRich:(BOOL)rich {
myself.rich = rich;
}
- (void)setHandsome:(BOOL)handsome {
myself.handsome = handsome;
}
- (BOOL)fat {
BOOL ret = myself.fat;
return ret;
}
- (BOOL)rich {
BOOL ret = myself.rich;
return ret;
}
- (BOOL)handsome {
BOOL ret = myself.handsome;
return ret;
}
@end
int main(int argc, const char * argv[]) {
@autoreleasepool {
DMPerson *p = [[DMPerson alloc] init];
p.fat = YES;
p.rich = NO;
p.handsome = YES;
NSLog(@"fat : %d,rich : %d,handsome : %d",p.fat, p.rich, p.handsome);
}
return 0;
}
Copy the code
We’re going to look at the value of our “myself” structure through a breakpoint
Now let’s look at the details of the structure
Is it the same as the data that we set? That’s what we callA domain
Normally, a structure like this needs to be occupied3 bytes
To represent the data stored, but when the bit field is used, we only needthree
That is1 byte
You can store the content. soA domain
For the purpose ofMake memory more optimized
.
NonpointerIsa&TaggedPoint
With the bit fields out of the way, let’s go back to our code
if(! nonpointer) { newisa.setClass(cls,this);
} else{... A series of assignments to a bit field... newisa.setClass(cls,this);
}
isa = newisa;
Copy the code
As you can see, when nonpointer is 0, the mapping between the class and the address is directly bound, which we call TaggedPoint. When nonpointer is 1, in addition to storing information about the class, it also stores some extra special information, which we call NonpointerIsa.
In September 2013, apple introduced the iPhone5s with a 64-bit processor. On a 64-bit CPU, the number of bits used by Pointers is 8 bytes 64-bit. A memory address actually does not use 64 bits to store, generally 32 bits can store a 2 billion number (2^31=2147483648, another one as a sign bit). So, Apple made a distinction between ISA on a need-based basis. Apple came up with TaggedPointer and NonpointerIsa. Use TaggedPointet for small objects to store their values. NonpointerIsa is used for large memory consuming objects to use ISA bit-by-bit, partly to store the actual object address and partly to store additional information.
TaggedPointer
For small objects like NSDate and NSNumber, the vast majority of the values stored are not greater than the order of 2 billion. If you use the pointer, heap memory method, it is bound to cause memory waste and performance loss. Apple uses uintptr_t bits to store the value directly in the isa_t and some special signs to indicate that the isa isa TaggedPoint type. This allows the value to be stored using ISA without the need to allocate memory on the heap to store the value. Remember that heap memory is allocated, freed, and accessed much more slowly than stack memory.
NonpointerIsa
Isa is not just a pointer. In the ARM64 architecture, for example, only 33 bits are actually used to store object addresses. The rest of the bits are used to store special values.
uintptr_t nonpointer : 1; // Indicates whether the pointer is nonpointer
uintptr_t has_assoc : 1; // Whether there is an associated object
uintptr_t has_cxx_dtor : 1; // does the object have a C++ or Objc destructor? If so, the destructor logic is required. If not, the object can be released faster
uintptr_t shiftcls : 33; // Object address
uintptr_t magic : 6; // Used by the debugger to determine whether the current object is a real object or there is no space to initialize
uintptr_t weakly_referenced : 1; // If the object is pointing to or has been pointing to an ARC weak variable, objects without weak references can be freed faster
uintptr_t deallocating : 1; // Whether the release is underway
uintptr_t has_sidetable_rc : 1; // The 19 bits of uintptr_t are enough to store in real life
Uintptr_t extra_rc: 19 uintPtr_t extra_rc: 19 uintPtr_t extra_rc: 19
Isa restores class information
Use the mask to restore the class information
Another interesting thing we saw when looking at the ISA_BITFIELD definition was the ISA_MASK. So what is this ISA_MASK thing? We call this a mask. We know that NonpointerIsa contains not only the class information, but also some other special information. The mask is a direct way for us to mask the other special information and find the class information directly, which you can think of as the following figure
Let’s also take 🌰
int main(int argc, const char * argv[]) {
@autoreleasepool {
DMPerson *p = [DMPerson alloc];
NSLog(@ "% @",p);
}
return 0;
}
Copy the code
So again, this DMPerson class, we’re going to alloc it, we’re going to put a breakpoint on it, we’re going to print out what we believe
We find that when we get the isa value with the maskBitwise and
And then, the result is information about our class.
Restore class information by bit operation
Since the test machine is the x86_64 architecture, I first post the NonpointerIsa structure for the X86_64 architecture
# define ISA_BITFIELD \
uintptr_t nonpointer : 1; \
uintptr_t has_assoc : 1; \
uintptr_t has_cxx_dtor : 1; \
uintptr_t shiftcls : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \
uintptr_t magic : 6; \
uintptr_t weakly_referenced : 1; \
uintptr_t unused : 1; \
uintptr_t has_sidetable_rc : 1; \
uintptr_t extra_rc : 8
Copy the code
We analyze that the class information we need to find is stored in the location shiftCLs, with 3 special bits in front of it and 6+1+1+1+8=17 special bits behind it. We thought that to get the class information stored in the ShiftCLs location, we would need to eliminate other special bits of information so that it would be easy
To illustrate it graphically, it looks something like this
conclusion
We use clang to convert OC code into underlying C/C++ code, thus confirming:
- The underlying nature of an object is a structure
NSObject
At the bottom of C/C++ is essentiallyobjc_object
Structure.- while
OC
In the inheritance, the underlying is the use ofThe structure is nested
Is implemented in the following way.
Then we found the definition of ISA from the underlying code, which extended the concept of union and bit field;
Then we looked at the difference between NonpointerIsa and TaggedPoint;
We ended up reverting the process from ISA to classes in two different ways.