preface
We’ve written about the process of creating an object and calculating the size of its memory, but what exactly is an object? So let’s do the analysis
Compile to C++ file
As we know, OC will eventually become C/C++ code under compiler action, which will then be converted into assembly, and finally generate binary code that can be recognized, so we can explore the underlying code in C/C++. There are two ways to convert OC code to C++.
1. clang
clang
Is made up ofApple
Lead writing, based onLLVM
theC/C++/Objective-C
The compiler.- Convert the code to
C++
The following steps are requiredmain.m
intomain.cpp
) :- First open the terminal
Go to the file where you want to convert the code
- Then execute the following code
- First open the terminal
clang -rewrite-objc main.m -o main.cpp
Copy the code
UIKit/ uikit.h cannot be found
main.m:8:9: fatal error: 'UIKit/UIKit.h' file not found
#import <UIKit/UIKit.h>^ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~1 error generated.
Copy the code
You can replace the second step with:
clang -rewrite-objc -fobjc-arc -fobjc-runtime=ios13.0. 0 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator143..sdk main.m
Copy the code
Note 1: -o indicates the output name. CPP, where main.m is printed as main. CPP note 2: / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneSimulator platform/Developer/SDKs/iPhoneSimulator14.3. The path to the SDK Is the path of the iPhoneSimulator on your computer. The version number needs to be modified according to the actual version on your computer.
2. xcrun
- In the installation
XCode
It was installed by the wayxcrun
Command,xcrun
Command inclang
On the basis of some packaging, better use some - Convert the code to
C++
The steps andclang
The commands are as follows:
In the simulator:
xcrun -sdk iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main.cpp
Copy the code
In the real machine:
xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main.cpp
Copy the code
Second, the structure of class
In the above steps, we have the C++ file, let’s analyze it:
- in
main.cpp
Search in fileWSPerson
(in themain.m
We defined one inWSPerson
Class), we get a structure:
struct WSPerson_IMPL {
struct NSObject_IMPL NSObject_IVARS;
};
Copy the code
Let’s define a wsName property in WSPerson, compile it into C++ code, and look at this structure:
struct WSPerson_IMPL {
struct NSObject_IMPL NSObject_IVARS;
NSString *_wsName;
};
Copy the code
There is also a wsName, so the object is essentially a structure. The NSObject_IVARS member is a structure. What is the structure? Let’s search:
struct NSObject_IMPL {
Class isa;
};
Copy the code
Conclusion: NSObject_IVARS is the member variable ISA
- in
WSPerson_IMPL
Up there, we noticed aobjc_object
:
typedef struct objc_object WSPerson;
Copy the code
You can see that WSPerson inherits from objc_Object, and we know that in OC, classes inherit from NSObject, and at the lower level of the substance, classes inherit from Objc_Object.
- So let’s see
Class
:
typedef struct objc_class *Class;
Copy the code
Class is a pointer to a structure with an ID under it:
typedef struct objc_object *id;
Copy the code
Is also a structure pointer, now have a question to lead to the solution:
Why does id person not have *? Because it is itself a pointer.
- Let’s see
wsName
Parameters:
extern "C" unsigned long int OBJC_IVAR_$_WSPerson$_wsName __attribute__ ((used, section ("__DATA,__objc_ivar"))) = __OFFSETOFIVAR__(struct WSPerson, _wsName);
// get
static NSString * _I_WSPerson_wsName(WSPerson * self, SEL _cmd) { return(* (NSString((* *)char *)self + OBJC_IVAR_$_WSPerson$_wsName)); }
// set
static void _I_WSPerson_setWsName_(WSPerson * self, SEL _cmd, NSString *wsName) { (*(NSString((* *)char *)self + OBJC_IVAR_$_WSPerson$_wsName)) = wsName; }
Copy the code
We see that these two functions, essentially get and set methods, have self and _cmd in both methods, which are hidden arguments to the function.
- So how do we get the parameters?
- First of all:
(char *)self
isWSPerson
A pointer to the OBJC_IVAR_$_WSPerson$_wsName
isoffset
The offset- And then I’m going to force theta
wsName
- First of all:
- Illustration:
We got it by gettingWSPerson
theThe first addressAnd then through the attributeOffset value offset
, to get the corresponding properties.
Bit-field and union
Before looking at ISA, let’s look at lower bit fields and unions
1. A domain
Let’s start with an example:
struct struct1 {
BOOL top;
BOOL left;
BOOL bottom;
BOOL right;
}s1;
Copy the code
- According to the previous articleMemory alignmentWe can get this
struct1
The occupied memory is4 bytes
But in essence4 bytes
? We know thatBOOL
The value is0 or 1
And so is binary0 or 1
with4 bytes
Storage causes a relatively large waste, which can be essentially like this, as shown in the figure:
// 4 bytes = 4 * 8 bits = 32 bits
00000000 00000000 00000000 00001111
Copy the code
We only need 4 bits to function as a struct1, 4 bits is half a byte, we are at least 1 byte, so how can we make struct1 occupy 1 byte? Bit fields can do that
Bit-field concept: The so-called bit-field is to divide the binary bits in a byte into several different regions and specify the number of bits in each region. Each domain has a domain name, which allows you to operate by domain name in the program, so that several different objects can be represented in a one-byte binary field. Bitfields are a data structure in C language.
- According to the concept that is, you need to put the member variables in the above structure
Specify the number
Let’s verify that by defining a structure like thisstruct2
:
struct struct2 {
BOOL top: 1;
BOOL left: 1;
BOOL bottom: 1;
BOOL right: 1;
}s2;
Copy the code
And then print bothsize
:
Summary: Bitfields can be optimized for memory by specifying the number of bits of a member variable.
2. Union (union
)
Union concept: Union is also called union and common body, which is similar to a data structure of struct to some extent. Union and struct can also contain a variety of data types and variables, with obvious differences:
union
andstruct
The difference between:
struct
:struct
All the members of thecoexistence
The advantage is thatTolerance is a great
, more comprehensive. The disadvantage is thatstruct
The allocation of memory space isextensive
The,It doesn't matter whether it works or not
.union
:union
The members areMutually exclusive
, the disadvantage is thatNot enough tolerance
. But the advantage is memory usageA more detailed
, a member of a consortiumShared memory space
So alsoIt saves memory space
.
Let’s show this in code:
/ / structure
struct LGTeacher1 {
char name;
int age;
double weight;
}t1;
/ / a consortium
union LGTeacher2 {
char name;
double weight;
int age;
}t2;
// Assign values separately
t1.name = 'K';
t1.age = 69;
t1.weight = 180;
t2.name = 'C';
t2.age = 69;
t2.weight = 179.9;
Copy the code
And then we use the p command to print them separately
You can see the structureLGTeacher1
The message is displayed normally, but commonwealthLGTeacher2
There are some anomalies in the display. Let’s usep/t
(p/t
Print binary information) to printt2
Take a look at:
Let’s use the diagram to analyze this structure:
- The order of assignment is
name
.age
.weight
As you can seeweight
Data coverageage
andname
The data. It shows that the last value assigned affects the previous value, and also shows the unionDon't tolerance
Features.
isa_t
4. Isa structure
Next, let’s analyze the structure of ISA. In the previous alloc flow, we analyzed object creation. The last step in initInstanceIsa creates isa of type ISA_t. Let’s look at its structure:
union isa_t {
isa_t() { } // constructor
isa_t(uintptr_t value) : bits(value) { } // bit-field constructor
uintptr_t bits;
private:
// Accessing the class requires custom ptrauth operations, so
// force clients to go through setClass/getClass by making this
// private.
Class cls;
public:
#if defined(ISA_BITFIELD)
struct {
ISA_BITFIELD; // defined in isa.h
};
bool isDeallocating() {
return extra_rc == 0 && has_sidetable_rc == 0;
}
void setDeallocating() {
extra_rc = 0;
has_sidetable_rc = 0;
}
#endif
void setClass(Class cls, objc_object *obj);
Class getClass(bool authenticated);
Class getDecodedClass(bool authenticated);
};
Copy the code
- The original
isa_t
It’s a union, two member variablesbits
andcls
Common memory space, which isThe mutex
When the first kindisa_t() { }
At initialization,cls
There’s no default value, and the second oneisa_t(uintptr_t value) : bits(value) { }
At initialization,CLS will have value
. - in
Objc4-818.2 -
Is created using the second method:
isa_t newisa(0)
Copy the code
isa_t
It also provides aA domain
To store some information. This member isISA_BITFIELD
, it is aMacro definition
, there are__arm64__
and__x86_64__
Two structures. Let’s do it here__x86_64__
Structure:
# elif __x86_64__
# define ISA_MASK 0x00007ffffffffff8ULL
# define ISA_MAGIC_MASK 0x001f800000000001ULL
# define ISA_MAGIC_VALUE 0x001d800000000001ULL
# define ISA_HAS_CXX_DTOR_BIT 1
# define ISA_BITFIELD
uintptr_t nonpointer : 1; // Indicates whether to enable pointer optimization for isa Pointers. 0: pure ISA Pointers, 1: Isa contains not only the address of the class object, but also the reference count of the object
uintptr_t has_assoc : 1; // The associated object flag bit, 0 does not exist, 1 exists
uintptr_t has_cxx_dtor : 1; // does the object have a destructor for C++ or Objc? If it has a destructor, it needs to do the destructor logic. If it does not, it can release the object faster
uintptr_t shiftcls : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ // Store the value of the class pointer. With pointer optimization turned on, 33 bits are used to store class Pointers in the ARM64 architecture
uintptr_t magic : 6; // It is used by the debugger to determine if the current object is a real object or if there is no space to initialize it
uintptr_t weakly_referenced : 1; // Whether the object is or was referred to an ARC weak variable. Objects without weak references can be freed faster
uintptr_t unused : 1; // Whether to use it
uintptr_t has_sidetable_rc : 1; // When the object reference technique is greater than 10, we need to borrow this variable to store the carry
uintptr_t extra_rc : 8 The extra_rc value is 9 if the object's reference count is 10. If the reference count is greater than 10, the following has_sideTABLE_rc is used.
# define RC_ONE (1ULL<<56)
# define RC_HALF (1ULL<<7)
Copy the code
As you can see, THERE are two types of ISA:
nonpointer
for0
: pureisa
Pointer to thenonpointer
for1
: Not only class object address,isa
Contains class information, object reference counts, and so on.
We use the figure to show the ISA_BITFIELD distribution:
- In the figure
nonpointer
inA "0"
.has_assoc
inThe first bit
.has_cxx_dtor
in2nd
.shiftcls
in3 ~ 46
.magic
in47 ~ 52
.weakly_referenced
in53
.unused
in54
.has_sidetable_rc
inAt 55
.extra_rc
in56~63
position
From the figure, it is clear that ShiftCLs is the core data, so let’s analyze it
shiftcls
:
- We are in
initIsa
In createisa
andnewisa.bit
Assign a breakpoint to get:
- In this picture, we see the change before and after the assignment because
newisa
isCommonwealth of the union
, sobit
Other values are also assigned due to memory sharing, notably:cls
.nonpointer
andmagic
. - Let’s print it out
cls
Binary display of:
In the domainISA_BITFIELD
,magic
in47 ~ 52
, soBinary 11 1011
Converted toThe decimal system
get59
By the same token,nonpointer
inA "0"
,nonpointer = 1
.
- And then we had
newisa.bit
After the assignment, enter thesetClass
Method to tell by a breakpoint that the step has been taken:
#else // Nonpointer isa, no ptrauth
shiftcls = (uintptr_t)newCls >> 3;
#endif
Copy the code
In this step, there’s a 3 to the right move, why do I move 3 to the right? The purpose of moving three bits to the right is to reduce memory consumption, because Pointers to classes need to be aligned with 8 bytes. That is, Pointers to classes must be multiples of 8, and the last three bits of the binary are zeros.
- in
Moves to the right
After, print againnewisa
:
(isa_t) $43 = {
bits = 8303516107965569
cls = LGPerson
= {
nonpointer = 1
has_assoc = 0
has_cxx_dtor = 0
shiftcls = 536875152
magic = 59
weakly_referenced = 0
unused = 0
has_sidetable_rc = 0
extra_rc = 0}}Copy the code
Here shiftcls has a value, so is there any other way we can see if ISA is associated with a class? We’ll try again.
validationisa
Association class:
1. Through bits and (&
)Mask ISA_MASK
:
# elif __x86_64__
# define ISA_MASK 0x00007ffffffffff8ULL
Copy the code
We get the object’s ISA first:
(lldb) x/4gx p1
0x100647de0: 0x011d800100008481 0x0000000000000000
0x100647df0: 0x0000000000000000 0x0000000000000000
// p1 is an object of class LGPerson
0x011D800100008481 is ISA
Copy the code
We print the next lgPerson. class in hexadecimal, then the ISA bit and (&) ISA_MASK:
(lldb) p/x LGPerson.class
(Class) $61 = 0x0000000100008480 LGPerson
(lldb) p/x 0x011d800100008481 & 0x00007ffffffffff8ULL
(unsigned long long) $60 = 0x0000000100008480
Copy the code
The discovery bits and the operation result in the same hexadecimal as lgPerson. class, which tells us that ISA has associated LGPerson.
2. Through displacement (>>
and<<
) operation:
In the above ISA_BITFIELD distribution diagram, we clearly know the position of Shiftcls in 64-bit, and we can perform the following operations:
- Moves to the right first
Three (> > 3)
That will benonpointer
.has_assoc
.has_cxx_dtor
Wipe 0:
(lldb) p/x 0x011d800100008481 >> 3
(long) $63 = 0x0023b00020001090
(lldb)
Copy the code
Then move $63 right 20 bits (<< 20), magic, Weakly_referenced, unused, has_sideTABLE_rc, extra_RC apply 0:
(lldb) p/x $63 << 20
(long) $65 = 0x0002000109000000
Copy the code
This leaves only Shiftcls, and then restores shiftcls to the right 17 bits with $65 (>> 17) :
(lldb) p/x $65 >> 17
(long) $66 = 0x0000000100008480
(lldb) p/x LGPerson.class
(Class) $67 = 0x0000000100008480 LGPerson
Copy the code
The whole process is more intuitive with graphical analysis:
You get the sumLGPerson.class
thehexadecimal
It’s the same thing. It turns outisa
Associated with theLGPerson
Class.