In the last article, we looked at the method execution flow inside alloc. (PS: this article is based on the 64 bit system, all types of bytes please according to 64 bit calculation.)
Preparation of materials:
LLVM source code: github.com/apple/llvm-…
Memory alignment principle
Ascall code mapping table
Let’s start with assembly debugging. We can see that the first method executed inside alloc is the objc_alloc method (Figure 01, Figure 02).
Next, search globally for objc_alloc and alloc methods, find and set a breakpoint (Figure 03, 04).
In order not to be disturbed, we will turn off the initial assembly debugging. We will cancel the two breakpoints in the alloc method and objc_alloc method, and then open the two breakpoints in the alloc line. LGPerson *p = [LGPerson alloc] ; Next step by step tracing through breakpoints, we can get a call flow like this:
The flow chart is as follows
graph TD alloc --> objc_alloc --> callAlloc --> objc_msgSend,alloc --> _objc_rootAlloc --> callAlloc --> _objc_rootAllocWithZone --> _class_createInstanceFromZone --> Returns the result
We can see that alloc executes twice, because the system hooks the alloc method. Interested friends, LLVM can download the source code, global search “GeneratePossiblySpecializedMessageSend”, you can see a piece of code like this:
CodeGen::RValue CGObjCRuntime::GeneratePossiblySpecializedMessageSend(
CodeGenFunction &CGF, ReturnValueSlot Return, QualType ResultType,
Selector Sel, llvm::Value *Receiver, const CallArgList &Args,
const ObjCInterfaceDecl *OID, const ObjCMethodDecl *Method,
bool isClassMessage) {
if (Optional<llvm::Value *> SpecializedResult =
tryGenerateSpecializedMessageSend(CGF, ResultType, Receiver, Args,
Sel, Method, isClassMessage)) {
return RValue::get(SpecializedResult.getValue());
}
return GenerateMessageSend(CGF, Return, ResultType, Sel, Receiver, Args, OID,
Method);
}
Copy the code
There are two methods: 1, tryGenerateSpecializedMessageSend (special) 2, GenerateMessageSend (normal way) when meet the if condition, to perform the first method, otherwise will perform a second. When the first method is performed, next time into GeneratePossiblySpecializedMessageSend method, inside if judgment is not set up, it will go the second method, also is the common method. That’s why some special methods, like alloc, go twice.
Click enter tryGenerateSpecializedMessageSend method, we could see a piece of code like this:
static Optional<llvm::Value *>
tryGenerateSpecializedMessageSend(CodeGenFunction &CGF, QualType ResultType,
llvm::Value *Receiver,
const CallArgList& Args, Selector Sel,
const ObjCMethodDecl *method,
bool isClassMessage) {
auto &CGM = CGF.CGM;
if(! CGM.getCodeGenOpts().ObjCConvertMessagesToRuntimeCalls)return None;
auto &Runtime = CGM.getLangOpts().ObjCRuntime;
switch (Sel.getMethodFamily()) {
case OMF_alloc:
if (isClassMessage && Runtime.shouldUseRuntimeFunctionsForAlloc() && ResultType->isObjCObjectPointerType()) {
//
if (Sel.isUnarySelector() && Sel.getNameForSlot(0) = ="alloc")
return CGF.EmitObjCAlloc(Receiver, CGF.ConvertType(ResultType));
// ...
}
break;
case OMF_autorelease:
// ...
break;
case OMF_retain:
// ...
break;
case OMF_release:
// ...
break;
default:
break;
}
return None;
}
Copy the code
Click on the EmitObjCAlloc method
/// Allocate the given objc object.
/// call i8* \@objc_alloc(i8* %value)
llvm::Value *CodeGenFunction::EmitObjCAlloc(llvm::Value *value, llvm::Type *resultType) {
return emitObjCValueOperation(*this, value, resultType,
CGM.getObjCEntrypoints().objc_alloc,
"objc_alloc");
}
Copy the code
From this, we know that LLVM optimizes the alloc method of our system, changing the IMP corresponding to SEL(alloc) to the IMP of objc_alloc.
Memory alignment principle
The bytes of the type are as follows:
There are three ways to print memory size: sizeof, class_getInstanceSize, and malloc_size.
sizeof
:Memory size occupied by type.
Parameters can pass basic data types, objects, Pointers. If we pass in an NSObject, and we know that NSObject itself is a pointer to a structure, it takes up 8 bytes.
class_getInstanceSize
: EssentiallyThe total size of memory used by member variables
, the 8-byte alignment principle. If the object inherits from NSObject and there are no custom properties, then the size is 8. If there are custom property types, you can refer to themType corresponds to the size in bytesCalculate the size.malloc_size
: Actual space allocated by the system, 16 bytes aligned.
Next, let's look at the factors that affect the size of an object's memory
The LGPerson class looks like this:
@interface LGPerson : NSObject
@property (nonatomic, copy) NSString *name;
@property (nonatomic, copy) NSString *nickName;
@property (nonatomic, copy) NSString *hobby;
@property (nonatomic, assign) int age;
@property (nonatomic) double height;
@property (nonatomic) char c1;
@property (nonatomic) char c2;
+ (void)sayNB;
@end
Copy the code
Then we print out the actual size of the class in memory.
Type corresponds to the size in bytesI’ve got it for you, so you can look at the table and calculate the size.
The question is, 8 + 8 + 8 + 4 + 8 + 1 + 1 = 38, so why does the console print 48? Because they’re missing an ISA. Then another student asked, even if 38+8 is not equal to 48 ah, xiaobian you bullying me math failed.
At this point, we have to consider one point: memory alignment.
For attributes, it is 8-byte alignment.
For an object, this is 16-byte alignment.
Class_getInstanceSize essentially prints the sum of the sizes of all the properties of the instance object, aligned at 8 bytes.
Besides attributes, do member variables, instance methods, and static methods affect the memory size of objects? Let’s test them all:
1. Add member variables
2, add class method, instance method
3. Add a protocol
4. Add a blockTherefore, we can conclude that the memory size of an object is affected by attributes, member variables, proxies, blocks.
Next we assign a value to LGPerson’s instance object, person:
Console print: X / 8gX Person output
Next, let's look at structural in-vivo alignment
Memory alignment principle. PNG
struct LGStruct1 { double a; // 8 [0 1 2 3 4 5 6 7] char b; // 1 [8] int c; // 4 (9 10 11 [12 13 14 15] short d; // 2 [16 17] struct1; struct LGStruct2 { double a; // 8 [0 7] int b; // 4 [8 9 10 11] char c; // 1 [12] short d; // 2 (13 [14 15] = 16}struct2; Struct LGStruct3 {double a; //8 [0 1 2 3 4 5 6 7] int b; //4 [8 9 10 11] char c; //1 [12] short d; //2 (13 [14 15] int e; //4 [16 17 18 19] struct LGStruct1 str; // (20 21 22 23 [24 25...47]) struct3;Copy the code
So let’s take struct1 for example,
- According to the first rule, the first attribute should be placed from the position offset is 0, which is 8 bytes, so yes
[0, 7]
; - Next, start from the 8th position. According to rule 1, the ordinal of the current position must be an integer multiple of the size of the current property. That is to say, the ordinal must be an integer multiple of 1 to start storing
[8]
; - Next, store from position 9, 9 obviously does not comply with the first rule, keep looking… Get the storage location
[12]
; - Next, start at position 16… Get the storage location
[16]
; - 17 does not satisfy rule 3 and must be an integer multiple of the largest internal member,
So we get 24
(To explain: [0,7] refers to the eight locations between 0 and 7.)
Next, run the code to verify the inferred results:
As shown in the figure above, our calculations are correct.