In daily development, we often use code like the following to create objects:
Person *p = [[Person alloc] init];
Copy the code
You’ll notice that to create an object, you need to call both alloc and init. So today we’re going to explore what that alloc does.
In fact, as iOS developers know, alloc is used to create memory for objects. So how exactly to open up memory? That’s the focus of our exploration today. First we need to download the source code of ObjC4, and try to download the latest version. Second, download down the source is not compiled, here to recommend you an article, iOS_OBJC4-756.2 latest source code compiler debugging. You can refer to this article to compile your own source code.
###alloc source explore ####1, createPerson
Object, as follows:We bury the break point here and jumpalloc
Function, you’ll find the following function call stack:
- “First step” entry
alloc
Function source:
+ (id)alloc {
return _objc_rootAlloc(self);
}
Copy the code
- “Step two” is in
alloc
In the source code, we’re going to post, the function goes in_objc_rootAlloc
, we follow in to check:
// Base class implementation of +alloc. cls is not nil.
// Calls [cls allocWithZone:nil].
id
_objc_rootAlloc(Class cls)
{
return callAlloc(cls, false/*checkNil*/, true/*allocWithZone*/);
}
Copy the code
- “Step 3” same operation as we enter
callAlloc
Method, find out.
static ALWAYS_INLINE id callAlloc(Class cls, bool checkNil, bool allocWithZone=false) { #if __OBJC2__ if (slowpath(checkNil && ! cls)) return nil; if (fastpath(! cls->ISA()->hasCustomAWZ())) { return _objc_rootAllocWithZone(cls, nil); } #endif // No shortcuts available. if (allocWithZone) { return ((id(*)(id, SEL, struct _NSZone *))objc_msgSend)(cls, @selector(allocWithZone:), nil); } return ((id(*)(id, SEL))objc_msgSend)(cls, @selector(alloc)); }Copy the code
At this point, it gets a little confusing, with different branches of the source code. This is where we need to debug breakpoints.
Before we start debugging breakpoints, let’s clarify the meaning of a few key judgments in the callAlloc method. 1, hasCustomAWZ. First we looked at hasCustomAWZ and found that its source code implementation looks like this:
bool hasCustomAWZ() const { return ! cache.getBit(FAST_CACHE_HAS_DEFAULT_AWZ); }Copy the code
Moving on to FAST_CACHE_HAS_DEFAULT_AWZ, we find the first thing we need (note the source code comment) :
// class or superclass has default alloc/allocWithZone: implementation
// Note this is is stored in the metaclass.
#define FAST_CACHE_HAS_DEFAULT_AWZ (1<<14)
Copy the code
CLS ->ISA()->hasCustomAWZ() is used to find out if there is an alloc/allocWithZone: implementation in the class or parent. (Obviously, our Person doesn’t have one at this point.)
2. Now that we know what fastPath means, let’s see what fastPath means.
We followed into the discovery:
#define fastpath(x) (__builtin_expect(bool(x), 1))
#define slowpath(x) (__builtin_expect(bool(x), 0))
Copy the code
Fastpath&slowpath are two macros defined in objC source code. All calls are __builtin_expect.
The main function of __builtin_expect(bool exp, probability) is conditional branching prediction. The first argument is a Boolean expression. The second argument is the probability that the first argument is true. This argument can only be 1 or 0. When the value is 1, it indicates that the Boolean expression is true in most cases. A value of 0 indicates that the Boolean expression is false most of the time. The return value of the function is the expression value of the first argument.
That is, fastPath (! CLS ->ISA()->hasCustomAWZ()) CLS ->ISA()->hasCustomAWZ(). So the “fourth step” is to go to the _objc_rootAllocWithZone inside the if. (This can also be determined by breakpoint debugging)
3. There’s another judgment here__OBJC2__
In the source code, there is no way to know what this macro definition is used for. But we can judge the general meaning from the following comment:It follows that__OBJC2__
Is to determine if there isCompiler optimization.
- “Step four”
_objc_rootAllocWithZone
NEVER_INLINE id _objc_rootAllocWithZone(Class cls, // allocWithZone under __OBJC2__ ignores the hide AllocWithZone Ignores the zone argument return _class_createInstanceFromZone(CLS, 0, nil, OBJECT_CONSTRUCT_CALL_BADALLOC); }Copy the code
- “Step five”
_class_createInstanceFromZone
_class_createInstanceFromZone is the heart of the alloc process.
static ALWAYS_INLINE id _class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone, int construct_flags = OBJECT_CONSTRUCT_NONE, bool cxxConstruct = true, Size_t *outAllocatedSize = nil) {// 'Realized' --> 'Realized' ASSERT(CLS ->isRealized()); // Read class's info bits all at once for performance Bool hasCxxCtor = cxxCTOR && CLS ->hasCxxCtor(); bool hasCxxDtor = cls->hasCxxDtor(); bool fast = cls->canAllocNonpointer(); size_t size; '_objc_rootAllocWithZone' == '0' size = CLS ->instanceSize(extraBytes); if (outAllocatedSize) *outAllocatedSize = size; id obj; if (zone) { obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size); } else {// Also according to '_objc_rootAllocWithZone' we can know that '__OBJC2__' case will enter here. Obj = (id)calloc(1, size); } if (slowpath(! obj)) { if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) { return _objc_callBadAllocHandler(cls); } return nil; } if (! Obj ->initInstanceIsa(CLS, hasCxxDtor); } else { // Use raw pointer isa on the assumption that they might be // doing something weird with the zone or RR. obj->initIsa(cls); } if (fastpath(! hasCxxCtor)) { return obj; } construct_flags |= OBJECT_CONSTRUCT_FREE_ONFAILURE; return object_cxxConstructFromClass(obj, cls, construct_flags); }Copy the code
In _class_createInstanceFromZone, there are three main steps:
size = cls->instanceSize(extraBytes);
Calculate how much memory you need to open up.obj = (id)calloc(1, size);
Application memoryobj->initInstanceIsa(cls, hasCxxDtor);
associatedclass
andIsa pointer
.
- “Step 6”
object_cxxConstructFromClass
The source code for this function is also very easy to understand because the official comments are clear. I will cut the key parts here and not show the full source code.You can seeobject_cxxConstructFromClass
Return value of, if yesself
It meansBuilding a successful; If it isnil
It meansBuild failures.
#### Alloc flow chart Through the above exploration, we conclude thatalloc
The call flow chart of the function is as follows:
Above we explored the alloc process, there are a few knowledge points, here we expand:
1: byte alignment && Memory alignment
This is size = CLS ->instanceSize(extraBytes) in _class_createInstanceFromZone; . Here we go to instanceSize:
``` inline size_t instanceSize(size_t extraBytes) const { if (fastpath(cache.hasFastInstanceSize(extraBytes))) { return cache.fastInstanceSize(extraBytes); } size_t size = alignedInstanceSize() + extraBytes; // CF requires all objects be at least 16 bytes. if (size < 16) size = 16; return size; } ` ` `Copy the code
In the case of no cache, we’ll call alignedInstanceSize(), so we’ll follow:
' '// Class's ivar size rounded up to a poor-size boundary. const { return word_align(unalignedInstanceSize()); } ` ` `Copy the code
UnalignedInstanceSize () returns the size of the instancesize variable, depending on the size of the ivars variable:
``` // May be unaligned depending on class's ivars. uint32_t unalignedInstanceSize() const { ASSERT(isRealized()); return data()->ro()->instanceSize; } ` ` `Copy the code
At this point, ourPerson
There is no custom member variable, however, the return value is8
; This is becauseNSObject
There is one in itselfisa
. As follows: 而Class
isstruct *
Type:
At the same timeobjc_class
Inherited from the originalobjc_object
:
- Byte alignment is 8-byte alignment
This is where we goword_align
The function will know:Pay attention toWORD_MASK
for7
, according to the algorithm,word_align
Return value, always8
Multiples.
X = 8 &&word_mask = 7; So the formula for this function is :(8 + 7) &7 which is 15 &7. The binary of 15 is –> 0000 1111. The binary of 7 is –> 0000 0111, so ~7 is –> 1111 1000. So 15&7 is 0000 1111&1111 1000, 0000 1000 == 8
- Memory alignment is 16 bytes aligned
In instanceSize, if there is a cache, the fastInstanceSize function is called:
size_t fastInstanceSize(size_t extra) const { ASSERT(hasFastInstanceSize(extra)); if (__builtin_constant_p(extra) && extra == 0) { return _flags & FAST_CACHE_ALLOC_MASK16; } else { size_t size = _flags & FAST_CACHE_ALLOC_MASK; // remove the FAST_CACHE_ALLOC_DELTA16 that was added // by setFastInstanceSize return align16(size + extra - FAST_CACHE_ALLOC_DELTA16); }}Copy the code
As you can see, the function ends with a call to align16:
static inline size_t align16(size_t x) {
return (x + size_t(15)) & ~size_t(15);
}
Copy the code
This is not hard to understand, the algorithm is the same as the algorithm above for byte alignment. But the return value is guaranteed to be a multiple of 16.
2: __builtin_expect(bool exp, probability)
This function was briefly introduced above, but will be covered in more detail here (see article: Built-in functions in the LLVM compiler) : the main purpose of this function is to perform conditional branch prediction. The function takes two arguments: the first argument is a Boolean expression and the second argument indicates the probability that the value of the first argument is true. This argument can only take 1 or 0.
A value of 1 indicates that the Boolean expression is true most of the time, while a value of 0 indicates that the Boolean expression is false most of the time.
The return value of the function is the expression value of the first argument.
During the execution of an instruction, the CPU can complete the value of the next instruction due to the function of pipeline, which can improve the CPU utilization. When executing a branch instruction, the CPU also prefetches the next instruction, but if the conditional branch jumps to another instruction, the CPU prefetches the next instruction useless, thus reducing pipeline efficiency. The __builtin_expect function can optimize the sequence of instructions after the program is compiled, so that the instructions are executed as sequentially as possible, thus improving the accuracy rate of CPU prefetch instructions. Such as:
if (__builtin_expect (x, 0))
foo();
Copy the code
Representation: the value of x may be false most of the time, so foo() is less likely to be executed. This way the compiler does not compile the foo() function next to the if conditional jump instruction when compiling this code. Such as:
if (__builtin_expect (x, 1))
foo();
Copy the code
The value of x is likely to be true most of the time, so foo() has a better chance of being executed. The compiler compiles the code by placing the assembly instruction for foo() next to the if conditional jump instruction.
To simplify function usage, iOS uses two macros, fastPath and slowPath, to implement this branch optimization judgment processing:
#define fastpath(x) (__builtin_expect(bool(x), 1))
#define slowpath(x) (__builtin_expect(bool(x), 0))
Copy the code