1. Start with code you’re most familiar with

The first line of code that iOS developers touch, besides NSLog(@”Hello workd! [[NSObject alloc] init], which is probably the code we’re most familiar with. But are you really familiar with alloc inits?

Consider the output of the following code:

Person *p1 = [Person alloc] init];
Person *p2 = [p1 init];
Person *p3 = [p1 init];

NSLog(@"%@-%p-%p", p1, p1, &p1);
NSLog(@"%@-%p-%p", p2, p2, &p2);
NSLog(@"%@-%p-%p", p3, p3, &p3);
Copy the code

Console output:

<Person: 0x6000003000f0> - 0x6000003000f0 - 0x7ffeee282028
<Person: 0x6000003000f0> - 0x6000003000f0 - 0x7ffeee282020
<Person: 0x6000003000f0> - 0x6000003000f0 - 0x7ffeee282018
Copy the code

[[NSObject alloc] init]? Why are p1, P2 and P3 the same? Init doesn’t seem to work. What exactly do alloc and init do? What can I do if I hold down the Control key and want to view the alloc source code but find nothing?

With these questions in mind, we begin today’s journey of discovery.

2. Three ideas of exploration

Symbol breakpoint debugging

Where is the break point? What are the breakpoint symbols? Since we don’t know where to set the Breakpoint, we will first set a Breakpoint on the known alloc, and the Breakpoint type will be Symbolic Breakpoint.

Hold down Control + Step into at the breakpoint and you’ll see that you end up here:

Objc_alloc is another breakpoint we need to set.

Repeat the above steps, the process appears in the function, all hit the symbol breakpoint, you can continue to explore.

Assembly tracking

Assembly tracing is the most intuitive way. Although somewhat obscure, the key method calls are visible. In [Person alloc] init]; Debug -> Debug Workflow -> Always Show Disassembly:

You can see that objc_alloc is the next function to execute.

Run the source code

Whether it’s symbol breakpoint debugging or assembly tracing, debugging is actually quite troublesome, so is there a better way? Many developers think Apple is closed source, but in reality, Apple is gradually opening up some of its source code.

What we need is the source code for Objc4:

There is a direct version available on GitHub: Objc4

By running the source code and tracking the call flow, we completely get rid of the delusional low-level exploration.

3. Main line call flow of alloc

From the call to alloc, we can probably find these core methods in the source code: 1, alloc

+ (id)alloc {
    return _objc_rootAlloc(self);
}
Copy the code

2, _objc_rootAlloc

id _objc_rootAlloc(Class cls)
{
    return callAlloc(cls, false/*checkNil*/, true/*allocWithZone*/);
}
Copy the code

3, callAlloc

static ALWAYS_INLINE id callAlloc(Class cls, bool checkNil, bool allocWithZone=false) { #if __OBJC2__ if (slowpath(checkNil && ! cls)) return nil; if (fastpath(! cls->ISA()->hasCustomAWZ())) { return _objc_rootAllocWithZone(cls, nil); } #endif // No shortcuts available. if (allocWithZone) { return ((id(*)(id, SEL, struct _NSZone *))objc_msgSend)(cls, @selector(allocWithZone:), nil); } return ((id(*)(id, SEL))objc_msgSend)(cls, @selector(alloc)); }Copy the code

4, _objc_rootAllocWithZone

NEVER_INLINE
id _objc_rootAllocWithZone(Class cls, malloc_zone_t *zone __unused)
{
    // allocWithZone under __OBJC2__ ignores the zone parameter
    return _class_createInstanceFromZone(cls, 0, nil,
                                         OBJECT_CONSTRUCT_CALL_BADALLOC);
}
Copy the code

5, _class_createInstanceFromZone

static ALWAYS_INLINE id _class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone, int construct_flags = OBJECT_CONSTRUCT_NONE, bool cxxConstruct = true, size_t *outAllocatedSize = nil) { ASSERT(cls->isRealized()); // Read class's info bits all at once for performance bool hasCxxCtor = cxxConstruct && cls->hasCxxCtor(); bool hasCxxDtor = cls->hasCxxDtor(); bool fast = cls->canAllocNonpointer(); size_t size; Size = CLS ->instanceSize(extrabize); if (outAllocatedSize) *outAllocatedSize = size; // create memory space id obj; if (zone) { obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size); } else { obj = (id)calloc(1, size); } // At this point obj is not tied to the class, just a simple memory region if (slowpath(! obj)) { if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) { return _objc_callBadAllocHandler(cls); } return nil; } //obj and class binding if (! zone && fast) { obj->initInstanceIsa(cls, hasCxxDtor); } else { // Use raw pointer isa on the assumption that they might be // doing something weird with the zone or RR. obj->initIsa(cls); } if (fastpath(! hasCxxCtor)) { return obj; } construct_flags |= OBJECT_CONSTRUCT_FREE_ONFAILURE; return object_cxxConstructFromClass(obj, cls, construct_flags); }Copy the code

It is important to note that although we see the call order from the source code as shown above, the actual call order is much more complex. The reason: Some of the methods have been hooked up by Apple in exchange for some of their own functionality. For example: buried point, statistics, etc.

Focus on the fixupMessageRef method:

You can see whenmsgtheselisallocWhen it’s timeimpIt’s going to be replaced byobjc_alloc. So that explains why we’re calling it, rightallocBut what you can see from assembly isobjc_alloc.

4. Object size calculation

Size = CLS ->instanceSize(extraBytes); , which calculates how much memory it takes to initialize the current instance variable.

1, instanceSize

How much memory is required to initialize an instance object of the current class.

inline size_t instanceSize(size_t extraBytes) const { if (fastpath(cache.hasFastInstanceSize(extraBytes))) { return cache.fastInstanceSize(extraBytes); Size_t size = alignedInstanceSize() + extraBytes; // CF requires all objects be at least 16 bytes. if (size < 16) size = 16; return size; }Copy the code

if (size < 16) size = 16; : Specifies the memory allocation. The minimum memory allocation is 16 bytes.

2, alignedInstanceSize

8 bytes aligned.

// Class's ivar size rounded up to a pointer-size boundary.
uint32_t alignedInstanceSize() const {
    return word_align(unalignedInstanceSize());
}
Copy the code

The comment is clear: the ivar size of the class is rounded up to the pointer size boundary. The minimum size of the ivar class is 8 bytes. UnalignedInstanceSize: Size of unaligned instance variables in bytes.

3, word_align

8 byte alignment implementation.

static inline uint32_t word_align(uint32_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}
Copy the code
#   define WORD_MASK 7UL
Copy the code

Align x with 8 bytes: if x is 7, then 8 is aligned, and if 12 is entered, then 16 is aligned.

5. Byte alignment algorithm

Let’s take a closer look at this formula :(x + WORD_MASK) & ~WORD_MASK.

7
WORD_MASK
7

(7 + 7) & ~7
Copy the code

That is, 14 & ~7

In binary:

0000 1110&1111 1000 ------------ = 8 0000 1000Copy the code

So we get 8. The tricky one here is ~7(1111 1000). Any number with ~7 will discard the lower three digits and become a multiple of 8. Similarly, if we need to align the x16 bytes later, we just need (x + 15) &~ 15.

In addition to (x + 7) & ~7, there is another way to write 8-byte alignment:

(x + 7) >> 3 << 3
Copy the code

Same principle, discard the lower three digits.

Why are they aligned in multiples of eight?

Some of you might wonder, why is it aligned in multiples of 8 instead of 16 and 32? Or why not align them in multiples of seven?

This problem can be broken down into two small questions:

1. Why align? . Computer IO operations are very resource-intensive. If the computer reads data based on the size of the data type every time, then the IO operation is worse. If we divide the memory space into small cells and put data of different sizes into these small cells, then the computer reads the data in units of cells, and does not need to determine the size of the specific data type. Byte alignment can be understood as putting data into these fixed-size cells, and while some cells may be a waste of space, this will greatly improve the speed of reading, which is typical of space for time.

2. Why is it 8 bytes aligned instead of 16 bytes aligned? In Arm64, the data types are char (1 byte), int (4 bytes), Long Long (8 bytes), and so on. These data types are not larger than 8 bytes. The key point is that the most frequently used pointer is eight bytes in size. Aligned with 8 bytes, enough to cover all the basic data types. Therefore, there is no need to waste extra memory space in multiples of 16 or 32.

Alloc init and new

alloc init

As you can see from the initial example, we called init twice on P1, but it had no effect on the memory address of P1. Is init doing nothing?

- (id)init {
    return _objc_rootInit(self);
}
Copy the code
id _objc_rootInit(id obj)
{
    // In practice, it will be hard to rely on this function.
    // Many classes do not properly chain -init calls.
    return obj;
}
Copy the code

I didn’t do anything. I just returned myself. As you can see from the official English notes, many classes don’t use init at all, so Apple doesn’t do any extra work in init. The only thing it might do is give developers a rewrite to make extra Settings.

new
+ (id)new {
    return [callAlloc(self, false/*checkNil*/) init];
}
Copy the code

The essence of a new method is the same thing as alloc init. Therefore, there is no difference between using new and alloc init in nature, but the form of alloc init is recommended for better scalability and customization.

7,

This section mainly discusses: how to explore the underlying iOS, alloc mainline flow, and some of the technical details that appear in the mainline flow. Beyond what we have already discussed, most of the technical details have not been covered. Exploring alloc and init is just the first step in a long journey, but now that you’ve chosen the iOS path, keep going

Road resistance and long, line will come, line and ceaseless, the future can be