preface

If you want to be an iOS developer, you have to read source code. The following is a small series that the author combed in OC source code exploration — Class and Object, welcome you to read and correct, but also hope to help you.

  1. Object creation for OC source analysis
  2. OC source code analysis of ISA
  3. OC source code analysis and such structural interpretation
  4. Caching principle of OC source analysis method
  5. OC source code analysis method search principle
  6. Analysis and forwarding principle of OC source code analysis method

Further introduction: a question

Before getting into the subject, think about the output of the following code

#import <Foundation/Foundation.h>

@interface Person : NSObject

@end

@implementation Person

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Person *p = [Person alloc];
        Person *p1 = [p init];
        Person *p2 = [p init];
        CCNSLog(@"p ==> %@", p);
        CCNSLog(@"p1 ==> %@", p1);
        CCNSLog(@"p2 ==> %@", p2);
    }
    return 0;
}
Copy the code

The results of the implementation are:

Obviously, objects P, P1, and p2 have the same memory address, that is, they are the same object. So the question is, why do these three objects have the same address? What’s going on underneath alloc and init? With these questions in mind, let’s explore the source code.

1. Alloc source code analysis

1.0 Preparations

  1. fromList of apple’s official open source codefindobjc4The source code.

The objC Version I’m using is objC4-756.2, and XCode is Version 11.3 (11C29). The source and XCode versions need not be the same as the author

  1. After downloading to the local, it is necessary to compile and debug the project. For specific steps, please refer to Cooci’s blog IOS_OBJC4-756.2 for the latest source code compilation and debugging.

  2. Once compiled, you can create a new target to play with.

Objc4-756.2 has been compiled and uploaded to Github. If you are interested, you can download it

Because of the runtime nature of the OC language, we can’t be sure that the entry is the +alloc method, which means we need to find the real entry first.

Common code tracing methods:

  1. XCode menu bar click in turnDebug->Debug Workflow->Always show Disassembly
  2. control + step into
  3. The symbol breakpoint, such asalloc

Bloggers often use the first type, no other, familiar

1.1 objc_allocallocThe real entrance to

Assign a breakpoint to [Person alloc]

In the XCode menu bar, click Debug->Debug Workflow->Always Show Disassembly to get the assembly code

As you can see, objc_alloc is then executed. The source code is as follows:

Consider: Why does [Person Alloc] call objc_alloc? (Answers will be revealed at the end of this article)

1.2 callAllocAnalysis – the first intimate encounter

Objc_alloc () internally calls callAlloc()

// Call [cls alloc] or [cls allocWithZone:nil], with appropriate 
// shortcutting optimizations.
static ALWAYS_INLINE id
callAlloc(Class cls, bool checkNil, bool allocWithZone=false)
{
    if(slowpath(checkNil && ! cls))return nil;

#if __OBJC2__
    if(fastpath(! cls->ISA()->hasCustomAWZ())) {// No alloc/allocWithZone implementation. Go straight to the allocator.
        // fixme store hasCustomAWZ in the non-meta class and 
        // add it to canAllocFast's summary
        if (fastpath(cls->canAllocFast())) {
            // No ctors, raw isa, etc. Go straight to the metal.
            bool dtor = cls->hasCxxDtor();
            id obj = (id)calloc(1, cls->bits.fastInstanceSize());
            if(slowpath(! obj))return callBadAllocHandler(cls);
            obj->initInstanceIsa(cls, dtor);
            return obj;
        }
        else {
            // Has ctor or raw isa or something. Use the slower path.
            id obj = class_createInstance(cls, 0);
            if(slowpath(! obj))return callBadAllocHandler(cls);
            returnobj; }}#endif

    // No shortcuts available.
    if (allocWithZone) return [cls allocWithZone:nil];
    return [cls alloc];
}
Copy the code

rightcallAlloc()The analysis is as follows:

  1. slowpath(bool)withfastpath(bool): often used inif-else, can optimize the speed of judgment.
// fastPath (x) : indicates that x is 1 (execute if code block) is more likely
#define fastpath(x) (__builtin_expect(bool(x), 1))
// slowpath(x) : indicates that x is 0 (else code blocks are executed) more likely
#define slowpath(x) (__builtin_expect(bool(x), 0))
Copy the code
  1. hasCustomAWZ(): it meanshasCustomAllocWithZoneThat is, whether there are overriding classes+allocWithZone:Method, but its value is not easy to determine! Look at the source
bool hasCustomAWZ(a) {
    return ! bits.hasDefaultAWZ();
}
Copy the code

Note:hasCustomAWZ()The value of the problem

  • Of the class+initialize:Method is primarily used to initialize static variables. Before it can be executed,hasDefaultAWZ()A value offalse, i.e.,hasCustomAWZ()fortrue; After its execution, if the current class is overwritten+allocWithZone:Method,hasCustomAWZ()fortrue, or forfalse.
  • Of the class+initialize:Method is called before the class is first initialized. When calling[cls alloc]Is triggeredobjc_msgSendAnd then it will execute+initialize:. (If you are interested, you can print it separately+allocand+initialize:Methods to verify)

So when the class first comes incallAlloc()Is eventually executed[cls alloc].

  1. canAllocFast()The source code is as follows:
bool canAllocFast(a) { assert(! isFuture());return bits.canAllocFast();
}
Copy the code

Looking further down at bits.canallocfast (), you find the key macro FAST_ALLOC

#if FAST_ALLOC.bool canAllocFast(a) {
        return bits & FAST_ALLOC;
    }
#else.bool canAllocFast(a) {
        return false;
    }
#endif
Copy the code

Digging deeper, we come to the definition of the FAST_ALLOC macro

#if! __LP64__// The current operating system is not 64-bit.#elif 1         // The current operating system is 64-bit.#else.#define FAST_ALLOC              (1UL<<2).#endif
Copy the code

CanAllocFast () is always false, regardless of whether the current operating system is 64-bit.

Therefore, if hasCustomAWZ() is false, class_createInstance() will go directly.

1.3 alloc->_objc_rootAlloc->callAlloc->class_createInstance

By analyzing hasCustomAWZ(), we know that the first initialization of a class ends at the end of the callAlloc, that is, return [CLS alloc];

  1. Because of the implementation[cls alloc]This time it really camealloc()Methods the
+ (id)alloc {
    return _objc_rootAlloc(self);
}
Copy the code
  1. Then there is_objc_rootAlloc()
// Base class implementation of +alloc. cls is not nil.
// Calls [cls allocWithZone:nil].
id
_objc_rootAlloc(Class cls)
{
    return callAlloc(cls, false/*checkNil*/.true/*allocWithZone*/);
}
Copy the code
  1. And then thecallAlloc()->class_createInstance()

Again coming to callAlloc, the value of hasCustomAWZ() depends on whether the current class overrides the +allocWithZone: method.

Since the Person class was not overwritten, FastPath (! CLS ->ISA()->hasCustomAWZ()) is true, while canAllocFast() is always false.

Therefore, the next step is class_createInstance(), which reads as follows:

id 
class_createInstance(Class cls, size_t extraBytes)
{
    return _class_createInstanceFromZone(cls, extraBytes, nil);
}
Copy the code

1.4 _class_createInstanceFromZone

As the name suggests, this is to create objects! However, when alloc is created, objects are created. Now, let’s put the question aside for the moment and analyze the source code first:

static __attribute__((always_inline)) 
id
_class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone, 
                              bool cxxConstruct = true.size_t *outAllocatedSize = nil)
{
    if(! cls)return nil;

    assert(cls->isRealized());

    // Read the class bits at a time to improve performance
    bool hasCxxCtor = cls->hasCxxCtor();    // Whether there is a constructor
    bool hasCxxDtor = cls->hasCxxDtor();    // Whether there is a destructor
    bool fast = cls->canAllocNonpointer();
    
    // Compute memory
    size_t size = cls->instanceSize(extraBytes);
    if (outAllocatedSize) *outAllocatedSize = size;

    id obj;
    if(! zone && fast) {// Allocate 1 block of contiguous memory with size
        obj = (id)calloc(1, size);
        if(! obj)return nil;
        // Initializes the object's ISA
        obj->initInstanceIsa(cls, hasCxxDtor);
    } 
    else {
        if (zone) {
            obj = (id)malloc_zone_calloc ((malloc_zone_t *)zone, 1, size);
        } else {
            obj = (id)calloc(1, size);
        }
        if(! obj)return nil;

        // Use raw pointer isa on the assumption that they might be 
        // doing something weird with the zone or RR.
        obj->initIsa(cls);
    }

    if (cxxConstruct && hasCxxCtor) {
        obj = _objc_constructOrFree(obj, cls);
    }

    return obj;
}
Copy the code

right_class_createInstanceFromZone()The analysis is as follows:

  1. cls->instanceSize(extraBytes)Computing memory, at this timeextraBytesis0, its source is
/ / 1.
size_t instanceSize(size_t extraBytes) {
    size_t size = alignedInstanceSize() + extraBytes;
    // CF requires all objects be at least 16 bytes.
    if (size < 16) size = 16;
    return size;
}

/ / 2.
uint32_t alignedInstanceSize() {
    return word_align(unalignedInstanceSize());
}

// 3. Byte alignment
static inline uint32_t word_align(uint32_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}
static inline size_t word_align(size_t x) {
    return (x + WORD_MASK) & ~WORD_MASK;
}

/ / 4.
#ifdef __LP64__
#   define WORD_SHIFT 3UL
#   define WORD_MASK 7UL
#   define WORD_BITS 64
#else
#   define WORD_SHIFT 2UL
#   define WORD_MASK 3UL
#   define WORD_BITS 32
#endif

Copy the code

As you can see, WORD_MASK is 7 on a 64-bit system and 3 otherwise, so word_align() is 8-byte aligned on a 64-bit system and 4-byte aligned otherwise.

At the same time, the instanceSize() function imposes a minimum memory size limit of 16 bytes.

  1. canAllocNonpointer()Isa distinction between the types of isa, in__OBJC2__, if a class usesisa_tThe type ofisaWords,fastistrue; And in the__OBJC2__,zoneIt’s going to be ignored, so! zoneIs alsotrue;

This is followed by calloc() and initInstanceIsa().

  1. calloc()The underlying source code is inApple open sourcelibmalloc, after the breakpoint trace, foundcallocThe size of the allocated memory is affectedsegregated_size_to_fit()Influence, see the source code below:
static MALLOC_INLINE size_t
segregated_size_to_fit(nanozone_t *nanozone, size_t size, size_t *pKey)
{
	size_t k, slot_bytes;

	if (0 == size) {
	    // Historical behavior
	    size = NANO_REGIME_QUANTA_SIZE;
	}
	// round up and shift for number of quanta
	k = (size + NANO_REGIME_QUANTA_SIZE - 1) >> SHIFT_NANO_QUANTUM; 
	// multiply by power of two quanta size
	slot_bytes = k << SHIFT_NANO_QUANTUM;							
	// Zero-based!
	*pKey = k - 1;													

	return slot_bytes;
}

#define SHIFT_NANO_QUANTUM	    4
#define NANO_REGIME_QUANTA_SIZE	    (1 << SHIFT_NANO_QUANTUM)	/ / 16
Copy the code

As you can see from the code, slot_bytes equals (size + 16-1) >> 4 << 4, i.e. 16-byte alignment, so calloc() must allocate an integer multiple of 16 bytes.

  1. initInstanceIsa()That’s initializationisa, and associated withcls.

Isa is an extremely important part of objC class structure. The blogger will write another article about its structure, initialization process, inheritance relationship, etc. Please look forward to it.

As you can see from the above code, _class_createInstanceFromZone() does a lot of work, and does end up creating the object, pretty much everything, so what does init really do? Please read on.

2. The init and new

1. init

- (id)init {
    return _objc_rootInit(self);
}

id
_objc_rootInit(id obj)
{
    // In practice, it will be hard to rely on this function.
    // Many classes do not properly chain -init calls.
    return obj;
}
Copy the code

Very simply, init simply returns the object created by alloc. Why is it designed this way? It’s not that hard to understand. In normal development, we often rewrite init to do some custom configuration based on business needs.

Init on NSObject is a factory design for subclass overrides.

2. new

Let’s look at new

+ (id)new {
    return [callAlloc(self, false/*checkNil*/) init];
}
Copy the code

Obviously, new is alloc+init.

3. Summary

So much for the source code analysis of alloc, init, and new. During alloc, the callAlloc and _class_createInstanceFromZone functions are the focus.

The above source code flow analysis is based on objC4-756.2 source code, 756.2 is currently the latest version.

The following flowchart summarizes the alloc object creation process

4. The conclusion

That’s all for creating the OC object source code. Looking back on the whole process, there are smooth and bumpy, and the whole process is quite baffling. However, after the one-stop service of ALLOc, I feel as if I have completed some important task, and feel extremely happy physically and mentally.

OC source analysis of the road, will be the road of honor, I hope you and cherish, you and I encourage!

supplement

  1. Q: Thinking: Why[Person alloc]Will be calledobjc_alloc?

    A: When the project is compiled, the image file will be read in_read_images()In the function, there is this code:
void _read_images(...) {...#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
    for (EACH_HEADER) {
        message_ref_t *refs = _getObjc2MessageRefs(hi, &count);
        if (count == 0) continue; .for (i = 0; i < count; i++) {
            fixupMessageRef(refs+i);
        }
    }
    ts.log("IMAGE TIMES: fix up objc_msgSend_fixup");
#endif. }Copy the code

In fixupMessageRef(), there is IMP repair binding for SEL_alloc

static void 
fixupMessageRef(message_ref_t *msg)
{    
    msg->sel = sel_registerName((const char *)msg->sel);

    if (msg->imp == &objc_msgSend_fixup) { 
        if (msg->sel == SEL_alloc) {
            msg->imp = (IMP)&objc_alloc;
        } else if (msg->sel == SEL_allocWithZone) {
            msg->imp = (IMP)&objc_allocWithZone;
        } else if (msg->sel == SEL_retain) {
            msg->imp = (IMP)&objc_retain;
        } else if (msg->sel == SEL_release) {
            msg->imp = (IMP)&objc_release;
        } else if (msg->sel == SEL_autorelease) {
            msg->imp = (IMP)&objc_autorelease;
        } else{ msg->imp = &objc_msgSend_fixedup; }}... }Copy the code

Given the fact that objc_alloc() is called through [Person alloc], we can guess that the correspondence between SEL_alloc and objc_alloc was formed during project compilation to generate a Mach-O file.

Drag the compiled Mach-O file to MachOView and verify that you can find objc_alloc

Last question

  1. The next two timesalloc, the difference between the underlying processes? ifPersonClass overrides the+allocWithZone:?
Person *p1 = [Person alloc];
Person *p2 = [Person alloc];
Copy the code

You can try it yourself, and it will help you understand the alloc process.

PS

  • The source code project has been placedgithubThe stamp, pleaseObjc4-756.2 – the source code
  • You can also download apple’s official ObjC4 source code to study.
  • Reprint please indicate the source! Thank you very much!