preface

The Autorelease core logic of AutoreleasePool has been explained in the Autorelease article behind the sun curtain, but much of it is conclusive. I read through the source code to explore more details and verify some of the well-trotted conclusions.

Source code based on Runtime 750.

What did @autoreleasepool {} do

Main. m file code:

int main(int argc, const char * argv[]) {
    @autoreleasepool {}
    return 0;
}
Copy the code

Use clang-rewrite-objc main.m to see the code that has been front-end processed by the compiler:

struct __AtAutoreleasePool {
  __AtAutoreleasePool() {atautoreleasepoolobj = objc_autoreleasePoolPush(); } ~__AtAutoreleasePool() {objc_autoreleasePoolPop(atautoreleasepoolobj); } void * atautoreleasepoolobj; }; int main(int argc, const char * argv[]) { /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; }return 0;
}
Copy the code

As you can see, @autoreleasepool{} creates a local variable of type __AtAutoreleasePool and contains it in the current scope. __AtAutoreleasePool constructs and destructs two different methods, so the simplification is as follows:

Void *context = objc_autoreleasePoolPush() // The object calls autoRelease to load the automatic release poolCopy the code

You can guess that push and POP operations are key to automatic release.

AutoreleasePoolPage Memory distribution

As mentioned in the official documentation, the main thread and non-explicitly created threads (such as GCD) will have an event loop (RunLoop is the implementation). At the beginning and end of each loop, the automatic pool-releasing methods push and POP are called respectively. This enables automatic memory management.

Objc_autoreleasePoolPush () and objc_autoreleasePoolPop (…). The push() and pop() methods of the AutoreleasePoolPage class are actually called. Let’s take a look at the data structure of this class:

class AutoreleasePoolPage {
    ...
    magic_t const magic;
    id *next;
    pthread_t const thread;
    AutoreleasePoolPage * const parent;
    AutoreleasePoolPage *child;

    static void * operator new(size_t size) {
        return malloc_zone_memalign(malloc_default_zone(), SIZE, SIZE);
    }
    id * begin() {
        return (id *) ((uint8_t *)this+sizeof(*this));
    }
    id * end() {
        return(id *) ((uint8_t *)this+SIZE); }... }Copy the code
  • parentandchildIt is pointing to precursor and successor Pointers that the auto-release pool is an example ofAutoreleasePoolPageIs a bidirectional linked list of nodes (verified later).
  • threadIs the thread corresponding to the current page.
  • magicUsed to check whether the memory is damaged.
  • nextPoints to the address of the current pluggable object.

Memory alignment

Rewrote new operator to use malloc_zone_memalign(…) To allocate memory:

extern void *malloc_zone_memalign(malloc_zone_t *zone, size_t alignment, size_t size) ;
    /* 
     * Allocates a new pointer of size size whose address is an exact multiple of alignment.
     * alignment must be a power of two and at least as large as sizeof(void *).
     * zone must be non-NULL.
     */
Copy the code

As the comment makes clear, this method allocates size in alignment with addresses. Both arguments are called with the SIZE macro, which is essentially the SIZE of the virtual memory page:

#define I386_PGBYTES 4096
Copy the code

Setting the size of a page too small can result in more open space operations that are inefficient, and a large number of parent/ Child pointer variables can take up considerable memory. Setting too much space can cause a page to be underutilized and waste too much memory. Setting it to 4096 is more elegant, maximizing space while keeping memory aligned and avoiding memory fragmentation. Doing this always makes the page address a multiple of 4096, which makes some operations easier (such as finding the corresponding page by pointer address).

Begin () and end ()

AutoreleasePoolPage itself is much smaller than 4096, and the extra space is used to store “objects that are expected to be automatically managed.” The begin() and end() methods mark the scope.

Sizeof (*this) is the SIZE of the AutoreleasePoolPage itself, so (uint8_t *)this+sizeof(*this) is the lowest address, (uint8_t *)this+SIZE is the highest address. When objects are inserted one by one, the next pointer moves from begin() to end(), followed by the full() method which means next == end() and empty() which means next == begin().

Note that next/end()/begin() are all ids *, i.e., Pointers to Pointers that move by the size of an ID when + 1-1.

Push logic

The push() method calls autoreleaseFast(POOL_BOUNDARY) :

    static inline id *autoreleaseFast(id obj)
    {
        AutoreleasePoolPage *page = hotPage();
        if(page && ! page->full()) {return page->add(obj);
        } else if (page) {
            return autoreleaseFullPage(obj, page);
        } else {
            returnautoreleaseNoPage(obj); }}Copy the code

HotPage refers to the page of the current pluggable object.

Select * from page where page exists and is not full;

id *add(id obj) { assert(! full()); unprotect(); id *ret = next; // faster than `return next-1` because of aliasing
        *next++ = obj;
        protect();
        return ret;
    }
Copy the code

Int mprotect(void *a, size_t b, int c); void *a, size_t b, int c

    inline void protect() {
#if PROTECT_AUTORELEASEPOOL
        mprotect(this, SIZE, PROT_READ);
        check();
#endif
    }
    inline void unprotect() {
#if PROTECT_AUTORELEASEPOOL
        check();
        mprotect(this, SIZE, PROT_READ | PROT_WRITE);
#endif
    }
Copy the code

Unprotect () is set to read and write, and protect() is set to read only, so the goal here is to keep page writes safe. However, #define PROTECT_AUTORELEASEPOOL 0 indicates that this protection feature is not enabled in the current version.

2, When the page exists and is full, extend the page node and add objects:

    static __attribute__((noinline))
    id *autoreleaseFullPage(id obj, AutoreleasePoolPage *page)
    {   ...
        do {
            if (page->child) page = page->child;
            else page = new AutoreleasePoolPage(page);
        } while (page->full());

        setHotPage(page);
        return page->add(obj);
    }
Copy the code

The logic of the loop: find the full page from the child direction, and if not, create a new page to concatenate to the end of the list (the AutoreleasePoolPage constructor takes the page argument passed as the parent precursor). Then set the latest page to hotPage and add obj to the page.

3. Initialize a page when it does not exist

    static __attribute__((noinline))
    id *autoreleaseNoPage(id obj) 
    {   ...
        AutoreleasePoolPage *page = new AutoreleasePoolPage(nil);
        setHotPage(page); .return page->add(obj);
    }
Copy the code

The core of this method is to create the first page and add it to thread-local storage.

hotPage

HotPage () = hotPage(); hotPage() = coldPage(); hotPage() = coldPage();

Since the autorelease pool is a bidirectional linked list made up of AutoreleasePoolPage, how can the list be accessed? The general idea might be to create a global variable to access it, but here’s another way:

    static inline AutoreleasePoolPage *hotPage() { AutoreleasePoolPage *result = (AutoreleasePoolPage *) tls_get_direct(key); // EMPTY_POOL_PLACEHOLDER indicates no pageif ((id *)result == EMPTY_POOL_PLACEHOLDER) return nil;
        if (result) result->fastcheck();
        return result;
    }
    static inline void setHotPage(AutoreleasePoolPage *page) 
    {
        if (page) page->fastcheck();
        tls_set_direct(key, (void *)page);
    }
Copy the code

tls_get_direct(…) And tls_set_direct (…). Internally, pages are stored using Thread Local Storage (TLS: Thread Local Storage) to avoid the need to maintain extra space to record trailing pages. This verifies the one-to-one correspondence between auto-release pools and threads.

One technique widely used in YYKit is to place an object in an asynchronous thread for last use, and if the object is released (possibly?) Will be in this asynchronous thread, thereby reducing the main thread stress. In effect, the compiler inserts the autoRelease code to add the object to the asynchronous thread’s automatic release pool, and if the asynchronous thread calls the object’s release() method before the main thread’s pop() pool, the object will be released on the asynchronous thread. Therefore, I believe that this optimization is not absolutely effective (here comes a question: an object is managed by multiple automatic release pool, if the object release pool how to avoid the wild pointer problem? .

POOL_BOUNDARY

The push() method calls autoreleaseFast(POOL_BOUNDARY) with an object POOL_BOUNDARY that does not need to be managed, which is defined as follows:

# define POOL_BOUNDARY nil
Copy the code

The autoreleaseFast(obj) method returns a pointer to the obj pointer, which is of type ID *. That is, the return value only cares about the address of the obj pointer, not the address of the obj value. The address of the obj pointer is an area in the memory of an AutoreleasePoolPage object.

Take a look at the upper-level call:

void *context = objc_autoreleasePoolPush()
...
objc_autoreleasePoolPop(context)
Copy the code

Pop will pass in the address of the OBj pointer. The logic of pop is to remove the hotPage objects and send a release message. When the page is removed, the object is removed from the parent node and the object is removed again. When the object removal stops, the address of the obj pointer is removed.

Therefore, adding a POOL_BOUNDARY in push operation is actually adding a boundary, while the scope is judged according to the boundary in POP operation, which is a process of loading and unloading.

Magic check

The recurring check() method is as follows:

    void check(bool die = true)  {
        if(! magic.check() || ! pthread_equal(thread, pthread_self())) busted(die); } void fastcheck(bool die =true) {// Add:#define CHECK_AUTORELEASEPOOL (DEBUG)
#if CHECK_AUTORELEASEPOOL
        check(die);
#else
        if (! magic.fastcheck()) busted(die);
#endif
    }
Copy the code

As you can see, they both call magic’s check method and DEBUG to check if the current thread matches the page thread.

Magic is of magic_t type, and this structure is mainly composed of uint32_t m[4]; Array, when you construct it, it’s going to be 0xA1A1A1A1 AUTORELEASE! , and the check() logic checks whether the value of the construct has changed. If it has changed, the page has been broken.

4. Autorelease logic

Upper-layer objects calling the autoRelease method call the following methods of AutoreleasePoolPage:

static inline id autorelease(id obj) { assert(obj); assert(! obj->isTaggedPointer()); id *dest __unused = autoreleaseFast(obj); assert(! dest || dest == EMPTY_POOL_PLACEHOLDER || *dest == obj);return obj;
    }
Copy the code

Obviously, the autoreleaseFast(…) that was parsed earlier will eventually be called. Method to insert objects. It can also be inferred that when a Thread does not have a Runloop to automatically release pool push and pop, The page is automatically created and added to thread-local storage (see autoreleaseNoPage(…) above) if the pool is not automatically released when the autorelease object is made. Method analysis).

Pop logic

The context parameter of objc_autoreleasePoolPop(context) is returned by objc_autoreleasePoolPush() and is actually the address of POOL_BOUNDARY in AutoreleasePoolPage. The pop() method is eventually called:

static inline void pop(void *token) { AutoreleasePoolPage *page; id *stop; . Page = pageForPointer(token); stop = (id *)token; . // pop the internal object until stop boundary page->releaseUntil(stop); . // Delete the empty Child list node, if the current page object is more than half, keep the next empty nodeif (page->lessThanHalfFull()) {
            page->child->kill(a); }else if (page->child->child) {
            page->child->child->kill();
        }
    }
Copy the code

The logic of pop() should be easy to understand. The token argument is the boundary.

Find the page corresponding to the boundary

    static AutoreleasePoolPage *pageForPointer(const void *p) {
        returnpageForPointer((uintptr_t)p); } static AutoreleasePoolPage *pageForPointer(uintptr_t p) { AutoreleasePoolPage *result; uintptr_t offset = p % SIZE; . result = (AutoreleasePoolPage *)(p - offset); result->fastcheck();return result;
    }
Copy the code

Const void *p is a pointer to a pointer, (uintptr_t)p) represents the address of the POOL_BOUNDARY pointer in the corresponding page.

P % SIZE = 0; p % SIZE = 0; p % SIZE = 0; p % SIZE = 0; In the end, p-offset is used to get the start address of the page.

Removes the managed object and sends a Release message

    void releaseUntil(id *stop)  {
        while(this->next ! = stop) { AutoreleasePoolPage *page = hotPage(); // If the page is empty, point to parentwhile (page->empty()) {
                page = page->parent;
                setHotPage(page); } // SCRIBBLE Page ->unprotect(); id obj = *--page->next; memset((void*)page->next, SCRIBBLE, sizeof(*page->next)); page->protect(); // Call the release method of the objectif(obj ! = POOL_BOUNDARY) { objc_release(obj); }} // Set the current page to hotPage (when called this is the page corresponding to the desired release boundary)setHotPage(this); . }Copy the code

Remove the child

    void kill() {
        AutoreleasePoolPage *page = this;
        while (page->child) page = page->child;
        AutoreleasePoolPage *deathptr;
        do {
            deathptr = page;
            page = page->parent;
            if (page) {
                page->unprotect();
                page->child = nil;
                page->protect();
            }
            delete deathptr;
        } while(deathptr ! = this); }Copy the code

The logic is straightforward: find the trailing page of the current page’s child, then release it in reverse and null the child pointer to its parent node. Write security is not enabled in UnProtect and Protect.

After the language

The above is the automatic release pool most of the source code analysis, this part of the source code does not involve assembly and less code, so it looks relatively easy. Understanding the lower level of memory management will help you understand the various upper-level features, locate memory challenges, and write more stable code. And in this process, it is inevitable to contact the knowledge related to the operating system and compilation principle, which can also cultivate general ability.

Reading the source code is more important than memorizing the conclusion, and it’s always fun to come across some amazing code details.