Analyze blocks by CLang

#include "stdio.h"
int main(){
    int a = 8;
    void(^block)(void) = ^{
        printf("chenxi - %d",a);
    };
     block();
    return 0;
}
Copy the code

With code like this, clang generates c++ code like this:

int main(){ int a = 8; / / ((void (*) ()) & represents a type / / __main_block_impl_0 a name, Function call / / strong (void *) type void block (*) (void) = ((void (*) ()) & __main_block_impl_0 (__main_block_func_0 (void *), &__main_block_desc_0_DATA, a)); / / void (*) type strong turn / / (__block_impl *) type strong turn ((void (__block_impl *) (*)) ((__block_impl *) block) - > FuncPtr) ((__block_impl *)block); return 0; }Copy the code

After simplifying, we can get the following code:

int main(){ int a = 8; / / void __main_block_impl_0 function block (*) (void) = __main_block_impl_0 (__main_block_func_0, & __main_block_desc_0_DATA, a)); block->FuncPtr(block); return 0; }Copy the code

Block = __main_block_IMPL_0; block = __main_block_IMPL_0;

Struct __main_block_impl_0 {struct __block_impl impl; struct __main_block_desc_0* Desc; // Capture the external a, generate the response member variable int a; __main_block_impl_0(void *fp) __main_block_impl_0(void *fp) struct __main_block_desc_0 *desc, int _a, int flags=0) : a(_a) { impl.isa = &_NSConcreteStackBlock; impl.Flags = flags; impl.FuncPtr = fp; Desc = desc; }}; static void __main_block_func_0(struct __main_block_impl_0 *__cself) { int a = __cself->a; // bound by copy printf("chenxi - %d",a); } static struct __main_block_desc_0 { size_t reserved; size_t Block_size; } __main_block_desc_0_DATA = { 0, sizeof(struct __main_block_impl_0)};Copy the code

__main_block_IMPL_0 is a structure and its value is the constructor __main_block_IMPL_0. Here we can draw some conclusions:

  • This is for variables that are passed in from the outsidea.blockThe corresponding member variables are generatedaAnd assign the value.
  • At compile timeblockIs of type_NSConcreteStackBlockBut in the front we areBasic Principle Analysis of Block (I)In the saidblockBecause external variables become heap areasblockHere we can guess that this step is done at runtime, but what if the runtime does it?
  • In the constructorimpl.FuncPtr = fp.fpThat ismainThe first argument in the function__main_block_func_0Is actually a function,impl.FuncPtrThe call to__main_block_func_0Function execution.
  • in__main_block_func_0In the function__cselfThe equivalent ofblockItself, a temporary variable in a functionaOnly withblockMember variablesaThe value of is the same, but the value is copied.
  • blockThe underlying copy of
int main(){
    __block int a = 8;
    void(^block)(void) = ^{
        a++;
        printf("chenxi - %d",a);
    };
    block();
    return 0;
}
Copy the code
struct __Block_byref_a_0 { void *__isa; // __forwarding points to a __Block_byref_a_0 *__forwarding; int __flags; int __size; int a; }; struct __main_block_impl_0 { struct __block_impl impl; struct __main_block_desc_0* Desc; __Block_byref_a_0 *a; Void *fp, struct __main_block_desc_0 *desc, struct __main_block_desc_0 * desc_0, __Block_byref_a_0 *_a, int flags=0) : a(_a->__forwarding) { impl.isa = &_NSConcreteStackBlock; impl.Flags = flags; impl.FuncPtr = fp; Desc = desc; }}; // A is the same as a passed into the block in c, Static void __main_block_func_0(struct __main_block_impl_0 *__cself) {__Block_byref_a_0 *a = __cself->a; // bound by ref (a->__forwarding->a)++; printf("chenxi - %d",(a->__forwarding->a)); } static void __main_block_copy_0(struct __main_block_impl_0*dst, struct __main_block_impl_0*src) {_Block_object_assign((void*)&dst->a, (void*)src->a, 8/*BLOCK_FIELD_IS_BYREF*/); } static void __main_block_dispose_0(struct __main_block_impl_0*src) {_Block_object_dispose((void*)src->a, 8/*BLOCK_FIELD_IS_BYREF*/); } static struct __main_block_desc_0 { size_t reserved; size_t Block_size; void (*copy)(struct __main_block_impl_0*, struct __main_block_impl_0*); void (*dispose)(struct __main_block_impl_0*); } __main_block_desc_0_DATA = { 0, sizeof(struct __main_block_impl_0), __main_block_copy_0, __main_block_dispose_0}; Int main(){// int a = 8 // This step is equivalent to the __Block_byref_a_0 structure initialization, And for a pick up address __Block_byref_a_0 a = {(void *) 0, (__Block_byref_a_0 *) & a, 0, sizeof (__Block_byref_a_0), 8}; void(*block)(void) = ((void (*)())&__main_block_impl_0((void *)__main_block_func_0, &__main_block_desc_0_DATA, (__Block_byref_a_0 *)&a, 570425344)); ((void (*)(__block_impl *))((__block_impl *)block)->FuncPtr)((__block_impl *)block); return 0; }Copy the code
  • When we putmainIn the functionint aModified to__block int athenclangAnd then you see, external variablesaIt’s going to be packaged as a structure__Block_byref_a_0, and theaThe address is fetched and assigned to the member variable__forwardingAnd the last__main_block_func_0In the functionaThe value is equal to the__forwarding.__forwardingPoints to external member variablesaThat’s why it’s added, right__blockThe reason for modifying the value of an external variable. Do not add__blockChanging the value also generates semantic errors.

Block assembler analyzes the process of obtaining a signature copy

Here we trace the flow of a block through assembly analysis.

First we do not use external variables in the block, after the breakpoint we locate the objc_retainBlock, and then we proceed step by step.

Then we can navigate to the _Block_copy function, where the output x0 is of type __NSStackBlock__. Now let’s modify the block to use external variables to output it again.

- (void)viewDidLoad {
    [super viewDidLoad];
    
    NSObject *object = [[NSObject alloc] init];
    void (^block)(void) = ^{
        NSLog(@"block --- %@",object);
    };
    block();
}
Copy the code

After using the external variable and exporting it, it becomes __NSStackBlock__, which is a stack block because it hasn’t been copied yet. We break the output here in the return execution of _Block_copy.

__NSMallocBlock__ is a heap block, indicating that after _Block_copy, the block is copied from the stack to the heap. _Block_copy () {block_copy ();

void *_Block_copy(const void *arg) { struct Block_layout *aBlock; if (! arg) return NULL; aBlock = (struct Block_layout *)arg; If (aBlock->flags & BLOCK_NEEDS_FREE) {latching_inCR_int (&aBlock->flags); return aBlock; Else if (aBlock->flags & BLOCK_IS_GLOBAL) {return aBlock; Else {// stack block (compile time) // Stack block (compile time) // Stack block (compile time) Block size_t size = Block_size(aBlock); struct Block_layout *result = (struct Block_layout *)malloc(size); if (! result) return NULL; // copy memmove(result, aBlock, size); // bitcopy first #if __has_feature(ptrauth_calls) // Resign the invoke pointer as it uses address authentication. result->invoke = aBlock->invoke; #if __has_feature(ptrauth_signed_block_descriptors) if (aBlock->flags & BLOCK_SMALL_DESCRIPTOR) { uintptr_t oldDesc = ptrauth_blend_discriminator( &aBlock->descriptor, _Block_descriptor_ptrauth_discriminator); uintptr_t newDesc = ptrauth_blend_discriminator( &result->descriptor, _Block_descriptor_ptrauth_discriminator); result->descriptor = ptrauth_auth_and_resign(aBlock->descriptor, ptrauth_key_asda, oldDesc, ptrauth_key_asda, newDesc); } #endif #endif result->flags &= ~(BLOCK_REFCOUNT_MASK|BLOCK_DEALLOCATING); // XXX not needed result->flags |= BLOCK_NEEDS_FREE | 2; // logical refcount 1 _Block_call_copy_helper(result, aBlock); // block result->isa = _NSConcreteMallocBlock; return result; }}Copy the code

Signature block

v8@? 0 represents 8 bytes, starting at position 0, and @ represents the ID type. You can also see in the output that the block is signed at sign? . A signature is required for message forwarding.

The structure of blocklayout

Struct Block_layout {// isa pointer void * __ptrAuth_objc_isa_pointer ISA; // Volatile int32_t flags; int32_t reserved; // Invoke BlockInvokeFunction; struct Block_descriptor_1 *descriptor; // imported variables }; struct Block_descriptor_1 { uintptr_t reserved; // uintptr_t size; };Copy the code

Through the source code we can see Block_layout data structure, but we can not see signature, copy, dispose data. But we can search and see copy, dispose in Block_descriptor_2, signature in Block_descriptor_3.

#define BLOCK_DESCRIPTOR_2 1
struct Block_descriptor_2 {
    // requires BLOCK_HAS_COPY_DISPOSE
    BlockCopyFunction copy;
    BlockDisposeFunction dispose;
};

#define BLOCK_DESCRIPTOR_3 1
struct Block_descriptor_3 {
    // requires BLOCK_HAS_SIGNATURE
    const char *signature;
    const char *layout;     // contents depend on BLOCK_HAS_EXTENDED_LAYOUT
};
Copy the code

Let’s look at the get of Block_descriptor_2 and Block_descriptor_3 again.

static struct Block_descriptor_2 * _Block_descriptor_2(struct Block_layout *aBlock)
{
    uint8_t *desc = (uint8_t *)_Block_get_descriptor(aBlock);
    desc += sizeof(struct Block_descriptor_1);
    return (struct Block_descriptor_2 *)desc;
}

static struct Block_descriptor_3 * _Block_descriptor_3(struct Block_layout *aBlock)
{
    uint8_t *desc = (uint8_t *)_Block_get_descriptor(aBlock);
    desc += sizeof(struct Block_descriptor_1);
    if (aBlock->flags & BLOCK_HAS_COPY_DISPOSE) {
        desc += sizeof(struct Block_descriptor_2);
    }
    return (struct Block_descriptor_3 *)desc;
}
Copy the code

We can see in the code that Block_descriptor_2 is obtained by translating Block_descriptor_1 by size, Block_descriptor_3 by translating Block_descriptor_2 by size, But it will check whether Block_descriptor_2 exists or not, if not, the old Block_descriptor_2 would be the right place to store Block_descriptor_3. Signature, copy, dispose of Block_layout data are optional according to the type.

The capture variable life cycle of a block

#include "stdio.h"
int main(){
    int a = 8;
    void(^block)(void) = ^{
        printf("chenxi - %d",a);
    };
     block();
    return 0;
}
Copy the code

As we mentioned at the beginning, such a code would generate the following c++ code from clang:

static void __main_block_copy_0(struct __main_block_impl_0*dst, struct __main_block_impl_0*src) {_Block_object_assign((void*)&dst->a, (void*)src->a, 8/*BLOCK_FIELD_IS_BYREF*/); } static void __main_block_dispose_0(struct __main_block_impl_0*src) {_Block_object_dispose((void*)src->a, 8/*BLOCK_FIELD_IS_BYREF*/); } static struct __main_block_desc_0 { size_t reserved; size_t Block_size; void (*copy)(struct __main_block_impl_0*, struct __main_block_impl_0*); void (*dispose)(struct __main_block_impl_0*); } __main_block_desc_0_DATA = { 0, sizeof(struct __main_block_impl_0), __main_block_copy_0, __main_block_dispose_0}; int main(){ // int a = 18; __Block_byref_a_0 a = { (void*)0, (__Block_byref_a_0 *)&a, 0, sizeof(__Block_byref_a_0), 18}; void(*block)(void) = ((void (*)())&__main_block_impl_0((void *)__main_block_func_0, &__main_block_desc_0_DATA, (__Block_byref_a_0 *)&a, 570425344)); ((void (*)(__block_impl *))((__block_impl *)block)->FuncPtr)((__block_impl *)block); return 0; }Copy the code

Here __main_block_desc_0 is the block struct, here the struct is initialized, __main_block_copy_0 is the copy in Block_descriptor_2, __main_block_dispose_0 is dispose that represents Block_descriptor_2, _Block_object_assign((void*)& DST ->a, (void*) SRC ->a, 8) is executed when __main_block_copy_0 is called.

By searching the source code for _Block_object_assign, you can see that these types represent the identification of captured variable types. Here the block captures int 8, so it is of type BLOCK_FIELD_IS_BYREF. Let’s take a look at the _Block_object_assign function.

_Block_object_assign source code analysis

void _Block_object_assign(void *destArg, const void *object, const int flags) { const void **dest = (const void **)destArg; switch (os_assumes(flags & BLOCK_ALL_COPY_DISPOSE_FLAGS)) { case BLOCK_FIELD_IS_OBJECT: _Block_retain_object(object); *dest = object; break; case BLOCK_FIELD_IS_BLOCK: *dest = _Block_copy(object); break; case BLOCK_FIELD_IS_BYREF | BLOCK_FIELD_IS_WEAK: case BLOCK_FIELD_IS_BYREF: *dest = _Block_byref_copy(object); break; case BLOCK_BYREF_CALLER | BLOCK_FIELD_IS_OBJECT: case BLOCK_BYREF_CALLER | BLOCK_FIELD_IS_BLOCK: *dest = object; break; case BLOCK_BYREF_CALLER | BLOCK_FIELD_IS_OBJECT | BLOCK_FIELD_IS_WEAK: case BLOCK_BYREF_CALLER | BLOCK_FIELD_IS_BLOCK | BLOCK_FIELD_IS_WEAK: *dest = object; break; default: break; }}Copy the code
  • If the holding variable is thetaBLOCK_FIELD_IS_OBJECTType, that is, none__blockModifier, pointer to the object, will hold the object, reference count increment 1
*dest = object;
Copy the code
  • If it isBLOCK_FIELD_IS_BLOCKType, and capture oneblock, do_Block_copyoperation
*dest = _Block_copy(object);
Copy the code
  • If it isBLOCK_FIELD_IS_BYREFThat there are__blockModifier, is called_Block_byref_copy
*dest = _Block_byref_copy(object);
Copy the code

_Block_byref_copySource code analysis

static struct Block_byref *_Block_byref_copy(const void *arg) { struct Block_byref *src = (struct Block_byref *)arg; If ((SRC ->forwarding->flags & BLOCK_REFCOUNT_MASK) == 0) {// SRC points to stack struct Block_byref *copy = (struct Block_byref *)malloc(src->size); copy->isa = NULL; // byref value 4 is logical refcount of 2: one for caller, one for stack copy->flags = src->flags | BLOCK_BYREF_NEEDS_FREE | 4; copy->forwarding = copy; // patch heap copy to point to itself src->forwarding = copy; // patch stack to point to heap copy copy->size = src->size; if (src->flags & BLOCK_BYREF_HAS_COPY_DISPOSE) { // Trust copy helper to copy everything of interest // If more than one  field shows up in a byref block this is wrong XXX struct Block_byref_2 *src2 = (struct Block_byref_2 *)(src+1); struct Block_byref_2 *copy2 = (struct Block_byref_2 *)(copy+1); copy2->byref_keep = src2->byref_keep; copy2->byref_destroy = src2->byref_destroy; if (src->flags & BLOCK_BYREF_LAYOUT_EXTENDED) { struct Block_byref_3 *src3 = (struct Block_byref_3 *)(src2+1); struct Block_byref_3 *copy3 = (struct Block_byref_3*)(copy2+1); copy3->layout = src3->layout; } // Capture external variables - memory processing - life cycle save (*src2->byref_keep)(copy, SRC); } else { // Bitwise copy. // This copy includes Block_byref_3, if any. memmove(copy+1, src+1, src->size - sizeof(*src)); } } // already copied to heap else if ((src->forwarding->flags & BLOCK_BYREF_NEEDS_FREE) == BLOCK_BYREF_NEEDS_FREE) { latching_incr_int(&src->forwarding->flags); } return src->forwarding; }Copy the code
  • Encapsulate the external object as a structure Block_byref * SRC

  • If it is BLOCK_FIELD_IS_BYREF, malloc is called and a Block_byref *copy is generated

  • Set forwarding to ensure that both inside and outside a block point to the same object

    copy->forwarding = copy;
    src->forwarding = copy;
Copy the code
  • Block_byrefkeepFunctions anddestroyProcess, and proceedbyref_keepFunction call

The design roadmap for Block_byref is similar to that for Block_layout descriptor. Use the byref->flag identifier to determine whether Block_byref_2 exists. The definition of Block_byref is as follows:

If an external variable is modified with __block, two methods are generated by default in the Block_byref structure of the CPP file, that is, the keep method corresponding to Block_byref_2 and the destory method, as shown below:

Search the CPP file for the implementations of these two functions, as shown below:

This process calls the _Block_object_assign function again to perform the BLOCK_FIELD_IS_OBJECT process on the objects in the Block_byref structure.

The triple copy of the block is now known:

  • blockCopy of the stack areablockCopy to the heap
  • __blockDecorates the object corresponding toBlock_byrefA copy of the structure
  • rightBlock_byrefDecorates the object called_Block_object_assignThe function is decorated.