For some objective-C implementations, we can download objC4 and other source code on Apple’s open source website to find out. We also wrote a previous article on how to debug objC4 source code. This article takes a brief look at some of the objective-C implementations from an assembly point of view, for personal note. Basic common points such as Class Metada, properties, access to member variables, calling Class methods, calling instance methods, and blocks.

preface

Objective-c source files (.m) are compiled by Clang + LLVM, Swift source files are compiled by Swift + LLVM. With the clang command, you can view the compilation results of the.m file.

xcrun --sdk iphoneos clang -arch arm64 -S XXXX.m

– the arch < arm | arm64 | x86_64 | i386 > : the generated code architecture of the system

-s: source code file. -o: outputs files.

Once the assembly code is generated, it can be understood by referring to the source corresponding in ObjC4.

Class Metadata

  1. @interface
#import "THNormalClass.h"

@interface THNormalClass()

@end
Copy the code
	.section	__TEXT,__text,regular,pure_instructions
	.build_version ios, 15, 2	sdk_version 15, 2
	.section	__DATA,__objc_imageinfo,regular,no_dead_strip
L_OBJC_IMAGE_INFO:
	.long	0
	.long	64

.subsections_via_symbol
Copy the code

The compiler generates little assembly code for @interface.

  1. @implementation

    #import "THNormalClass.h"
    
    @interface THNormalClass()
    
    @end
    
    @implementation THNormalClass
    
    @end
    Copy the code
    .section __TEXT,__text,regular,pure_instructions .build_version ios, 15, 2 sdk_version 15, 2 .section __TEXT,__objc_classname,cstring_literals l_OBJC_CLASS_NAME_: ; @OBJC_CLASS_NAME_ .asciz "THNormalClass" .section __DATA,__objc_const .p2align 3 ; @"_OBJC_METACLASS_RO_$_THNormalClass" __OBJC_METACLASS_RO_$_THNormalClass: .long 1 ; 0x1 .long 40 ; 0x28 .long 40 ; 0x28 .space 4 .quad 0 .quad l_OBJC_CLASS_NAME_ .quad 0 .quad 0 .quad 0 .quad 0 .quad 0 .section __DATA,__objc_data .globl _OBJC_METACLASS_$_THNormalClass ; @"OBJC_METACLASS_$_THNormalClass" .p2align 3 _OBJC_METACLASS_$_THNormalClass: .quad _OBJC_METACLASS_$_NSObject .quad _OBJC_METACLASS_$_NSObject .quad __objc_empty_cache .quad 0 .quad __OBJC_METACLASS_RO_$_THNormalClass .section __DATA,__objc_const .p2align 3 ; @"_OBJC_CLASS_RO_$_THNormalClass" __OBJC_CLASS_RO_$_THNormalClass: .long 0 ; 0x0 .long 8 ; 0x8 .long 8 ; 0x8 .space 4 .quad 0 .quad l_OBJC_CLASS_NAME_ .quad 0 .quad 0 .quad 0 .quad 0 .quad 0 .section __DATA,__objc_data .globl  _OBJC_CLASS_$_THNormalClass ; @"OBJC_CLASS_$_THNormalClass" .p2align 3 _OBJC_CLASS_$_THNormalClass: .quad _OBJC_METACLASS_$_THNormalClass .quad _OBJC_CLASS_$_NSObject .quad __objc_empty_cache .quad 0 .quad __OBJC_CLASS_RO_$_THNormalClass .section __DATA,__objc_classlist,regular,no_dead_strip .p2align 3 ; @"OBJC_LABEL_CLASS_$" l_OBJC_LABEL_CLASS_$: .quad _OBJC_CLASS_$_THNormalClass .section __DATA,__objc_imageinfo,regular,no_dead_strip L_OBJC_IMAGE_INFO: .long 0 .long 64 .subsections_via_symbolsCopy the code
    • __OBJC_CLASS_RO_$_THNormalClass and __OBJC_METACLASS_RO_$_THNormalClass correspond to struct class_ro_t in objC-Runtime-new. h

      struct class_ro_t {
              uint32_t flags;
              uint32_t instanceStart;
              uint32_t instanceSize;                    // The memory control used by the instance object
              const uint8_t * ivarLayout;
              const char * name;                        / / the name of the class
              method_list_t * baseMethodList;           / / read-only
              protocol_list_t * baseProtocols;
              const ivar_list_t * ivars;                // List of member variables
              const uint8_t * weakIvarLayout;
              property_list_t *baseProperties;
              method_list_t *baseMethods(a) const {
                  returnbaseMethodList; }};Copy the code
    • _OBJC_CLASS_$_THNormalClass and _OBJC_METACLASS_$_THNormalClass correspond to struct objc_class in objC-Running-new. h

      struct objc_class : objc_object {
              // Class ISA;
              Class superclass;
              cache_t cache;          // formerly cache pointer and vtable method cache
              class_data_bits_t bits; // To get specific class information
       
              class_rw_t *data(a) {
                  return bits.data(a);// class_rw_t* data() { return (class_rw_t *)(bits & FAST_DATA_MASK); }}...// All the others are methods
          }
      Copy the code

attribute

@interface THNormalClass()

@property(nonatomic, copy) NSString *str1;

@property(nonatomic, copy) NSString *str2;

@end

@implementation THNormalClass

@end
Copy the code
	.section	__TEXT,__text,regular,pure_instructions
	.build_version ios, 15, 2	sdk_version 15, 2
	.p2align	2                               ; -- Begin function -[THNormalClass str1]
"-[THNormalClass str1]":                ; @"\01-[THNormalClass str1]"
...
"-[THNormalClass setStr1:]":            ; @"\01-[THNormalClass setStr1:]"
...
"-[THNormalClass str2]":                ; @"\01-[THNormalClass str2]"
...
"-[THNormalClass setStr2:]":            ; @"\01-[THNormalClass setStr2:]"
...
l_OBJC_CLASS_NAME_:                     ; @OBJC_CLASS_NAME_
...
__OBJC_METACLASS_RO_$_THNormalClass:
...
_OBJC_METACLASS_$_THNormalClass:
...
l_OBJC_METH_VAR_NAME_:                  ; @OBJC_METH_VAR_NAME_
	.asciz	"str1"

	.section	__TEXT,__objc_methtype,cstring_literals
l_OBJC_METH_VAR_TYPE_:                  ; @OBJC_METH_VAR_TYPE_
	.asciz	"@16@0:8"

	.section	__TEXT,__objc_methname,cstring_literals
l_OBJC_METH_VAR_NAME_.1:                ; @OBJC_METH_VAR_NAME_.1
	.asciz	"setStr1:"

...
__OBJC_$_INSTANCE_METHODS_THNormalClass:
	.long	24                              ; 0x18
	.long	4                               ; 0x4
	.quad	l_OBJC_METH_VAR_NAME_
	.quad	l_OBJC_METH_VAR_TYPE_
	.quad	"-[THNormalClass str1]"
	.quad	l_OBJC_METH_VAR_NAME_.1
	.quad	l_OBJC_METH_VAR_TYPE_.2
	.quad	"-[THNormalClass setStr1:]"
	.quad	l_OBJC_METH_VAR_NAME_.3
	.quad	l_OBJC_METH_VAR_TYPE_
	.quad	"-[THNormalClass str2]"
	.quad	l_OBJC_METH_VAR_NAME_.4
	.quad	l_OBJC_METH_VAR_TYPE_.2
	.quad	"-[THNormalClass setStr2:]"

	.private_extern	_OBJC_IVAR_$_THNormalClass._str1 ; @"OBJC_IVAR_$_THNormalClass._str1"
	.section	__DATA,__objc_ivar
	.globl	_OBJC_IVAR_$_THNormalClass._str1
	.p2align	2
_OBJC_IVAR_$_THNormalClass._str1:
	.long	8                               ; 0x8

	.section	__TEXT,__objc_methname,cstring_literals
l_OBJC_METH_VAR_NAME_.5:                ; @OBJC_METH_VAR_NAME_.5
	.asciz	"_str1"

	.section	__TEXT,__objc_methtype,cstring_literals
l_OBJC_METH_VAR_TYPE_.6:                ; @OBJC_METH_VAR_TYPE_.6
	.asciz	"@\"NSString\""

	.private_extern	_OBJC_IVAR_$_THNormalClass._str2 ; @"OBJC_IVAR_$_THNormalClass._str2"
	.section	__DATA,__objc_ivar
	.globl	_OBJC_IVAR_$_THNormalClass._str2
	.p2align	2
_OBJC_IVAR_$_THNormalClass._str2:
	.long	16                              ; 0x10

	.section	__TEXT,__objc_methname,cstring_literals
l_OBJC_METH_VAR_NAME_.7:                ; @OBJC_METH_VAR_NAME_.7
	.asciz	"_str2"

	.section	__DATA,__objc_const
	.p2align	3                               ; @"_OBJC_$_INSTANCE_VARIABLES_THNormalClass"
__OBJC_$_INSTANCE_VARIABLES_THNormalClass:
	.long	32                              ; 0x20
	.long	2                               ; 0x2
	.quad	_OBJC_IVAR_$_THNormalClass._str1
	.quad	l_OBJC_METH_VAR_NAME_.5
	.quad	l_OBJC_METH_VAR_TYPE_.6
	.long	3                               ; 0x3
	.long	8                               ; 0x8
	.quad	_OBJC_IVAR_$_THNormalClass._str2
	.quad	l_OBJC_METH_VAR_NAME_.7
	.quad	l_OBJC_METH_VAR_TYPE_.6
	.long	3                               ; 0x3
	.long	8                               ; 0x8

	.section	__TEXT,__objc_methname,cstring_literals
l_OBJC_PROP_NAME_ATTR_:                 ; @OBJC_PROP_NAME_ATTR_
	.asciz	"str1"

l_OBJC_PROP_NAME_ATTR_.8:               ; @OBJC_PROP_NAME_ATTR_.8
	.asciz	"T@\"NSString\",C,N,V_str1"

l_OBJC_PROP_NAME_ATTR_.9:               ; @OBJC_PROP_NAME_ATTR_.9
	.asciz	"str2"

l_OBJC_PROP_NAME_ATTR_.10:              ; @OBJC_PROP_NAME_ATTR_.10
	.asciz	"T@\"NSString\",C,N,V_str2"

	.section	__DATA,__objc_const
	.p2align	3                               ; @"_OBJC_$_PROP_LIST_THNormalClass"
__OBJC_$_PROP_LIST_THNormalClass:
	.long	16                              ; 0x10
	.long	2                               ; 0x2
	.quad	l_OBJC_PROP_NAME_ATTR_
	.quad	l_OBJC_PROP_NAME_ATTR_.8
	.quad	l_OBJC_PROP_NAME_ATTR_.9
	.quad	l_OBJC_PROP_NAME_ATTR_.10

	.p2align	3                               ; @"_OBJC_CLASS_RO_$_THNormalClass"
__OBJC_CLASS_RO_$_THNormalClass:
	.long	0                               ; 0x0
	.long	8                               ; 0x8
	.long	24                              ; 0x18
	.space	4
	.quad	0
	.quad	l_OBJC_CLASS_NAME_
	.quad	__OBJC_$_INSTANCE_METHODS_THNormalClass
	.quad	0
	.quad	__OBJC_$_INSTANCE_VARIABLES_THNormalClass
	.quad	0
	.quad	__OBJC_$_PROP_LIST_THNormalClass

	.section	__DATA,__objc_data
	.globl	_OBJC_CLASS_$_THNormalClass     ; @"OBJC_CLASS_$_THNormalClass"
	.p2align	3
_OBJC_CLASS_$_THNormalClass:
	.quad	_OBJC_METACLASS_$_THNormalClass
	.quad	_OBJC_CLASS_$_NSObject
	.quad	__objc_empty_cache
	.quad	0
	.quad	__OBJC_CLASS_RO_$_THNormalClass

	.section	__DATA,__objc_classlist,regular,no_dead_strip
	.p2align	3                               ; @"OBJC_LABEL_CLASS_$"
l_OBJC_LABEL_CLASS_$:
	.quad	_OBJC_CLASS_$_THNormalClass

	.section	__DATA,__objc_imageinfo,regular,no_dead_strip
L_OBJC_IMAGE_INFO:
	.long	0
	.long	64

.subsections_via_symbols
Copy the code

__OBJC_$_INSTANCE_VARIABLES_THNormalClass corresponds to struct ivar_list_t in objC-Runtime-new. h plus 2. Long (4 bytes) overhead.

__OBJC_$_INSTANCE_VARIABLES_THNormalClass:
	.long	32                              ; 0x20
	.long	2                               ; 0x2
	.quad	_OBJC_IVAR_$_THNormalClass._str1
	.quad	l_OBJC_METH_VAR_NAME_.5
	.quad	l_OBJC_METH_VAR_TYPE_.6
	.long	3                               ; 0x3
	.long	8                               ; 0x8
	.quad	_OBJC_IVAR_$_THNormalClass._str2
	.quad	l_OBJC_METH_VAR_NAME_.7
	.quad	l_OBJC_METH_VAR_TYPE_.6
	.long	3                               ; 0x3
	.long	8                               ; 0x8
Copy the code

Struct ivar_t in objc-Runtime-new. h The last paragraph indicates that ivars in struct class_ro_t points to ivar_list_t mentioned above.

Struct objc_class = class_ro_t struct objc_class

  1. The compiler first automatically generates getters and setters for the property and places them in method_list_t.

    __OBJC_$_INSTANCE_METHODS_THNormalClass:
    	.long	24                              ; 0x18
    	.long	4                               ; 0x4
    	.quad	l_OBJC_METH_VAR_NAME_
    	.quad	l_OBJC_METH_VAR_TYPE_
    	.quad	"-[THNormalClass str1]"
    	.quad	l_OBJC_METH_VAR_NAME_.1
    	.quad	l_OBJC_METH_VAR_TYPE_.2
    	.quad	"-[THNormalClass setStr1:]"
    	.quad	l_OBJC_METH_VAR_NAME_.3
    	.quad	l_OBJC_METH_VAR_TYPE_
    	.quad	"-[THNormalClass str2]"
    	.quad	l_OBJC_METH_VAR_NAME_.4
    	.quad	l_OBJC_METH_VAR_TYPE_.2
    	.quad	"-[THNormalClass setStr2:]"
    Copy the code
  2. Since property is a wrapper around a member variable, the corresponding member variable is held in iVAR_list_t.

    __OBJC_$_INSTANCE_VARIABLES_THNormalClass:
    	.long	32                              ; 0x20
    	.long	2                               ; 0x2
    	.quad	_OBJC_IVAR_$_THNormalClass._str1
    	.quad	l_OBJC_METH_VAR_NAME_.5
    	.quad	l_OBJC_METH_VAR_TYPE_.6
    	.long	3                               ; 0x3
    	.long	8                               ; 0x8
    	.quad	_OBJC_IVAR_$_THNormalClass._str2
    	.quad	l_OBJC_METH_VAR_NAME_.7
    	.quad	l_OBJC_METH_VAR_TYPE_.6
    	.long	3                               ; 0x3
    	.long	8                               ; 0x8
    Copy the code
  3. Each objective-C class has an property_list_t, which is stored in struct class_ro_t.

    __OBJC_$_PROP_LIST_THNormalClass:
    	.long	16                              ; 0x10
    	.long	2                               ; 0x2
    	.quad	l_OBJC_PROP_NAME_ATTR_
    	.quad	l_OBJC_PROP_NAME_ATTR_.8
    	.quad	l_OBJC_PROP_NAME_ATTR_.9
    	.quad	l_OBJC_PROP_NAME_ATTR_.10
    Copy the code
  4. Each property_list_t holds a struct property_t object.

    struct property_t {
        const char *name;
        const char *attributes;
    };
    Copy the code
    l_OBJC_PROP_NAME_ATTR_:                 ; @OBJC_PROP_NAME_ATTR_
    	.asciz	"str1"
    
    l_OBJC_PROP_NAME_ATTR_.8:               ; @OBJC_PROP_NAME_ATTR_.8
    	.asciz	"T@\"NSString\",C,N,V_str1"
    Copy the code

Access to member variables

@interface THNormalClass() {
    int a;
    int b;
}

@end

@implementation THNormalClass

- (int)calAdd {
    return a + b;
}

@end
Copy the code
FunctionalProgrammingLearn`-[THNormalClass calAdd]:
    0x100b9a4e4 <+0>:  sub    sp, sp, #0x10             ; =0x10 
    0x100b9a4e8 <+4>:  str    x0, [sp, #0x8]
    0x100b9a4ec <+8>:  str    x1, [sp]
    0x100b9a4f0 <+12>: ldr    x8, [sp, #0x8]
->  0x100b9a4f4 <+16>: ldr    w8, [x8, #0x8]
    0x100b9a4f8 <+20>: ldr    x9, [sp, #0x8]
    0x100b9a4fc <+24>: ldr    w9, [x9, #0xc]
    0x100b9a500 <+28>: add    w0, w8, w9
    0x100b9a504 <+32>: add    sp, sp, #0x10             ; =0x10 
    0x100b9a508 <+36>: ret  
Copy the code

X8 stores the address of THNormalClass *. [x8, #0x8] offset 8 bytes Find a’s location in memory. LDR in this case is memory access instruction.

A method is called

@interface THNormalClass() {
    int a;
    int b;
}

@end

@implementation THNormalClass

- (int)calAdd {
    return [self calMethod];
}

- (int)calMethod {
    a = 1;
    b = 8;
    return a + b;
}

@end
Copy the code
"-[THNormalClass calAdd]":              ; @"\01-[THNormalClass calAdd]"
	.cfi_startproc
; %bb.0:
	sub	sp, sp, #32                     ; =32
	stp	x29, x30, [sp, #16]             ; 16-byte Folded Spill
	add	x29, sp, #16                    ; =16
	.cfi_def_cfa w29, 16
	.cfi_offset w30, -8
	.cfi_offset w29, -16
	str	x0, [sp, #8]
	str	x1, [sp]
	ldr	x0, [sp, #8]
	adrp	x8, _OBJC_SELECTOR_REFERENCES_@PAGE
	ldr	x1, [x8, _OBJC_SELECTOR_REFERENCES_@PAGEOFF]
	bl	_objc_msgSend
	ldp	x29, x30, [sp, #16]             ; 16-byte Folded Reload
	add	sp, sp, #32                     ; =32
	ret
Copy the code

Adr: Small range of address read instructions. Reads the address value based on the relative offset of the PC into a register.

Adrp: large range of address reading instruction in page.

Bl: jump, also save the address of the next BL instruction in a register.

X1 keeps the address of the selector.

Calling class methods

@implementation THNormalClass

- (void)calMethod {
    return [THNormalClass calClassMethod];
}

+ (void)calClassMethod {
     
}

@end
Copy the code
"-[THNormalClass calMethod]":           ; @"\01-[THNormalClass calMethod]"
	.cfi_startproc
; %bb.0:
	sub	sp, sp, #32                     ; =32
	stp	x29, x30, [sp, #16]             ; 16-byte Folded Spill
	add	x29, sp, #16                    ; =16
	.cfi_def_cfa w29, 16
	.cfi_offset w30, -8
	.cfi_offset w29, -16
	str	x0, [sp, #8]
	str	x1, [sp]
	adrp	x8, _OBJC_CLASSLIST_REFERENCES_$_@PAGE
	ldr	x0, [x8, _OBJC_CLASSLIST_REFERENCES_$_@PAGEOFF]
	adrp	x8, _OBJC_SELECTOR_REFERENCES_@PAGE
	ldr	x1, [x8, _OBJC_SELECTOR_REFERENCES_@PAGEOFF]
	bl	_objc_msgSend
	ldp	x29, x30, [sp, #16]             ; 16-byte Folded Reload
	add	sp, sp, #32                     ; =32
	ret
Copy the code

In the assembly code above, you can see that class methods are called just like member methods, _objc_msgSend. Adrp x8, _OBJC_CLASSLIST_REFERENCES_$_@PAGE LDR x0, [x8, _OBJC_CLASSLIST_REFERENCES_$_@PAGEOFF], class method call has an extra step to get the global class pointer.

adrp x8, 6

Assume that the current PC register is 0x10410E324 1. First shift the value of 6 12 bits to the left to 0x6000 2. Clear the lower 12 bits of the PC register, that is, 0x10410E324 => 0x10410E000 3. Finally, add 0x10410E000 and 0x6000 to register X8. By 8 0 x104114000Copy the code

block

@implementation THNormalClass

void(^block)(void) = ^(void) {
    NSLog(@"this is a block");
};

- (void)calMethod {
    block();
}

@end
Copy the code

  1. You can see that this is an __NSGlobalBlock__.

  2. struct Block_layout {
        void * __ptrauth_objc_isa_pointer isa;
        volatile int32_t flags; // contains ref count
        int32_t reserved;
        BlockInvokeFunction invoke;
        struct Block_descriptor_1 *descriptor;
        // imported variables
    };
    Copy the code

    Invoke is at 17-24 bytes in memory. Po 0x00000001028AC090 tells invoke: 0x1028AA304. Disassemble addresses through DIS-S. Find the NSLog written in the source code as shown below.

    (lldb) dis -s 0x00000001028aa304
    FunctionalProgrammingLearn`block_block_invoke:
        0x1028aa304 <+0>:  sub    sp, sp, #0x20             ; =0x20 
        0x1028aa308 <+4>:  stp    x29, x30, [sp, #0x10]
        0x1028aa30c <+8>:  add    x29, sp, #0x10            ; =0x10 
        0x1028aa310 <+12>: str    x0, [sp, #0x8]
        0x1028aa314 <+16>: str    x0, [sp]
        0x1028aa318 <+20>: adrp   x0, 2
        0x1028aa31c <+24>: add    x0, x0, #0xb0             ; =0xb0 
        0x1028aa320 <+28>: bl     0x1028aa668               ; symbol stub for: NSLog
    (lldb) 
    Copy the code
  3. Signature: "v8 @? 0"

    Type Encodings

Afterword.

It’s easy to explore some common OC points from an assembly perspective, comparing objC4 and libclosure’s source code. Feel the need to organize their own ARM assembly learning notes, not completely remember to understand, behind self check.