symbol

In iOS development, as long as function names, variable names and method names are involved, symbol tables will be generated after compilation. There are differences between symbol tables, including internal symbols and external symbols:

  • 1. Internal function methods, variable names are internal symbols
  • 2,External symbolAlso known as theIndirect symbol tableWhen we call an external function(Outside of this Mach-O)Method, for exampleNSLogThe compiler does not knowNSLogThe symbol is placed in the indirect symbol table.

In internal symbols, they are divided into global symbols and local symbols:

  • 1. Global symbols are functions that can be called globally, such as those provided to the outside world when we write the SDK.
  • 2. Local symbols are limited to symbols in this file.

In the Mach-O file:

Symbols stores all symbol tables in our project, while Indirect Symbols stores all Indirect symbol tables.

Symbol binding

Let’s create a new project, just two NSLog functions

- (void)viewDidLoad { [super viewDidLoad]; NSLog(@" First external function call!" ); NSLog(@" Second external function call!" ); }Copy the code

The pile

Mach-o statically analyzes symbol table Data

On the first NSLog call, we look at the assembly code

  • 1, first we useimage list Look at theASLRfor0x00000001001f8000
(lldb) image list [ 0] 88C36EBA-D614-344F-AEA4-3D9F52E6C102 0x00000001001f8000 /Users/bel/Library/Developer/Xcode/DerivedData/symbolDemo-fgjddtxgssnvjpadkexjmgqpbbun/Build/Products/Debug-iphoneos/sym bolDemo.app/symbolDemoCopy the code

So we get the NSLog offset in Mach-O 0x1001FE50C-0x00000001001F8000 = 0x650C and we look at the Data value in the Mach-O file

In theSymbol StubsIn theNSLog symboltheDataA value of1F2003D590D7025800021FD6This is what we started withMach-OFile analysis, and then we useLLDBDynamic validation

LLDB Dynamically views the symbol table data

At this point, we holdcontrol + step inGo into that functionEnter the function

View the memory value

(lldb) x 0x104c6650c 0x104c6650c: 1f 20 03 d5 90 d7 02 58 00 02 1f d6 1f 20 03 d5 . ..... X..... . 0x104c6651c: 70 d7 02 58 00 02 1f d6 1f 20 03 d5 50 d7 02 58 p.. X..... . P.. XCopy the code

We can see that 1f 2003D590D7025800021FD6 is exactly the same as F2003D590D7025800021FD6

Here F2003D590D7025800021FD6 is a string of code that is the Data value of the NSLog symbol, which we call a pile. Bl -> address is equivalent to BL -> pile, which executes the address value (Data) in the symbol table.

Symbolic binding function

Then the above function executes, and when it reaches BR x16, we read the value of X16

readx16The value of the

(lldb) register read x16
     x16 = 0x0000000104c665c0  
Copy the code

By calculating the offset in MachO 0x0000000104C665C0-ASLR = 0x65C0, at offset == 0x65C0, we can see that it jumps to 0x65A8 and executes a code

So what does a function of 0x65A8 mean?

We go back to the LLDB environment and click Step into until we see the assembly code at 0x65a8

   0x104f3a5a8: adr    x17, #0x6f10              ; _dyld_private
    0x104f3a5ac: nop    
    0x104f3a5b0: stp    x16, x17, [sp, #-0x10]!
    0x104f3a5b4: nop    
    0x104f3a5b8: ldr    x16, #0x1a48              ; (void *)0x00000001ba1b435c: dyld_stub_binder
    0x104f3a5bc: br     x16
    0x104f3a5c0: ldr    w16, 0x104f3a5c8
    0x104f3a5c4: b      0x104f3a5a8
Copy the code

LLDB debugging shows that the assembly code calls the dyLD_STUB_binder function, which is used to bind symbols.

⚠ ️ ⚠ ️ ⚠ ️ summary:

From the above analysis, we can know that NSLog is an external function, and Symbol Stubs table in the Text section stores the stakes of the external function. The first time the NSLog is called, bl jumps into the stub of the function and then calls dyLD_STUB_binder to symbolic bind the NSLog.

Non-lazy-loaded symbol table and lazy-loaded symbol table

By looking at MachO, we can see that the dyLD_STUB_binder Symbol is stored in the non-lazy Symbol of the Text segment

We can see that at compile time the value of Data is 0, whichoffsetfor0x8000At run time, let’s look at its value

(lldb) image list [ 0] 88C36EBA-D614-344F-AEA4-3D9F52E6C102 0x00000001009d8000 /Users/bel/Library/Developer/Xcode/DerivedData/symbolDemo-fgjddtxgssnvjpadkexjmgqpbbun/Build/Products/Debug-iphoneos/sym bolDemo.app/symbolDemo (lldb) p/x 0x00000001009d8000 + 0x8000 (long) $0 = 0x00000001009e0000 (lldb) x 0x00000001009e0000  0x1009e0000: 5c 43 1b ba 01 00 00 00 d0 cf b8 04 02 00 00 00 \C.............. 0x1009e0010: d0 07 00 00 00 00 00 00 e6 f3 9d 00 01 00 00 00 ................ (lldb) dis -s 0x01ba1b435c libdyld.dylib`dyld_stub_binder: 0x1ba1b435c <+0>: stp x29, x30, [sp, #-0x10]! 0x1ba1b4360 <+4>: mov x29, sp 0x1ba1b4364 <+8>: sub sp, sp, #0xf0 ; =0xf0 0x1ba1b4368 <+12>: stp x0, x1, [x29, #-0x10] 0x1ba1b436c <+16>: stp x2, x3, [x29, #-0x20] 0x1ba1b4370 <+20>: stp x4, x5, [x29, #-0x30] 0x1ba1b4374 <+24>: stp x6, x7, [x29, #-0x40] 0x1ba1b4378 <+28>: stp x8, x9, [x29, #-0x50]Copy the code

At runtime, the Data value in the dyLD_STUB_binder symbol table is no longer 0. The symbol table that is not lazily loaded will write the real address value into the Data value of the symbol and bind the symbol when the App starts.

Lazy loading of symbol tables, the values inside are functionsThe pileThe value of theNSLog symbolIf lazy loading symbol table exists, this function is called to execute the address value in symbol table. At the first call, the Data value of the symbol table points todyld_stub_binderFunction to perform symbolic binding.

After symbol binding

On the second execution of the NSLog, we look at its Data value:

throughASLRThe offset NSLog, we can directly locate the Data value of the NSLog symbol table:

(lldb) p/x 0x0000000102238000 + 0xc000 (long) $0 = 0x0000000102244000 (lldb) x 0x0000000102244000 0x102244000: 10 e7 78 ba 01 00 00 00 00 38 78 ba 01 00 00 00 .. x...... 8x..... 0x102244010: 58 4c 46 be 01 00 00 00 d8 e5 23 02 01 00 00 00 XLF....... #... (lldb) dis -s 0x01ba78e710 Foundation`NSLog: 0x1ba78e710 <+0>: sub sp, sp, #0x20 ; =0x20 0x1ba78e714 <+4>: stp x29, x30, [sp, #0x10] 0x1ba78e718 <+8>: add x29, sp, #0x10 ; =0x10 0x1ba78e71c <+12>: adrp x8, 267230 0x1ba78e720 <+16>: ldr x8, [x8, #0xe20] 0x1ba78e724 <+20>: ldr x8, [x8] 0x1ba78e728 <+24>: str x8, [sp, #0x8] 0x1ba78e72c <+28>: add x8, x29, #0x10 ; =0x10Copy the code

Here we can see that the value of Data in the symbol table is the address value of NSLog in Foundation.

conclusion

Lazy-loaded symbol tables and non-lazy-loaded symbol tables are stored in the Data segment and are readable and writable.

  • Non-lazy-loaded symbol tables are bound as soon as the program starts, for exampledyld_stub_binderFunction.
  • Lazy loading symbol table, the first time the method is called, will be bound to the function address, at compile time, the symbol table of the string of code, we callFunction of the pile.The code inside the pile is executed at the address in the symbol tableOn the first call,The pileThe code in there points todyld_stub_binderFunction to bind the symbol, and on the second call,The pileThe code inside points to the bound pointer.