symbol
In iOS development, as long as function names, variable names and method names are involved, symbol tables will be generated after compilation. There are differences between symbol tables, including internal symbols and external symbols:
- 1. Internal function methods, variable names are internal symbols
- 2,
External symbol
Also known as theIndirect symbol table
When we call an external function(Outside of this Mach-O)
Method, for exampleNSLog
The compiler does not knowNSLog
The symbol is placed in the indirect symbol table.
In internal symbols, they are divided into global symbols and local symbols:
- 1. Global symbols are functions that can be called globally, such as those provided to the outside world when we write the SDK.
- 2. Local symbols are limited to symbols in this file.
In the Mach-O file:
Symbols stores all symbol tables in our project, while Indirect Symbols stores all Indirect symbol tables.
Symbol binding
Let’s create a new project, just two NSLog functions
- (void)viewDidLoad { [super viewDidLoad]; NSLog(@" First external function call!" ); NSLog(@" Second external function call!" ); }Copy the code
The pile
Mach-o statically analyzes symbol table Data
On the first NSLog call, we look at the assembly code
- 1, first we use
image list
Look at theASLR
for0x00000001001f8000
(lldb) image list [ 0] 88C36EBA-D614-344F-AEA4-3D9F52E6C102 0x00000001001f8000 /Users/bel/Library/Developer/Xcode/DerivedData/symbolDemo-fgjddtxgssnvjpadkexjmgqpbbun/Build/Products/Debug-iphoneos/sym bolDemo.app/symbolDemoCopy the code
So we get the NSLog offset in Mach-O 0x1001FE50C-0x00000001001F8000 = 0x650C and we look at the Data value in the Mach-O file
In theSymbol Stubs
In theNSLog symbol
theData
A value of1F2003D590D7025800021FD6
This is what we started withMach-O
File analysis, and then we useLLDB
Dynamic validation
LLDB Dynamically views the symbol table data
At this point, we holdcontrol + step in
Go into that functionEnter the function
View the memory value
(lldb) x 0x104c6650c 0x104c6650c: 1f 20 03 d5 90 d7 02 58 00 02 1f d6 1f 20 03 d5 . ..... X..... . 0x104c6651c: 70 d7 02 58 00 02 1f d6 1f 20 03 d5 50 d7 02 58 p.. X..... . P.. XCopy the code
We can see that 1f 2003D590D7025800021FD6 is exactly the same as F2003D590D7025800021FD6
Here F2003D590D7025800021FD6 is a string of code that is the Data value of the NSLog symbol, which we call a pile. Bl -> address is equivalent to BL -> pile, which executes the address value (Data) in the symbol table.
Symbolic binding function
Then the above function executes, and when it reaches BR x16, we read the value of X16
readx16
The value of the
(lldb) register read x16
x16 = 0x0000000104c665c0
Copy the code
By calculating the offset in MachO 0x0000000104C665C0-ASLR = 0x65C0, at offset == 0x65C0, we can see that it jumps to 0x65A8 and executes a code
So what does a function of 0x65A8 mean?
We go back to the LLDB environment and click Step into until we see the assembly code at 0x65a8
0x104f3a5a8: adr x17, #0x6f10 ; _dyld_private
0x104f3a5ac: nop
0x104f3a5b0: stp x16, x17, [sp, #-0x10]!
0x104f3a5b4: nop
0x104f3a5b8: ldr x16, #0x1a48 ; (void *)0x00000001ba1b435c: dyld_stub_binder
0x104f3a5bc: br x16
0x104f3a5c0: ldr w16, 0x104f3a5c8
0x104f3a5c4: b 0x104f3a5a8
Copy the code
LLDB debugging shows that the assembly code calls the dyLD_STUB_binder function, which is used to bind symbols.
⚠ ️ ⚠ ️ ⚠ ️ summary:
From the above analysis, we can know that NSLog is an external function, and Symbol Stubs table in the Text section stores the stakes of the external function. The first time the NSLog is called, bl jumps into the stub of the function and then calls dyLD_STUB_binder to symbolic bind the NSLog.
Non-lazy-loaded symbol table and lazy-loaded symbol table
By looking at MachO, we can see that the dyLD_STUB_binder Symbol is stored in the non-lazy Symbol of the Text segment
We can see that at compile time the value of Data is 0, whichoffset
for0x8000
At run time, let’s look at its value
(lldb) image list [ 0] 88C36EBA-D614-344F-AEA4-3D9F52E6C102 0x00000001009d8000 /Users/bel/Library/Developer/Xcode/DerivedData/symbolDemo-fgjddtxgssnvjpadkexjmgqpbbun/Build/Products/Debug-iphoneos/sym bolDemo.app/symbolDemo (lldb) p/x 0x00000001009d8000 + 0x8000 (long) $0 = 0x00000001009e0000 (lldb) x 0x00000001009e0000 0x1009e0000: 5c 43 1b ba 01 00 00 00 d0 cf b8 04 02 00 00 00 \C.............. 0x1009e0010: d0 07 00 00 00 00 00 00 e6 f3 9d 00 01 00 00 00 ................ (lldb) dis -s 0x01ba1b435c libdyld.dylib`dyld_stub_binder: 0x1ba1b435c <+0>: stp x29, x30, [sp, #-0x10]! 0x1ba1b4360 <+4>: mov x29, sp 0x1ba1b4364 <+8>: sub sp, sp, #0xf0 ; =0xf0 0x1ba1b4368 <+12>: stp x0, x1, [x29, #-0x10] 0x1ba1b436c <+16>: stp x2, x3, [x29, #-0x20] 0x1ba1b4370 <+20>: stp x4, x5, [x29, #-0x30] 0x1ba1b4374 <+24>: stp x6, x7, [x29, #-0x40] 0x1ba1b4378 <+28>: stp x8, x9, [x29, #-0x50]Copy the code
At runtime, the Data value in the dyLD_STUB_binder symbol table is no longer 0. The symbol table that is not lazily loaded will write the real address value into the Data value of the symbol and bind the symbol when the App starts.
在Lazy loading of symbol tables
, the values inside are functionsThe pile
The value of theNSLog symbol
If lazy loading symbol table exists, this function is called to execute the address value in symbol table. At the first call, the Data value of the symbol table points todyld_stub_binder
Function to perform symbolic binding.
After symbol binding
On the second execution of the NSLog, we look at its Data value:
throughASLR
和 The offset NSLog
, we can directly locate the Data value of the NSLog symbol table:
(lldb) p/x 0x0000000102238000 + 0xc000 (long) $0 = 0x0000000102244000 (lldb) x 0x0000000102244000 0x102244000: 10 e7 78 ba 01 00 00 00 00 38 78 ba 01 00 00 00 .. x...... 8x..... 0x102244010: 58 4c 46 be 01 00 00 00 d8 e5 23 02 01 00 00 00 XLF....... #... (lldb) dis -s 0x01ba78e710 Foundation`NSLog: 0x1ba78e710 <+0>: sub sp, sp, #0x20 ; =0x20 0x1ba78e714 <+4>: stp x29, x30, [sp, #0x10] 0x1ba78e718 <+8>: add x29, sp, #0x10 ; =0x10 0x1ba78e71c <+12>: adrp x8, 267230 0x1ba78e720 <+16>: ldr x8, [x8, #0xe20] 0x1ba78e724 <+20>: ldr x8, [x8] 0x1ba78e728 <+24>: str x8, [sp, #0x8] 0x1ba78e72c <+28>: add x8, x29, #0x10 ; =0x10Copy the code
Here we can see that the value of Data in the symbol table is the address value of NSLog in Foundation.
conclusion
Lazy-loaded symbol tables and non-lazy-loaded symbol tables are stored in the Data segment and are readable and writable.
- Non-lazy-loaded symbol tables are bound as soon as the program starts, for example
dyld_stub_binder
Function. - Lazy loading symbol table, the first time the method is called, will be bound to the function address, at compile time, the symbol table of the string of code, we call
Function of the pile
.The code inside the pile is executed at the address in the symbol table
On the first call,The pile
The code in there points todyld_stub_binder
Function to bind the symbol, and on the second call,The pile
The code inside points to the bound pointer.