symbol

The Symbol is used to indicate an address, either the starting address of a program or the starting address of a variable. In short, think of it as a Symbol or name. The linking process discussed in the previous section is essentially bringing together different object files. In link formation, the object files are combined together, which is actually the reference to the address between the object files. In the linking process, function and variable are called as symbols, function name and variable name are symbolic names, and the address information recorded by them is their symbolic value.

classification

Symbols are classified, according to their characteristics, can be divided into internal symbols and external symbols.

  • Internal symbol: Internal function or method, variable name (current Mach-O)
    • For example, AppDelegate, whose address is known at compile time, is divided into:
      • Global symbols: global scoped functions, symbols, the entire project visible
      • Local symbol: symbol within the current file that is visible to the current file
  • External symbol: External function or method, variable name (not current Mach-O)
    • For example, NSLog, which is further divided into:
      • Import symbolFor the current Mach-o file, it is importedNSLogThe symbol is called: import symbol
      • Export symbolsFor Foundation, it is exportedNSLogThe symbol is called: derived symbol
    • The resulting symbol must be a global symbol, and it follows that the global symbol cannot be removed
    • The link’s address is not known when it is compiled, but its memory address can only be determined when it is dynamically linked
    • Unity is located in theRelocate symbol tableAlso called** Indirect symbol table **In the
  • Special symbols: special symbols, etc., generated when the linker generates the executable
    • __executable_start, the start address of the program, not the entry address
    • _end or end, the end address of the program

🤔 Why does an external symbol need to be dynamically linked to determine its memory address?

Local symbols can be obtained by address offset, but since Mach-O has not been loaded, the program virtual memory space has not been allocated, and the imported external symbols have no memory address.

Global symbol

To illustrate global symbols, add the following code to the hello.c file above

#include <stdio.h>
#define MAX_AGE 120int globalVal = 99; int globalUndefineVal; int main() { printf("%s\n", "hello world~"); printf("%d\n", MAX_AGE); // simply output return 0; }Copy the code

Execute compilation output A.out to view, visible

  • Both the _globalVal and _globalUndefinedVal symbol signatures aregDescription is a global symbol
  • Both the _globalVal and _globalUndefinedVal symbols are in the _DATA section, but one is in __data and the other in __common

Local symbol

Local symbols, also known as static symbols, differ from global symbols in visibility. To compare local symbols, add the following code to the hello.c file above

int globalUndefineVal; static int staticVal = 88; Static int staticUndefineVal; # uninitialized static variable int main() {printf("%d %d\n", staticVal, staticUndefineVal); // Simple output... }Copy the code

Note: IfA static variableThere is noreferenceIf so, it becomes a debug symbol and cannot be printed

Import/export symbol

Import and export are relative. The export symbol of one party can be used as the import symbol of the other party. Check the export symbol of A. out

objdump --macho --exports-trie  a
Copy the code

As you can see, the derived symbol is what you saw earlierGlobal symbol, global symbol default isExport symbolsAnd of course you can hide itGlobal symbol, the following example

int globalVal2 __attribute__((visibility("hidden"))) = 99;
Copy the code

Usually, the external symbols used are put into the indirect symbol table. Check out the indirect symbol table for A. out

objdump --macho --indirect-symbols a.out
Copy the code

The symbol table

Link process, we need to varioussymbolManage every one of themThe target fileThere will always be a correspondingSymbol Table, used to record all symbols used in the object file, in the static link, will carry out the same nature of the segment merge.Symbol table classification

  • The Symbol Table:All symbol tables
    • Stores all symbol information, including symbols in the dynamically linked symbol table
  • Dynamic Symbol Table: Dynamic linked Symbol table, Indirect Symbol table
    • It stores everything it usesSymbol information in other external dynamic librariesFor dynamic linking
  • String Table: symbol name

Common commands

-----------------------------------------
#-A displays all symbol tables. -p does not need sorting

$ nm -pa hello.o
0000000000000000 T _main
                 U _printf
                 
$ nm -pa a
0000000100008008 d __dyld_private
0000000100000000 T __mh_execute_header
0000000100003f30 T _main
                 U _printf
                 U dyld_stub_binder
-----------------------------------------
#View the Mach-header content
$ otool -h a
a:
Mach header
      magic  cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
 0xfeedfacf 16777223          3  0x00           2    16       1368 0x00200085
 
$ objdump --macho --private-header a
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
MH_MAGIC_64  X86_64        ALL  0x00     EXECUTE    16       1368   NOUNDEFS DYLDLINK TWOLEVEL PIE
-----------------------------------------
#View the __TEXT snippet content
$objdump --macho -d a
a:
(__TEXT,__text) section
_main:
100003f30:	55	pushq	%rbp
100003f31:	48 89 e5	movq	%rsp, %rbp
100003f38:	c7 45 fc 00 00 00 00	movl	$0, -4(%rbp)
...
100003f75:	5d	popq	%rbp
100003f76:	c3	retq
-----------------------------------------
#See the sign
objdump --macho -syms hello.o
hello.o:
SYMBOL TABLE:
0000000000000048 g     O __DATA,__data _globalVal
0000000000000000 g     F __TEXT,__text _main
0000000000000004         *COM*	0000000000000004 _globalUndefineVal
0000000000000000         *UND* _printf
-----------------------------------------
#View exported symbols
$objdump --macho --exports-trie a
a:
Exports trie:
0x100000000  __mh_execute_header
0x100003F30  _main
0x100008010  _globalVal
0x100008014  _globalUndefineVal
-----------------------------------------
#View indirect symbols
$ objdump --macho --indirect-symbols a
a:
Indirect symbols for (__TEXT,__stubs) 1 entries
address            index name
0x0000000100003f78     5 _printf
Indirect symbols for (__DATA_CONST,__got) 1 entries
address            index name
0x0000000100004000     6 dyld_stub_binder
Indirect symbols for (__DATA,__la_symbol_ptr) 1 entries
address            index name
0x0000000100008000     5 _printf
Copy the code