1. Space and address allocation
- Space and address allocation scans all object files, merges same segments, and collects symbol definitions and references.
- Symbol resolution and relocation
The sample code
/* a.c */
extern int shared;
extern void swap(int* a, int* b);
int main(a)
{
int a = 100;
swap(&a, &shared);
}
/* b.c */
int shared = 1;
void swap(int* a, int* b)
{
*a ^= *b ^= *a ^= *b;
}
Copy the code
Before and after the link
a.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000039 00000000 00000000 00000034 2**0 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000000 00000000 00000000 0000006d 2**0 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 0000006d 2**0 ALLOC 3 .comment 00000036 00000000 00000000 0000006d 2**0 CONTENTS, READONLY 4 .note.GNU-stack 00000000 00000000 00000000 000000a3 2**0 CONTENTS, READONLY 5 .eh_frame 00000044 00000000 00000000 000000a4 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA b.o: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000039 00000000 00000000 00000034 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 00000004 00000000 00000000 00000070 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000074 2**0 ALLOC 3 .comment 00000036 00000000 00000000 00000074 2**0 CONTENTS, READONLY 4 .note.GNU-stack 00000000 00000000 00000000 000000aa 2**0 CONTENTS, READONLY 5 .eh_frame 00000038 00000000 00000000 000000ac 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA ab: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000072 08048094 08048094 00000094 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .eh_frame 00000064 08048108 08048108 00000108 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .data 00000004 0804916c 0804916c 0000016c 2**2 CONTENTS, ALLOC, LOAD, DATA 3 .comment 00000035 00000000 00000000 00000170 2**0 CONTENTS, READONLYCopy the code
The linked VMA is the virtual address in the process space. The executable allocates space for the BSS segment, with the static_uninit_var address 0x08049170 immediately following the.data segment
SYMBOL TABLE: 08048094 l d .text 00000000 .text 08048108 l d .eh_frame 00000000 .eh_frame 0804916c l d .data 00000000 .data 08049170 l d .bss 00000000 .bss 00000000 l d .comment 00000000 .comment 00000000 l df *ABS* 00000000 a.c 08049170 l O .bss 00000004 static_uninit_var.1485 00000000 l df *ABS* 00000000 b.c 080480cd g F .text 00000039 swap 0804916c g O .data 00000004 shared 08049170 g .bss 00000000 __bss_start 08049174 g O .bss 00000004 global_uninit_var 08048094 g F .text 00000039 main 08049170 g .data 00000000 _edata 08049178 g .bss 00000000 _endCopy the code
Symbol analysis and relocation
Symbol table '.symtab' contains 16 entries:
Num: Value Size Type Bind Vis Ndx Name
9: 080480cd 57 FUNC GLOBAL DEFAULT 1 swap
10: 0804916c 4 OBJECT GLOBAL DEFAULT 3 shared
Copy the code
The symbol table information of the executable file. After splicing, the virtual space address of each segment has been assigned, and then the virtual space address is calculated by the offset address of each symbol in the target file segment.
Relocation table for object file A.o
a.o: file format elf32-i386
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000001c R_386_32 shared
00000025 R_386_PC32 swap
Copy the code
The relocation table records the offset addresses of symbols to be relocated in the target file. For example, the offset address bit 0x0000001c of the share variable is known. The virtual space address of the.text segment is 0x08048094.
Disassembly of section. text: 08048094 <main>: ···· 80480AF: 68 6C 91 04 08 push$0x804916c· · · · · ·Copy the code
0x080480B0 The first 4-byte reference to shared is corrected to the correct address 0x804916c.
If a global symbol appears more than once in a compilation unit, each reference to it will have a relocation message in the relocation table.
a.o: file format elf64-x86-64
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000018 R_X86_64_PC32 shared-0x0000000000000004 # shared is referenced multiple times with multiple relocation information
0000000000000049 R_X86_64_PC32 shared-0x0000000000000004
0000000000000051 R_X86_64_PLT32 swap-0x0000000000000004
000000000000006a R_X86_64_PLT32 __stack_chk_fail-0x0000000000000004
Copy the code
The COMMON block
To initialize global variables, for the weak symbols in the target file, but not in the distribution of BSS space marked as COMMON, this is because he is a global variable, may have defined in other compilation unit, even in other type of the symbol in a compilation unit size is bigger, so that only the final link can confirm his size, Space is allocated in the BSS section of the output file.
- If a weak symbol is larger than a strong symbol, the link is warned.
- Attributes can be added to force global variables not to be marked COMMON for initialization, which is equivalent to a strong symbol.