In my last article, I analyzed the principle of fishHook. In this article, I will analyze the source code of fishHook.
Structure rebinding
struct rebinding { const char *name; // The name of the function that needs a HOOK, C string void *replacement; // Address of the new function void **replace; // Pointer to the original function address! };Copy the code
When using fishHook, we need to create an array of structures called rebinding. Let’s look at the properties of the rebinding structure:
- Name: is a C string. FishHook uses name to find the corresponding symbol of the system function in machO
- Replacement: is a pointer to the function address of the new function we need to replace
- Replaced: A secondary pointer, because the program is loaded when unable to find a real system function function address, can only know after completion of the link, and fishHook hooks need after the completion of a pointer to the real function of function addresses saved, this time as we need to define a function pointer is used to store real system function function addresses, Otherwise, when the hook system function is finished, the real system function cannot be called
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel);
int rebind_symbols(struct rebinding rebindings[], Size_t rebindingS_nel) {// the prepend_rebindings function adds the entire REbindings array to the head of the _rebindings_head list Fishhook uses a linked list to store the parameters passed in each call to Rebind_symbols. Each call inserts a node into the head of the list. The head of the list is: _rebindings_head int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel); If (retval < 0) {return retval; if (retval < 0) {return retval; } // Check whether _rebindingS_head ->next is null. if (! _dyLD_register_func_for_add_image; // The image that has been loaded by dyLD is immediately called back. // Subsequent images trigger a callback when dyLD is loaded. _dyld_register_func_for_add_image(_rebind_symbols_for_image); // Hook uint32_t c = _dyLD_image_count (); for (uint32_t i = 0; i < c; i++) { _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i)); } } return retval; }Copy the code
We find that the prepend_rebindings function is first invoked to save the parameter rebindings passed in as a linked list.
prepend_rebindings
struct rebindings_entry { struct rebinding *rebindings; size_t rebindings_nel; struct rebindings_entry *next; }; static struct rebindings_entry *_rebindings_head; static int prepend_rebindings(struct rebindings_entry **rebindings_head, struct rebinding rebindings[], size_t nel) { struct rebindings_entry *new_entry = (struct rebindings_entry *) malloc(sizeof(struct rebindings_entry)); if (! new_entry) { return -1; } new_entry->rebindings = (struct rebinding *) malloc(sizeof(struct rebinding) * nel); if (! new_entry->rebindings) { free(new_entry); return -1; } memcpy(new_entry->rebindings, rebindings, sizeof(struct rebinding) * nel); new_entry->rebindings_nel = nel; new_entry->next = *rebindings_head; *rebindings_head = new_entry; return 0; }Copy the code
On the first call, call _dyLD_register_func_for_add_image to register the listener method.
Not the first call to iterate over the loaded image, hook it with the _rebind_SYMBOLs_for_image function
_rebind_symbols_for_image
static void _rebind_symbols_for_image(const struct mach_header *header,
intptr_t slide) {
rebind_symbols_for_image(_rebindings_head, header, slide);
}
Copy the code
The rebind_SYMBOLS_for_image function is called
Rebind_symbols_for_image, a real hook function
// The end of the callback is this function! Three parameters: Static void rebind_SYMBOLS_for_image (struct RebindingS_entry *rebindings, Const struct mach_header *header, intptr_t slide) {/*dladdr() determines whether the specified address is in one of the loading modules (executable or shared library) that make up the address space of the process, If an address lies between the base address on which the loaded module is mapped and the highest virtual address mapped for the loaded module (including both ends), the address is considered to be in the range of the loaded module. If a loading module meets this condition, its dynamic symbol table is searched for the symbol closest to the specified address. The closest symbol is the symbol whose value is equal to, or closest but less than, the specified address. */ /* Returns 0 if the specified address is not in the range of one of the loading modules; The contents of the Dl_info structure are not modified. Otherwise, a non-zero value is returned with the fields of the Dl_info structure set. If no symbol with a value less than or equal to address is found in the loading module containing address, the dli_sNAME, dli_SADDr, and dli_size fields are set to 0. The dli_bind field is set to STB_LOCAL and the dli_type field is set to STT_NOTYPE. */ / the dladdr function looks for the header Dl_info info; if (dladdr(header, &info) == 0) { return; } // Here are a few variables, ready to find MachO! segment_command_t *cur_seg_cmd; segment_command_t *linkedit_segment = NULL; struct symtab_command* symtab_cmd = NULL; struct dysymtab_command* dysymtab_cmd = NULL; LoadCommand uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t); for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) { cur_seg_cmd = (segment_command_t *)cur; if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) { if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) { linkedit_segment = cur_seg_cmd; } } else if (cur_seg_cmd->cmd == LC_SYMTAB) { symtab_cmd = (struct symtab_command*)cur_seg_cmd; } else if (cur_seg_cmd->cmd == LC_DYSYMTAB) { dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd; }} // Return if (! symtab_cmd || ! dysymtab_cmd || ! linkedit_segment || ! dysymtab_cmd->nindirectsyms) { return; } // Find base symbol/string table addresses // Program base address = __LINKEDIT.VM_Address -__LINKEDIT.File_Offset + silde change value uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff; // printf(" address :%p\n",linkedit_base); Nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff); Char *strtab = (char *)(linkedit_base + symtab_cmd->stroff); // How to direct the uint32_t indices into symbol table? // How to direct the uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff); cur = (uintptr_t)header + sizeof(mach_header_t); for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) { cur_seg_cmd = (segment_command_t *)cur; If (cur_seg_cmd-> CMD == LC_SEGMENT_ARCH_DEPENDENT) {STRCMP (cur_seg_cmd->segname, SEG_DATA)! = 0 && strcmp(cur_seg_cmd->segname, SEG_DATA_CONST) ! = 0) { continue; } for (uint j = 0; j < cur_seg_cmd->nsects; j++) { section_t *sect = (section_t *)(cur + sizeof(segment_command_t)) + j; If ((sect->flags & SECTION_TYPE) == S_LAZY_SYMBOL_POINTERS) {perform_rebinding_with_section(rebindings, sect, slide, symtab, strtab, indirect_symtab); } if ((sect->flags & SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) {perform_rebinding_with_section(rebindings, SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) { sect, slide, symtab, strtab, indirect_symtab); } } } } }Copy the code
- First, dlADDR is used to determine whether the image passed in is in the memory space of the process. If not, return
- Define several variables, find the addresses of several key tables through traversal, and store them in variables
- Find the base address of the program link, add the offset of the table, and then iterate over the segment to find lazy and non-lazy tables for symbols using the perform_rebinding_WITH_section function for symbol redirection
perform_rebinding_with_section
static void perform_rebinding_with_section(struct rebindings_entry *rebindings, section_t *section, intptr_t slide, nlist_t *symtab, char *strtab, Uint32_t *indirect_symtab) {// the reserveD1 field in the NL_symbol_ptr and LA_symbol_ptrsection indicates the corresponding INDEX at the start of the indirect symbol table uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1; // Slide +section->addr is the array of functions that store symbols, that is, my corresponding __nl_symbol_ptr and __la_symbol_ptr function Pointers are in there, Void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr); For (uint I = 0; i < section->size / sizeof(void *); Uint32_t symtab_index = indirect_symbol_indices[I]; if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL || symtab_index == (INDIRECT_SYMBOL_LOCAL | INDIRECT_SYMBOL_ABS)) { continue; } // Use symtab_index as the subscript to access the symbol table uint32_t strtab_offset = symtab[symtab_index].n_un-n_strx; // select * strtab_offset = strtab_offset; Symbol_name_longer_than_1 = symbol_name[0] &&symbol_name [1]; symbol_name_longer_than_1 = symbol_name[0] &&symbol_name [1]; Hook struct rebindingS_entry *cur = rebindings; while (cur) { for (uint j = 0; j < cur->rebindings_nel; J++) {// if the name of the function is the same as that of the function symbol_name[1]. If (symbol_name_longer_than_1 && STRCMP (&symbol_name[1], [j].name) == 0) {// Determine the address of the replacement is not NULL and the implementation of my method and rebindings[j].replacement method is not consistent if (cur->rebindings[j].replaced ! = NULL && indirect_symbol_bindings[i] ! = cur->rebindings[j]. Replacement) {// Let rebindings[j]. Replace save indirect_symbol_bindings[I] function address *(cur->rebindings[j].replaced) = indirect_symbol_bindings[i]; } // Indirect_symbol_bindings [I] = cur->rebindings[j].replacement; goto symbol_loop; } } cur = cur->next; } symbol_loop:; }}Copy the code
- Variable each symbol in the session by passing in the pointer address to the table
- Find the symbol that corresponds to the function name, then pass the variable to the Rebindings array, if the match, then start the redirection
- Firstly, determine that the function address of the transferred rebindings structure is not empty, and the corresponding function address of the symbol is inconsistent with that of the new function address. In this case, use the rebindings function address to save the corresponding function address of the symbol
- Then, the function address corresponding to the symbol itself is replaced by the replacement of the passed rebindings, so: hook is complete