Facebook’s Fishhook dynamically modifies external C functions at run time

Written in C

This paper analyzes fishhook written by Swift

This article is mainly written by Woshiccm /FishHook, Swift

There’s a big difference in functionality from Facebook’s FishHook,

Can be used to understand Facebook’s FishHook

Principle:

C is a static language, and the address of a function is determined at compile time

Because iOS did some mach-O loading optimizations, known as dynamic linking,

Using the dynamic linking feature, FishHook dynamically modifies external C functions at run time

Simple understanding, dynamic link dynamic loader

C is a static language, and the addresses of internal C functions are determined at compile time

External C function, not determined at compile time, can change its symbolic address

External C language functions in the shared cache library

At compile time, the implementation of a function points to a symbol whose address cannot be determined

During dynamic linking, the system changes the address corresponding to the symbol to the actual address

At run time, Fishhook dynamically modifies the external C functions, that is, the addresses corresponding to symbols

MachOView, a great helper to learn Fishhook

MachOView Allows you to view the format of a Mach-O file on a Mac

A simple understanding of the internal layout of mach-O files

Mach-o files are divided into two main types of segment

The text section of the program

The data section that puts the data

Fishhook dynamically modifies, of course, only the data segment

There are two common tables using Fishhook, the load table and the lazy load table

  • Loading table, non-lazy Symbol Pointers,

Loading table symbols, a run program, the system to the actual function address, have been processed

  • Lazy pointer, La, Lazy Symbol Pointers,

Lazy loading table symbol, when the function is used, the system to find the actual function address

Implementation:

The swift wrote, the initial hook

Features:
  • Swift, hook, call convention is hard to write

Function convention, which is a bit more cumbersome than C’s Fishhook

  • The Swift version uses Pointers a lot

Because Swift is a secure language, pointer handling is a bit trickier than C’s Fishhook

C fishhook, easy to convert

Code:

All calls are,

Take the old function,

Get the new function (replace by what we need)

Another variable record (save the address of the function to be replaced)

Access to information
Public struct Rebinding {let name: String; UnsafeMutableRawPointer // A pointer to the address of the new function var press: UnsafeMutableRawPointer? // pointer to the original function address}Copy the code
Manipulate the Mach-O binaries

Get the log. Hook it up

public func _rebindSymbol(_ rebinding: CurrentRebinding == nil {currentRebinding = Rebinding) {currentRebinding = nil; _dyLD_register_func_for_add_image (rebindSymbolForImage)} else {currentRebinding = rebinding Image, hook for I in 0.. <_dyld_image_count() { rebindSymbolForImage(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i)) } } }Copy the code

To hook

The corresponding lazy loaded symbol segment, non-lazy loaded symbol segment, symbol table, dynamic symbol table and string table are mainly found from the macho-o file

Because of Apple’s design, you can’t directly get the content you want, you have to follow the corresponding path step by step to get the value you want.

Table lookup is tedious

func rebindSymbolForImage(_ mh: UnsafePointer<mach_header>? , _ slide:Int) {guard let mh = mh else {return} UnsafeMutablePointer<segment_command_64>! var linkeditSegment: UnsafeMutablePointer<segment_command_64>! var symtabCmd: UnsafeMutablePointer<symtab_command>! var dysymtabCmd: UnsafeMutablePointer<dysymtab_command>! PageZero = pageZero + pageZero + mach_header_64; Var cur = UnsafeRawPointer(mh). Advanced (by: MemoryLayout<mach_header_64>. Stride) // To traverse commands, // determine the position of linkedit, symtab, dysymtab UInt32 in 0 .. < mh.pointee.ncmds { curSegCmd = UnsafeMutableRawPointer(mutating: cur).assumingMemoryBound(to: segment_command_64.self) cur = UnsafeRawPointer(cur).advanced(by: Int(curSegCmd.pointee.cmdsize)) if curSegCmd.pointee.cmd == LC_SEGMENT_64 { let segname = String(cString: &curSegCmd.pointee.segname, maxLength: 16) if segname == SEG_LINKEDIT { linkeditSegment = curSegCmd } } else if curSegCmd.pointee.cmd == LC_SYMTAB { symtabCmd = UnsafeMutableRawPointer(mutating: curSegCmd).assumingMemoryBound(to: symtab_command.self) } else if curSegCmd.pointee.cmd == LC_DYSYMTAB { dysymtabCmd = UnsafeMutableRawPointer(mutating: Sumingmemorybound (to: dysymtab_command. Self)}} = nil, symtabCmd ! = nil, dysymtabCmd ! = nil else {return} // Program base address when linking = __linkedit. VM_Address + silde change value - __linkedit. File_Offset let linkeditBase = slide + Int (linkeditSegment. Pointee. Vmaddr) - Int (linkeditSegment. Pointee. Fileoff) / / symbol table address = base address + symbol table offset let symtab = UnsafeMutablePointer<nlist_64>(bitPattern: LinkeditBase + Int(symtabcmd.pointee.symoff)) // string table address = base address + string table offset let strtab = UnsafeMutablePointer<UInt8>(bitPattern: LinkeditBase + Int(symtabcmd.pointee.stroff) // linkeditBase = base address + offset let indirectSymtab = UnsafeMutablePointer<UInt32>(bitPattern: linkeditBase + Int(dysymtabCmd.pointee.indirectsymoff)) guard let _symtab = symtab, let _strtab = strtab, Let _indirectSymtab = indirectSymtab else {return} UnsafeRawPointer(mh). Advanced (by: MemoryLayout<mach_header_64>. Stride) for _: UInt32 in 0 .. < mh.pointee.ncmds { curSegCmd = UnsafeMutableRawPointer(mutating: cur).assumingMemoryBound(to: segment_command_64.self) cur = UnsafeRawPointer(cur).advanced(by: Int(curSegCmd.pointee.cmdsize)) if curSegCmd.pointee.cmd == LC_SEGMENT_64 { let segname = String(cString: & curSegCmd. Pointee. Segname, maxLength: 16) if segname = = SEG_DATA {/ / find the data segment for j in 0. < cursegcmd.pointee. nsects {// Iterate over each section header let cur = UnsafeRawPointer(curSegCmd).advanced(by: MemoryLayout<segment_command_64>.size + Int(j)) let section = UnsafeMutableRawPointer(mutating: cur).assumingMemoryBound(to: section_64.self) if section.pointee.flags == S_LAZY_SYMBOL_POINTERS || section.pointee.flags == S_NON_LAZY_SYMBOL_POINTERS {/ / to get lazy loading table / / and lazy loading table performRebindingWithSection (section, slide, slide, symtab: _symtab, strtab: _strtab, indirectSymtab: _indirectSymtab) } } } } } }Copy the code

Complete the function exchange, and save the old function address pointer

func performRebindingWithSection(_ section: UnsafeMutablePointer<section_64>, slide: Int, symtab: UnsafeMutablePointer<nlist_64>, strtab: UnsafeMutablePointer<UInt8>, indirectSymtab: UnsafeMutablePointer<UInt32>) { guard var rebinding = currentRebinding, let symbolBytes = rebinding.name.data(using: String.encod.utf8)?.map({$0}) else {return} // the reserveD1 in nl_symbol_ptr and LA_SYMBOL_ptr sections Indirect Symbol table let indirectSymbolIndices = InDirectSymtab.advanced (by: Int (section. The pointee. Reserved1)) / / slide + section - > addr, the array of the corresponding function is / / symbols, // The function pointer in the __nl_symbol_ptr and __la_symbol_ptr tables is found. Let indirectSymbolBindings = UnsafeMutablePointer<UnsafeMutableRawPointer> Slide +Int(section.pointee.addr)) guard let _indirectSymbolBindings = indirectSymbolBindings else {return Every symbol in section for I in 0.. <Int(section.pointee.size)/MemoryLayout<UnsafeMutableRawPointer>. Size {// Find the value of the Symbol in the Indrect Symbol Table Table DATA, get the location of the symbol in the DATA segment section (suitable for lazy loading table, and the lazy loading table) let symtabIndex = indirectSymbolIndices. The advanced (by: i) if symtabIndex.pointee == INDIRECT_SYMBOL_ABS || symtabIndex.pointee == INDIRECT_SYMBOL_LOCAL { continue; } // Use symtab_index as a subscript to access symbol table let strtabOffset = symtab.advanced(by: Int(symtabindex. pointee).pointee.n_un. N_strx // get the symbolName. Int(strtabOffset)) var isEqual = true for i in 0.. <symbolBytes.count { if symbolBytes[i] ! = symbolName.advanced(by: I +1).pointee {isEqual = false}} if isEqual {// save the implementation of the old function, The rebinding. Replaced = _indirectSymbolBindings. The advanced (by: I). The pointee / / new function, exchange of good _indirectSymbolBindings. The advanced by: (I), the initialize (to: rebinding. Replacement)}}}Copy the code
See the effect:

Image list, get the first address of the program

MachOView takes the offset of the symbol

X /1gx, find the address of the symbol

Dis, disassembly

That’s a lot of information

github repo