Opening position is important

Early on, iOS mentioned “dark magic”, HOOK, many people first think of is AOP RunTime MethodSwizzling unknown things, their basic usage is actually not difficult, the real difficult is how to use them in the right place.

Everything has two sides, the stronger it is, the more destructive the hidden dangers it may bring. The runtime mechanism provided by Apple is useful, but when abused in a project (and not used as an interview promotion), it can often backfire. For details, please refer to the cancer of iOS -MethodSwizzling.

The use of MethodSwizzling has been covered in previous articles, so please refer to several postures for MethodSwizzling. This method is mainly used in the development of some tools such as performance monitoring, crash compatibility and reporting, anti-cracking protection, etc. In the reverse process, its application is limited in the face of the application with corresponding security protection measures.

There is more to “dark magic” than RunTime. Today we are going to talk about another HOOK method commonly used in reverse engineering: Fishhook.

The story behind Fishhook

(1) Implementation principle

Fishhook is FaceBook’s open source tool for dynamically modifying the MachO symbol table. The power of FishHook lies in its ability to HOOK static C functions of the system.

OC method calls are msG_send (ID,SEL) at the bottom, which provides an opportunity to swap method implementations (IMP). The C function, however, determines the address Offset of the function pointer at link compilation time. This Offset is fixed in the compiled executable, and the starting address allocated by the system (obtained by the image List command in LLDB) changes every time the executable is reloaded into memory. The address of the Offset + Mach0 file is equal to the start address of the above Offset + Mach0 file:

Since pointer addresses to C functions are relatively fixed and immutable, how does FishHook HOOK to C functions? The internal/custom C function Fishhook doesn’t HOOK either, it can only HOOK external (shared cache libraries) functions. Fishhook takes advantage of MachO’s dynamic binding mechanism: Apple’s shared cache is not compiled into our MachO files, but is rebound when dynamically linked. Apple adopted the technology of position-Independent Code (PIC) to successfully make the bottom layer of C dynamic:

  • Compile-time creates a pointer (8 bytes of data, all zeros) to each referenced system C function in the symbol table of the _DATA section of the Mach-o file. This pointer is used to relocate the function implementation to the shared library during dynamic binding.
  • At runtime, the system C function is dynamically bound once when it is called for the first time, and then the corresponding pointer in the _DATA segment symbol table in Mach-O points to the external function (its actual memory address in the shared library).

Fishhook uses PIC technology to do two things:

  • Rebind Pointers to system methods (external functions) to internal functions/custom C functions.
  • Points a pointer to an internal function to the address of a system method when dynamically linked.

In this way, the system method is exchanged with the method defined by itself, to achieve the purpose of HOOK system C functions (in the shared library).

(two) with assembly parsing process

In order to better understand how Fishhook is the C function of HOOK system, we take HOOK NSLog as an example and analyze it step by step from assembly to explain the whole process of fishhook implementation of HOOK system NSLog.

Note: For non-lazy loaded symbol tables, DyLD will link the dynamic library when dynamically linked. For lazy loaded symbol tables, DyLD will dynamically bind the NSLog to the lazy loaded symbol table once the first time the run-time function is calledCopy the code

1. Verify system dynamic binding:

Create a new empty project and write these two lines:

Debug -> Debug Workflow -> Always Show Disassembly check box to check assembly information. After the breakpoint is broken, MachO will be sent to the first address in memory.

0x3028+0x000000010b0f7000

  1. Get the current value of the pointer, iOS CPU is small encoder, the current model is 64-bit CPU, so read 8 bytes in reverse order is the pointer value: 0x010B0F89A0
  2. Dis – s is the disassembly orders, we found that when the pointer to the function is called dynamic binding system functions
  3. To see more details about the calling function: libdyld. Dylib ‘dyLD_stub_binder

What the hell is going on? That’s right, that’s where the pointer to the NSLog in the bit-lazy loading symbol table was rebound when the NSLog was called for the first time.

Then we will pass through the first breakpoint, let the breakpoint break at the second NSLog, and check the symbol table again to the address that the pointer (again address of 0x3028+ 0x000000010B0F7000) points to.

0x010b0f89a0
0x010b491276

2. Verify fishhook rebinding:

We drag the Fishhook file into the project and add a simple binding:

Note: in MachO files recompiled after modifying the file, the pointer offset in the symbol table may change, and the memory header address of the re-run program will also change, requiring you to retrieve them to calculate the new memory address of the pointer.

After running, we click the screen to enter the breakpoint shown in the above image and check the pointer in the symbol table that originally pointed to the system NSLog:

(c) How does Fishhook find its function implementation in the shared library based on the pointer to the string in the symbol table?

Fishhook officials gave this image:

  1. The order of the strings in the Lazy Symbol Pointers is their position in the Dynamic Symbols Table -> Indirect Symbols Table (here is the first one).

    The value of the header “Reserved1” (section base address + offset value = its corresponding offset in Indirect Symbols) is used in the actual calculation of the addressSource code analysisThere are detailed instructions.

  2. The first corresponding Data value in the Dynamic Symbols Table -> Indirect Symbols Table (0x7A=122) is its index in the Symbols Table -> Symbols.
  3. In the Symbols Table -> Symbols, the position with index 122 corresponds to Data = 0x9B:
  4. Data (0x9B) + String Table start address (0x4F04)

conclusion

Today, we introduced the basic principle and specific process of Fishhook’s HOOK implementation of external functions of the system, combined with the PIC technology adopted in the shared cache library of iOS, and verified the dynamic binding process of iOS and fishhook rebinding mechanism one by one through disassembly commands. Finally, it demonstrates the steps fishhook takes to find a function implementation based on a symbol pointer in a symbol table. May you gain something! Level is limited, please advise ~

Fishhook’s source code analysis and application scenarios, as well as the sharing of security protection, will continue here.