preface

In the previous article iOS Application Security 4 — Code injection, stealing wechat login password, we know how to hook App OC method, that is, use OC runtime mechanism, during the run to replace the implementation of the corresponding method. Have you ever wondered how we can hook functions written in C? And that’s using fishhook, the download address.

Fishhook profile

Fishhook is a tool provided by Facebook to dynamically modify linked Mach-O files. By using MachO file loading principle, C function hook is achieved by modifying the pointer of lazy loading table and non-lazy loading table.

Without further ado, download it and add it to the project.


  1. There is a structure with some comments added to explain it as follows:
// rebinding-- > rebind
struct rebinding {
  const char *name;         // To hook the function name, char* is a C language string
  void *replacement;        // The address of the function used to replace the original function
  void **replaced;          // The address of the original function to be replaced, void**
};
Copy the code
  1. There are also two functions, as follows:
FISHHOOK_VISIBILITY
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel);

FISHHOOK_VISIBILITY
int rebind_symbols_image(void *header,
                         intptr_t slide,
                         struct rebinding rebindings[],
                         size_t rebindings_nel);
Copy the code

The second function is obviously an extension of the first, so let’s look at how the first function is used.

Fishhook use

For simplicity, instead of re-signing an App for testing, hook the NSLog C function in the ViewController class of the new project. There’s very little code here

- (void)viewDidLoad {
    [super viewDidLoad];
    self.view.backgroundColor = [UIColor whiteColor];
    // The NSLog function of the hook system is used as an example
    // Create a structure variable
    struct rebinding rebind;
    rebind.name = "NSLog";                  // Note that C language strings do not add @
    rebind.replacement = cus_log;           // C function names are function Pointers
    // This is the address of the function address to attach the NSLog address inside the rebind_symbols function
    rebind.replaced = (void *)&sys_log;
    
    // Define a rebinding array with only one element. Of course, you can store more than one
    struct rebinding rbs[] = {rebind};
    rebind_symbols(rbs, 1);
    
}
// Define the address of the NSLog function used to receive the system
static void (*sys_log)(NSString *format, ...);
// Custom print function to replace NSLog
void cus_log(NSString *format, ...) {
    // Add a tag to the end of the printed string
    format = [format stringByAppendingString:@"--------> mark of cus_log"];
    // call the NSLog function
    sys_log(format);
}

/ / test
- (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event {
    NSLog(@"Click the screen 😂😂😂");
}
Copy the code

I think the above code is clearly commented and won’t be repeated here. The rebinding structure will use void** as the replacement variable. If you are not familiar with this, do a bit of research on the difference between passing values and passing addresses in C. You can also read my article a few days ago about data structures and Algorithms 2 — the bottom summary of the linked list. There is a detailed explanation of this section.

The running effect is as follows:

Hook custom C functions?

We have used Fishhook to replace system C function NSLog, so can we use Fishhook to replace C function written by developers in App? To find out, the test code is as follows:

- (void)viewDidLoad {
    [super viewDidLoad];
    self.view.backgroundColor = [UIColor whiteColor];

    struct rebinding rb;
    rb.name = "printHelloWorld";
    rb.replacement = hook_printHelloWorld;
    rb.replaced = (void *)&app_printHelloWorld;
    
    struct rebinding rbs[] = {rb};
    rebind_symbols(rbs, 1);
}

// A simple C function, assuming the function is implemented in someone else's App
void printHelloWorld(a) {
    NSLog(@"Hello World!");
}
// Receive the implementation address of printHelloWorld
static void (*app_printHelloWorld)(void);

// Replace the function of printHelloWorld
void hook_printHelloWorld(a) {
    NSLog(@"Hello World! Print a sentence before it represents hook success.");
    // Call the original printHelloWorld function
    app_printHelloWorld();
}
/ / test
- (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event {
    printHelloWorld();
}
Copy the code

The hook failed. Because in Hello World! The sentence that represents hook success is not printed in Hello World! Print before printing.

Analyze the cause of failure

P1. Why did it fail? Why not hook this custom C function?

If we think about it, we find two things!

  1. OC is a dynamic language, and the dynamic is that OC’s methods are determined at runtime which method implementation is actually going to be executed, right? (This sentence is a bit long, so I added a few blanks.)

    And because OC is a dynamic language, we can use itMethod SwizzlingAnd other methods to achieve hook OC method.
  2. C is a static language. Static is reflected in the function written in C language. During the compilation of the program, we can determine the code to be executed when the function is called. Because of this, we can’t hook the functions written in C language like OC.

Why fishhook hook NSLog function? Student: isn’t NSLog a C function?

First of all, NSLog is really a C function, which is fine. IOS App Security 5 — What does the main function do before calling it? . The previous article focused on how MachO files and Dyld load applications. These two parts, of course, are not gratuitous to talk about these two content, certainly useful.

Shared cache

In the dyLD application loading process, there is a step called loading the shared cache, I’m sure you remember what this step does?

We all know that NSLog is a C function in the System library Foundation framework, which itself is a truly dynamic library with features that are shared across multiple processes. This is the type of system library that we usually refer to as shared cache.

The difference between custom C functions and C functions in the system shared library

System C function

C function defined by the system, because the specific function implementation is in the system shared library, so during the program compilation is unable to obtain the implementation address of this C function, can only be dynamically linked to the function name through a method called symbolic binding. This is a bit abstract, but here are a few scenarios:

1, create a new project and write NSLog(@” hahaha “) in viewController. m viewDidLoad; 2, Command +B compiles the program, then compiles to NSLog(@” hahaha “); When I say this, Xcode will determine if the NSLog function is a function in the system library. NSLog(@” hahaha “); NSLog(@” hahaha “); This code binds to a CHAR * C string “NSLog” in the symbol table. 4. When Command +R runs the application, the iOS shared cache Foundation library is loaded into memory when dyLD loads the application. 5. During the operation of the program, when the code is called to NSLog function for the first time, it will query the implementation address of NSLog function in the MachO file of Foundation library. Then bind the implementation address of the function to the string “NSLog” in the symbol table, so that when you call the NSLog function later, you can find the implementation address directly. (This method is also known as lazy binding because of lazy loading.)

Custom C functions

At this point, I don’t have to tell you, but we all know how custom C functions work, right? Custom C function, because the function implementation and function call is in the same MachO file (App itself MachO file), so during the compilation link, Xcode will directly link the function call statement and the function implementation address, there will be no system C function those steps.

Fishhook principle

Now that we understand how the system C functions work, we should be able to guess some fishhook system C functions as well. Again, the NSLog function is used:

As we know from the above, when the NSLog function is called for the first time, the system will find the function implementation address of NSLog in the Foundation framework and bind it with the string “NSLog” in the symbol table of the MachO file of App.

Fishhook takes advantage of this. Look back at the fishhook function rebind_symbols, which translates to rebinding symbols. What’s in the rebinding structure?

// rebinding-- > rebind
struct rebinding {
  const char *name;         // To hook the function name, char* is a C language string
  void *replacement;        // The address of the function used to replace the original function
  void **replaced;          // The address of the original function to be replaced, void**
};
Copy the code

Is everything just right? We passed a rebinding structure to the rebind_symbols function, including the symbol “NSLog” for the NSLog function in the symbol table. The rebind_symbols function then uses the symbol “NSLog” to find the address of the NSLog function implemented in the Foundation framework, assigns this address to the press variable and returns it to us. Then replace the function address bound by the symbol “NSLog” with the function address replacement that we passed in. (Note: To ensure that the values returned by Fishhook are correct, the function will be called inside rebind_symbols by default if necessary.)

Empty talk?

A lot of stuff. Who knows if it’s true or not?

  1. In order to ensure that the original function address returned to us is correct, Fishhook will call the corresponding function once by default before rebinding, but for clearer testing, we can use therebind_symbolsCall a function manually before calling itNSLogFunction. As shown, add a sentenceNSLog(@"123456");And make a breakpoint before the call.
  2. Run the code and break to 20 lines. uselldbtheimage listCommand to view the memory offset (ASLR) of the main program MachO.
  3. Use the MachOView tool to open the MachO file of your App.
  4. By the calculation method described in the figure above, calculateSymbol NSLogThe real memory address of the currently bound address.

    That is, the real address = 0x00000001005B0000 + 0xC000 = 0x00000001005BC000.
  5. throughlldbThe commandmemory readLook at the memory we calculated.
  6. Now you can see thatSymbol NSLogThis value is a pointer to the actual address 0x00000001005b6904, since iOS is small endian mode by default and the high bits are stored in the low address.
  7. throughlldbthedis -sThe instruction disassembles the address. You can see that this address is now just a meaningless address, which isSymbol NSLogDefault bound address.
  8. Break point down, executeNSLog(@"123456");Breakpoint stops in therebind_symbolsBefore the function.
  9. Repeat Step 5 to passlldbThe commandmemory readView the memory, you can find that the same block of memory address, stored in the address has changed.
  10. Again, usinglldbthedis -sThe instruction disassembles this address (note the little endian).
  11. Now you can see,Symbol NSLogThe bound pointer has changed its original address from a meaningless one to oneFoundationin-frameAddress of the NSLog function.
  12. Don’t worry, it’s not over yet. Break point and go one step further. Letrebind_symbolsFunction execution. Again usinglldbThe commandmemory readCheck that memory.
  13. Found the binding pointer to the address and changed, as usual, disassemble this address.
  14. You can findSymbol NSLogThe binding pointer is savedNSLog function in FoundationThe address has been changed by FishhookThe cus_log function in fishhook-test.

That’s how it works. You’re not framed, are you?

How do I find the symbol NSLog from the C string “NSLog”?

From the above we know how to hook the function by using the symbols in the lazy loading symbol table in MachO file to find the memory address of the function. And the premise of all of this is to find the NSLog in the symbol table, so how do you do that?

Take a look at fishhook’s rebinding structure. In addition to passing in the address of the hook function, we need to provide a C string with the same name as the system C function. So it must be this C string to find the symbol of the corresponding function.

  1. Use MachOView to open up your App’s MachO file, which has an item calledString TableAs the name suggests, this is a string table"NSLog"If you’re looking for a function symbol, then of course you start with the string table.String TableAs follows:
  2. These strings use.To divide and use_This is a function. Look closely at the figure above"._NSLog"The corresponding binary value, you can find that the binary value is actually each characterasciiCode.
  3. That is to say, we can take C strings"NSLog"The character is retrieved and converted to the correspondingasciiCode, and then toMachOtheString TableQuery the table, and you can find"NSLog"In a stringString TableThe offset in the table.
  4. There is something wrong with the offset in the figure above and it should be calculated"_NSLog"The offset of theta is not"NSLog"So the result of the above calculation should be 0x00010F5B instead of 0x00010F5C.
  5. We calculated it"_NSLog"The offset relative to the MachO file minusString TableThe offset relative to the MachO file is obtained"_NSLog"Relative to theString TableThe offset of.

    Result = 0x00010F5B – 0x00010EC0 = 0x9B.
  6. Next look at the symbol tableSymbol TableYou can see that there’s one for each of these symbolsString Table IndexThat’s what we figured out in the last step.
  7. You can findString Table IndexIs equal to 0x9B, and its corresponding string value is foundValueis_NSLog“, it means there is no mistake.
  8. And then you can see, this"_NSLog"The corresponding position in the symbol is 105, which in hexadecimal is 0x69. To switch toDynamic Symbol TableDirectory, find the row with the value 0x69.
  9. You can find the value0x69The correspondingSymbolSymbols are exactly what they are_NSLog. Again, yes. And the key is the one belowIndirect Address0 x10000c000.
  10. 0x10000C000 0x100000000 indicates yesLoad CommandsThe inside of the_PAGEZEROThis size is actually a virtual size
  11. That is to say,Indirect AddressThe real offset is 0xC000. Does this 0xC000 sound familiar? Ok, look at lazy loading symbol table –>Lazy Symbol Pointers
  12. The symbol corresponding to the offset 0xC000 is_NSLogSo far, finally connected with a title.

conclusion

Through this article, we have learned the following points:

  • What is Fishhook?
    • It is a tool used to hook system C functions.
  • How does Fishhook work?
    • The code is very simple. It says.
  • Why can fishhook hook system C functions?
    • Because the function implementation address of the system C function is in the shared library, it cannot be obtained directly during compilation.
  • How does fishhook implementation work?
    • Through the function symbol string in MachO file through a series of searches, finally find the implementation address of the function, the implementation address of the function is replaced by our self-defined function address.

Extra-topic — simple protection

As mentioned before, the method of Hook OC can exchange the implementation address of the method in three ways, so that the swapped method 2 will be executed when the App calls method 1, and the added code and the original implementation of method 1 will be executed when we call method 2 again. For details, please refer back to iOS App Security 4 — Code injection and Stealing wechat login passwords.

So how can we as developers defend ourselves against this kind of cracking? In this article, you can use fishhook to replace all the C functions exchanged in the Runtime with unusable C functions. In this way, when other people use the method exchanged method, they will call the unusable C functions. Ha ha 😄.

But is this completely safe? Not even close. Others can still use other methods to undermine or circumvent your protection.

Attack and defense are always opposite, there is no absolute security, only relative security. If the hacker really determined to break your App, you no matter how much protection is useless. All we have to do is make it take a long, long time for the hacker to crack it and let him back off.

This paper addresses https://juejin.cn/post/6844904128158629902