Preface:
This article relies more on the lead-in to the mach-O file in the previous article and recommends reading before exploring.
Hook is often used for code injection in the reverse process, and is often used in security protection and monitoring.
In addition, the way that only the runtime swaps imp for middle and advanced developers (want to be lazy and want to install *) is obviously not enough.
So today we will discuss Hook and fishHook principle.
Summary of the hooks
I have a HOOK. In iOS reverse is the technique of changing the flow of a program. Hook allows other people’s programs to execute their own code. This technique is often used in reverse-engineering. Therefore, in the learning process, we should focus on understanding its principle, so that we can effectively protect malicious code.
The famous Hook has been played out of the flower by many people, and its many uses we do not say.
Several common hooks in iOS
1 . Method Swizzle
Using the Runtime feature of OC, the corresponding relationship between SEL (method number) and IMP (method implementation) is dynamically changed to achieve the purpose of changing the process of OC method call. Mainly used for OC methods.
Commonly used with
method_exchangeImplementations
Exchange functionimpclass_replaceMethod
Replace methodmethod_getImplementation
与method_setImplementation
directlyget
/set
imp
The basic usage of these Runtime methods and how they work are explained and demo at the end of this article on debugging and code modification for resending applications, if you are interested.
2 . fishhook
It is a tool provided by Facebook to dynamically modify linked Mach-O files. By using MachO file loading principle, C function Hook is achieved by modifying the pointer of lazy loading table and non-lazy loading table.
3. Cydia Substrate
Cydia Substrate, formerly known as Mobile Substrate, is mainly used for Hook operations against OC method, C function, and function address. Of course, it’s not just designed for iOS, android can work as well. Official address: www.cydiasubstrate.com/
It uses logos syntax, which I’ll discuss in more detail in a future article.
Fishhook basic use
download
Git – address: fishhook
If necessary, you can download this annotated version link extraction code: F4F8.
demo
#import "ViewController.h"
#import "fishhook.h"
@interface ViewController(a)
@end
@implementation ViewController
- (void)viewDidLoad {
[super viewDidLoad];
/ / rebinding structure
struct rebinding nslog;
// The name of the function that needs HOOK, C string
nslog.name = "NSLog";
// Address of the new function
nslog.replacement = myNslog;
// Pointer to the original function address!
nslog.replaced = (void *)&sys_nslog;
// Rebinding array
struct rebinding rebs[1] = {nslog};
/** * Parameter 1: the array to hold the rebinding structure * parameter 2: the array length */
rebind_symbols(rebs, 1);
}
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- change the NSLog -- -- -- -- -- -- -- -- -- -- -
// Function pointer
static void(*sys_nslog)(NSString* format,...) ;// Define a new function
void myNslog(NSString* format,...) { format = [format stringByAppendingString:@" Check! \n"];
// Call the originalsys_nslog(format); } - (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event
{
NSLog("Hit the screen!!");
}
@end
Copy the code
Click on the screen to print the result:
001--fishHookDemo[15776:645816] Click the screen!! Hook!Copy the code
The key function
Rebind_symbols, source code:
int rebind_symbols(struct rebinding rebindings[], size_t rebindings_nel) {
// The prepend_rebindings function adds the entire REbindings array to the _rebindings_head header
Fishhook uses a linked list to store parameters passed in each call to rebind_symbols. Each call inserts a node into the head of the list. The head of the list is _rebindings_head
int retval = prepend_rebindings(&_rebindings_head, rebindings, rebindings_nel);
// Use prepend_rebinding as shown above. If less than 0, return an error code
if (retval < 0) {
return retval;
}
// Check whether _rebindingS_head ->next is null.
if(! _rebindings_head->next) {// For the first time, call _dyLD_register_func_for_add_image to register the listener method.
// An image that has been loaded by dyld is immediately called back.
// Subsequent images trigger a callback when dyLD is loaded.
_dyld_register_func_for_add_image(_rebind_symbols_for_image);
} else {
// Go through the loaded image and hook it
uint32_t c = _dyld_image_count();
for (uint32_t i = 0; i < c; i++) { _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i)); }}return retval;
}
Copy the code
The basics of Fishhook are simple to use.
- We need to define a pointer so that
fishhook
Can help us save the original system function implementation address, in addition will need to replaceThe name of the function
andCustom function address
Write it as a struct callrebind_symbols
That’s it, - It is also possible to write multiple structures to an array at once for multiple functions
hook
.
Fishhook analysis
Basic OC function hook principle we do not repeat, in fact, is to replace the method to achieve the IMP, this dynamic runtime mechanism based on OC language is very well understood.
But what about C?
We know that C functions are static, that is, at compile time, the compiler knows their implementation address, which is why C functions that write only function declarations are called with an error. So why is Fishhook able to change the call to C functions? Is there a dynamic property of the function? Let’s explore how it works
Note:
Fishhook is a function that can hook the system, not all C functions. That is to say, Fishhook can only rebind system functions with symbol tables, and it has no way to implement C functions.
We could write a C function ourselves and try it out.
#import "ViewController.h"
#import "fishhook.h"
@interface ViewController(a)
@end
@implementation ViewController
- (void)viewDidLoad {
[super viewDidLoad];
// hook custom C functions
struct rebinding Cfunction;
Cfunction.name = "func";
Cfunction.replacement = newfunc;
Cfunction.replaced = (void *)&funcOri;
struct rebinding resbs[1] = {Cfunction};
rebind_symbols(resbs, 1);
}
// To hook the c function
void func(const char * str){
NSLog(@"%s",str);
}
// The original function pointer record
static void(*funcOri)(const char *);
void newfunc(const char * str){
NSLog("Hooked!");
funcOri(str);
}
@end
Copy the code
Run, print the result
2019-11-12 14:54:19.001680+0800 fishhookDemo[35238:1563336] click on screen 2019-11-12 14:54:19.706819+0800 FishhookDemo [35238:1563336] select * from fishhookDemo[35238:1563336]Copy the code
Unable to hook custom C functions, let’s use the Fishhook principle to explain why.
Fishhook principle
First of all:
C | OC |
---|---|
static | dynamic |
Determine the function address at compile time | The runtime determines the function address |
The C function of the system has a dynamic part, which is often referred to as the symbol table, using a technique called Position Independent Code, which is where Fishhook did his article.
fishhook
Original story:
The principle of overview
Since the UIKit/Foundation libraries in iOS are loaded into memory via DyLD, Apple has put them in one place to save space: Dyld shared cache (Mac OS).
Therefore, the implementation address of an NSlog-like function does not and cannot be in our own project’s Mach-O, so how can our project call the NSLog method find its real implementation address?
The process is as follows:
-
Generated when the project is compiled
Mach-O
There’s a space set aside in the executable, which is essentially a symbol table, and it’s stored there_DATA
Data segment (because_DATA
Segments are readable and writable at run time. -
Compile-time: all projects that reference system library methods in the shared cache point to symbolic addresses. (For example, if there is an NSLog in the project, an NSLog is created in Mach -o at compile time, and the NSLog in the project points to this symbol.)
-
Runtime: When dyld loads the application into memory, it does the binding based on the library files listed in Load Commands (NSLog, for example, Dyld will find the real address of the NSLog in Foundation and write it to the _DATA symbol table above the symbol of the NSLog.)
This process is called PIC technology.
Now that we know how to load the system functions, let’s look at the fishhook function name:
Rebind_symbols :: Rebinding symbols is straightforward.
The principle is:
The symbols pointed to by the compiled system library functions are rebound at run time to the user specified function address, and then the real address of the original system function is assigned to the user specified pointer.
So look back at the custom C function why can’t hook?
The answer is simple:
- The custom
C
The actual address of the function is in its ownMach-O
In addition, there is no sign and binding process. - This is determined at compile time and there is no way to manipulate it.
View the symbol table in Mach-O
Use MachOView to view directly.
Some students said, say so say, how to verify?
After all, some bug-collecting tools also use symbol table restoration frequently. By the way, let’s have a practical operation, while testing the theory, while deepening our memory.
Symbol tables and practice test theory
As we can see from MachOView, there are two types of symbol tables
Lazy Symbol Pointers
Non-Lazy Symbol Pointers
That’s literally lazy loading and non-lazy loading.
Therefore, when using Fishhook, it is best to call the original function to prevent the possibility of unused and unbound problems.
So, how about we play?
So without further ado, let’s go to the demo that we just wrote that hooked NSLog, and in viewDidLoad we’ll just add NSLog(@”123″);
Started to play
1 Prepare code and breakpoints
2 MachOView view
CMD + R run the project to get to the breakpoint, find Mach-O and use MachOView.
3 Calculate the symbolic address
-
First we see that the initial address offset for this symbol is 3028 based on the Mach-O file (everyone’s different, you use your own).
So where is mach-O’s address?
Go to the project LLDB and enter the command: image list
The first is the Mach-O actual memory address of our project
-
Open the calculator CMD + 3, select hexadecimal, CMD + V paste in the Mach-O real memory address, plus MachOView sign offset address 3028.
CMD + c copy the calculation result.
4 LLDB View memory and assembly code
X + 0x1042C8028 (Your calculation result)
(Memory read also works, x is short for memory read)
5 View the content of the first eight bytes
Note: In iOS small end mode, read from right to left.
So my actual address in the figure above is 0x01042C69C0.
Dis-s 0x01042C69C0 (Your own address)
So we see that there’s nothing in there, which means that at this point in the breakpoint here, the symbol is not bound to anything.
Go over the break point and get to the second break point.
For those unfamiliar with assembly, compare the result of the second breakpoint. In addition, the author will consider further compiling some contents in the future.)
Look at the symbol again
X + 0x1042C8028 (Your calculation result)
You can see that the content has obviously changed.
Dis-s 0x01042C69C0 (your own address)
So that’s it. Let’s go back to the principles section. Fully verified!
Don’t worry, this is just verifying the PIC part of iOS, but what about fishhook?
- in
touchesBegan
Add a breakpoint (it doesn’t have to betouchesBegan
Plus, I’m just herefishhook
的rebind_symbols
There is no code to cross the breakpoint. - Past the current breakpoint (
rebind_symbols
), click the screen to go to the next breakpoint. - Look again at memory and assembly
The results are as follows:
After fishhook is rebound, the symbol points to our custom function address. Verify the hypothesis completely.
The last
Fishhook is very important in reverse-engineering, and many tools have Fishhook built into them, so I want you to understand and grasp the principles.
Thanks for watching, and we’ll see you next time.