[Wireless Platform] ELF PLT Hook principle brief introduction

The paper

Android is a Linux-based operating system, so ELF is the natively supported executable file format on Android development platform. The ELF file format can be used as a shared library format in addition to executable files, which are common so files, as well as object files (.o), core dumps files, etc.

GOT/PLT HOOK is an implementation mechanism of ELF file function HOOK. GOT/PLT HOOK is mainly used to implement the external call to replace a certain SO. Its advantage is very stable, SO this implementation scheme is usually used in production environment.

The Global Offset Table (GOT) and The Procedure Linkage Table (PLT) segments in The ELF file structure are named for The Procedure Linkage Table (PLT)

scenario

In Android development, ELF Hook technology can be useful in many scenarios. For example, hook File IO API in APM monitoring realizes file reading and writing monitoring; Hook loading dex function, to achieve a simple shell tool peeling effect.

ELF File structure and function brief

Common Tools

  • Readelf is used to view the contents of ELF headers and sections
  • Objdump disassembles elf specified section contents
  • GDB is used for code debugging

On MAC, readelf Objdump can be replaced with Greadelf Gobjump

Elf file format

Each ELF file consists of an ELF header and file data. These data include

  • A program header table describing zero or more memory segments
  • Section header table describing zero or more section blocks
  • Data that has a program header table or section header table reference

ELF files are involved in the connection and execution of programs, so you can look at the format of ELF files specifically from the link-file and executable perspectives, respectively

  • If used for compilation and linking, the compiler and linker treat the ELF file as a collection of sections described by an optional program header table
  • If used for load execution, the loader treats the ELF file as a collection of segments described by the program header table. A segment may contain multiple sections, with an optional section header table
  • If the file is shared, it contains both

See the detailed elf file structure for detailswikiAnd the definition of ELF structure in LLVMelf.h

Dynamic link

To solve the problem of wasting static link space and making it difficult to update programs, program modules are often separated into separate files rather than statically linked together. Dynamic library can be understood as a collection of object files. At runtime, dynamic library functions are inserted into executable files by linking them in memory. And dynamic libraries are loaded into memory only once. Once a dynamic library is loaded into memory, its code can be used by any program that needs it.

The specific process of dynamic linking

For details on the Dynamic Linking process, see Dynamic Linking

GOT 、PLT

.got (Global Offset Table)

The GLOBAL offset table is located in the data segment and is used to record the address map of external symbols referenced in code. The symbols here include variables, functions, etc.

If the global offset table (GLOBAL offset Table) does not store the address of the symbol in the global offset table (global offset table), the PLT table must be used

.plT (Procedure Linkage Table)

The PLT is in the code snippet. Every external function in the dynamic library has a corresponding record in the PLT. Each PLT record is a small piece of executable code. A PLT is a table of code fragments, so to speak. Each fragment consists of a specific function call that jumps to the GLOBAL offset table

The -fpic instructions

The PIC argument is used at compile time to tell the compiler to generate position-independent Code. For shared libraries, if -fpIC is not added, the data object referenced by the.so file will need to be repositioned when loaded. Relocation alters the contents of the code segment, which means that every process that uses the.so file will need to make a copy of the.so file in the kernel, and each copy will be different, depending on where the.so file and the data segment are memorized. When the PIC parameter is added, the generated code does not have absolute addresses, all use relative addresses, so the code can be loaded by the loader anywhere in memory, can be correctly executed. Therefore, for dynamic libraries, it is common to produce position-independent code

Got PLT working example

We first create a shared library and an executable that uses the shared library to analyze the detailed working process of the GOT PLT table

  • Create a shared library testa.so with the function print_hello
  • Create an elf executable, link to the testa library, and execute print_hello

Create the testa header file testa.h

#include <stdio.h>
void say_hello(a);
Copy the code

testa.c

#include "testa.h"
#include <stdio.h>

void say_hello(a){
	printf("Hello,World! \n");
}
Copy the code

Run the following command to compile and generate the testa.so dynamic library

gcc ./testa.c -fPIC -shared -o libtesta.so
Copy the code

Create the main.c file

#include "testa.h"
#include <stdio.h>

void say_hello(a){
	printf("Hello,World! \n");
}
Copy the code

Run the following command to compile and generate the main executable

gcc main.c -L. -ltesta -o main
Copy the code

The default search path for Linux dynamic linked libraries is /lib and /usr/lib, so when dynamic libraries are created, LD_LIBRARY_PATH= = = = = = = = = = = = = = = = = = = = = = = =

Run objdump to view the assembly instructions of the main file

objdump -d -M intel -S main
Copy the code
. Disassembly of section .plt: 00000000000005d0 <.plt>: 5d0: ff 35 ea 09 20 00 push QWORD PTR [rip+0x2009ea]# 200fc0 <_GLOBAL_OFFSET_TABLE_+0x8>
 5d6:	ff 25 ec 09 20 00    	jmp    QWORD PTR [rip+0x2009ec]        # 200fc8 <_GLOBAL_OFFSET_TABLE_+0x10>
 5dc:	0f 1f 40 00          	nop    DWORD PTR [rax+0x0]

00000000000005e0 <say_hello@plt>:
 5e0:	ff 25 ea 09 20 00    	jmp    QWORD PTR [rip+0x2009ea]        # 200fd0 <say_hello>
 5e6:	68 00 00 00 00       	push   0x0
 5eb:	e9 e0 ff ff ff       	jmp    5d0 <.plt>

...
000000000000070a <main>:
 70a:	55                   	push   rbp
 70b:	48 89 e5             	mov    rbp,rsp
 70e:	48 83 ec 10          	sub    rsp,0x10
 712:	89 7d fc             	mov    DWORD PTR [rbp-0x4],edi
 715:	48 89 75 f0          	mov    QWORD PTR [rbp-0x10],rsi
 719:	b8 00 00 00 00       	mov    eax,0x0
 71e:	e8 bd fe ff ff       	call   5e0 <say_hello@plt>
 723:	b8 00 00 00 00       	mov    eax,0x0
 728:	c9                   	leave  
 729:	c3                   	ret    
 72a:	66 0f 1f 44 00 00    	nop    WORD PTR [rax+rax*1+0x0]

Copy the code

Output on top

call 5e0 <say_hello@plt>
Copy the code

As you can see, the say_hello() function we called in the source program is replaced by the assembly instruction call 5e0. The address of 5e0 is a code segment in the PLT table. We go to address 5e0 to see the corresponding function

Disassembly of section .plt:

00000000000005d0 <.plt>:
 5d0:	ff 35 ea 09 20 00    	push   QWORD PTR [rip+0x2009ea]        # 200fc0 <_GLOBAL_OFFSET_TABLE_+0x8>
 5d6:	ff 25 ec 09 20 00    	jmp    QWORD PTR [rip+0x2009ec]        # 200fc8 <_GLOBAL_OFFSET_TABLE_+0x10>
 5dc:	0f 1f 40 00          	nop    DWORD PTR [rax+0x0]

00000000000005e0 <say_hello@plt>:
 5e0:	ff 25 ea 09 20 00    	jmp    QWORD PTR [rip+0x2009ea]        # 200fd0 <say_hello>
 5e6:	68 00 00 00 00       	push   0x0
 5eb:	e9 e0 ff ff ff       	jmp    5d0 <.plt>
Copy the code

JMP QWORD PTR [rip+x] Assembly instructions of the generated annotation, the address is GOT in the address table, so here and execute a jump, the reason also explained in front, for external Shared libraries, although the Shared library code part of the physical memory is Shared, but the dynamic link data part is it loaded in each application. Therefore, all instructions that need to reference the external address of the shared library will be queried in the global offset table to find the corresponding location of the function in the virtual memory of the currently running program.

When an external function is called for the first time, the global offset table does not store the memory address that the function was actually loaded into. Because Linux uses a delayed binding technique, the address needs to be linked by the dynamic link library DL_runtime_resolve and resolved for the first invocation. The implementation of this process is that the original instruction in the GLOBAL offset table jumps back to the second instruction in the PLT table corresponding to the source function

push   0x0
jmp    5d0 <.plt>
Copy the code

PLT[0] is the first entry in the global offset table. PLT[0] is the first entry in the global offset table. PLT[0] is the first entry in the global offset table. When dl_runtime_resolve resolves the address of the dynamic library function, the actual address is written back into the global offset table, so that the function needs to be resolved by the dynamic library the second time.

The following is an example of the flow of a shared library function call on the web working in PLT +got mode

PLT Hook implementation mechanism

The following is a brief introduction to the implementation mechanism based on PLT Hook (based on the open source library scheme) 1. Suppose there are two shared libraries libfoo.so,libbar.so and the programs that depend on them

Foo_func and bar_func are functions located in libfoo.so and libar.so, respectively. In the case of program, these two functions come from external libraries, so the actual execution needs to get the runtime function address from the PLT table.

If we need hook foo_func, the simplest implementation mechanism is to replace the address of foo_func in the PLT table with the address of the function we modified (assuming the function name is hook_foo_func).

If the modified function is not inside program (such as creating a new SO), then simply replace the address of foo_func in the PLT table with the address of hook_foo_func in libbar.so when it was loaded into memory.

If hook_foo_func is inside the program, the source function foo_func() cannot be called in hoo_foo_func because foo_func in the PLT is replaced with the address of hook_foo_func. If foo_func is called in hook_foo_func, an infinite loop is created. The solution to this problem is to call a source function in hook_foo_func by getting the address of the source function and setting it to a pointer variable. In a Unix environment, dysym is used to obtain the addresses of dynamic library symbols, including functions.

Some implementation details

1. How to know the corresponding address according to the function name of the dynamic library? First, load the corresponding dynamic library through dlopen and get the handle of the dynamic library

void *hndl = dlopen(filename, RTLD_LAZY | RTLD_NOLOAD);
Copy the code

Dlsym is used to obtain the relative addresses of each function symbol in the shared library

char *addr = dlsym(hndl, symbols[i])
Copy the code

The actual address at which a function is loaded into memory is the base address of the shared library + the relative address of the function, so you also need to get the base address of the corresponding shared library. Dl_iterate_phdr is used to iterate through the currently loaded dynamic library to get the base address. The first argument passed to dl_iterate_phdr is a callback function. The definition of this function can be viewed using man dl_iterate_phdr

struct dl_iterate_data data = {0}; data.addr = address;// Walk through the shared library to get the actual memory address of address
dl_iterate_phdr(dl_iterate_cb, &data);
Copy the code

The function dl_iterate_cb is implemented as

static int dl_iterate_cb(struct dl_phdr_info *info, size_t size, void *cb_data)
{
    struct dl_iterate_data *data = (struct dl_iterate_data*)cb_data;
    Elf_Half idx = 0;

    for (idx = 0; idx < info->dlpi_phnum; ++idx) {
        const Elf_Phdr *phdr = &info->dlpi_phdr[idx];
        char* base = (char*)info->dlpi_addr + phdr->p_vaddr;
        if (base <= data->addr && data->addr < base + phdr->p_memsz) {
            break; }}if (idx == info->dlpi_phnum) {
        return 0;
    }
    for (idx = 0; idx < info->dlpi_phnum; ++idx) {
        const Elf_Phdr *phdr = &info->dlpi_phdr[idx];
        if (phdr->p_type == PT_DYNAMIC) { // Save dynamic link library information
            data->lmap.l_addr = info->dlpi_addr;
            data->lmap.l_ld = (Elf_Dyn*)(info->dlpi_addr + phdr->p_vaddr);
            return 1; }}return 0;
}
Copy the code

2. Replace function address First, the base address of the dynamic library is saved when traversing the dynamic library. At this time, obtain the position of the source function in the PLT table and replace the address value with the target address value


static int check_rel(const plthook_t *plthook, const Elf_Plt_Rel *plt, Elf_Xword r_type, const char **name_out, void ***addr_out)
{
    if (ELF_R_TYPE(plt->r_info) == r_type) {
        size_t idx = ELF_R_SYM(plt->r_info);
        idx = plthook->dynsym[idx].st_name;
        if (idx + 1 > plthook->dynstr_size) {
            set_errmsg("too big section header string table index: %" SIZE_T_FMT, idx);
            return PLTHOOK_INVALID_FILE_FORMAT;
        }
        *name_out = plthook->dynstr + idx;
        // return the address of the source function PLT
        *addr_out = (void**)(plthook->plt_addr_base + plt->r_offset);
        return 0;
    }
    return - 1;
}

// Iterate over the functions in the dynamic library PLT, returning the function name and address
int plthook_enum(plthook_t *plthook, unsigned int *pos, const char **name_out, void ***addr_out)
{
    while (*pos < plthook->rela_plt_cnt) {
        const Elf_Plt_Rel *plt = plthook->rela_plt + *pos;
        int rv = check_rel(plthook, plt, R_JUMP_SLOT, name_out, addr_out);
        (*pos)++;
        if (rv >= 0) {
            returnrv; }}#ifdef R_GLOBAL_DATA
    while (*pos < plthook->rela_plt_cnt + plthook->rela_dyn_cnt) {
        const Elf_Plt_Rel *plt = plthook->rela_dyn + (*pos - plthook->rela_plt_cnt);
        int rv = check_rel(plthook, plt, R_GLOBAL_DATA, name_out, addr_out);
        (*pos)++;
        if (rv >= 0) {
            returnrv; }}#endif
    *name_out = NULL;
    *addr_out = NULL;
    return EOF;
}

int plthook_replace(plthook_t *plthook, const char *funcname, void *funcaddr, void **oldfunc)
{
    size_t funcnamelen = strlen(funcname);
    unsigned int pos = 0;
    const char *name;
    void **addr;
    int rv;

    if (plthook == NULL) {
        set_errmsg("invalid argument: The first argument is null.");
        return PLTHOOK_INVALID_ARGUMENT;
    }
    while ((rv = plthook_enum(plthook, &pos, &name, &addr)) == 0) {
        if (strncmp(name, funcname, funcnamelen) == 0) { // If a function with the same name is found
            if (name[funcnamelen] == '\ 0' || name[funcnamelen] == The '@') {
                int prot = get_memory_permission(addr);
                if (prot == 0) {
                    return PLTHOOK_INTERNAL_ERROR;
                }
                if(! (prot & PROT_WRITE)) {if(mprotect(ALIGN_ADDR(addr), page_size, PROT_READ | PROT_WRITE) ! =0) {
                        set_errmsg("Could not change the process memory permission at %p: %s",
                                   ALIGN_ADDR(addr), strerror(errno));
                        returnPLTHOOK_INTERNAL_ERROR; }}if (oldfunc) {
                    *oldfunc = *addr;
                }
                // Replace the address value in the PLT
                *addr = funcaddr;
                if(! (prot & PROT_WRITE)) { mprotect(ALIGN_ADDR(addr), page_size, prot); }return 0; }}}if (rv == EOF) {
        set_errmsg("no such function: %s", funcname);
        rv = PLTHOOK_FUNCTION_NOT_FOUND;
    }
    return rv;
}
Copy the code

Other implementation mechanisms

In addition to PLT based implementation mechanism, native hook implementation schemes also include InlineHook and TrapHook implementation. TrapHook uses the system SIGTRAP signal interrupt mechanism, which has poor performance and is rarely used in production environment. InlineHook is an instruction level replacement mechanism that is difficult to implement. Therefore, the implementation of GOT/PLT is often used in the production environment.

In the Android environment, the most commonly used open source library is iQIyi’s xHook library. In addition, facebook’s open source library Profilo also has PLT Hook implementation scheme, which can be used separately

In addition, in the Android environment, PLT loading mechanism is non-lazy loading mode, that is, when the shared library is loaded for the first time, all relocation operations will be completed, so the specific implementation will be different.