An overview,

Obtaining shared library function addresses and global variables through Dlopen and DLSYM is a frequently used programming skill, especially in Hook framework. However, both DLSYm and some common frameworks (such as Nougat_dlfunctions) can only search **. Dynsym segments, not. Symtab ** segments. Therefore, the implementation of. Symtab segment search is an urgent problem to be solved.

This article will introduce how to search. Symtab section on the basis of Nougat_dlfunctions framework. If you don’t know about FastHook, check out FastHook — an efficient, stable, simple and easy to use Android Hook framework

Second, Enhanced DLFunctions implementation

The ELF file is actually a table structure, with segments as units, each segment stores different information, and segments are associated with each other by indexes to form a table. So all segments can be accessed through these relationships with just one header, the ELF file header. Let’s take a look at what information is needed:

  1. ELF header: Stores information about all segment headers. The segment header is the metadata that describes the segment information. You can obtain the segment information from the segment header.
  2. Shstrtab section. The segment name that stores all segments. ELF file headers allow traversal of segments without identifying specific segments (different segment types can be the same), so segment names are used to identify segments.
  3. Dynsym segment: dynamic symbol table, which stores import and export symbols related to dynamic links, excluding symbols inside the module. Nougat_dlfunctions queries only this segment and therefore misses many symbols.
  4. . Dynstr: indicates the symbol name corresponding to the symbol of the. Dynsym segment.
  5. .symtab segment: symbol table. Information about functions and global variables that are defined and referenced in a program.
  6. Strtab: stores. Symtab symbol name.
  7. . Comment section: information about the program, used and positioned symbol address.

2.1 enhanced_dlopen

void *enhanced_dlopen(const char *libpath, int flags) {
    FILE *maps;
    char buff[256];
    struct ctx *ctx = 0;
    off_t load_addr, size;
    int k, fd = -1, found = 0;
    void *shoff;
    Elf_Ehdr *elf = (Elf_Ehdr *) MAP_FAILED;

#define fatal(fmt, args...) do { log_err(fmt,##args); goto err_exit; } while(0)

    maps = fopen("/proc/self/maps"."r");
    if(! maps) fatal("failed to open maps");

    while(! found && fgets(buff, sizeof(buff), maps))if (strstr(buff, "r-xp") && strstr(buff, libpath)) found = 1;

    fclose(maps);

    if(! found) fatal("%s not found in my userspace", libpath);

    if (sscanf(buff, "%lx", &load_addr) ! = 1) fatal("failed to read load address for %s", libpath);

    log_info("%s loaded in Android at 0x%08lx", libpath, load_addr);

    /* Now, mmap the same library once again */

    fd = open(libpath, O_RDONLY);
    if (fd < 0) fatal("failed to open %s", libpath);

    size = lseek(fd, 0, SEEK_END);
    if (size <= 0) fatal("lseek() failed for %s", libpath);

    elf = (Elf_Ehdr *) mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
    close(fd);
    fd = -1;

    if (elf == MAP_FAILED) fatal("mmap() failed for %s", libpath);

    ctx = (struct ctx *) calloc(1, sizeof(struct ctx));
    if(! ctx) fatal("no memory for %s", libpath);

    ctx->load_addr = (void *) load_addr;
    shoff = ((void *) elf) + elf->e_shoff;

    Elf_Shdr *shstrtab = (Elf_Shdr *)(shoff + elf->e_shstrndx * elf->e_shentsize);
    char * shstr = malloc(shstrtab->sh_size);
    memcpy(shstr, ((void *) elf) + shstrtab->sh_offset, shstrtab->sh_size);

    for (k = 0; k < elf->e_shnum; k++, shoff += elf->e_shentsize) {
        Elf_Shdr *sh = (Elf_Shdr *) shoff;
        log_dbg("%s: k=%d shdr=%p type=%d", __func__, k, sh, sh->sh_type);
        switch (sh->sh_type) {
            case SHT_DYNSYM:
                if (ctx->dynsym) fatal("%s: duplicate DYNSYM sections", libpath); /* .dynsym */
                ctx->dynsym = malloc(sh->sh_size);
                if(! ctx->dynsym) fatal("%s: no memory for .dynsym", libpath);
                memcpy(ctx->dynsym, ((void *) elf) + sh->sh_offset, sh->sh_size);
                ctx->dynsym_num = (sh->sh_size / sizeof(Elf_Sym));
                break;
            case SHT_SYMTAB:
                if (ctx->symtab) fatal("%s: duplicate SYMTAB sections", libpath); /* .symtab */
                ctx->symtab = malloc(sh->sh_size);
                if(! ctx->symtab) fatal("%s: no memory for .symtab", libpath);
                memcpy(ctx->symtab, ((void *) elf) + sh->sh_offset, sh->sh_size);
                ctx->symtab_num = (sh->sh_size / sizeof(Elf_Sym));
                break;
            case SHT_STRTAB:
                if(! strcmp(shstr+sh->sh_name,".dynstr")) {
                    if (ctx->dynstr) break;    /* .dynstr is guaranteed to be the first STRTAB */
                    ctx->dynstr = malloc(sh->sh_size);
                    if(! ctx->dynstr) fatal("%s: no memory for .dynstr", libpath);
                    memcpy(ctx->dynstr, ((void *) elf) + sh->sh_offset, sh->sh_size);
                }else if(! strcmp(shstr+sh->sh_name,".strtab")) {
                    if (ctx->strtab) break;
                    ctx->strtab = malloc(sh->sh_size);
                    if(! ctx->strtab) fatal("%s: no memory for .strtab", libpath);
                    memcpy(ctx->strtab, ((void *) elf) + sh->sh_offset, sh->sh_size);
                }
                break;
            case SHT_PROGBITS:
                if(! ctx->dynstr || ! ctx->dynsym || ctx->bias)break;
                /* won't even bother checking against the section name */ ctx->bias = (off_t) sh->sh_addr - (off_t) sh->sh_offset; break; } } munmap(elf, size); elf = 0; if (! ctx->dynstr || ! ctx->dynsym) fatal("dynamic sections not found in %s", libpath); #undef fatal log_dbg("%s: ok, dynsym = %p, dynstr = %p symtab = %p strtab = %p", libpath, ctx->dynsym, ctx->dynstr, ctx->symtab, ctx->strtab); return ctx; }Copy the code

1. Obtain the loading address of so and store it in load_addr

    maps = fopen("/proc/self/maps"."r");

    while(! found && fgets(buff, sizeof(buff), maps))if (strstr(buff, "r-xp") && strstr(buff, libpath)) found = 1;

    fclose(maps);

    if (sscanf(buff, "%lx", &load_addr) ! = 1) fatal("failed to read load address for %s", libpath);
Copy the code

2. Remap SO to obtain the ELF file header

    fd = open(libpath, O_RDONLY);

    size = lseek(fd, 0, SEEK_END);

    elf = (Elf_Ehdr *) mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
Copy the code

3. Obtain the. Shstrtab segment.

    Elf_Shdr *shstrtab = (Elf_Shdr *)(shoff + elf->e_shstrndx * elf->e_shentsize);

    char * shstr = malloc(shstrtab->sh_size);

    memcpy(shstr, ((void *) elf) + shstrtab->sh_offset, shstrtab->sh_size);
Copy the code

4. Obtain the. Dynsym,. Dynstr,. Symtab, and. Strtab segments

 for (k = 0; k < elf->e_shnum; k++, shoff += elf->e_shentsize) {
        Elf_Shdr *sh = (Elf_Shdr *) shoff;
        switch (sh->sh_type) {
            case SHT_DYNSYM:
                ctx->dynsym = malloc(sh->sh_size);
                memcpy(ctx->dynsym, ((void *) elf) + sh->sh_offset, sh->sh_size);
                ctx->dynsym_num = (sh->sh_size / sizeof(Elf_Sym));
                break;
            case SHT_SYMTAB:
                ctx->symtab = malloc(sh->sh_size);
                memcpy(ctx->symtab, ((void *) elf) + sh->sh_offset, sh->sh_size);
                ctx->symtab_num = (sh->sh_size / sizeof(Elf_Sym));
                break;
            case SHT_STRTAB:
                if(! strcmp(shstr+sh->sh_name,".dynstr")) {
                    ctx->dynstr = malloc(sh->sh_size);
                    memcpy(ctx->dynstr, ((void *) elf) + sh->sh_offset, sh->sh_size);
                }else if(! strcmp(shstr+sh->sh_name,".strtab")) {
                    ctx->strtab = malloc(sh->sh_size);
                    memcpy(ctx->strtab, ((void *) elf) + sh->sh_offset, sh->sh_size);
                }
                break;
            case SHT_PROGBITS:
                if(! ctx->dynstr || ! ctx->dynsym || ctx->bias)break;
                ctx->bias = (off_t) sh->sh_addr - (off_t) sh->sh_offset;
                break; }}Copy the code

2.2 enhanced_dlsym

void *enhanced_dlsym(void *handle, const char *name) {
    int k;
    struct ctx *ctx = (struct ctx *) handle;
    Elf_Sym *dynsym = (Elf_Sym *) ctx->dynsym;
    Elf_Sym *symtab = (Elf_Sym *) ctx->symtab;
    char *dynstr = (char *) ctx->dynstr;
    char *strtab = (char *) ctx->strtab;

    for (k = 0; k < ctx->dynsym_num; k++, dynsym++) {
        if (strcmp(dynstr + dynsym->st_name, name) == 0) {
            void *ret = ctx->load_addr + dynsym->st_value - ctx->bias;
            log_info("%s found at %p", name, ret);
            returnret; }}if(symtab) {
        for (k = 0; k < ctx->symtab_num; k++, symtab++) {
            sym_tab->st_name,strings + sym_tab->st_name,k);
            if (strcmp(strtab + symtab->st_name, name) == 0) {
                void *ret = ctx->load_addr + symtab->st_value - ctx->bias;
                log_info("%s found at %p", name, ret);
                returnret; }}}return 0;
}
Copy the code

1. Query the. Dynsym segment. Dymsym traverses all symbols, looking for symbols whose names match the given string. Load_addr, bias, and sign offset are used to get the actual address.

for (k = 0; k < ctx->dynsym_num; k++, dynsym++) {
        if (strcmp(dynstr + dynsym->st_name, name) == 0) {
            void *ret = ctx->load_addr + dynsym->st_value - ctx->bias;
            log_info("%s found at %p", name, ret);
            returnret; }}Copy the code

2. If the. Dynsym segment cannot find the target symbol and the. Symtab segment exists, the. Symtab segment traverses all symbols and searches for symbols whose names are consistent with the given character string. Load_addr, bias, and sign offset are used to get the actual address.

for (k = 0; k < ctx->symtab_num; k++, symtab++) {
            sym_tab->st_name,strings + sym_tab->st_name,k);
            if (strcmp(strtab + symtab->st_name, name) == 0) {
                void *ret = ctx->load_addr + symtab->st_value - ctx->bias;
                log_info("%s found at %p", name, ret);
                returnret; }}Copy the code

Three, endnotes

Nougat_dlfunctions only support ARM platform, and I do not work on x86 platform, so Enhanced DLFunctions only support ARM32 and ARM64 for the time being. If it is compatible with x86 and other platforms, it is not difficult to change the implementation. The main difference is some structure of elf.h.

Four, reference

  1. FastHook — An efficient, stable, simple and easy to use Android Hook framework
  2. FastHook — Smart use of dynamic proxies for non-invasive AOP
  3. FastHook — Superior stability over YAHFA
  4. How to use FastHook root hook free wechat