Bytecode Mobile Technology — Xie Junyi

IOS suffers worldwide with Apple’s 14.2 update. Libffi crash in iOS14.2, many apps of our company are deeply troubled, many basic libraries use libffi.

The code sign error is caused by VMRemap. We solved this problem by using static trampoline so that libffi didn’t need to use vmremap. Here is an introduction to the relevant implementation principle.

What is libffi

Compilers for high-level languages generate code that follows certain conventions. These conventions are in part necessary for a separate compilation exercise. A “calling convention” is essentially a set of assumptions made by the compiler about where a function entry will find its arguments. The calling convention also specifies where to find the return value of the function.

Some programs may not know the parameters to be passed to a function at compile time. For example, at run time, the interpreter might be told the number and types of arguments to use to call a given function. Libffi can be used for such programs to provide a bridge from interpreter programs to compiled code.

The libffi library provides a portable, high-level programming interface for a variety of calling conventions. This allows the programmer to call any function specified by the call interface description at run time.

The use of ffi

Simply find a library that uses FFI and look at its call interface

ffi_type *returnType = st_ffiTypeWithType(self.signature.returnType);
NSAssert(returnType, @"can't find a ffi_type of %@", self.signature.returnType);

NSUInteger argumentCount = self->_argsCount;
_args = malloc(sizeof(ffi_type *) * argumentCount) ;

for (int i = 0; i < argumentCount; i++) {
  ffi_type* current_ffi_type = st_ffiTypeWithType(self.signature.argumentTypes[i]);
  NSAssert(current_ffi_type, @"can't find a ffi_type of %@", self.signature.argumentTypes[i]);
  _args[i] = current_ffi_type;
}

// Create a closure for ffI springboard
_closure = ffi_closure_alloc(sizeof(ffi_closure), (void **)&xxx_func_ptr);

// Create the cif, the type information of the parameters and return values used by the call function, and then the call convention is used to process the parameters and return values
if(ffi_prep_cif(&_cif, FFI_DEFAULT_ABI, (unsigned int)argumentCount, returnType, _args) == FFI_OK) {

        // Closure writes the springboard data page
  if (ffi_prep_closure_loc(_closure, &_cif, _st_ffi_function, (__bridge void*)(self), xxx_func_ptr) ! = FFI_OK) { NSAssert(NO, @"genarate IMP failed"); }}else {
  NSAssert(NO, @"");
}
Copy the code

After reading this code, you can probably understand ffI’s operation.

  1. Provide a pointer to the outside world (to a trampoline entry)
  2. Create a closure and place the return values of the parameters associated with the call in the closure
  3. Write a closure to the trampoline data entry corresponding to a trampoline

And then when we call trampoline Entry func PTR,

  1. The closure data written to the trampoline’s trampoline data entry will be found
  2. According to the call parameter and return value information provided by the closure, combined with the call convention, the operation register and stack, write parameters for the function call, and obtain the return value.

How does FFI find the closure data of a trampoline’s trampoline data entry?

Let’s start with the FFI trampoline assignment:

static ffi_trampoline_table *
ffi_remap_trampoline_table_alloc (void)
{.../* Allocate two pages -- a config page and a placeholder page */
  config_page = 0x0;
  kt = vm_allocate (mach_task_self (), &config_page, PAGE_MAX_SIZE * 2,
                    VM_FLAGS_ANYWHERE);
  if(kt ! = KERN_SUCCESS)return NULL;

  /* Allocate two pages -- a config page and a placeholder page */
  //bdffc_closure_trampoline_table_page

  /* Remap the trampoline table on top of the placeholder page */
  trampoline_page = config_page + PAGE_MAX_SIZE;
  trampoline_page_template = (vm_address_t)&ffi_closure_remap_trampoline_table_page;
#ifdef __arm__
  /* bdffc_closure_trampoline_table_page can be thumb-biased on some ARM archs */
  trampoline_page_template &= ~1UL;
#endif
  kt = vm_remap (mach_task_self (), &trampoline_page, PAGE_MAX_SIZE, 0x0,
                 VM_FLAGS_OVERWRITE, mach_task_self (), trampoline_page_template,
                 FALSE, &cur_prot, &max_prot, VM_INHERIT_SHARE);
  if(kt ! = KERN_SUCCESS) { vm_deallocate (mach_task_self (), config_page, PAGE_MAX_SIZE *2);
      return NULL;
  }


  /* We have valid trampoline and config pages */
  table = calloc (1.sizeof (ffi_trampoline_table));
  table->free_count = FFI_REMAP_TRAMPOLINE_COUNT/2; table->config_page = config_page; table->trampoline_page = trampoline_page; .return table;
}
Copy the code

First, WHEN FFI creates a trampoline, it assigns two consecutive pages

The trampoline page is remapped to the ffi_cloSURE_remap_trampoline_table_page that we wrote in the code beforehand.

Its structure is shown in the figure below:

When ffi_prep_cloSURE_loc (_closure, &_CIF, _st_FFi_function, (__bridge void *)(self), entry1)) writes closure data, Writes to Closuer1 corresponding to entry1.


ffi_status
ffi_prep_closure_loc (ffi_closure *closure,
                      ffi_cif* cif,
                      void (*fun)(ffi_cif*,void*,void* *,void*),
                      void *user_data,
                      void *codeloc)
{...if (cif->flags & AARCH64_FLAG_ARG_V)
      start = ffi_closure_SYSV_V; // closure is handled by ffi
  else
      start = ffi_closure_SYSV;

  void **config = (void((* *)uint8_t *)codeloc - PAGE_MAX_SIZE);
  config[0] = closure;
  config[1] = start; . }Copy the code

How does that correspond to that? Closure1 and entry1 have the same offset from the Page to which they belong. Through the offset, the corresponding relationship between trampoline entry and trampoline closure is successfully established.

Now that we know this relationship, let’s look at the code to see how the closure is found at run time.

These four instructions are code implementations of our Trampoline Entry, the xxx_func_ptr returned by FFI

adr x16, -PAGE_MAX_SIZE
ldp x17, x16, [x16]
br x16
nop
Copy the code

With.rept we create PAGE_MAX_SIZE/FFI_TRAMPOLINE_SIZE springboards, which are exactly the size of a page

# dynamic remap page
.align PAGE_MAX_SHIFT
CNAME(ffi_closure_remap_trampoline_table_page):
.rept PAGE_MAX_SIZE / FFI_TRAMPOLINE_SIZE
  # This is our Trampoline entry, which is the function pointer generated by FFI
  adr x16, -PAGE_MAX_SIZE                         // Subtract PAGE_MAX_SIZE from the PC address to find trampoine data entry
  ldp x17, x16, [x16]                             // Load our closure, start to x17, x16
  br x16                                          // Jump to start
  nop        /* each entry in the trampoline config page is 2*sizeof(void*) so the trampoline itself cannot be smaller that 16 bytes * /
.endr
Copy the code

The corresponding trampoline data entry is found by subtracting PAGE_MAX_SIZE from the PC address.

Implementation of static springboard

Because the code segment and the data segment are in different memory areas.

At this time, we cannot allocate two consecutive pages like vmremap. When searching for trampoline data entry, it is only simple -page_max_size to find the corresponding relationship, which requires a little more complicated processing.

_FFi_STATIC_trampoline_data_page1 and _FFi_STATIC_trampoline_page1 are found by adRP. Use the start address of PC – _ffi_STATIC_trampoline_page1 to calculate the offset and find the trampoline data entry.

# statically allocated page
#ifdef __MACH__
#include <mach/machine/vm_param.h>

.align 14
.data
.global _ffi_static_trampoline_data_page1
_ffi_static_trampoline_data_page1:
    .space PAGE_MAX_SIZE*5
.align PAGE_MAX_SHIFT
.text
CNAME(_ffi_static_trampoline_page1):

_ffi_local_forwarding_bridge:
adrp x17, ffi_closure_static_trampoline_table_page_start@PAGE; // text page
sub  x16, x16, x17; // offset
adrp x17, _ffi_static_trampoline_data_page1@PAGE; // data page
add x16, x16, x17; // data address
ldp x17, x16, [x16]; // x17 closure x16 start
br x16
nop
nop
.align PAGE_MAX_SHIFT
CNAME(ffi_closure_static_trampoline_table_page):

This label is used adrp@PAGE to calculate the offset from the trampoline to the trampoline page
# save 5 for debugging.
Static trampoline is the same as remap
ffi_closure_static_trampoline_table_page_start:
adr x16, # 0
b _ffi_local_forwarding_bridge
nop
nop

adr x16, # 0
b _ffi_local_forwarding_bridge
nop
nop

adr x16, # 0
b _ffi_local_forwarding_bridge
nop
nop

adr x16, # 0
b _ffi_local_forwarding_bridge
nop
nop

adr x16, # 0
b _ffi_local_forwarding_bridge
nop
nop

/ / 5 * 4
.rept (PAGE_MAX_SIZE*5-5*4) / FFI_TRAMPOLINE_SIZE
adr x16, # 0
b _ffi_local_forwarding_bridge
nop
nop
.endr

.globl CNAME(ffi_closure_static_trampoline_table_page)
FFI_HIDDEN(CNAME(ffi_closure_static_trampoline_table_page))
#ifdef __ELF__
        .type        CNAME(ffi_closure_static_trampoline_table_page), #function
        .size        CNAME(ffi_closure_static_trampoline_table_page), . - CNAME(ffi_closure_static_trampoline_table_page)
#endif
#endif
Copy the code

About the Byte Mobile Platform team

Client Infrastructure, bytedance’s mobile platform team, is the industry leader in large front-end Infrastructure technology, responsible for the construction of the entire large front-end Infrastructure of Bytedance, and improving the performance, stability and engineering efficiency of the company’s entire product line. The supported products include but are not limited to Tiktok, Toutiao, Watermelons, Huoshan, etc. In the mobile terminal, Web, Desktop and other terminals have in-depth research.

Now! Client/front-end/server/test development for society + campus recruitment, Base North, Shanghai, Guangzhou, Shenzhen and Hangzhou! Let’s change the world with technology. If you are interested, please contact chenxuwei.cxw@bytedance.com, subject: resume – Name – Job objective – Phone number.