Read the iOS thread call stack principle

Senior programmers are digging into assembly, registers, and kernel bugs. Are you still reading logs? This article takes you through assembly, function call procedures and register usage, and dives into the kernel to understand the most important aspect of the KSCrash crash collection framework: the thread call stack. Actually the principle is very simple, go ~

Introduction to the

Usage scenarios

For example, we often debug breakpoints to view the call stack, as shown in the figure below:

You can also use crash log reports to exclude bugs or use crash log collection tools to collect the call stack of the entire thread, as shown below:

Another important use is “lag detection” in application performance optimization, which requires the call stack of the main thread to detect the specific function call process analysis time to optimize performance.

Open source framework

Stack information collection open source frameworks such asPLCrashReporter,KSCrashAnd so on;
Their Allies,Bugly 等SDKIt not only provides Crash capture and stack information collection, but also integrates analysis, statistics and other services, which is very perfect.
Choke detection gets thread stack information, such as wechat open source performance detection toolMatrix, including crash, stalling and memory burst, currently contains two plug-ins:WCCrashBlockMonitorPluginandWCMemoryStatPlugin, where the caton detection is based onKSCrashFramework development, through the inspectionRunLoopRunning status Determines whether the application is stuck. It has the time consuming stack extraction capability, and can obtain the most recent time consuming main thread stack.

The most basic function of these open source frameworks is to get the call stack of the thread, and to get the function symbol information of the call stack, such as function name, function address, function library, and so on.

What is a call stack

concept

Call stack, also known as execution stack, control stack, runtime stack and machine stack, is an important data structure for storing running subroutines in computer science. It mainly stores the return address, local variables, parameters and environment passing. It is used to track the point at which each active subroutine should return control after completion of execution.

Take a look at this:

It is divided into several stack frames, and each stack frame corresponds to a function call. For example, the blue part is the stack frame of the DrawSquare function, which calls the DrawLine function during the running process, and the stack frame is represented by the green part. Stack frame mainly contains three parts function parameters to local variables, return addresses, frame, such as in the above function DrawLine call first turn on the function parameters into the stack, and then put the return address into the stack (says after the current function performs a stack frame of the frame pointer), the last is the function within local variables (including function after the execution continues executing program address).

Most operating system stacks grow from the top down (including iOS/Mac), with Stack Pointer pointing to the top of the Stack and Frame Pointer pointing to the Stack Pointer value of a Frame on the previous Stack. Frame Pointer allows you to recursively backtrack the entire call Stack.

x86_64

The x86_64 is used as an example. The ARM registers are different, but they are similar.

Processor registers in x86-64 architecture are shown in the following figure:

As can be seen from the figure above, the frame pointer is saved as “the address of the frame on the previous stack + the return address”, where the return address is the address of the next instruction after the caller executes the calling function.

We practice

#include <stdio.h>

int add(int a, int b){
    return a + b;
}

int main(a){
    int a = 10;
    int b = 20;
    int c = add(a, b);
  	printf("add ret:%d \n", c);

    return 0;
}
Copy the code

Compiled using the x86-64 architecture, the assembly implementation is as follows:

Iphoneos clang-s-arch armv64 XXX. C-o XXX

The main function uses the callq assembly instruction before calling the add function, which will automatically push the address of the next instruction of the current function on the stack, that is, after the add function returns, it will automatically out of the stack to execute the instruction in the address of the instruction. After the add function is entered, it will automatically push the RBP frame pointer of the previous function into the stack, and then point to the RSP stack pointer of the current function, that is, the frame pointer points to the stack low of the current function add. The frame pointer fp saves the address of the frame on the previous stack, and fp+1 saves the instruction address of the last function on the PC.

Let’s verify the implementation here:

In the figure above, 0x100000F68 is the address of the instruction to continue the execution after the function called add returns. The add function is first pushed to the address of the frame pointer push % RBP. Therefore, fp saves the address of the frame on the previous stack, and FP +1 saves the address of the instruction of the last function on the PC.

Therefore, the entire function call relationship can be obtained from the call stack information, as shown in the figure below:

The call is handled as follows:

while(FP) {
  PC = *(FP + 1);
  FP = *FP;
}
Copy the code

Through the call stack can get the whole function call relationship, that is, get the function instruction address FP, can get the real address of the function, and then get the function name and its mirror related information, if there is a dSYM symbol table can find the function line number information.

The address of the current function can be obtained by the current RIP register.

Note: this function address is close to the real function address!

In the region of the executing code, between each symbol is a continuous front (such as the function in the demo), and all symbols will be stored in the symbol table, then we can traverse the symbol table, locate the address location is less than the function, and a symbol of distance function address recently, then we can think of our function jump occurred within the function, To determine the final address of the function symbol.

How do I get the thread call stack

Current function call stack

Familiar is the use of [NSThread callstackSymbols] to get the current function callstack, as shown below:

However, if the symbol is clipped by the strip, the complete call stack information of the current thread (such as function symbol name) cannot be obtained by this method; In addition, when applied to caton detection, caton detection is combined with runloop. If dispatch_async/performSelector method is used, its method is also added to runloop, so the call stack of the main thread cannot be obtained in real time. Communicating with the main thread through other thread communication methods, such as signal /Mach Port method, requires adding additional code to point to the thread. If you need to get the call stack of all threads, this method is not common and tedious, and it cannot handle the case of thread crash, so you need a common way to get the call stack of all threads.

All threads call the stack

How do I get all threads of the current process?

Fortunately, the Mach kernel provides a user-mode interface, as follows:

// Task_Threads stores all threads in the target_Task task in an act_list array containing act_listCnt threads. Mach_task_self () is used to get target_task for the current process
kern_return_t task_threads
(
	task_inspect_t target_task,
	thread_act_array_t *act_list,
	mach_msg_type_number_t *act_listCnt
);
Copy the code

From the name it is “get all threads from task”. Why is it a task? Here’s a brief illustration of the relationship between tasks and threads in the XNU kernel for iOS/Mac and familiar processes and threads:

The underlying implementation of processes and threads in the kernel is based on Mach tasks and threads, where tasks are containers for threads to manage resources, such as files, I/O device handles, and so on. The familiar processes and threads have their corresponding low-level tasks and threads. Therefore, all the contained threads can be obtained through the underlying task.

So once you get all the threads you need to get the call stack for that thread and how do you get that?

Here the Mach kernel exposes the corresponding interface at the user-mode layer, as follows:

// Get thread state information
kern_return_t thread_get_state
(
	thread_act_t target_act,							// The target thread is obtained via the task_Threads interface
	thread_state_flavor_t flavor,					// The thread state type, for example, [ARM/x86]_THREAD_STATE64
	thread_state_t old_state,							// Thread state information, can get the thread call stack register information
	mach_msg_type_number_t *old_stateCnt	// Number of thread state information members
);
Copy the code

The thread_state_t member holds the thread call stack information, as follows:

_STRUCT_X86_THREAD_STATE64
{
	__uint64_t	__rax;
	__uint64_t	__rbx;
	__uint64_t	__rcx;
	__uint64_t	__rdx;
	__uint64_t	__rdi;
	__uint64_t	__rsi;
	__uint64_t	__rbp;	/ / frame pointer
	__uint64_t	__rsp;	/ / the stack pointer
	__uint64_t	__r8;
	__uint64_t	__r9;
	__uint64_t	__r10;
	__uint64_t	__r11;
	__uint64_t	__r12;
	__uint64_t	__r13;
	__uint64_t	__r14;
	__uint64_t	__r15;
	__uint64_t	__rip;	// The current thread instruction address
	__uint64_t	__rflags;
	__uint64_t	__cs;
	__uint64_t	__fs;
	__uint64_t	__gs;
};
Copy the code

Through the above interface, you can get the key call stack registers mentioned above, such as RBP frame pointer, RSP stack pointer and RIP current thread instruction address, so you can get the whole call stack and its function call function address through RBP.

Using the above two interfaces, you can get the entire call function address information for all threads and their call stacks. Therefore, you can get the associated function symbol name from the function address.

Function addresses are symbolized

How to use the function address to obtain information about the function, such as the mirror name and its address, symbol address and its name, above:

The steps are as follows:

Get function address

Task_threads is used to obtain all threads, and thread_get_state is used to obtain the thread call stack, thus obtaining all function addresses of the current thread.

Locate the mirror

Dyld provides interfaces related to the image. For example, to obtain the image quantity _dyLD_IMAGe_count, name _dyLD_GEt_image_name, and address _dyLD_GEt_image_header, the image address can obtain the information related to Mach-O, as shown in the following figure:

In one sentence: The Mach64 Header in Mach-o contains the number of Load Commands, the Load Commands Load command contains LC_SEGMENT_64, The load Command data structure contains the Command name Command, the virtual Address VM Address, and the VM Size, so you can compare each segment in LC_SEGMENT_64 to determine whether it is in the segment, and therefore whether it is in the image, by traversing it to get the virtual start Address of each segment and its range. The specific code is as follows:

static uint32_t imageIndexContainingAddress(const uintptr_t address)
{
    const uint32_t imageCount = _dyld_image_count();
    const struct mach_header* header = 0;
    
    for(uint32_t iImg = 0; iImg < imageCount; iImg++)
    {
        header = _dyld_get_image_header(iImg);
        if(header ! =NULL)
        {
            // Look for a segment command with this address within its range.
            uintptr_t addressWSlide = address - (uintptr_t)_dyld_get_image_vmaddr_slide(iImg);
            uintptr_t cmdPtr = firstCmdAfterHeader(header);
            if(cmdPtr == 0)
            {
                continue;
            }
            for(uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++)
            {
                const struct load_command* loadCmd = (struct load_command*)cmdPtr;
                if(loadCmd->cmd == LC_SEGMENT)
                {
                    const struct segment_command* segCmd = (struct segment_command*)cmdPtr;
                    if(addressWSlide >= segCmd->vmaddr &&
                       addressWSlide < segCmd->vmaddr + segCmd->vmsize)
                    {
                        returniImg; }}else if(loadCmd->cmd == LC_SEGMENT_64)
                {
                    const struct segment_command_64* segCmd = (struct segment_command_64*)cmdPtr;
                    if(addressWSlide >= segCmd->vmaddr &&
                       addressWSlide < segCmd->vmaddr + segCmd->vmsize)
                    {
                        returniImg; } } cmdPtr += loadCmd->cmdsize; }}}return UINT_MAX;
}
Copy the code

Find symbol

Use the LC_SYMTAB loading command to obtain the symbol table and string table information, such as address, number and size, and then obtain all symbols in the symbol table and the corresponding function name in the string table. The specific code is as follows:

// Get the Mach-o Header
const struct mach_header* header = _dyld_get_image_header(index);
// Get _LINKEDIT and LC_SYMTAB by iterating through Load Commands through the header
for(uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++)
{
		const struct load_command* loadCmd = (struct load_command*)cmdPtr;
    if(loadCmd->cmd == LC_SYMTAB){
      symtabCmd = loadCmd;
    } else if(loadCmd->cmd == LC_SEGMENT_64) {
      	const struct segment_command_64* segmentCmd = (struct segment_command_64*)cmdPtr;
      	if(strcmp(segmentCmd->segname, SEG_LINKEDIT) == 0) { linkeditSegment = segmentCmd; }}}// Base address = offset + _LINKEDIT virtual address - _LINKEDIT file offset address
uintptr_t linkeditBase = (uintptr_t)slide + linkeditSegment->vmaddr - linkeditSegment->fileoff;
// Symbol table address = base address + symbol table offset
const nlist_t *symbolTable = (nlist_t *)(linkeditBase + symtabCmd->symoff);
// String table address = base address + string table offset
char *stringTab = (char *)(linkeditBase + symtabCmd->stroff);
// The number of symbols
uint32_t symNum = symtabCmd->nsyms;
Copy the code

Position mark

Through traversal of all symbol addresses in the symbol Table to match the closest to the current function address, that is, to find the function symbol, and through the symbol Table Index in the String Table offset to get the function symbol name, the specific code is as follows:

const uintptr_t imageVMAddrSlide = (uintptr_t)_dyld_get_image_vmaddr_slide(idx);
const uintptr_t addressWithSlide = address - imageVMAddrSlide;//address is the memory address of the call stack
// Iterates over symbols to find the best match
for(uint32_t iSym = 0; iSym < symtabCmd->nsyms; iSym++)
{
    // If n_value is 0, the symbol refers to an external object.
    if(symbolTable[iSym].n_value ! =0)
    {
        uintptr_t symbolBase = symbolTable[iSym].n_value;// Get the memory address of the symbol (function pointer)
        uintptr_t currentDistance = addressWithSlide - symbolBase;
        if((addressWithSlide >= symbolBase) &&
        (currentDistance <= bestDistance))
        {
            bestMatch = symbolTable + iSym;// Best match symbol address
            bestDistance = currentDistance;// The distance between the stack memory address and the current symbol memory address}}} to ACif(bestMatch ! =NULL)
{
    info->dli_saddr = (void*)(bestMatch->n_value + imageVMAddrSlide);
    if(bestMatch->n_desc == 16)
    {
        // This image has been stripped. The name is meaningless, and
        // almost certainly resolves to "_mh_execute_header"
        info->dli_sname = NULL;
    }
    else
    {
      	// Get the symbol name
        info->dli_sname = (char((*)intptr_t)stringTable + (intptr_t)bestMatch->n_un.n_strx);
        if(*info->dli_sname == '_') { info->dli_sname++; }}}Copy the code

Thinking and exploration

How do I get the thread name
How to usedSYMDebug the symbol table to get the line number of the file to which the function symbol corresponds