concept
Call stack, also known as execution stack, control stack, runtime stack and machine stack, is an important data structure for storing running subroutines in computer science. It mainly stores the return address, local variables, parameters and environment transfer. It is used to track the point at which each active subroutine should return to control after completion of execution.The call stack of a thread is shown in the figure above, which is divided into several stack frames, and each stack frame corresponds to a function call. For example, the blue part is the stack frame of the DrawSquare function, which calls the DrawLine function during running, and the green part is the stack frame. Stack frame mainly contains three parts function parameters to local variables, return addresses, frame, such as in the above function DrawLine call first turn on the function parameters into the stack, and then put the return address into the stack (says after the current function performs a stack frame of the frame pointer), the last is the function within local variables (including function after the execution continues executing program address).
Most operating systems Stack (including iOS) grow from top to bottom, with a Stack Pointer pointing to the top of the Stack and a Frame Pointer pointing to the Stack Pointer value of the previous Frame. Using Frame Pointer, you can recursively retrieve the entire Stack.
ARM the call stack
First of all, the ARM architecture (64-bit ARM64 instruction set) is used to call the various registers of the stack, as follows:The 32-bit ARMV7 instruction set registers are as follows;
- R15, PC(The Program Counter), The instruction register, also known as The Program Counter, holds The memory address of The next instruction to be executed;
- R14, LR(The Link Register), which holds The memory address of The instruction calling The function when The current function returns;
- R13, SP(The Stack Pointer), holds The Pointer to The top of The Stack;
- R12, IP(The intra-procede-call scratch Register), can be simply considered as temporary SP.
- R7, FP(The Frame Pointer), which holds The Pointer to The Frame on The previous stack;
- R9: Reserved OS
- R4-r6, R8, R10-R11: no special provisions, are ordinary general purpose registers
- R0-r3, used to store the parameters and return values passed to the function;
A typical stack frame is shown below:
Func1 Stack frame is the stack frame of the current function (called), with the bottom of the stack at a high address and the stack growing downward. In the figure, FP is the stack base address, which points to the start address of the stack frame of the function; SP is the stack pointer to the function, pointing to the top of the stack. ARM pushes the current function pointer PC, return pointer LR, stack pointer SP, stack base address FP, number of passed parameters and Pointers, local variables and temporary variables. If a function is about to call another function, the temporary variable area should hold the other function’s arguments before jumping.
The assembly code corresponding to the call stack shown above is as follows.
Line 1.8514 saves the current SP in IP (IP is just a general-purpose register used to hold data temporarily between functions for analysis and calls, usually R12);
Line 2.8518 stacks the four registers from right to left.
3.851 Line C subtracts 4 from the saved IP to get the fp address of the currently called function, which points to the PC location in the stack.
Line 4.8520 subtracts 8 from sp to open up 8 bytes of stack space for local commands.
00008514 <func1>:
8514: e1a0c00d mov ip, sp
8518: e92dd800 push {fp, ip, lr, pc}
851c: e24cb004 sub fp, ip, #4
8520: e24dd008 sub sp, sp, #8
8524: e3a03000 mov r3, #0
8528: e50b3010 str r3, [fp, #-16]
852c: e30805dc movw r0, #34268 ; 0x85dc
8530: e3400000 movt r0, #0
8534: ebffff9d bl 83b0 <puts@plt>
8538: e51b3010 ldr r3, [fp, #-16]
853c: e12fff33 blx r3
8540: e3a03000 mov r3, #0
8544: e1a00003 mov r0, r3
8548: e24bd00c sub sp, fp, #12
854c: e89da800 ldm sp, {fp, sp, pc}
Copy the code
We can trace the function call based on the FP and SP registers, as shown in the picture above: Function func1 holds the stack information of main (SP and FP in green). From these two values, we can know the start address of main (the value of the FP register) and the top of the stack (the value of the SP register). Once you have the stack frame of main, you can easily extract the VALUE of the LR register from it (if FP is offset by 4 bytes, it is LR), and you know who called main. This leads to a complete chain of function calls (usually not necessary to go back to main or thread entry functions). In fact, we don’t need to know SP at the top of the stack to go back, just FP.
Example code is as follows:
#include <stdio.h>
int add(int a, int b){
return a + b;
}
int main(){
int a = 10;
int b = 20;
int c = add(a, b);
printf("add ret:%d
", c);
return 0;
}
Copy the code
Using xcRun to specify the SDK and clang to specify the build architecture -ARCH, the result is as follows:
// -arch indicates that the architecture to be compiled includes armv7 armv7s arm64 // -isysroot specifies the root path of the header file Clang – S – archarmv64 – ohellohello. C – isysroot/Applications/Xcode. The app/Contents/Developer/Platforms/iPhoneOS platform/Develope R/SDKs/iPhoneOS12.4. SDK / / can also use xcrun, Xcrun − SDK uses the latest SDK to compile clang-s-arch armv64 -o hello hello.c — isysroot / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneOS platform/Developer/SDKs/iPhoneOS12.4 SDK / / can also use xcrun, xcrun -sdk Will use the latest SDK to compile clang – S – archarmv64 – ohellohello. C – isysroot/Applications/Xcode. The app/Contents/Developer/Platforms/iPhoneOS platf The orm/Developer/SDKs/iPhoneOS12.4. SDK / / can also use xcrun, Xcrun − SDK will use the latest SDK to compile xcrun-SDK iphoneOS clang-s-arch armv64 -o hello hello.c
Thread stack
Each thread has its own thread stack to save the execution of the thread call situation, through the above call stack registers SP and FP can determine the stack information, how to obtain the specific thread stack information?
NSThread provides [NSThread callstackSymbols] to obtain the callstack of the current thread. It can also be obtained through the Backtrace/Backtrace_symbols interface, but can only obtain the callstack of the current thread, not the callstack of other threads. Fortunately, the Mach kernel provides an interface to get thread context thread_get_state and all threads task_threads, as defined below:
kern_return_t thread_get_state ( thread_act_t target_act, thread_state_flavor_t flavor, thread_state_t old_state, mach_msg_type_number_t *old_stateCnt ); #if defined(__x86_64__) _STRUCT_MCONTEXT ctx; mach_msg_type_number_t count = x86_THREAD_STATE64_COUNT; thread_get_state(thread, x86_THREAD_STATE64, (thread_state_t)&ctx.__ss, &count); uint64_t pc = ctx.__ss.__rip; uint64_t sp = ctx.__ss.__rsp; uint64_t fp = ctx.__ss.__rbp; #elif defined(__arm64__) _STRUCT_MCONTEXT ctx; mach_msg_type_number_t count = ARM_THREAD_STATE64_COUNT; thread_get_state(thread, ARM_THREAD_STATE64, (thread_state_t)&ctx.__ss, &count); uint64_t pc = ctx.__ss.__pc; uint64_t sp = ctx.__ss.__sp; uint64_t fp = ctx.__ss.__fp; #endif //task_threads stores all the threads in the target_task task in the act_list array, which contains the act_listCnt thread, This uses mach_task_self() to get the current process flag target_task kern_return_t task_threads (task_inspect_t target_task, thread_act_array_t *act_list, mach_msg_type_number_t *act_listCnt );Copy the code
All the threads of the task can be read from the act_list array. Once the threads are retrieved, all the information about each thread can be retrieved using the thread_get_state method, which is populated with parameters of type _STRUCT_MCONTEXT. There are two parameters in this method that vary from CPU architecture to CPU architecture, so you need to be aware of the differences between cpus. A _STRUCT_MCONTEXT structure stores the Stack Pointer of the current thread and the Frame Pointer of the topmost Stack Frame to retrieve the call Stack of the entire thread.
Mach_task_self () = mach_task_self() = mach_task_self(); Here, the thread is the lowest level of Mach kernel thread. The POSIX thread pthread corresponds to the kernel thread one by one, which is the abstraction of kernel thread. NSThread is the object-oriented encapsulation of Pthread.
In the process of function call, there may be exceptions that lead to stack frame damage. Therefore, the current ready-made stack frame address is in the address space that is not allowed to access. If thread_get_state is directly used to obtain thread stack frame and the whole call stack is obtained, there will be pointer access error and the program will crash abnormally. The thread call stack can be obtained safely using the vm_red_overwrite function, which asks the kernel if it has permission to access the specified memory, avoiding pointer access exceptions. The specific functions are as follows:
typedef struct StackFrameEntry{ const struct StackFrameEntry * const previous; // Uintptr_t return_address; // function address} StackFrameEntry; //mach_task_self: task object // SRC: FP stack frame pointer //numBytes: sizeof (StackFrameEntry) // DST: StackFrameEntry pointer //bytesCopied: Kern_return_t vm_read_overwrite(mach_task_self(), (vm_address_t) SRC, (vm_size_t)numBytes, (vm_address_t) DST, &bytesCopied)Copy the code
Get thread name
Each kernel thread is uniquely identified by an ID of type thread_T. The unique id of phread is of type pthread_T. Converting from thread_T to pthread_T is relatively easy, but nsthreads do not have an ID to store pthread_T, but nsthreads can get thread names. The pthread interface provides pthread_getName_NP to get the thread name, which is the same, where NP means not POSIX (not cross-platform). However, the main thread cannot get its name from pthread_getname_NP, so it needs to get the thread’s thread_t from load.
Pthread_t pthread_t pthread_t = pthread_from_mach_thread_np((thread_t)thread); Thread_t static mach_port_t main_thread_id; + (void)load { main_thread_id = mach_thread_self(); }Copy the code
Function symbolization
After obtaining the call stack address of all threads, how to symbolize the function address and then convert it into readable information for troubleshooting and locating problems.
Locate the Image
For the application, there will be multiple Image Image files (as shown in the figure above), and the Image will be mapped to a unique address segment, so the address of the call stack function obtained can determine the Image to which it belongs. Specific information related to obtaining the Image includes the number of images, Image name, Image Mach-O header information and offset information. It can be obtained through the relevant interface provided by DYLD, as follows:
uint64_t count = _dyld_image_count(); Const struct mach_header *header = _dyLD_GEt_image_header (index); //image mach-o header const char *name = _dyld_get_image_name(index); //image name uint64_t slide = _dyld_get_image_vmaddr_slide(index); //ALSR offset addressCopy the code
The Image Mach -o Header Header information and the loading command are traversed to obtain the address space range to determine whether the Image is located in the current Image. The specific code logic is as follows:
static uint32_t imageIndexContainingAddress(const uintptr_t address) { const uint32_t imageCount = _dyld_image_count(); const struct mach_header* header = 0; for(uint32_t iImg = 0; iImg < imageCount; iImg++) { header = _dyld_get_image_header(iImg); if(header ! = NULL) { // Look for a segment command with this address within its range. uintptr_t addressWSlide = address - (uintptr_t)_dyld_get_image_vmaddr_slide(iImg); uintptr_t cmdPtr = firstCmdAfterHeader(header); if(cmdPtr == 0) { continue; } for(uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++) { const struct load_command* loadCmd = (struct load_command*)cmdPtr; if(loadCmd->cmd == LC_SEGMENT) { const struct segment_command* segCmd = (struct segment_command*)cmdPtr; if(addressWSlide >= segCmd->vmaddr && addressWSlide < segCmd->vmaddr + segCmd->vmsize) { return iImg; } } else if(loadCmd->cmd == LC_SEGMENT_64) { const struct segment_command_64* segCmd = (struct segment_command_64*)cmdPtr; if(addressWSlide >= segCmd->vmaddr && addressWSlide < segCmd->vmaddr + segCmd->vmsize) { return iImg; } } cmdPtr += loadCmd->cmdsize; } } } return UINT_MAX; }Copy the code
Find symbol
The Symbol Table is stored in the LC_SEGMENT(__LINKEDIT) section of the Mach-O file, involving Symbol tables and String tables. The address of the symbol table in the Mach-O object file can be found by symoff specified by the LC_SYMTAB loading command. The corresponding symbol name is stroff, and there is a total of NSYMS bar symbol information. That is, use LC_SYMTAB to find the symbolic address stored in __LINKEDIT.
A symbol table is a contiguous list where each item is a struct nlist, as follows:
truct nlist { union { uint32_t n_strx; // Offset of symbol name in string table} n_un; uint8_t n_type; uint8_t n_sect; int16_t n_desc; uint32_t n_value; // the memory address of the symbol, similar to the function pointer};Copy the code
N_strx in the symbol Table entry is used to obtain the offset of the symbol name in the String Table, and then the symbol name is the function name. N_value is used to obtain the address of the symbol in memory, that is, the function pointer. Thus, the correspondence between symbol names and memory addresses is clear. The code for obtaining symbol table and string table is as follows:
Mach-o Header const struct mach_header* Header = _dyLD_GEt_image_header (index); Uint32_t iCmd = 0; // Uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++) { const struct load_command* loadCmd = (struct load_command*)cmdPtr; if(loadCmd->cmd == LC_SYMTAB){ symtabCmd = loadCmd; } else if(loadCmd->cmd == LC_SEGMENT_64) { const struct segment_command_64* segmentCmd = (struct segment_command_64*)cmdPtr; if(strcmp(segmentCmd->segname, SEG_LINKEDIT) == 0) { linkeditSegment = segmentCmd; // uintptr_t linkeditBase = (uintptr_t)slide + linkeditSegment->vmaddr - linkeditSegment->fileoff; Const nlist_t *symbolTable = (nlist_t *)(linkeditBase + symtabCmd->symoff); Char *stringTab = (char *)(linkeditBase + symtabCmd->stroff); Uint32_t symNum = symtabCmd->nsyms;Copy the code
Position mark
The search is to get the real memory addresses of signs and the function name, and through the function call stack access is inside the function executes instructions address, but the address and the real function deviation is not big, so can traverse the symbols of memory address with the function of the call stack symbols memory address closest compare with the best match of symbols, Is the symbol of the current call stack, the code is as follows:
const uintptr_t imageVMAddrSlide = (uintptr_t)_dyld_get_image_vmaddr_slide(idx); const uintptr_t addressWithSlide = address - imageVMAddrSlide; // uint32_t iSym = 0; iSym < symtabCmd->nsyms; iSym++) { // If n_value is 0, the symbol refers to an external object. if(symbolTable[iSym].n_value ! = 0) { uintptr_t symbolBase = symbolTable[iSym].n_value; Uintptr_t currentDistance = addressWithslide-symbolbase; if((addressWithSlide >= symbolBase) && (currentDistance <= bestDistance)) { bestMatch = symbolTable + iSym; // Best matching symbol address bestDistance = currentDistance; }} if(bestMatch! = NULL) { info->dli_saddr = (void*)(bestMatch->n_value + imageVMAddrSlide); if(bestMatch->n_desc == 16) { // This image has been stripped. The name is meaningless, and // almost certainly resolves to "_mh_execute_header" info->dli_sname = NULL; } else {info->dli_sname = (char*)((intptr_t)stringTable + (intptr_t)bestMatch->n_un.n_strx); if(*info->dli_sname == '_') { info->dli_sname++; }}}Copy the code
IOS Development — Explore the iOS thread call stack and symbolization