concept
Call stack, also known as execution stack, control stack, runtime stack and machine stack, is an important data structure for storing running subroutines in computer science. It mainly stores the return address, local variables, parameters and environment transfer. It is used to track the point at which each active subroutine should return to control after completion of execution.
The call stack of a thread is shown in the figure above, which is divided into several stack frames, and each stack frame corresponds to a function call. For example, the blue part is the stack frame of the DrawSquare function, which calls the DrawLine function during running, and the green part is the stack frame. Stack frame mainly contains three parts function parameters to local variables, return addresses, frame, such as in the above function DrawLine call first turn on the function parameters into the stack, and then put the return address into the stack (says after the current function performs a stack frame of the frame pointer), the last is the function within local variables (including function after the execution continues executing program address).
Most operating systems Stack (including iOS) grow from top to bottom, with a Stack Pointer pointing to the top of the Stack and a Frame Pointer pointing to the Stack Pointer value of the previous Frame. Using Frame Pointer, you can recursively retrieve the entire Stack.
ARM the call stack
First of all, the ARM architecture (64-bit ARM64 instruction set) is used to call the various registers of the stack, as follows:
The 32-bit ARMV7 instruction set registers are as follows;
r15
.PC(The Program Counter)
An instruction registeralso called a program counter holds the memory address of the next instruction to be executed.r14
.LR(The Link Register)
The link register holds the memory address of the instruction that called the function when the current function returns.r13
.SP(The Stack Pointer)
The stack pointer, which holds the pointer to the top of the stack;r12
.IP( The Intra-Procedure-call scratch register)
, can be simply considered as temporary SP.r7
.FP(The Frame Pointer)
, the stack frame pointer, which holds the pointer to the previous stack frame;R9
: Reserved for the OSR4-R6, R8, R10-R11
: No special provisions, is a common general purpose registerr0-r3
Is used to store the parameters and return values passed to the function.
A typical stack frame is shown below:
Func1 Stack frame is the stack frame of the current function (called), with the bottom of the stack at a high address and the stack growing downward. In the figure, FP is the stack base address, which points to the start address of the stack frame of the function; SP is the stack pointer to the function, pointing to the top of the stack. ARM pushes the current function pointer PC, return pointer LR, stack pointer SP, stack base address FP, number of passed parameters and Pointers, local variables and temporary variables. If a function is about to call another function, the temporary variable area should hold the other function’s arguments before jumping.
The assembly code corresponding to the call stack shown above is as follows.
- Line 8514 is the current line
sp
Stored in theip
(ip
A general-purpose register used to hold data temporarily during parsing and calls between functions, usuallyr12
); - Line 8518 stacks the four registers from right to left.
- Line 851c will be saved
ip
Subtract 4 to get the value of the currently called functionfp
The address, which points to the stackpc
Position. - 8520 line
sp
Minus 8 opens up 8 bytes of stack space for local commands.
00008514 <func1>:
8514: e1a0c00d mov ip, sp
8518: e92dd800 push {fp, ip, lr, pc}
851c: e24cb004 sub fp, ip, #4
8520: e24dd008 sub sp, sp, #8
8524: e3a03000 mov r3, #0
8528: e50b3010 str r3, [fp, #-16]
852c: e30805dc movw r0, #34268 ; 0x85dc
8530: e3400000 movt r0, #0
8534: ebffff9d bl 83b0 <puts@plt>
8538: e51b3010 ldr r3, [fp, #-16]
853c: e12fff33 blx r3
8540: e3a03000 mov r3, #0
8544: e1a00003 mov r0, r3
8548: e24bd00c sub sp, fp, #12
854c: e89da800 ldm sp, {fp, sp, pc}
Copy the code
We can trace the function call based on the FP and SP registers, as shown in the picture above: Function func1 holds the stack information of main (SP and FP in green). From these two values, we can know the start address of main (the value of the FP register) and the top of the stack (the value of the SP register). Once you have the stack frame of main, you can easily extract the VALUE of the LR register from it (if FP is offset by 4 bytes, it is LR), and you know who called main. This leads to a complete chain of function calls (usually not necessary to go back to main or thread entry functions). In fact, we don’t need to know SP at the top of the stack to go back, just FP.
Example code is as follows:
#include <stdio.h>
int add(int a, int b){
return a + b;
}
int main(a){
int a = 10;
int b = 20;
int c = add(a, b);
printf("add ret:%d \n", c);
return 0;
}
Copy the code
Using xcRun to specify the SDK and clang to specify the build architecture -ARCH, the result is as follows:
-arch indicates that the architecture to be compiled includes armv7 armv7s arm64. -isysroot Specifies the root path of the header file. $clang-s -arch armv64 -o hello hello.c -isysroot / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneOS platform/Developer/SDKs/iPhoneOS12.4. The SDK
$xcrun-sdk iphoneOS clang-s-arch armv64 -o hello hello.c
Thread stack
Each thread has its own thread stack to save the execution of the thread call situation, through the above call stack registers SP and FP can determine the stack information, how to obtain the specific thread stack information?
NSThread provides [NSThread callstackSymbols] to obtain the callstack of the current thread. It can also be obtained through the Backtrace/Backtrace_symbols interface, but can only obtain the callstack of the current thread, not the callstack of other threads. Fortunately, the Mach kernel provides an interface to get thread context thread_get_state and all threads task_threads, as defined below:
kern_return_t thread_get_state
(
thread_act_t target_act,
thread_state_flavor_t flavor,
thread_state_t old_state,
mach_msg_type_number_t *old_stateCnt
);
#if defined(__x86_64__)
_STRUCT_MCONTEXT ctx;
mach_msg_type_number_t count = x86_THREAD_STATE64_COUNT;
thread_get_state(thread, x86_THREAD_STATE64, (thread_state_t)&ctx.__ss, &count);
uint64_t pc = ctx.__ss.__rip;
uint64_t sp = ctx.__ss.__rsp;
uint64_t fp = ctx.__ss.__rbp;
#elif defined(__arm64__)
_STRUCT_MCONTEXT ctx;
mach_msg_type_number_t count = ARM_THREAD_STATE64_COUNT;
thread_get_state(thread, ARM_THREAD_STATE64, (thread_state_t)&ctx.__ss, &count);
uint64_t pc = ctx.__ss.__pc;
uint64_t sp = ctx.__ss.__sp;
uint64_t fp = ctx.__ss.__fp;
#endif
//task_threads stores all threads in the target_task task in the act_list array, which contains act_listCnt threads. Here mach_task_self() is used to get the current process flag target_task
kern_return_t task_threads
(
task_inspect_t target_task,
thread_act_array_t *act_list,
mach_msg_type_number_t *act_listCnt
);
Copy the code
All the threads of the task can be read from the act_list array. Once the threads are retrieved, all the information about each thread can be retrieved using the thread_get_state method, which is populated with parameters of type _STRUCT_MCONTEXT. There are two parameters in this method that vary from CPU architecture to CPU architecture, so you need to be aware of the differences between cpus. A _STRUCT_MCONTEXT structure stores the Stack Pointer of the current thread and the Frame Pointer of the topmost Stack Frame to retrieve the call Stack of the entire thread.
Mach_task_self () = mach_task_self() = mach_task_self();
Here, the thread is the lowest level of Mach kernel thread. The POSIX thread pthread corresponds to the kernel thread one by one, which is the abstraction of kernel thread. NSThread is the object-oriented encapsulation of Pthread.
In the process of function call, there may be exceptions that lead to stack frame damage. Therefore, the current ready-made stack frame address is in the address space that is not allowed to access. If thread_get_state is directly used to obtain thread stack frame and the whole call stack is obtained, there will be pointer access error and the program will crash abnormally. The thread call stack can be obtained safely using the vm_red_overwrite function, which asks the kernel if it has permission to access the specified memory, avoiding pointer access exceptions. The specific functions are as follows:
typedef struct StackFrameEntry{
const struct StackFrameEntry * const previous;// Address of the previous stack frame
const uintptr_t return_address;// Function address
} StackFrameEntry;
//mach_task_self: task object
// SRC: pointer to fp stack frame
//numBytes: sizeof (StackFrameEntry)
// DST: StackFrameEntry pointer
//bytesCopied: //cpye byte sizes
kern_return_t vm_read_overwrite(mach_task_self(), (vm_address_t)src, (vm_size_t)numBytes, (vm_address_t)dst, &bytesCopied)
Copy the code
Get thread name
Each kernel thread is uniquely identified by an ID of type thread_T. The unique id of phread is of type pthread_T. Converting from thread_T to pthread_T is relatively easy, but nsthreads do not have an ID to store pthread_T, but nsthreads can get thread names. The pthread interface provides pthread_getName_NP to get the thread name, which is the same, where NP means not POSIX (not cross-platform). However, the main thread cannot get its name from pthread_getname_NP, so it needs to get the thread’s thread_t from load.
// A specific interface between thread_t and pthread_t
pthread_t pthread = pthread_from_mach_thread_np((thread_t)thread);
// Get the thread_t of the main thread
static mach_port_t main_thread_id;
+ (void)load {
main_thread_id = mach_thread_self();
}
Copy the code
Function symbolization
After obtaining the call stack address of all threads, how to symbolize the function address and then convert it into readable information for troubleshooting and locating problems.
Locate the Image
For the application, there will be multiple Image Image files (as shown in the figure above), and the Image will be mapped to a unique address segment, so the address of the call stack function obtained can determine the Image to which it belongs. Specific information related to obtaining the Image includes the number of images, Image name, Image Mach-O header information and offset information. It can be obtained through the relevant interface provided by DYLD, as follows:
uint64_t count = _dyld_image_count();/ / number of the image
const struct mach_header *header = _dyld_get_image_header(index);//image mach-o header
const char *name = _dyld_get_image_name(index);//image name
uint64_t slide = _dyld_get_image_vmaddr_slide(index);//ALSR offset address
Copy the code
The Image Mach-O Header Header information and its loading command are traversed to obtain the address space range to determine whether it is located in the current Image(Mach-O related knowledge points can be found in the Mach-O file). The specific code logic is as follows:
static uint32_t imageIndexContainingAddress(const uintptr_t address)
{
const uint32_t imageCount = _dyld_image_count();
const struct mach_header* header = 0;
for(uint32_t iImg = 0; iImg < imageCount; iImg++)
{
header = _dyld_get_image_header(iImg);
if(header ! =NULL)
{
// Look for a segment command with this address within its range.
uintptr_t addressWSlide = address - (uintptr_t)_dyld_get_image_vmaddr_slide(iImg);
uintptr_t cmdPtr = firstCmdAfterHeader(header);
if(cmdPtr == 0)
{
continue;
}
for(uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++)
{
const struct load_command* loadCmd = (struct load_command*)cmdPtr;
if(loadCmd->cmd == LC_SEGMENT)
{
const struct segment_command* segCmd = (struct segment_command*)cmdPtr;
if(addressWSlide >= segCmd->vmaddr &&
addressWSlide < segCmd->vmaddr + segCmd->vmsize)
{
returniImg; }}else if(loadCmd->cmd == LC_SEGMENT_64)
{
const struct segment_command_64* segCmd = (struct segment_command_64*)cmdPtr;
if(addressWSlide >= segCmd->vmaddr &&
addressWSlide < segCmd->vmaddr + segCmd->vmsize)
{
returniImg; } } cmdPtr += loadCmd->cmdsize; }}}return UINT_MAX;
}
Copy the code
Find symbol
The Symbol Table is stored in the LC_SEGMENT(__LINKEDIT) section of the Mach-O file, involving Symbol tables and String tables. The address of the symbol table in the Mach-O object file can be found by symoff specified by the LC_SYMTAB loading command. The corresponding symbol name is stroff, and there is a total of NSYMS bar symbol information. That is, use LC_SYMTAB to find the symbolic address stored in __LINKEDIT.
A symbol table is a contiguous list where each item is a struct nlist, as follows:
truct nlist {
union {
uint32_t n_strx;// The offset of the symbol name in the string table
} n_un;
uint8_t n_type;
uint8_t n_sect;
int16_t n_desc;
uint32_t n_value;// The memory address of a symbol, similar to a function pointer
};
Copy the code
N_strx in the symbol Table entry is used to obtain the offset of the symbol name in the String Table, and then the symbol name is the function name. N_value is used to obtain the address of the symbol in memory, that is, the function pointer. Thus, the correspondence between symbol names and memory addresses is clear. The code for obtaining symbol table and string table is as follows:
// Get the Mach-o Header
const struct mach_header* header = _dyld_get_image_header(index);
// Use header to traverse Load Commands to get _LINKEDIT and LC_SYMTAB
for(uint32_t iCmd = 0; iCmd < header->ncmds; iCmd++)
{
const struct load_command* loadCmd = (struct load_command*)cmdPtr;
if(loadCmd->cmd == LC_SYMTAB){
symtabCmd = loadCmd;
} else if(loadCmd->cmd == LC_SEGMENT_64) {
const struct segment_command_64* segmentCmd = (struct segment_command_64*)cmdPtr;
if(strcmp(segmentCmd->segname, SEG_LINKEDIT) == 0) { linkeditSegment = segmentCmd; }}}// Base address = offset + _LINKEDIT virtual address - _LINKEDIT file offset address
uintptr_t linkeditBase = (uintptr_t)slide + linkeditSegment->vmaddr - linkeditSegment->fileoff;
// Address of symbol table = base address + symbol table offset
const nlist_t *symbolTable = (nlist_t *)(linkeditBase + symtabCmd->symoff);
// Address of the string table = base address + string table offset
char *stringTab = (char *)(linkeditBase + symtabCmd->stroff);
// Number of symbols
uint32_t symNum = symtabCmd->nsyms;
Copy the code
Position mark
The search is to get the real memory addresses of signs and the function name, and through the function call stack access is inside the function executes instructions address, but the address and the real function deviation is not big, so can traverse the symbols of memory address with the function of the call stack symbols memory address closest compare with the best match of symbols, Is the symbol of the current call stack, the code is as follows:
const uintptr_t imageVMAddrSlide = (uintptr_t)_dyld_get_image_vmaddr_slide(idx);
const uintptr_t addressWithSlide = address - imageVMAddrSlide;//address is the memory address of the call stack
// Look for the best matching symbol
for(uint32_t iSym = 0; iSym < symtabCmd->nsyms; iSym++)
{
// If n_value is 0, the symbol refers to an external object.
if(symbolTable[iSym].n_value ! =0)
{
uintptr_t symbolBase = symbolTable[iSym].n_value;// Get the memory address of the symbol (function pointer)
uintptr_t currentDistance = addressWithSlide - symbolBase;
if((addressWithSlide >= symbolBase) &&
(currentDistance <= bestDistance))
{
bestMatch = symbolTable + iSym;// Best match symbol address
bestDistance = currentDistance;// The distance between the stack memory address and the current symbol memory address}}}if(bestMatch ! =NULL)
{
info->dli_saddr = (void*)(bestMatch->n_value + imageVMAddrSlide);
if(bestMatch->n_desc == 16)
{
// This image has been stripped. The name is meaningless, and
// almost certainly resolves to "_mh_execute_header"
info->dli_sname = NULL;
}
else
{
// Get the symbol name
info->dli_sname = (char((*)intptr_t)stringTable + (intptr_t)bestMatch->n_un.n_strx);
if(*info->dli_sname == '_') { info->dli_sname++; }}}Copy the code
Reference
Personal understanding of the Call stack
The call stack
Gets the Call Stack for any thread
IOS gets an arbitrary thread call stack
Talk about the iOS get call chain
Get the function call stack at run time
Thread Call Stack capture and parsing
IOS Thread call stack learning
How does the stack change when C language functions are called in ARM?
IOS reverse ARM assembler
ARM64 instruction simple manual
IOS assembler quick start
thread_get_state
pthread.c
Mac OSX & iOS
vm_read_overwrite
KSCrash