Introduction of abnormal

There are state codes designed to identify different events in the processor and system kernel, and these states are encoded into different bits and signals. Each time the processor and kernel detect a change in state, they fire an event called an exception.

Every possible type of exception in the system is assigned an exception number that is a unique, non-negative integer. These exception numbers are assigned by the processor and the operating system’s kernel designer.

When the processor detects an event, it makes an indirect procedure call through a jump table called an exception table to an operating system subroutine specifically designed to handle such events, called an exception handler.

At system startup (computer startup), the operating system allocates and initializes an exception table whose starting address is stored in a special CPU register called the Exception Table Base Register. The exception number is the index in the exception table. Through the start address and the exception number, find the call address of the abnormal program, and finally execute the abnormal program.

Categories of exceptions

Exceptions fall into four categories: interrupts, traps, faults, and terminations.

Interrupt: It is asynchronous in the sense that it is caused by an ABNORMAL signal of I/O from outside the processor, not by any specific instruction. Hardware interrupt handlers are often referred to as interrupt handlers. For example, remove and insert the USB flash drive.

Trap: An intentional exception, the result of executing an instruction. As with interrupts, trap handlers return control to the next instruction. The ultimate purpose of a trap is to provide a process-like interface, called a system call, between the user program and the kernel. For example, user programs often need to request services from the kernel, such as reading a file (read), creating a process (fork), loading a new program (execve), or terminating the current process (exit). These operations are implemented by triggering trap exceptions to execute the system kernel program.

Fault: Caused by an error condition that may be corrected by a fault handler. When a fault occurs, the processor transfers control to the fault handler. If the fault handler can fix this error condition, it puts control back into the fault instruction and executes it again. If it can’t, the fault handler transfers control to the kernel’s ABORT () function, which abort() eventually terminates the application that caused the failure.

Unix does not attempt to recover from a generic protected fault. Instead, it reports this protected fault as a SIGSEGV corresponding signal and aborts the program. For example: divide by zero, the program tries to write read-only text segment, etc.

The classic example of a failure is a page-missing exception. Termination: The result of an unrecoverable fatal error. Termination never returns control to the application. Example: parity error (machine hardware error detection)

UnixSignal (signal)

For higher-level software exceptions, a signal is a message that notifies a process that a certain type of message has occurred in the system. Signaling provides a mechanism for notifying user processes of these exceptions and also allows processes to interrupt other processes.

Both Linux and macOS are Unix-like systems. The following table lists more than 30 different types of signals supported by Linux. Many Linux signals are also available on macOS:

number The name The default behavior The corresponding event number The name The default behavior The corresponding event
1 SIGHUP Termination of The terminal line is hung 16 SIGSTKFLT Termination of Stack failure on coprocessor
2 SIGINT Termination of Interrupt from keyboard 17 SIGCHLD ignore A child process pauses or terminates
3 SIGQUIT Termination of Exit from keyboard 18 SIGCONT ignore If the process is paused, the process continues
4 SIGILL Termination of Illegal instruction 19 SIGSTOP Stop until the next SIGCONT Pause signal not coming from terminal
5 SIGTRAP Terminates and dumps storage Tracking trap 20 SIGTSTP Stop until the next SIGCONT Pause signal from terminal
6 SIGABRT Terminates and dumps storage fromabortThe terminating signal of a function 21 SIGTTIN Stop until the next SIGCONT Background process reads from terminal
7 SIGBUS Termination of Bus error 22 SIGTTOU Stop until the next SIGCONT Background processes write to terminals
8 SIGFPE Terminates and dumps storage Floating-point exception 23 SIGURG ignore Emergency on the socket
9 SIGKILL Termination of Kill the process 24 SIGXCPU Termination of The CPU time exceeded the limit. Procedure
10 SIGUSER1 Termination of User-defined signal 1 25 SIGXFSZ Termination of The file size exceeds the limit
11 SIGSEGV Termination of Invalid memory reference (segment failure) 26 SIGVTALRM Termination of Virtual timer Expires
12 SIGUSER2 Termination of User-defined signal 2 27 SIGPROF Termination of Profiling timer expires
13 SIGPIPE Termination of Write to a pipe that no user is reading 28 SIGWINCH ignore Window size change
14 SIGALRM Termination of fromalarmThe timer signal of the function 29 SIGIO Termination of Perform I/O operations on a descriptor
15 SIGTERM Termination of Software kill signal 30 SIGPWR Termination of A power failure

System kernel

Mac OS X&iOS&iPad OS all have Darwin at their core. Darwin includes the open source XNU hybrid kernel, which includes Mach/BSD, an API that provides standardization (POSIX) on top of Mach, while XNU has Mach at its core. The following figure shows the OS X kernel architecture.

Mach: Is a microkernel operating system. The microkernel handles only the most core tasks, leaving other tasks to user-mode programs, including file management, device drivers and other services, which are split into different address Spaces, and each other’s messaging needs IPC. Main responsibilities: thread and process management, virtual memory management, process communication and messaging, task scheduling. The single-kernel counterpart puts all services in the same address space and calls each other.

BSD: is a Unix derivative system. Main responsibilities: Unix process model, POSIX thread model and related primitives, file system access, device access, network protocol stack, Unix users and groups.

Abnormal source

Exceptions in iOS mainly come from hardware exceptions, software exceptions, Mach exceptions, and Signal exceptions. The relationship between them is shown as follows:

The Mach abnormal

A Mach exception is a system kernel-level exception. It is triggered by a trap triggered by the CPU, which calls the exception handler to convert the exception from the hardware into a Mach exception, and then passes the Mach exception to the corresponding Thread, task, and host. If no result is returned, the task is terminated.

The kernel functions involved in Mach exception passing are shown below:

According to the above information, refer to apple open source materials to find the corresponding function information, briefly listed as follows (for detailed information, please go tohere) :

struct ppc_saved_state *trap(int trapno,
			     struct ppc_saved_state *ssp,
			     unsigned int dsisr,
			     unsigned int dar) {
      / /...
      doexception(exception, code, subcode);
      / /...
}
Copy the code
void doexception(
	    int exc,
	    int code,
	    int sub) {
	exception_data_type_t   codes[EXCEPTION_CODE_MAX];

	codes[0] = code;	
	codes[1] = sub;
	exception(exc, codes, 2);
}
Copy the code
// Des:The current thread caught an exception.
// We make an up-call to the thread's exception server.
void exception(
	exception_type_t	exception,
	exception_data_t	code,
	mach_msg_type_number_t  codeCnt)
{
	thread_act_t		thr_act;
	task_t			task;
	host_priv_t		host_priv;
	struct exception_action *excp;
	mutex_t*mutex; assert(exception ! = EXC_RPC_ALERT);if (exception == KERN_SUCCESS)
		panic("exception");
	/* * Try to raise the exception at the activation level. Thread level */
	thr_act = current_act();
	mutex = mutex_addr(thr_act->lock);
	excp = &thr_act->exc_actions[exception];
	exception_deliver(exception, code, codeCnt, excp, mutex);
	/* * Maybe the task level will handle it. Task level */
	task = current_task();
	mutex = mutex_addr(task->lock);
	excp = &task->exc_actions[exception];
	exception_deliver(exception, code, codeCnt, excp, mutex);
	/* * How about at the host level? Host level */
	host_priv = host_priv_self();
	mutex = mutex_addr(host_priv->lock);
	excp = &host_priv->exc_actions[exception];
	exception_deliver(exception, code, codeCnt, excp, mutex);
	/* * Nobody handled it, terminate the task. No processing terminates */
         // ...
	(void) task_terminate(task);
	thread_exception_return();
	/*NOTREACHED*/
}
Copy the code
// Make an upcall to the exception server provided.
void exception_deliver(
	exception_type_t	exception,
	exception_data_t	code,
	mach_msg_type_number_t  codeCnt,
	struct exception_action *excp,
	mutex_t			*mutex)
{
        / / /...
    
	int behavior = excp->behavior;

	switch (behavior) {
	case EXCEPTION_STATE: {
        ///EXCEPTION_STATE: Send a 'catch_EXCEPtion_RAISe_state' message
        ///including the thread state.
		/ /..
		kr = exception_raise_state(exc_port, exception,
						   code, codeCnt,
						   &flavor,
						   state, state_cnt,
						   state, &state_cnt);
                / /..
		return;
	}

	case EXCEPTION_DEFAULT:
        EXCEPTION_DEFAULT: Send a 'catch_EXCEPtion_raise' message
        ///including the thread identity.
		/ /..
		kr = exception_raise(exc_port,
				retrieve_act_self_fast(a_self),
				retrieve_task_self_fast(a_self->task),
				exception,
				code, codeCnt);
                / /..
		return;

	case EXCEPTION_STATE_IDENTITY: {
        /// EXCEPTION_STATE_IDENTITY: Send a 'catch_EXCEPtion_RAISe_state_identity' message
        ///including the thread identity and state.
		/ /..
	        kr = exception_raise_state_identity(exc_port,
				retrieve_act_self_fast(a_self),
				retrieve_task_self_fast(a_self->task),
				exception,
				code, codeCnt,
				&flavor,
				state, state_cnt,
				state, &state_cnt);
		 / /..
		 return;
	}
	
	default:
		panic ("bad exception behavior!"); }}Copy the code

The Apple documentation on how to catch Mach exceptions is sparse, and there is no API available. The Mach kernel API is available here.

Mr. Sigal signal

BSD is a Unix-like system derived from the Unix operating system. It is based on the Tasks of the Mach kernel process and provides POSIX application program interface. See wikipedia-xNU for details. For this reason, the Unix Signal mechanism also works on Apple operating systems. For the Unix Signal definition, run #import

.

Mach exception -> Signal

MAC Operating system Mach exceptions coexist with Signal signals. Mach runs the core of the operating system as a separate process, communicating with the BSD server process via an IPC mechanism. Similarly, Mach kernel-mode exceptions also send exception messages to BSD based on IPC, and BSD converts the messages to user-mode Signal signals. The specific process is as follows:

  1. It is executed when the Apple kernel bootsbsdinit_task()And finally callux_handler_init()Methods.
void bsdinit_task(void)
{
	proc_t p = current_proc();
	struct uthread *ut;
	thread_t thread;

	process_name("init", p);

	ux_handler_init();

	thread = current_thread();
	(void) host_set_exception_ports(host_priv_self(),
					EXC_MASK_ALL & ~(EXC_MASK_RPC_ALERT),//pilotfish (shark) needs this port
					(mach_port_t) ux_exception_port,
					EXCEPTION_DEFAULT| MACH_EXCEPTION_CODES,
					0);

	ut = (uthread_t)get_bsdthread_info(thread);

	bsd_init_task = get_threadtask(thread);
	init_task_failure_data[0] = 0;

#if CONFIG_MACF
	mac_cred_label_associate_user(p->p_ucred);
	mac_task_label_update_cred (p->p_ucred, (struct task *) p->task);
#endif
	load_init_program(p);
	lock_trace = 1;
}
Copy the code
  1. ux_handler_init()Initialize oneux_handler()Method and create a thread to execute it.
void ux_handler_init(void)
{
	thread_t thread = THREAD_NULL;

	ux_exception_port = MACH_PORT_NULL;
	(void) kernel_thread_start((thread_continue_t)ux_handler, NULL, &thread);
	thread_deallocate(thread);
	proc_list_lock();
	if (ux_exception_port == MACH_PORT_NULL)  {
		(void)msleep(&ux_exception_port, proc_list_mlock, 0."ux_handler_wait".0);
	}
	proc_list_unlock();
}
Copy the code
  1. ux_handler()Application for receivingMachPort for kernel messages (port) collection, receive fromMachException message of
static void ux_handler(void)
{
    task_t		self = current_task();
    mach_port_name_t	exc_port_name;
    mach_port_name_t	exc_set_name;

    /*
     *	Allocate a port set that we will receive on.
     */
    if (mach_port_allocate(get_task_ipcspace(ux_handler_self), MACH_PORT_RIGHT_PORT_SET,  &exc_set_name) != MACH_MSG_SUCCESS)
	    panic("ux_handler: port_set_allocate failed");

    /*
     *	Allocate an exception port and use object_copyin to
     *	translate it to the global name.  Put it into the set.
     */
    if (mach_port_allocate(get_task_ipcspace(ux_handler_self), MACH_PORT_RIGHT_RECEIVE, &exc_port_name) != MACH_MSG_SUCCESS)
	panic("ux_handler: port_allocate failed");
    if (mach_port_move_member(get_task_ipcspace(ux_handler_self),
    			exc_port_name,  exc_set_name) != MACH_MSG_SUCCESS)
	panic("ux_handler: port_set_add failed");

    if (ipc_object_copyin(get_task_ipcspace(self), exc_port_name,
			MACH_MSG_TYPE_MAKE_SEND, 
			(void *) &ux_exception_port) != MACH_MSG_SUCCESS)
		panic("ux_handler: object_copyin(ux_exception_port) failed");

    proc_list_lock();
    thread_wakeup(&ux_exception_port);
    proc_list_unlock();

    /* Message handling loop. */

    for (;;) {
	struct rep_msg {
		mach_msg_header_t Head;
		NDR_record_t NDR;
		kern_return_t RetCode;
	} rep_msg;
	struct exc_msg {
		mach_msg_header_t Head;
		/* start of the kernel processed data */
		mach_msg_body_t msgh_body;
		mach_msg_port_descriptor_t thread;
		mach_msg_port_descriptor_t task;
		/* end of the kernel processed data */
		NDR_record_t NDR;
		exception_type_t exception;
		mach_msg_type_number_t codeCnt;
		mach_exception_data_t code;
		/* some times RCV_TO_LARGE probs */
		char pad[512];
	} exc_msg;
	mach_port_name_t	reply_port;
	kern_return_t	 result;

	exc_msg.Head.msgh_local_port = CAST_MACH_NAME_TO_PORT(exc_set_name);
	exc_msg.Head.msgh_size = sizeof (exc_msg);
#if 0
	result = mach_msg_receive(&exc_msg.Head);
#else
	result = mach_msg_receive(&exc_msg.Head, MACH_RCV_MSG,
			     sizeof (exc_msg), exc_set_name,
			     MACH_MSG_TIMEOUT_NONE, MACH_PORT_NULL,
			     0);
#endif
	if (result == MACH_MSG_SUCCESS) {
	    reply_port = CAST_MACH_PORT_TO_NAME(exc_msg.Head.msgh_remote_port);
            ///收到消息后调用 mach_exc_server() 
	    if (mach_exc_server(&exc_msg.Head, &rep_msg.Head)) {
                ///收到消息,回复消息
		result = mach_msg_send(&rep_msg.Head, MACH_SEND_MSG,
			sizeof (rep_msg),MACH_MSG_TIMEOUT_NONE,MACH_PORT_NULL);
		if (reply_port != 0 && result != MACH_MSG_SUCCESS)
			mach_port_deallocate(get_task_ipcspace(ux_handler_self), reply_port);
	    }

	}
	else if (result == MACH_RCV_TOO_LARGE)
		/* ignore oversized messages */;
	else
		panic("exception_handler");
    }
}
Copy the code
  1. Upon receiving the Mach kernel message, the ux_handler(void) calls the mach_exc_server function, Catch_mach_exception_raise (), catch_mach_exception_raise_state(), And catch_mach_EXCEPtion_raise_state_identity (), catch_mach_EXCEPtion_raise () triggers the conversion of Mach exception messages to Unix signals. The implementation of mach_exc_server() is not directly given as other functions, see here.

  2. The call to catch_mach_Exception_raise () converts the Mach exception into a Unix signal that is eventually sent to the corresponding thread.

kern_return_t catch_mach_exception_raise(
        __unused mach_port_t exception_port,
        mach_port_t thread,
        mach_port_t task,
        exception_type_t exception,
        mach_exception_data_t code,
        __unused mach_msg_type_number_t codeCnt
)
{
	/ / /...

	/* * Convert exception to unix signal and code. */
	ux_exception(exception, code[0], code[1], &ux_signal, &ucode);
        ///struct uthread *ut
        ///struct proc	*p;
        ut = get_bsdthread_info(th_act);
	p = proc_findthread(th_act);
        / / /...
         /* * Send signal. */
	  if(ux_signal ! =0) {
		ut->uu_exception = exception;
		//ut->uu_code = code[0]; // filled in by threadsignal
		ut->uu_subcode = code[1];			
		threadsignal(th_act, ux_signal, code[0]);
	   }
	 if(p ! =NULL) 
	   proc_rele(p);
	   thread_deallocate(th_act);
	/ / /...
}
Copy the code
static void ux_exception(
		int			exception,
		mach_exception_code_t 	code,
		mach_exception_subcode_t subcode,
		int			*ux_signal,
		mach_exception_code_t 	*ux_code)
{
    /* * Try machine-dependent translation first. */
    if (machine_exception(exception, code, subcode, ux_signal, ux_code))
	return;
	
    switch(exception) {

	case EXC_BAD_ACCESS:
		if (code == KERN_INVALID_ADDRESS)
			*ux_signal = SIGSEGV;
		else
			*ux_signal = SIGBUS;
		break;

	case EXC_BAD_INSTRUCTION:
	    *ux_signal = SIGILL;
	    break;

	case EXC_ARITHMETIC:
	    *ux_signal = SIGFPE;
	    break;

	case EXC_EMULATION:
	    *ux_signal = SIGEMT;
	    break;

	case EXC_SOFTWARE:
	    switch (code) {

	    case EXC_UNIX_BAD_SYSCALL:
		*ux_signal = SIGSYS;
		break;
	    case EXC_UNIX_BAD_PIPE:
		*ux_signal = SIGPIPE;
		break;
	    case EXC_UNIX_ABORT:
		*ux_signal = SIGABRT;
		break;
	    case EXC_SOFT_SIGNAL:
		*ux_signal = SIGKILL;
		break;
	    }
	    break;

	case EXC_BREAKPOINT:
	    *ux_signal = SIGTRAP;
	    break; }}Copy the code

The ux_Exception () function shows the conversion relationship between Mach exceptions and Signal signals. For the definition of Mach exception signals in iOS, run the #include < Mach/Exception_types.h > jump.

Hardware abnormal

Hardware exceptions include interruption, defect, fault, and termination. The process for triggering hardware exceptions is as follows:

Software exception

An application-level exception is called NSException in iOS. If the NSException is not try-caught, the system ends up calling abort() to signal SIGABRT to the application.

void abort(a) {
         / / /...
	/* 
      
        abort() should call pthread_kill to deliver a signal to the aborting thread * This helps gdb  focus on the thread calling abort() */
      
	if (__is_threaded) {
	    / /...
	    (void)pthread_kill(pthread_self(), SIGABRT);
	} else {
	    / /...
	    (void)kill(getpid(), SIGABRT);
	}
	/ /...
}
Copy the code

Exception handling

As you can see from the previous analysis, both hardware and software exceptions are eventually converted to Unix Signal, so processing of Signal signals can cover most crash information. In addition, the system provides us with NSException, which can be used to obtain more detailed information about the collapse. With this in mind, we will only provide a simple example of capturing Signal and NSException below.

Signal capture

When doing Signal capture, you need to be aware of overwriting issues. Because each Signal corresponds to a Handler function, when we bind our own Hanlder to collect the crash information, we may overwrite the handlers already bound to other libraries, causing them to be unable to collect the crash information.

The core code is as follows:

/ / header files
#import <sys/signal.h>
#import "execinfo.h"
///1. Save the old handler
static struct sigaction *previous_signalHandlers = NULL;
///2. Define the signal we want to process
static int signals[] = {SIGABRT,SIGBUS,SIGFPE,SIGILL,SIGPIPE,SIGSEGV,SIGSYS,SIGTRAP};
/ / / 3. Registered ` Handler `
+ (BOOL)registerSignalHandler; {
    // initialize our Sigaction
    struct sigaction action = { 0 };
    // initialize the old Sigaction array
    int count = sizeof(signals) / sizeof(int);
    if (previous_signalHandlers == NULL) {
        previous_signalHandlers = malloc(sizeof(struct sigaction) * count);
    }
    action.sa_flags = SA_SIGINFO;
    sigemptyset(&action.sa_mask);
    /// bind our handler
    action.sa_sigaction = &_handleSignal;
    for (int i = 0; i < count; i ++) {
        /// iterate over the signal
        int signal = signals[i];///or *(signals + i)
        // bind new 'Sigaction' and store old 'Sigaction'
        int result = sigaction(signal, &action, &previous_signalHandlers[i]);
        /// Failed to bind
        if(result ! =0) {
            NSLog(@"signal:%d,error:%s",signal,strerror(errno));
            for (int j =i--; j >= 0; j--) {// restore the old Sigaction, this time the function returns NO
                sigaction(signals[j], &previous_signalHandlers[j], NULL);
            }
            return NO; }}return YES;
}
/// 4
void _handleSignal(int sigNum,siginfo_t *info,void *ucontext_t) {
    
    /// todo our operation
    NSLog(❌ intercepts crash signal :%d, prints stack information :% @",[CrashSignals callStackSymbols]);
    // get the index corresponding to 'sigNum' in the signal array
    int index = - 1,count = sizeof(signals) / sizeof(int);
    for (int i = 0; i < count; i++) {
        if (*(signals + i) == sigNum) {
            index = i;
            break; }}if (index == - 1) return;
    /// remove the old 'Sigaction'
    struct sigaction previous_action = previous_signalHandlers[index];
    if (previous_action.sa_handler == SIG_IGN) {
        // 'SIG_IGN' ignores signals, and 'SIG_DFL' processes signals by default
        return;
    }
    // restore the old 'Sigaction' binding to Signal
    sigaction(sigNum, &previous_action, NULL);
    /// Rethrows the Signal, which will be intercepted by the previous_action handler.
    raise(sigNum);
}
//5. Function call stack
+ (NSArray*)callStackSymbols {
    /// ` int backtrace(void ** buffer , int size )`
    /// void ** buffer: returns a trace of the program's stack frame in the array pointed to by 'buffer',
    /// void ** buffer: Each item in the array pointed to by buffer is of type void *
    void* backtrace_buffer[128];
    /// return value may be larger than 128, stool truncation, small all display
    int numberOfReturnAdderss = backtrace(backtrace_buffer, 128);
    ///char **backtrace_symbols(void *const *buffer, int size);
    /// `backtrace_symbols()` translates the addresses into an array of strings that describe the addresses symbolically
    /// The size argument specifies the number of addresses in buffer
    char **symbols = backtrace_symbols(backtrace_buffer, numberOfReturnAdderss);
    // retrieve the symbol information corresponding to each return address
    NSMutableArray *tempArray = [[NSMutableArray alloc]initWithCapacity:numberOfReturnAdderss];
    for (int i = 0 ; i < numberOfReturnAdderss; i++) {
        char *cstr_item = symbols[i];
        NSString *objc_str = [NSString stringWithUTF8String:cstr_item];
        [tempArray addObject:objc_str];
    }
    return [tempArray copy];
}
Copy the code

NSException capture

The system provides an API for handling uncaught NSException in iOS. You just need to follow the API, but like Signal, you need to be aware of multiple registered coverage issues to avoid affecting other collectors in your project.

The core code is as follows:

///1. Declare the static variable used to hold the old 'Hanlder'
static NSUncaughtExceptionHandler *previous_uncaughtExceptionHandler;
/// Register 'handler' to handle application-level exceptions
+ (void)registerExceptionHandler; {
    previous_uncaughtExceptionHandler = NSGetUncaughtExceptionHandler(a);NSSetUncaughtExceptionHandler(&_handleException);
}
/// Our exception handler
void _handleException(NSException *exception) {
    /// Todo our operation
    NSLog(@"✅ intercepted abnormal stack information: %@",exception.callStackSymbols);
    
    /// pass an exception
    if(previous_uncaughtExceptionHandler ! =NULL) {
        previous_uncaughtExceptionHandler(exception);
    }
    // Kill the program to prevent SIGABRT thrown at the same time from being caught by Signal exceptions
    /// kill (cannot be caught or ignored)
    kill(getpid(), SIGKILL);
}
Copy the code

Debug verification

In Xcode’s Debug environment, Signal exceptions and NSException exceptions are intercepted by the Xcode debugger and do not go to our handler. Therefore, the author used the simulator to run the program, stopped running, broke away from the Xcode debugging environment, opened the program in the simulator again, started the Mac console program, clicked the button to trigger crash, and checked the log record of the simulator corresponding to the console to verify whether the capture was correct. Kill (getPid (), SIGBUS); To trigger.

The resources

Flylib.com/books/en/3….

Shevakuilin.com/ios-crashpr…

Minosjy.com/2021/04/10/…

Developer.apple.com/library/arc…