1. As an app developer, have you had the following experiences:

  • In order to ensure the correct operation of the APP, you and the tester will conduct a lot of testing work before submitting the APP Store or enterprise distribution. After a period of testing and repair process, the APP runs stably on the test machine, everyone is confident to release the version, and you also need to start the next iteration of the function of the version. But when the release has a certain amount of installation, there will be user feedback flash back!

  • At this time, it is particularly important to obtain the flash backoff information of online applications for the analysis and processing of flash backoff anomalies. If your app is not highly private about data, you can tap into common third-party SDKS such as Bugly, Duyun, and Umeng. If your company has high requirements for data privacy, you will need to develop your own exception log collection system.

2. Analysis of the cause of flash regression

Despite a lot of tests in the development process, it is difficult to avoid the following problems in the face of different devices and different system versions:

  • Array out of bounds, insert null value
  • unrecognized selector
  • NSString crash
  • NSNotification crash
  • KVO crash
  • Wild pointer
  • Threading issues
  • EXC_BAD_ACCESS
  • .

The iOS application blink is generally caused by Mach and ObjC exceptions. The capture process of the blink is as follows:

3,NSExceptionabnormal

Common exceptions to NSException are:

  • unrecognized selector crash
  • KVO crash
  • NSNotification Crash (Zombie object)
  • NSTimer crash
  • Container crash (array out of bounds, insert nil, etc)
  • NSString crash (string operation crash)
  • Bad Access Crash (wild pointer)
  • UI not on Main Thread Crash
  • .
3.1. Catch exceptions

Aiming at application level anomalies, through the system to provide NSGetUncaughtExceptionHandler backup exception handling method, through NSSetUncaughtExceptionHandler catch exceptions for custom processing methods:

//Backing up original exception handler
g_previousUncaughtExceptionHandler = NSGetUncaughtExceptionHandler();

//Setting new handler
NSSetUncaughtExceptionHandler(&handleUncaughtException);
Copy the code
3.2 handling exceptions
static void handleUncaughtException(NSException* exception) {
    handleException(exception, false);
}
Copy the code
3.3 compatibility processing

Normally, it is suggested that in the same App in avoid registration by NSSetUncaughtExceptionHandler exception handling many times. But common third-party frameworks Bugly, listen to the cloud, and their Allies are integrated with their own Crash handling, there might be multiple NSSetUncaughtExceptionHandler inevitable exception handling registration, in order to be compatible with processing, advice: Before through setting their own handler NSSetUncaughtExceptionHandler obtained through NSGetUncaughtExceptionHandler handler and backup before, when I after handling the exception handler must perform a backup, So that the transfer chain doesn’t break.

/ / after perform their own exception handling, must perform a backup of the handler if (g_previousUncaughtExceptionHandler! = NULL) { //Calling original exception handler g_previousUncaughtExceptionHandler(exception); }Copy the code

4,Machabnormal

4.1. What is Mach?

In the figure below, Mach is the closest module to the bottom of the system, the XNU microkernel core, which provides some of the most critical functions of the operating system. Mach manages processor resources such as CPU utilization and memory, handles scheduling, enforces memory protection, and implements a message-centric infrastructure for local and remote typeless interprocess communication.

Mach exceptions are the lowest level kernel-level exceptions that allow exceptions to be handled within or across processes. Exception information is transmitted between processes through the Mach IPC(Inter-Process Communication) port. Any Process, specific thread, specific task, or even the entire host that has permission on the Mach port of the target Process can be registered for exception handling. The kernel searches for these exception handling in order. Each ThreadTaskHost has an array of exception ports to receive exceptions. Part of the Mach API is exposed to the user state. The user state developer can use the Mach API directly to set exception ports for Thread/Task /host to catch Mach exceptions. Related apis are as follows:

  • task_get_exception_ports: Gets the last exception port of this task
  • mach_port_allocate: Creates a new exception port for the task
  • mach_port_insert_right: Apply for task_set_Exception_ports permission for this task
  • task_set_exception_ports: Sets a new exception port for the task
  • mach_msg(): Receives abnormal messages
4.2,MachException handling

Create a new exception handler thread to listen for Mach exceptions and handle exception messages:

4.2.1 Port Structure
static struct
{
    exception_mask_t masks[EXC_TYPES_COUNT];
    exception_handler_t ports[EXC_TYPES_COUNT];
    exception_behavior_t behaviors[EXC_TYPES_COUNT];
    thread_state_flavor_t flavors[EXC_TYPES_COUNT];
    mach_msg_type_number_t count;
} g_previousExceptionPorts;
Copy the code
4.2.2 Backing up the Current Abnormal Port
//Backing up original exception ports
kr = task_get_exception_ports(thisTask,
                              mask,
                              g_previousExceptionPorts.masks,
                              &g_previousExceptionPorts.count,
                              g_previousExceptionPorts.ports,
                              g_previousExceptionPorts.behaviors,
                              g_previousExceptionPorts.flavors);
Copy the code
4.2.3 Creating a New exception port and setting Task as the new exception port
//Allocating new port with receive rights kr = mach_port_allocate(thisTask, MACH_PORT_RIGHT_RECEIVE, &g_exceptionPort); . //Adding send rights to port kr = mach_port_insert_right(thisTask, g_exceptionPort, g_exceptionPort, MACH_MSG_TYPE_MAKE_SEND); . //Installing port as exception handler kr = task_set_exception_ports(thisTask, mask, g_exceptionPort, (int)(EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES), THREAD_STATE_NONE);Copy the code
4.2.4 Creating a Listener Thread to start listening
//Creating primary exception thread error = pthread_create(&g_primaryPThread, &attr, &handleExceptions, kThreadPrimary);  . pthread_attr_destroy(&attr); g_primaryMachThread = pthread_mach_thread_np(g_primaryPThread); ksmc_addReservedThread(g_primaryMachThread);Copy the code
4.2.5 Receiving and Handling Exception Messages
/** Our exception handler thread routine.
* Wait for an exception message, uninstall our exception port, record the
* exception information, and write a report.
*/
static void* handleExceptions(void* const userData)
{
    ......
}
Copy the code
4.2.6 Restoring the original Port
// Reinstall old exception ports.
for(mach_msg_type_number_t i = 0; i < g_previousExceptionPorts.count; i++)
{
    //Restoring port index i
    kr = task_set_exception_ports(thisTask,
                                  g_previousExceptionPorts.masks[i],
                                  g_previousExceptionPorts.ports[i],
                                  g_previousExceptionPorts.behaviors[i],
                                  g_previousExceptionPorts.flavors[i]);
}
Copy the code

5,UnixAbnormal signal

A signal is a software layer simulation of an interrupt. It is a processing mechanism for asynchronous communication. In fact, the process does not know when the signal is coming. BSD is a UNIX signal-handling mechanism built on top of Mach exceptions, which are converted to the corresponding UNIX by ux_Exception at the Host layer and passed to the offending thread via ThreadSignal. In the face of Mach and Unix exceptions, we should catch Mach exceptions first, because Mach exception handling takes precedence over Unix signal handling, and if Mach exception handling causes the program to exit, the Unix signal has no chance of reaching the thread.

5.1 common signal types
SIGABRT, //NSExceptionMachCabort()
SIGFPE, //NaN
SIGILL, //
SIGPIPE, //
SIGBUS, //
SIGSEGV, //MachEXC_BAD_ACCESS
SIGSYS, //
SIGTRAP, //trap
Copy the code
5.2. Signal acquisition
5.2.1 sigaction function
struct sigaction{
  void (*sa_handler)(int);
  sigset_t sa_mask;
  int sa_flag;
  void (*sa_sigaction)(int,siginfo_t *,void *);
};
int sigaction(int sig, const struct sigaction *act, struct sigaction *oact)
Copy the code

Parameter Description:

  • sig: Signal to operate
  • act: The new way of processing signals to be set up
  • oact: Original way of processing signals
  • The return value:0Success means success,- 1Indicates that an error occurred
5.2.2 Set a new handler for the signal (back up the original handler)
action.sa_sigaction = &handleSignal; for(int i = 0; i < fatalSignalsCount; i++) { //Assigning handler for signal %d", fatalSignals[i] if(sigaction(fatalSignals[i], &action, &g_previousSignalHandlers[i]) ! = 0) { char sigNameBuff[30]; const char* sigName = kssignal_signalName(fatalSignals[i]); if(sigName == NULL) { snprintf(sigNameBuff, sizeof(sigNameBuff), "%d", fatalSignals[i]); sigName = sigNameBuff; } // Try to reverse the damage for(i--; i >= 0; i--) { sigaction(fatalSignals[i], &g_previousSignalHandlers[i], NULL); } goto failed; }}Copy the code
5.2.3 Handling abnormal signals
static void handleSignal(int sigNum, siginfo_t* signalInfo, void* userContext) { //Trapped signal %d", sigNum if(g_isEnabled) { thread_act_array_t threads = NULL; mach_msg_type_number_t numThreads = 0; ksmc_suspendEnvironment(&threads, &numThreads); . kscm_handleException(crashContext); ksmc_resumeEnvironment(threads, numThreads); } //Re-raising signal for regular handlers to catch raise(sigNum); }Copy the code

Q: Why didn’t the Mach phase catch EXC_Crash instead of the Unix signal catching its corresponding SIGABRT signal? A: The open source framework PLCrashReporter explains:

/* We still need to use signal handlers to catch SIGABRT in-process. The kernel sends an EXC_CRASH mach exception
* to denote SIGABRT termination. In that case, catching the Mach exception in-process leads to process deadlock
* in an uninterruptable wait. Thus, we fall back on BSD signal handlers for SIGABRT, and do not register for
* EXC_CRASH. */
Copy the code

6,The symbol tableparsing

Through the above steps, we can get the crash log information when the application crashes. If the application is packaged in a production environment, the stack information is the address of the function. In this case, we need to perform symbol table parsing to restore the symbol meaning in the stack.

6.1 the symbol table
6.1.1 What is a symbol table?

A symbol table is a mapping of memory addresses to function names, file names, and line numbers. The symbol table elements are as follows: < start address > < end address > < function > [< filename: line number >]

6.1.2 Why Do I Configure a Symbol Table?

In order to quickly and accurately locate the code location where APP Crash occurred, Bugly used symbol table to parse and restore the program stack where APP Crash occurred.

Here’s an example:

6.1.3 Obtaining application symbol Table files
  • The defaultXcodeThe project will have the following Settings, the production environment compile package to generate symbol table files:

  • Open theXcode → Window → Organizer→ Select the package to publish → right clickShow in FinderTo find the*.xcarchiveFile → Right clickDisplay package contents:

  • dSYMs → *.app.dSYM– > right clickDisplay package contents → Contents → Resources → DWARF → Symbol table file:

  • Get the symbol table file UUID identifier:Dwarfdump -- UUID symbol table file:

  • View symbol table contents:dwarfdump --arch arm64 --debug-pubnames Demo.app.dSYM:

Content interception:

Demo.app.dSYM/Contents/Resources/DWARF/Demo(arm64): file format Mach-O arm64

.debug_pubnames contents:
length = 0x00000b57 version = 0x0002 unit_offset = 0x000a29a6 unit_size = 0x00001cfc
Offset Name
0x00000275 "-[EMGroupInfoViewController initWithGroupId:]"
0x000002c0 "-[EMGroupInfoViewController viewDidLoad]"
0x000002f7 "-[EMGroupInfoViewController reloadInfo]"
0x0000032e "-[EMGroupInfoViewController dealloc]"
0x00000365 "-[EMGroupInfoViewController _setupSubviews]"
0x0000039c "-[EMGroupInfoViewController numberOfSectionsInTableView:]"
0x000003e3 "-[EMGroupInfoViewController tableView:numberOfRowsInSection:]"
0x0000044e "-[EMGroupInfoViewController tableView:cellForRowAtIndexPath:]"
0x000004d9 "-[EMGroupInfoViewController groupOwnerDidUpdate:newOwner:oldOwner:]"
0x00000540 "-[EMGroupInfoViewController tableView:heightForHeaderInSection:]"
0x00000593 "-[EMGroupInfoViewController tableView:heightForFooterInSection:]"
0x000005e6 "-[EMGroupInfoViewController tableView:didSelectRowAtIndexPath:]"
0x00000717 "-[EMGroupInfoViewController multiDevicesGroupEventDidReceive:groupId:ext:]"
0x00000782 "-[EMGroupInfoViewController _resetGroup:]"
0x0000080b "-[EMGroupInfoViewController _fetchGroupWithId:isShowHUD:]"
0x00000876 "__57-[EMGroupInfoViewController _fetchGroupWithId:isShowHUD:]_block_invoke"
0x000008d6 "__copy_helper_block_e8_32w"
0x000008ff "__destroy_helper_block_e8_32w"
0x0000091e "-[EMGroupInfoViewController tableViewDidTriggerHeaderRefresh]"
0x00000956 "-[EMGroupInfoViewController handleGroupInfoUpdated:]"
0x000009b0 "-[EMGroupInfoViewController groupAnnouncementAction]"
0x000009f9 "__52-[EMGroupInfoViewController groupAnnouncementAction]_block_invoke"
0x00000a9b "__52-[EMGroupInfoViewController groupAnnouncementAction]_block_invoke_2"
0x00000aff "__52-[EMGroupInfoViewController groupAnnouncementAction]_block_invoke_3"
Copy the code
6.2 ATOS symbol table Parsing

Atos is used to resolve the file name, line number and function name corresponding to the memory address of the program at runtime.

6.2.1 Common Crash logs
Incident Identifier: B7DEEC95-FCCA-42A6-ABA3-77AD8B743F6C CrashReporter Key: 66 c23a1c1d6377ff1cc81121d3ac82623457b377 Hardware Model: iPhone10, 2 Process: Demo [35768] the Path: /private/var/containers/Bundle/Application/140CCFD5-86C9-4CE8-A6EC-08232859CF79/Demo.app/Demo Identifier: Com.test. demo Version: 2 (1.4.0) Code Type: ARM-64 (Native) Role: Foreground Parent Process: Launchd [1] Date/Time: 2020-11-23 16:23:23.444 +0800 Launch Time: 2020-11-23 16:22:23.4110 +0800 OS Version: IPhone OS 14.2 (18B92) Release Type: User Baseband Version: 6.02.01 Report Version: 104 Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Exception Note: EXC_CORPSE_NOTIFY Triggered by Thread: 0 Thread 0 name: Dispatch queue: com.apple.main-thread Thread 0 Crashed: 0 libsystem_kernel.dylib 0x00000001c386d84c 0x1c3846000 + 161868 1 libsystem_pthread.dylib 0x00000001dedc99e8 0x1dedbf000 + 43496 2 libsystem_c.dylib 0x00000001a13d98f4 0x1a1366000 + 473332 3 libc++abi.dylib 0x00000001ac1dacc8 0x1ac1c8000 + 77000 4 libc++abi.dylib 0x00000001ac1ccca0 0x1ac1c8000 + 19616 5 libobjc.A.dylib 0x00000001ac0dee04 0x1ac0d8000 + 28164 6 Demo 0x0000000100c8fd0c 0x1006b8000 + 6126860 7 Demo 0x00000001014bbb44 0x1006b8000 + 14695236 8 libc++abi.dylib 0x00000001ac1da154 0x1ac1c8000 + 74068 9 libc++abi.dylib 0x00000001ac1dce68 0x1ac1c8000 + 85608 10 libobjc.A.dylib 0x00000001ac0ded04 0x1ac0d8000 + 27908 11 CoreFoundation 0x00000001986a6c7c 0x198613000 + 605308 12 GraphicsServices 0x00000001ae9c9598 0x1ae9c6000 + 13720 13 UIKitCore 0x000000019af90638 0x19a464000 + 11716152 14 UIKitCore 0x000000019af95bb8 0x19a464000 + 11738040 15 Demo 0x0000000100704aac 0x1006b8000 + 314028 16 libdyld.dylib 0x0000000198385588 0x198384000 + 5512Copy the code

Key stack information is encrypted:

6 Demo 0x0000000100c8fd0c 0x1006b8000 + 6126860
7 Demo 0x00000001014bbb44 0x1006b8000 + 14695236
Copy the code

Custom log format flow flow:

6.2.2 Parsing Crash Logs

Log information (intercepted) :

UUID identifier of the log :(the content can be correctly parsed only when it matches the UUID identifier of the symbol table file)

Flash back Crash stack (intercept) :

“Object_addr “: 4331880448 → 0x102334000 “instruction_addr”: 4331909492 → 0x10233B174

Run atos -arch arm64 -o MobApmDemo -l 0x102334000 0x10233B174 to parse the COMMAND line

6.2.3 atoslimitations

Atos is the built-in command line tool of Xcode. It supports only Mac platforms, but cannot be used on Linux and Windows platforms. Atos can only parse symbols line by line, which is inefficient.

The ATOSL tool is a Linux implementation of ATOS, developed by Facebook, but in practice ATOSL does not support symbol table parsing with the ARM64 architecture.

6.3 symbolicatecrashSymbol resolution

Symbolicatecrash is Xcode’s own analysis tool for batch symbolizing stack information. It encapsulates ATOS and can parse crash files one at a time. It also contains system symbols for easy use.

6.3.1 accesssymbolicatecrashtool

Use the find command to find the path to the Symbolicatecrash tool:

Find /Applications/ xcode. app-name symbolicatecrash-type f //  /Applications/Xcode.app/Contents/SharedFrameworks/DVTFoundation.framework/Versions/A/Resources/symbolicatecrashCopy the code
6.3.2 parsing.crashfile

Run the following command:

export DEVELOPER_DIR="/Applications/XCode.app/Contents/Developer"
Copy the code
./symbolicatecrash xxx.crash Demo.app.dSYM/Contents/Resources/DWARF/Demo -o out.crash
Copy the code

The resulting out.crash file resolves all stack symbols:

Incident Identifier: 2B157C99-2399-405A-8BB0-A0FD6253FAB6 CrashReporter Key: B3e9ef79355f1c42b61d81edfd1f97c43cab232c Hardware Model: iPhone10, 2 Process: Demo [41201] the Path: /private/var/containers/Bundle/Application/EC97E67A-F9C4-49EC-A3A6-F083FF742048/Demo.app/Demo Version: 126 (1.1.3) Code Type: ARM-64 Parent Process:? [1] Date/Time: 2020-11-05 11:56:34.687 +0800 OS Version: iOS 14.1 (18A8395) Report Version: 104 Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x00000000 at 0x0000000000000000 Crashed Thread: 0 Application Specific Information: *** Terminating app due to uncaught exception 'NSUnknownKeyException', reason: '[<NSObject 0x2836b8600> setValue:forUndefinedKey:]: this class is not key value coding-compliant for the key key.' Thread 0 Crashed: 0 CoreFoundation 0x000000019eca9114 __exceptionPreprocess + 216 1 libobjc.A.dylib 0x00000001b2545cb4 objc_exception_throw + 56 2 CoreFoundation 0x000000019ebba4e0 -[NSObject+ 177376 (NSKindOfAdditions) isNSSet__] + 0 3 Foundation 0x000000019fe54728 -[NSObject+ 182056 (NSKeyValueCoding) setValue:forKey:] + 312 4 Demo 0x000000010475623c -[ViewController crashTest:] + 25148 (ViewController.m:104) 5 UIKitCore 0x00000001a14f291c -[UIApplication sendAction:to:from:forEvent:] + 96 6 Demo 0x0000000104908afc __nbsEventHookSendAction_block_invoke + 1232 7 UIKitCore 0x00000001a0e8a5bc -[UIControl sendAction:to:forEvent:] + 240 8 UIKitCore 0x00000001a0e8a900 -[UIControl _sendActionsForEvents:withEvent:] + 352 9 UIKitCore 0x00000001a0e89238 -[UIControl touchesEnded:withEvent:] + 532 10 UIKitCore 0x00000001a152d7f0 -[UIWindow _sendTouchesForEvent:] + 1244 11 UIKitCore 0x00000001a152f118 -[UIWindow sendEvent:] + 3824 12 UIKitCore 0x00000001a150a4fc -[UIApplication sendEvent:] + 744 13 UIKitCore 0x00000001a158c76c __dispatchPreprocessedEventFromEventQueue + 1032 14 UIKitCore 0x00000001a1590f0c __processEventQueue + 6440 15 UIKitCore  0x00000001a15881cc __eventFetcherSourceCallback + 156 16 CoreFoundation 0x000000019ec29240 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24 17 CoreFoundation 0x000000019ec29140 __CFRunLoopDoSource0 + 204 18 CoreFoundation 0x000000019ec28488 __CFRunLoopDoSources0 + 256 19 CoreFoundation 0x000000019ec22a40 __CFRunLoopRun + 776 20 CoreFoundation 0x000000019ec22200 CFRunLoopRunSpecific + 572 21 GraphicsServices 0x00000001b4d9f598 GSEventRunModal + 160 22 UIKitCore 0x00000001a14ebbcc -[UIApplication _run] + 1052 23 UIKitCore 0x00000001a14f11a0 UIApplicationMain + 164 24 Demo 0x000000010475756c main + 30060 (main.m:18) 25 libdyld.dylib 0x000000019e901588 start + 4Copy the code