Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”

preface

Hi Coder, I’m CoderStar!

Before, the knowledge related to symbolization is always fragmentary and not systematic. Just recently, I saw many students sent some articles about iOS symbolization. I want to sort out this article to sort out the knowledge related to iOS symbolization. I am still in the primary stage of symbolization, a lot of knowledge in the article comes from the following reference materials, thank you for sharing.

Symbolization from some common sense is the machine language symbols can be converted into a human readable, and the environment here is refers to some exception information under the iOS or Mac OS (hexadecimal notation) by certain means into the senior developers can read code fragments, thus further locate the source of the abnormal, rapid repair.

The degree of symbolization is generally divided into three types:

  • No symbolic
  • Partial symbolization
  • Fully symbolic

Symbolizing the general case will require the following three parts

  • Crash logs
  • DSYM file
  • Symbolic tools

Crash logs

Crash logs can be obtained from several sources, including the following:

  • throughSettings - Privacy - Analysis and Improvement - Analyzing dataExport, this area can get some abnormal information of the whole phone, yesJetsamMechanism generated in the format of.ips, it should be noted that this location may not be able to get all APP exception logs (at least I did not get them during my test);
  • Test machine direct export,Xcode -> Window-Devices and simulators -> View Device Logs(Select the destination device on the left toolbar.) Export the file in the format of.crashIn fact, the source of the log file read in this way is from the first one above;
  • throughXcode-Organizer-CrashesGets the crash log in the format.xccrashpoint, open its package contents, in fact, the internal file format is still.crashLog file;
  • Exceptions can be captured and reported in the code by using third-party tools or self-developed tools, such as Bugly and Ameng.

In fact, the above methods can be roughly divided into two kinds

  • Crash Log: a complete Crash Log file;
  • Exception information: Only critical error information, such as the stack, is reported.

Either way, the most important piece of information for us is the error stack. For cases where we need to catch exceptions in code, the collection approach includes the following methods, which are used in combination by the commonly used Crash collection framework.

  • Mach exception mode:mach_port_allocate -> mach_port_insert_right -> task_set_exception_ports-> Loop to wait for messages
  • Unix signal mode:signal
  • Application level exception NSException:NSSetUncaughtExceptionHandler

In fact, the Unix signal itself is also passed to the upper level of the Mach exception for conversion.

Including NSSetUncaughtExceptionHandler value can catch exceptions to the OC, Swift abnormalities is captured, usually after the catch exceptions NSException will also catch exceptions to a corresponding signal, of course, this is also the general situation, There may be no special cases.

The following is a simple Exception capture code example. Actual Exception capture is much more complex than this, including getting Slide Address, passing Exception capture, Mach Exception, and so on.

For detailed Crash collection code, see the following open source exception catching tools:

  • KSCrash (recommended reading source)
  • plcrashreporter
  • CrashKit

NSSetUncaughtExceptionHandler

NSSetUncaughtExceptionHandler(CrashHandler.exceptionHandler)

private static let exceptionHandler: @convention(c) (NSException) - >Void = { exception in
   /// exception stack
  let arr = exception.callStackSymbols
  /// The cause is abnormal
  let reason = exception.reason
  /// The exception name
  let name = exception.name.rawValue
}
Copy the code

Exception type:

  • NSInvalidArgumentException: illegal abnormal parameters, such as NSDictionary can’t add a nil object
  • NSRangeException: Out-of-bounds exception
  • NSGenericException: Exception for making changes in array traversal
  • NSInternalInconsistencyException: inconsistencies resulting in abnormalities, such as used NSDictionary as NSMutableDictionary
  • Some abnormal NSFileHandleOperationException: when handling files, one of the most common is the problem of insufficient storage space, such as applications frequently save the document, or deal with the larger cache data:
  • NSMallocException: Insufficient memory
  • KVO Crash: Remove observer repeatedly
  • unrecognized selector send to instance

In response to the above anomalies, netease proposed a Baymax Health system in 2017 — Baymax Health System — an automatic Crash repair system for iOS APP running. There is also a corresponding open source framework JJException, also known as safety cushion.

signal

// Most exceptions are SIGTRAP. NSException in OC also corresponds to this signal.
signal(SIGTRAP.CrashHandler.signalHandler)
signal(SIGABRT.CrashHandler.signalHandler)
signal(SIGSEGV.CrashHandler.signalHandler)
signal(SIGBUS.CrashHandler.signalHandler)
signal(SIGILL.CrashHandler.signalHandler)

private static let signalHandler: @convention(c) (Int32) - >Void = { signal in
  /// exception stack
  for symbol in Thread.callStackSymbols {

  }
  exit(signal)
}
Copy the code

SDK exception Passing

In general, we don’t want to integrate multiple Crash log collection services into our app, but there will always be times when we do, and in those cases, we really want the SDK to be a friendly player and not just eat the Crash and not deliver it.

If at the same time have many registration by NSSetUncaughtExceptionHandler exception handler, peaceful way is: After the registration by NSGetUncaughtExceptionHandler remove others previously registered handler and backup, in your own handler after processing consciously put others handler registered back, don’t pass. If forced overwriting is not delivered, the Crash logs written by the log collection service registered before will lose information such as Last Exception Backtrace due to the failure to obtain NSException. (Crash Reporter on P.S. iOS is not affected)

Please see KSCrashMonitor_NSException detailed code, a g_previousUncaughtExceptionHandler properties.

Signal is similar, in KSCrashMonitor_Signal, with a g_previousSignalHandlers attribute.

DSYM file

DWARF

DWARF(Debuging With Arbitrary Record Format) is the standard Format for storing and processing debugging information in file formats such as ELF and Mach-O. The internal data is highly compressed and readable information can be extracted using commands such as dwarfdump and otool. When the DWARF is opened with MachOView, the outer layer is still mach-O. Debug_info and DEBUg_line sections store the main debugging information.

ELF and Mach-O are the names of files used by the Linux and Mac OS platforms to store binaries, executables, object code, and shared libraries, respectively.

dSYM

On the iOS platform, dSYM files refer to target files with debugging information. DSYM files store information such as file names, method names, and line numbers, which correspond to the hexadecimal function addresses of executable files one by one. You can accurately know the specific crash information by analyzing crashed crash files.

Build Settings -> Debug Information Format Allows you to set the Format of the debugging Information.

  • DWARF
  • DWARF DWARF With dSYM File

In general, we use DWARF in the Debug environment to facilitate debugging, while in the Release environment, we use the second method. By choosing the second method, we can remove the symbol table from the binary file and store it in dSYM file. Once you open it, you can see it in the xcarchive file that Xcode packaged. In addition, if bitcode optimization is enabled, Apple will do a secondary compilation optimization, so the final dSYM will need to be downloaded manually from Apple Connect.

DSYM files are very important for the symbolic process, so it is very necessary for us to backup dSYM files after each release.

There are other ways to catch a Crash without a dSYM file, but it’s not as quick and easy as it is with a dSYM file.

Symbolic process

The key lines of the crash log App are obtained

Whether it’s the complete crash file or the stack information, what we ultimately need is the key crash line information, which looks like this:

No stack information is displayed. 3 iOSTest 0x000000010029E694 iOSTest + 26260 // Stack information with offset is displayed. 3 iOSTest 0x000000010029E694 0x0000000100298000 + 26260Copy the code

Of course, for a complete Crash log file, we can easily symbolicatecrash the entire log file using the symbolicatecrash tool. We won’t cover that in this section, but see the next section for details.

Get the offset, runtime stack address, and runtime APP starting address

Let’s take the stack information that shows the offset,

  • 3: The information is in the stack index
  • IOSTest: package name
  • 0x000000010029E694: Runtime stack Address, hexadecimal
  • 0x0000000100298000: The starting point of the application stack in the operating system stack (hexadecimal)
  • +26260: Offset from the load address (symbol address) in base 10

The relationship between the above three addresses is symbol_address = stack_address – load_address

IOS uses Address Space Layout Randomization (ASLR) for security when loading mach-O files. As a result, the starting Address of binary Mach-O files is different each time they are loaded into memory, but the calculation rules are the same, as shown in the figure above.

Obtain the dSYM start address

otool -l iOSTest.app.dSYM/Contents/Resources/DWARF/iOSTest | grep __TEXT -C 5

After the command is executed, the result is as follows. You can see that the start address of the dSYM code segment is 0x0000000100000000, which is usually the same value.

Calculate the crash address corresponding to the address in the dSYM symbol table

  • DSYM start address and end address: 0x0000000100000000, hexadecimal
  • Function offset: 26260,10 base

The stack address (0x000000010029E694) corresponds to 0x0000000100000000 + 26260 = 0x100006694 in dSYM

Get the specific function/line number/file

Using dwarfdump

Dwarfdump --arch arm64 --lookup 0x100006694 iOSTest. App. dSYM or dwarfdump iOSTestCopy the code

In the figure above, we can see the files that have crashed, but the line info obtained is all 0. The main reason is that the function has been inlined at the crash. However, dwarfdump is not well compatible with multilevel inlining. Therefore, the following methods are generally used in the majority of ATOS.

The use of atos

This way, instead of manually calculating the crash address in the dSYM symbol table,

## 0x0000000100298000 is load address
## 0x000000010029e694 is symbol address
atos -arch arm64  -o iOSTest.app.dSYM/Contents/Resources/DWARF/iOSTest -l 0x0000000100298000 0x000000010029e694 -i
Copy the code

The -i followed by the command is intended to display inline related information.

Using DSYMTools

We can also use the open source DSYMTools, which also uses ATOS internally, for easier use with graphical pages.

Assemble and format

According to the above flow, we can basically map the stack information to the corresponding file, function, line number, etc., forming the common form:

3 iOSTest 0x000000010029e694 DemoListViewController.click(menuModel:) 0x0000000100298000 + 26260 (DemoListViewController.swift:62)
Copy the code

Symbolic mode

The symbolization process above provides an overview of how and how symbolization works, as well as the different methods of symbolization such as Dwarfdump and ATOS.

Let’s take a look at how the stack can be symbolized:

  • 7. Symbolicatecrash: The ability to symbolise the entire Crash file, used less online and more offline, or with Xcode’s built-in Crash ->Xcode-Organizer-Crashes;
  • Under the MACatosTools: single-line stack symbolization;
  • Alternatives to ATOS in Linux, such as ATosl and llVM-atosl
  • The corresponding relation between address and symbol is extracted by dSYM file, and symbol is restored.

The ATOS method is still applicable in general cases, but once the magnitude rises, its symbolic speed can not meet the needs. At present, most of the mainstream online APM is the fourth solution, such as Bugly and byte APM. This section will not be expanded, and the following chapters will be introduced separately.

Symbolic related tools

According to the symbolization process above, we used the following tools.

dwarfdump

Crash Log will carry a UUID (consisting of 32 characters) located in Binary Images, APP Binary will also have a UUID, dSYM will also have a UUID, the three can be resolved properly, otherwise parsing will fail. Therefore, when parsing fails, dwarfdump should be used first to check whether the UUID of the three are the same.

#Use the sample
dwarfdump -h

#Check the UUID of the xx.app file
dwarfdump --uuid xx.app/xx

#Check the UUID of the xx.app.dSYM file
dwarfdump --uuid xx.app.dSYM

#Export the debug_info information to the debug_line. TXT file
dwarfdump --debug-info xx.app.dSYM > debug_info.txt

#Output the debug_line information to the debug_line. TXT file
dwarfdump --debug-line xx.app.dSYM > debug_line.txt

dwarfdump --arch arm64 --lookup 0x100006694 iOSTest.app.dSYM
Copy the code

If there are many dSYM files on the device, run the following command to locate the dSYM location corresponding to the specified UUID

#The UUID is changed to the actual UUID, and the UUID requires a format conversion (incrementThe '-')
mdfind "com_apple_xcode_dsym_uuids == UUID"
Copy the code

symbolicatecrash

Symbolicatecrash provided by Xcode. The command is located at: / Applications/Xcode. App/Contents/SharedFrameworks/DVTFoundation framework Versions/A/Resources/symbolicatecrash, Is a Perl script that incorporates step-by-step parsing (you can copy the command and call it directly).

Use mode:

#You need to run this command first, otherwise the following symbolicatecrash command will appear
# Error: "DEVELOPER_DIR" is not defined at ./symbolicatecrash line 69.
export DEVELOPER_DIR="/Applications/XCode.App/Contents/Developer"

#Before running the command, you need to copy the crash log, dSYM, and Symbolicatecrash to the same directory
symbolicatecrash log.crash -d xxx.app.dSYM > symbol.log
Copy the code

Advantages: can be very convenient to symbolize the whole crash log. Disadvantages:

  • It takes a long time.
  • It is coarse-grained and cannot symbolize a particular line.

In fact, on the basis of this method, Xcode can symbolize crash files visually, put crash logs, dSYM files and executable files in the same directory, and then drag and drop crash logs to Devicelog. To symbolize, right-click the symbolicate Log or re-symbolicate Log.

atos

The ATos command converts a hexadecimal address to a function name and line number that are recognizable in the source code. Advantages: Fast speed, can symbolize a specific line, convenient for the upper layer to do cache.

#Load adress: the start loading address of the executable instruction relative to the image file
#Address to symbolicate: The address of the calling function
atos -arch <Binary Architecture> -o <Path to dSYM file>/Contents/Resources/DWARF/<binary image name> -l <load address> <address to symbolicate>

atos -arch arm64  -o iOSTest.app.dSYM/Contents/Resources/DWARF/iOSTest -l 0x0000000100298000 0x000000010029e694 -i
Copy the code

System logs are symbolized

To symbolize the method name of your App, you need to compile the generated dSYM file. To convert system library symbols into complete method names, you also need symbolic files for each iOS system library.

The system library symbol file is not generic and needs to correspond to the system version and CPU model of the device where the crash occurred. So in order to symbolize all symbols, we need to collect as many different versions of the system symbol files as possible.

The following is the top information of the Crash Log I exported from Xcode, from which we can get the information about the device that caused the Crash.

IPhone OS 15.0 (Build 19A346) Architecture: ARM64E...Copy the code

To the corresponding version of the symbol file in Mac OS ~ / Library/Developer/Xcode/iOS DeviceSupport directory, you can use the Xcode’s own symbolic tool symbolicatecrash symbolization. This tool automatically searches for the symbol files of the native system library based on its UUID in the crash log.

Several methods of obtaining system symbol files

  1. When you first connect to a device with Xcode for real debugging, you’ll see that Xcode displays Processing Symbol Files, This is in the copy machine really symbol files to Mac/Users/XXX/Library/Developer/Xcode/iOS DeviceSupport directory.

  2. Extracting system library symbols from the iOS firmware has been discussed in The resources section. Give the address that will be used in the process. Theiphonewiki: firmware download site, also maintains firmware decryption key–Firmware_Keys.

  3. Already collected open source projects, such as ios-system-Symbols…

Online symbolization

In fact, online symbolization is the last way of symbolization mentioned above. Its core lies in the use of tools to extract the corresponding relationship between address and symbol, which requires us to understand the structure of DWARF files and find the location of the corresponding relationship. The core is the content of debug_line and debug_info.

For more details, see “The Way to iOS Symbolic Parsing and Refactoring” and “iOS Symbolization: Basics and Advances.”

We can use a few tools for our own use when parsing DWARF.

  • Gimli: a library based on rust’s read and write DWARF debugging format
  • debug/dwarf: Based on golang native system library DEBUG/DWARF, can realize the dwarf file parsing, address parsing into symbols.

Of course, we can also use file reading instead of using some existing libraries, such as Bugly’s BuglySymbolios.jar.

The last

Try harder!

Let’s be CoderStar!

References & suggested resources

  • Do you really understand symbolization?
  • IOS symbol parsing refactoring road
  • IOS symbolization: Basics and Advancements
  • IOS crash log online symbolization practice
  • Ramble on the iOS Crash collection framework
  • IOS Crash Analysis: a symbolic system library approach
  • Talk about extracting system library symbols from iOS firmware

It is very important to have a technical circle and a group of like-minded people, come to my technical public account, here only talk about technical dry goods.

Wechat official account: CoderStar