Android Native exception catching library

Based on The Android Native exception catching library of Google/Breakpad, the Java layer can get relevant exception information when the Native layer has an exception.

Project home page

The status quo

  • When a Native anomaly occurs, Android system will output the native anomaly information to LogCAT, but the Java layer cannot sense the occurrence of the Native anomaly, so it cannot obtain the abnormal information and report it to the service anomaly monitoring system.
  • The business can quickly implement an exception monitoring system at the Java layer (the implementation of global exception capture at the Java layer is simple), or the business has implemented an exception monitoring system at the Java layer, but has not covered exception capture at the Native layer.
  • Android can also tap into Breakpad, which exports minidump files that are small and full of information, but there are two problems:
    • 1. Same problem as point 1 of status Quo.
    • 2. Useful information can be obtained only after taking minidump file and going through tedious steps:
      • Check whether the Minidump file is exported during Breakpad startup. If yes, a native exception occurs.
      • Go to the customer site or remotely pull the Minidump file.
      • The Minidump_StackWalk tool that compiles your computer’s operating system.
      • Use the Minidump_StackWalk tool to translate the contents of the Minidump file, such as the values in the program counter registers at the time of the crash (hereinafter referred to as PC values).
      • Locate the add2line tool that corresponds to the broken SO library ABI and locate the number of lines of code where the exception occurred based on the PC values obtained in the previous step.

The whole process is very complicated and tedious, and there is no crash thread stack information of Java layer, which is not conducive for Java developers to quickly locate the code calling Native.

Design intent

  1. Let the Java layer have access to native exceptions:
    • Instead of checking Breakpad to see if minidump files have been exported, Java developers can get information about native exceptions in Java code and react to native exceptions.
  2. Increase the availability of information to improve the efficiency of problem analysis:
    • The callback provides naive exception information, naive and Java call stack information, and minidump file path. These information can be directly reported by the exception monitoring system of the service department.

    • It is divided into two stages to solve the problem. I expect that most of the problems are solved in the first stage, and there is no need to analyze minidump files. Generally speaking, the analysis efficiency is improved:

      • Phase 1: With the Java call stack and native call stack information, most abnormal causes can be quickly located and analyzed.
      • Phase 2: The callback also provides the path for storing the Minidump file. The service department can pull the file as required. (This step requires the business department itself to have the function of pulling logs, and it needs to be operated according to the “current situation” above, which is time-consuming and laborious)
  3. Minimum changes:
    • Let access parties not change existing code massively to introduce new features. For example, in the callback of Native crash, the existing Java layer exception monitoring system is used to report Native exception information.
  4. Single responsibility:
    • It only captures native crash, but does not collect system memory, CPU usage, system logs and other information.

The whole process

Function is introduced

  • Keep BreakPad export minidump files (optional)
  • When a Native exception occurs, the exception information, the call stack of the Native layer and the call stack of the Java layer are provided to the developer through callback, and the output of these information to the console is as follows:
The 2022-02-14 11:33:08. 598, 30228-30253 / com. Babyte. Banativecrash E/crash: / data/user / 0 / com. Babyte. Banativecrash/cache/f1474006-60 ca - 40 f4 - c9d8e89a - 47 e90c2e. DMP 11:33:08 2022-02-14. 599 30228-30253 / com. Babyte. Banativecrash E/crash: Operating system, Android, 28 Linux 4.4.146#37 SMP PREEMPT Wed Jan 20 18:26:59 CST 2021CPU: aarch64 (8 core) Crash reason: signal 11(SIGSEGV) Invalid address Crash address: 0000000000000000 Crash pc: 0000000000000650 Crash so: /data/app/com.babyte.banativecrash-ptLzOQ_6UYz-W3Vgyact8A==/lib/arm64/libnative-lib.so(arm64) Crash method: _Z5Crashv 11:33:08. 2022-02-14, 602, 30228-30253 / com. Babyte. Banativecrash E/crash: Thread [name: DefaultDispatch] (NOTE: linux thread name lengthlimit is 15 characters)
    #00 pc 0000000000000650 /data/app/com.babyte.banativecrash-ptLzOQ_6UYz-W3Vgyact8A==/lib/arm64/libnative-lib.so (Crash()+20)
    #01 pc 0000000000000670 /data/app/com.babyte.banativecrash-ptLzOQ_6UYz-W3Vgyact8A==/lib/arm64/libnative-lib.so (Java_com_babyte_banativecrash_MainActivity_nativeCrash+20)
    #02 pc 0000000000565de0 /system/lib64/libart.so (offset 0xc1000) (art_quick_generic_jni_trampoline+144)
    #03 pc 000000000055cd88 /system/lib64/libart.so (offset 0xc1000) (art_quick_invoke_stub+584)
    #04 pc 00000000000cf740 /system/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+200)
    #05 pc 00000000002823b8 /system/lib64/libart.so (offset 0xc1000). The 2022-02-14 11:33:08. 603, 30228-30253 / com. Babyte. Banativecrash E/crash: Thread [DefaultDispatcher - worker - 1, 5, the main] at the babyte. Banativecrash. MainActivity. NativeCrash (Native Method) at com.babyte.banativecrash.MainActivity$onCreate$2The $1.invokeSuspend(MainActivity.kt:39)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
        at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$WorkerThread. The run (CoroutineScheduler. Kt: 665) [DefaultDispatcher - worker - 2, 5, the main] at Java. Lang. Object. Wait at (Native Method) java.lang.Thread.parkFor$(Thread.java:2137) at sun.misc.Unsafe.park(Unsafe.java:358) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:353) at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.park(CoroutineScheduler.kt:795)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.tryPark(CoroutineScheduler.kt:740)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:711)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
...
Copy the code

Navigate to a specific line of code example in SO

You can use the add2line tool in the NDK to locate specific lines of code based on PC values and the so library with signed information.

Add2line = aARCH64; add2line = arm64;

$. / the NDK/android - the NDK - r16b toolchains/aarch64 - Linux - android - 4.9 / prebuilt/Darwin - x86_64 / bin/aarch64 - Linux - android - addr2line -Cfe ~/arm64-v8a/libnative-lib.so 0000000000000650
Copy the code

The following output is displayed:

Crash() /Users/ba/AndroidStudioProjects/NativeCrash2Java/app/.cxx/cmake/debug/arm64-v8a/.. /.. /.. /.. /src/main/cpp/native-lib.cpp:6Copy the code

access

In the build.gradle of the root project:

allprojects {
    repositories {
        mavenCentral()// Add this line}}Copy the code

In build.gradle:

dependencies {   
    // Add this line,releaseVersionCode for the latest version
	implementation 'io.github.BAByte:native-crash:releaseVersionCode' 
}
Copy the code

Initialize the

Two modes are available:

// When a native exception occurs, call back the exception information and export minidump to the specified directory.
BaByteBreakpad.initBreakpad(this.cacheDir.absolutePath) { info:CrashInfo ->
    // Format output to the console
    BaByteBreakpad.formatPrint(TAG, info)
}

// When a native exception occurs: call back the exception information
BaByteBreakpad.initBreakpad { info:CrashInfo ->
    // Format output to the console
    BaByteBreakpad.formatPrint(TAG, info)
}
Copy the code

The sample project

Click to see: Sample project

Thank you

  • Thanks to the Google BreakPad library for the source code
  • Thanks to the Tencent Bugly team for providing the idea of native calling back to the Java layer when an exception occurs
  • Thanks to iQiyi xCrash library source dlopen ideas