series

  • Google Breakpad
  • Google Breakpad

Introduction to the

Breakpad is an open-source system for crash-reporting that includes both client and server components.

Introduction on the official website:

Breakpad is a set of client and server components which implement a crash-reporting system.

Compile build

Breakpad’s source code dependencies are managed using Google’s own depot_tools. However, for some reason, downloading the source code using Depot_tools often gets stuck, so there are many tutorials for you to directly download the source code using Git. There are so many related articles that I won’t post links here.

Breakpad builds scripts that are context-sensitive, so build different artifacts based on the system you’re currently using. Breakpad supports macOS, Linux, and Windows.

The author himself is macOS(Mojave 10.14.4) environment, but failed to compile, indicating the lack of some dependent libraries, finally used Docker to run a Linux(Ubuntu 18.04) image, unexpected smooth, a success.

Docker-compose: docker-compose: docker-compose: docker-compose: docker-compose: docker-compose: docker-compose

Dockerfile

Git ENV DEBIAN_FRONTEND=noninteractive RUN apt update && \ apt install --no-install-recommends Git -- all curl wget build-essential --assume -- yes https://chromium.googlesource.com/chromium/tools/depot_tools.git ENV PATH ${PATH}:/opt/depot_tools COPY src/ /opt/breakpad/ # configure breakpad RUN CD /opt/breakpad &&./configure && make && make installCopy the code

The COPY command is used to COPY the breakpad source code because you need to modify the breakpad source code to facilitate debugging.

docker-compose.yml

Version: "3.7" Services: yybreakpad: build:./ command: tail -f anything volumes: -./temp:/opt/tempCopy the code

Docker container does not exit automatically:

 command: tail -F anything
Copy the code

The temp directory is mounted, and you can place some of the executed artifacts in this directory, without frequently copying between the container and the host:

 volumes:
     - ./temp:/opt/temp
Copy the code

Here is the hierarchy for building directories:

├── Heavy Exercises ── heavy ExercisesCopy the code

architecture

Here’s a look at Breakpad’s official architecture:

Breakpad consists of three main parts:

  • When a Client crashes, the minidump file is generated by default.
  • Symbol Dumper, a tool used to generate breakpad-specific Symbol tables, works in the original library with debug information.
  • Processor, this tool reads the Minidump file generated by Client, matches the corresponding Symbol table generated by Symbol Dumper, and finally generates human-readable C/C++ stack trace.

Minidump format

The minidump file can be considered a simplified version of the Coredump file. The official reasons for using it are:

  • Coredump is very large and not easy to transfer on the end.
  • Coredump documentation is incomplete; for example, the Linux standard library does not describe how registers are stored in the PT_NOTE segment.
  • Convincing Windows machines to produce Coredump is harder than other systems to produce Minidump.
  • Unified dump file formats are implemented on all platforms.

Processing flow (Linux as an example)

  1. dump_syms

    When we write a library in C/C++ code, the compiler generates ELF files with debugging information by default. In this case, we can generate symbol tables by executing the following command:

     dump_syms [elf_file] > [elf_file.sym]
    Copy the code

    Elf_file.sym is not mandatory and can use any file name and suffix.

    ELF files with debug information are much larger, so files behind strips are usually used on the end, which removes unnecessary information and makes the files smaller.

    The generated symbol table must be stored in the specified format so that it can be correctly matched. First, the name of the outer directory must be the ELF file name with the suffix, then the directory with the symbol table ID, and finally the corresponding symbol table.

    Such as:

    └ ─ ─ symbols └ ─ ─ libtest. So └ ─ ─ D6CAF1C3E374EFD057659926ABA14AD00 └ ─ ─ libtest. So the symCopy the code

    The symbol ID can be obtained by reading the first line of the symbol table file:

     $ head -n1 libtest.so.sym
     MODULE Linux arm D6CAF1C3E374EFD057659926ABA14AD00 libtest.so
    Copy the code

    D6CAF1C3E374EFD057659926ABA14AD00 which is corresponding to the symbol table ID.

  2. minidump_writer

    The Breakpad Client component registers the SIGSEGV, SIGABRT and other callback methods in advance. When a crash occurs on the end, the minidump file will be generated, which contains thread information, link library information, stack information, and so on.

  3. Symupload (Optional)

    Breakpad supports uploading generated Minidump files to a specified server, which is an optional step, with the option of uploading yourself.

  4. minidump_stackwalk

    Once the Minidump file is obtained, the MinidumP_StackWalk can be used to parse the Minidump file into a human-readable stack trace with the corresponding symbol table.

     minidump_stackwalk [minidump_file] ./symbols > [stacktrace_file]
    Copy the code

    ./symbols is used to specify the symbol table directory, as shown in Step 1. [stacktrace_file] is used to store the resulting stacktrace.

Symbol table matching (Linux as an example)

The minidump file matches the symbol table according to the symbol ID.

The rule for symbol ID generation is to use BuildId in the ELF file by default, or from the text section summary if no BuildId exists. The corresponding code snippet is as follows:

The text section is used in ELF files to store code sections.

You can learn more about ELF file formats in a wiki.

// https://chromium.googlesource.com/breakpad/breakpad/+/refs/heads/main/src/common/linux/file_id.cc // static bool FileID::ElfFileIdentifierFromMappedFile(const void* base, Wasteful_vector <uint8_t>& identifier) {// Look for a build id note first.  identifier)) return true; // Fall back on hashing the first page of the text section. Return HashElfTextSection(Base, Identifier) using the hash value of the text section; }Copy the code

So, when we use the MinidumP_stackwalk to not convert the symbolic address to the corresponding symbol, we can check that the symbol ID matches correctly.

summary

The overall architecture of Breakpad is very clear, and each module is responsible for different responsibilities. The Client captures crash on the end and generates minidump report. The server uses the symbol table generated in advance. Parse the Minidump into a human-readable stack trace using the MinidumP_StackWalk.

In this section, we will look at the overall design of Breakpad from a macro perspective, including the build, process, and so on, which will help us analyze the source code for Breakpad in the next section.