Binder: The Binder mechanism for communication between Android processes

As we all know, Android is based on Linux, so it’s worth understanding why Android has to build its own Binder instead of using Linux’s own process communication mechanism. Next, let’s take a brief look at some of the ways Linux communicates between processes. This article introduces the IPC mechanism of Linux, aiming to understand its ideas and find out its advantages and disadvantages, so as to find the basis for using Binder in Android. Therefore, it will not do an in-depth analysis. If you want to do an in-depth analysis, you can refer to the reference list at the end of this article.

Linux IPC way security The efficiency of Single and double to model
The pipe v Is not high A one-way 1v1
The Shared memory x The highest two-way Many to many
Socket v Very low two-way Many to many
File x low two-way Many to many

IPC mechanism in Linux

1. Pipeline communication

Pipeline file features

  • To open a pipe, both ends (two processes) must open a pipe at the same time. Are reading (R) and writing (w) respectively; The reading side is responsible for reading data from the pipe, and the writing side is responsible for writing data to the pipe.
  • When the reader is closed, the writer receives a signal to terminate the program. When the write side is closed, the read side is no longer blocked.
  • The size of the pipe file is always 0. When the pipe file is opened, the memory space is allocated for it, and the data is cleared after the pipe is closed. Linux default PIPE_SIZE=64K(maximum storage space in a pipe); PIPE_BUF (pipe buffer) = 4K; If multiple processes simultaneously write to a pipe and the data size exceeds PIPE_BUF, data may be interwritten (data insecurity).

There are two Pointers in the memory space of the pipe file: head and tail. The head pointer moves backwards with read data and the tail pointer moves backwards with write data. Pointer to end of file causes read block (end of read)/write block (space is full)

Limitations of pipelines

  • Data can only be read, not written
  • Once the data has been read away, it does not exist in the pipe and cannot be read repeatedly
  • Because the pipe uses half duplex communication, data can only flow in one direction
  • Communication can only be implemented in processes that have a common ancestor

Summary: Pipeline communication is a 1V1 communication mode, relatively safe. The downside: data can only be transmitted in one direction.

2. Shared memory

Shared memory allows two or more processes to share a storage area. When the data in the shared storage area changes, all processes that share the storage area will notice the change of the data. Because the data does not need to be copied between the client and server, the data is directly written to the memory without several copies, so this is the fastest IPC.

Advantages of shared memory

  • Because two processes map to the same physical address via address, processes can either write data to that physical space or read data —- processes communicate in both directions.
  • Both clients and servers read and write data directly from memory, without copying data. Therefore, the speed is the fastest.
  • The life cycle follows the kernel and is not destroyed when the server or client is disconnected. All processes accessing the shared memory object end, and the shared memory area object still exists.

Disadvantages of shared memory

Shared memory does not implement synchronization: multiple processes can write data at the same time, resulting in data clutter

summary

It is a many-to-many IPC, the fastest access IPC. However, no synchronization mechanism is provided, which is not secure

3. The Socket communication

The Linux philosophy is “everything is a file”, so you can use open > read/write > close mode to achieve this, Socket is an implementation of this mode, Socket is a special file.

Socket can not only realize the communication between different hosts on the network, but also realize the communication between different processes under the same host. And the established communication is two-way communication. The communication process is shown in the following figure. As can be seen from the figure, the client and server need to establish a connection and then read and write data, which is equivalent to copying data between two processes and then transmitting data. This efficiency is very slow.

summary

Socket is one of the IPC mechanisms in two-way many-to-many mode, but the data is copied twice, so the efficiency is low. But it’s safer.

The advantages and disadvantages of several IPC methods in Linux:

Linux IPC way security The efficiency of Single and double to model
The pipe v Is not high A one-way 1v1
The Shared memory x The highest two-way Many to many
Socket v Very low two-way Many to many
File x low two-way Many to many

As a result,Android decided to implement its own IPC implementation —-Binder Binder requires only one copy and is secure enough.

Android IPC Communication Binder

Before introducing Binder, consider the following concepts:

1. Kernel space and user space

  • Kernel space (kernel process) : memory area occupied by the operating system ——- Only one copy is available
  • User space (user process) : memory area where user processes reside ——— Multiple copies

To put it bluntly, the bytecode corresponding to the APP we developed is stored in user space, and a series of method calls and memory allocation are all in user space. The Android code is in kernel space.

Q: Why is it so divided?

A: It can be done by using the separation of kernel space and user space. Each APP (user space) does not affect other apps and does not cause system crash.

Using ADB shell ps, you can see the following figureYou can see that the PPID 0 in the phone is the kernel process.

2. Physical address and virtual address

2.1 Virtual Memory

In fact, we write programs, are facing virtual memory, we write variables in the program address, is actually virtual memory address, when the CPU wants to access the address, memory management unit MMU will translate the virtual address into physical address. The CPU can then retrieve the data from the actual physical address.

2.2 MMU: Memory management unit

MMU: It is hardware, not software. It is used to translate virtual addresses into actual physical memory addresses, and it can also set specific memory blocks to different read and write properties, so as to achieve memory protection. Note that MMU is hardware management, not software memory management.

To sum up, MMU can achieve the following functions:

  • Virtual memory. With virtual memory, you can run applications on the processor that are larger than the actual physical memory. To use virtual memory, an operating system typically sets up a swap area (usually on a hard disk) to free up physical memory for other programs by putting inactive data and instructions in memory into the swap area.
  • Memory protection. This allows you to set specific blocks of memory as read, write, or executable properties. For example, make immutable data or code read-only to prevent malicious string modification.

This article only needs to know that MMU is the process of converting virtual address into physical address is the core of MMU. Because programs are local, only a small amount of code is executed by the CPU during a small period of time. Each time from diskThe bufferLoad 4K data into memory.

2.2.1 pp.

We know that physical memory executes a small amount of program code and data that can be executed, so what is the size of that small amount of code and data that can be executed per load?

It is obviously impractical to manage the virtual space in units of storage, so Linux divides virtual space into equal storage partitions, which Linux calls pages. In order to change in, change out of the convenience, physical memory is also divided into a number of blocks by size.Because a block space in physical memory is a container for virtual pages, a block in physical memory is called a page frame.Page and page box are the foundation of Linux virtual memory technology.

To efficiently and conveniently manage memory for the CPU, you need to take one page of code at a time —- contiguous storage =4K, also known as blocks.

Since physical memory and virtual memory are divided into pages and frames, the figure above divides the page frame code (page number) and offset into two parts. The page frame code (page number) is the code that identifies the page frame (page), and the offset is the address code that identifies the storage unit within the page frame (page).

2.2.2 page table

The idea of a page table was born to map page boxes in physical memory to pages in virtual memory. Page table is a data structure that stores page frame codes in physical memory and page numbers in virtual memory

Through this page table, we can find the address encoding of physical memory from virtual memory, and thus obtain the corresponding data in physical memory. —– : Map is a similar data structure. We know thatThe memory mapping, understand its design ideas can be.

Finally, let’s take a quick look at what Android’s Cross-process Communication Binder has long said about copying —- once, which uses the core of memory mapping.

This is the classic Binder IPC communication model figure, originally to the interaction between two processes need two copies (sending data will be copied into the kernel space, then the kernel space and copy the data to the receiving party), with the use of mmap (memory mapping) reduces a data receiving process data copies (or server). Binder only needs to copy data to kernel space once in the data sending process (client).

conclusion

Starting from several IPC mechanisms of Linux processes, this paper briefly analyzes the advantages and disadvantages of various IPC mechanisms, and leads to the design of Binder, a cross-process communication model for Android. This article is a Linux primer on Binder communication. It is intended to help readers get a general idea of the mechanisms for cross-process communication and to better understand why Binder was designed for Android. In the next article, we will examine Binder’s architecture.

The IPC way security The efficiency of Single and double to model
The pipe v Is not high A one-way 1v1
The Shared memory x The highest two-way Many to many
Socket v Very low two-way Many to many
File x low two-way Many to many
Binder v high two-way Many to many

Refer to the article

Linux pipe -CSDN blog – Linux Pipe

Linux – Interprocess communication programmer -CSDN blog

Interprocess Communication in Linux – Shared memory – CS_WU – Cnblogs.com

Linux Socket local process communication _GXYandSXP blog -CSDN blog _Linux socket communication

Three deep and simple processor three memory management and memory management unit (MMU) -CSDN blog

Linux kernel learning: Virtual memory details (MMU, page table structure) – Zhihu (zhihu.com)