Reflection series blog is my attempt to learn a new way, the series of origin and table of contents please refer to here.

An overview of the

ANR is a cliche for Android developers, and from an interviewer’s perspective, it seems that saying “Don’t do time-consuming operations on the main thread” qualitatively qualitates.

But what exactly is the ANR mechanism, what is the principle behind it, and why is it designed? These questions come to mind all the time, and to understand them, you have to turn to Android’s own Input System.

What is Android’s own input system? In short, any interaction with an Android device — what we call an input event — needs to be managed and distributed through the input system; The closest and most typical of these small links is the View event distribution process.

In this way, the input system itself is indeed a very large and complex proposition, and the closer to the bottom details, the easier it is to have a sense of not seeing the woods for the trees, repeated several times, until lost in the details of the code, a learning effort to try in vain.

Therefore, it is a very good way to combine theory with practice to control the granularity of principle analysis, to systematically understand the design concept of input system itself from a macro perspective, and to extend the principle and solution ideas of ANR phenomenon in practical development, which is also the original intention of the author of this paper.

This article is quite long, and the mind map is as follows:

Explore from the top down

When it comes to the Android system itself, first of all, there must be a clear understanding of the application process and the system process. The former generally represents the applications created and developed by developers relying on the Android platform itself. The latter represents the core processes created by the Android system itself.

Instead of looking at the application process, let’s look at the system process, which initializes and manages the scheduling of the input system itself.

When the Android system starts, it initializes the Zygote process and the SystemServer process fork out from the Zygote process. As one of the system processes, the SystemServer process provides a series of system services, and the next InputManagerService is provided by SystemServer.

During SystemServer initialization, InputManagerService(IMS) and WindowManagerService(WMS) are created. Where the creation of WMS itself depends on the injection of IMS objects:

// SystemServer.java
private void startOtherServices(a) {
 // ...
 InputManagerService inputManager = new InputManagerService(context);
 // inputManager as the WindowManagerService construction parameterWindowManagerService wm = WindowManagerService.main(context,inputManager, ...) ; }Copy the code

The WMS is very important in the input system, managing the communication between IMS, Window, and ActivityManager. I’ll stop here and fill in later, but LET’s look at IMS first.

As the name implies, IMS service is responsible for the initialization of the input module at the Java level, and through JNI calls, the creation and preprocessing of the functions related to the lower input subsystem at the Native level.

In the process of invoking JNI, IMS creates an instance of NativeInputManager, which in turn creates EventHub and InputManager in the initialization process:

NativeInputManager::NativeInputManager(jobject contextObj, jobject serviceObj, const sp<Looper>& looper) : mLooper(looper), mInteractive(true) {
    // ...
    // Create an EventHub object
    sp<EventHub> eventHub = new EventHub(a);// Create an InputManager object
    mInputManager = new InputManager(eventHub, this.this);
}
Copy the code

We’re already at the Native level. The reader needs to note that it is very important for the Native level as a whole that it is responsible for getting input from the Linux device node down and communicating with the Java level close to the user up. In this hierarchy, EventHub and InputManager are the two most central roles.

What are the responsibilities of these two roles? EventHub is the core class in the underlying input subsystem that reads events from the physical input device to the InputManager, which encapsulates InputReader and InputDispatcher. Used to read and distribute events from EventHub:

InputManager::InputManager(...). { mDispatcher =new InputDispatcher(dispatcherPolicy);
    mReader = new InputReader(eventHub, readerPolicy, mDispatcher);
    initialize(a); }Copy the code

In a nutshell, EventHub establishes communication between Linux and input devices, and InputReader and InputDispatcher in InputManager are responsible for reading and distributing input events, both of which are really important in an input system.

Here is a simple summary of this, using the picture on the Internet:

EventHub and epoll mechanism

Most App developers probably don’t need to spend much time digging into the details of EventHub’s implementation — a brief overview of its responsibilities and a quick overview of it seems like a good deal.

However, in the implementation details of EventHub, the author found that its use of epoll mechanism is a very classic case study, so taking the time to understand a little bit more is definitely a benefit of two birds with one stone.

As mentioned above, EventHub establishes communication between Linux and input devices. This description is not accurate. What problem is EventHub designed to solve, and how is it implemented?

1. Multi-input devices and input subsystems

As we know, Android devices can be connected to multiple input devices at the same time, such as screen, keyboard, mouse and so on. User input on any device will generate an interrupt, which will be converted into an Event by the Interrupt processing of Linux kernel and device driver, and finally handed to the user space application for processing.

The Linux kernel provides an abstraction layer for unified conversion of different data interfaces on different devices. As long as the underlying input device driver is implemented according to this abstract interface, applications can access all input devices through the unified interface, which is the input subsystem of the Linux kernel.

So how does the input subsystem handle the events it receives? This brings us to EventHub, the hub for low-level Event processing, which utilizes the Epoll mechanism to continuously receive input events and pass them on to the upper-level InputReader.

2. What is epoll

The Looper in Handler is polling through an infinite loop. Why doesn’t the program crash or ANR due to an infinite loop?

The reader should know that the Handler simply uses the epoll mechanism to block and wake up the message queue. Here is a very classic explanation of the epoll mechanism, which should be read by readers who are not familiar with its design philosophy:

What is the principle of epoll or Kqueue?

Referring to above, here we make a simple summary of the epoll mechanism:

Epoll can be understood as an event poll. Different from busy polling and undifferentiated polling, epoll only notifies us of which I/O event occurred in which stream when there are multiple input streams. At this point our operations on these streams make sense.

The use of epoll in EventHub is appropriate — multiple physical input devices correspond to different input streams. With the epoll mechanism, mEpollFd and mINotifyFd are created during EventHub initialization. InputReader is used to check whether the device node is adding or deleting device files. InputReader is used to check whether the device node is adding or deleting device files.

Read and distribute events

This section provides a systematic introduction to InputReader and InputDispatcher.

1. InputReader: Read event

What is InputReader? After receiving events from EventHub, InputReader processes the events, encapsulates the events, adds them to the Queue of InputDispatcher, and wakes up InputDispatcher for the next event distribution.

At first glance, in the Native layer of the input system, the InputReader seems ordinary, but the more seemingly unpretentic things, often play an absolutely important role in the whole process.

First of all, EventHub events not only contain common input events, but also add, delete, and scan events of the device itself. These additional events are not directly distributed to the InputDispatcher, but are processed in the InputReader.

Epoll_wait () returns and stores the Event at some point — perhaps when a user presses a key, or a device is inserted, or a device property is adjusted.

After that, InputReader reads the input event once. Because different devices have different event processing logic, InputReader holds a series of Mapper to match the event. If not, InputReader ignores the event. Otherwise, the Event is encapsulated into a new NotifyArgs data object, ready to be stored in the queue, that is, wake up the InputDispatcher for distribution.

Cleverly, the InputReader performs a very special intercept operation in its own thread before waking up the InputDispatcher for distribution.

2. Input event interception and conversion

Readers know that in application development, some special input events cannot be intercepted in the ordinary way; For example, the volume key, the Power key, the telephone key, and some special key combinations, which are commonly referred to as system keys.

While Android is open to developers, there are limitations. Most user keystrokes can be blocked by applications, but system keystrokes can’t. This limitation is often the last guarantee of device security for users.

So, before InputReader wakes up InputDispatcher for event distribution, InputReader does two rounds of interception processing in its own thread.

The first round of interception is the processing of system keyword-level input events, which in the case of mobile phones is done in PhoneWindowManager. For example, when a user presses the Power button, the Android device itself wakes up or sleeps — that is, lights up and stops.

This is why “in the technical forum, the technical solution for system keystroke interception is basically to modify the source code of PhoneWindowManager”.

Then the input events enter the second round of processing. If the user chooses to enable certain functions in Setting->Accessibility, take gesture recognition as an example, Android AccessbilityManagerService (auxiliary function services) may according to need to transform into a new Event, such as two fingers pinched ZoomEvent gesture would eventually become.

Note that interception processing here does not actually consume the event, but rather flags the event in a special way (policyFlags) and then handles it in InputDispatcher.

At this point, InputReader completes a full round of processing of the input event, after which InputReader enters a new round of waiting.

InputDispatcher: Dispatches events

When wake() is awakened by the waiting InputDispatcher in Looper, the InputDispatcher begins a new round of event distribution.

Specifically, when InputDispatcher is woken up, wake() is actually executed in the thread of InputManagerService, Namely the whole process of thread switching sequence InputReaderThread – > InputManagerServiceThread – > InputDispatcherThread.

The thread of InputDispatcher is responsible for distributing the received input events to the target application window. In this process, InputDispatcher first needs to intercept the system keystroke related events marked for interception in the previous section, and the intercepted events are no longer sent down.

After that, InputDispatcher entered in this paper, one of the key links – call findFocusedWindowTargetLocked () to obtain the focus of the current window, whether there ANR happen at the same time detect the target application.

If it detects that the target window is in a normal state, that is, ANR has not occurred, InputDispatcher enters the real dispatcher, encapsulates the event object in a new round, wakes up the Looper thread of the process where the target window is located via SocketPair, that is, the main thread of our application process. The latter reads the corresponding key value and processes it.

On the surface, the whole distribution process seems clean and simple and easy to understand, but in fact the logic of the whole process of InputDispatcher is very complex, imagine that an event distribution process across 3 threads how simple?

In addition, InputDispatcher is also responsible for ANR processing, which leads to another level of complexity of the whole process, which will be analyzed in more detail in the ANR section later, so it will not be mentioned.

Next, let’s take a look at how the application process establishes communication links with the system process during the distribution of input events.

4. Establish communication through Socket

This section on the establishment of cross-process communication was originally intended as a large chapter, but it seems to be an important and unnecessary knowledge point for the overall input system. Finally, the author briefly describes it in a section, and interested readers can refer to the reference link at the end of the article for more detailed information.

We know that InputReader and InputDispatcher run in the system_server system process, and the user operates in the application process. Cross-process communication is involved here, so how does an application process establish communication with a system process?

Let’s go back to the initial process of WindowManagerService(WMS) and InputManagerService(IMS) initialization. When IMS and other system services are initialized, the application starts.

If an application has activities (and only activities can accept user input), it registers its Windows with THE WMS.

Android uses sockets instead of Binder to do this. In WMS, OpenInputChannelPair generates FDS of two sockets, representing both ends of a two-way channel. Data is written to one end and read to the other end. On the other hand, if one end does not write data and the other end reads it, it will be blocked and wait.

Finally, the Connection object of the target application is established in InputDispatcher, which represents the establishment of a link with the remote application window. Similarly, ViewRootImpl in the application process creates a WindowInputEventReceiver to accept events sent by InputDispatchor:

Here we have a preliminary understanding of the cross-process communication process. Binder is the most widely used cross-process communication method for Android, but is it all used in Android? The answer is no, at least in input systems, sockets besides binders also play a significant role.

Then a new problem arises, why Socket is chosen instead of Binder? The author found a good explanation for this problem:

The Socket allows asynchronous notification and requires only two threads (one at each end of the Pipe) to participate, assuming the system has N applications and the number of threads associated with Input processing is N+1 (1 is the Input Dispatcher thread). With a Binder, however, each application needs two threads, one Binder and one background thread, for asynchronous reception (you can’t process input in a Binder thread because it’s too time consuming and blocks the calling thread at the sender). On the sender side, it also takes two threads, one sender thread and one receiver thread, to receive application completion notifications, so N applications require 2 (N+1) threads. Sockets are much more efficient by comparison.

The application process can now receive input events processed by InputDispatcher and distributed. This brings us to the most familiar application-level event distribution process. For application-level event distribution after this, you can read two additional articles by the author below, which are not covered here.

  • The design and implementation of Android event distribution mechanism
  • The design and implementation of Android event interception mechanism

Iv. Design and implementation of ANR mechanism

After a more preliminary overall understanding of the input system, this paper will further explore the ANR mechanism.

Generally speaking, ANR sources are Service, Broadcast, Provider, and Input.

The reason for this is that, first of all, ANR problems occurring in application process components are usually relatively easy to solve. If the ANR itself is easy to reproduce, developers usually only need to determine whether the component’s code has done time-consuming processing in the main thread. The reason for the occurrence of the latter ANR is the timeout distribution of input events, including buttons and touch events on the screen. By reading the previous chapter, readers know that the InputDispatcher in the system process is responsible for handling ANR problems in the input system, and its whole process is more complicated in logic compared with the former.

After simple understanding, readers need to know, “component class ANR happen because is usually a main thread did take processing” this statement is, in fact, general, more precisely speaking, its reason is the nature of task scheduling overtime component, and in the case of compact equipment resources, the occurrence of ANR is more comprehensive.

The internal mechanism of the Input ANR is different from that of Service, Broadcast, and Provider.

1. Overview of the first type of principle

Gityuan explains in this article how ANR of Service, Broadcast, and Provider components are different:

ANR is a mechanism to monitor whether Android applications respond in a timely manner. ANR can be likened to detonating a bomb. Then the whole process consists of three parts:

  • Planting time bomb: Central control System (system_serverProcess) start countdown, in the specified time if the target (application process) did not finish all the work, the central control system will be directed to blow up (kill process) target.
  • Bomb demolition: in the specified time to finish all the work site, and timely report to the central control system to complete, request to remove the time bomb, it survived.
  • Set off bombs: The central control system immediately encapsulates the scene, takes snapshots and collects criminal evidence of slow target execution (traces), facilitate the follow-up case detection (debugging analysis), and finally blow up the target.

It is appropriate to compare component ANR mechanism to time bomb. Take Service as an example. For The Android system, the essence of starting a Service is asynchronous communication between processes.

Therefore, Android designed a death-by-death mechanism. When trying to start the Service, let the central control system system_server bury a time bomb. When the Service is started, remove the bomb. Otherwise detonate the bomb in the System_server ActivityManager thread, which is how the component class ANR mechanism works:

Let’s take a quick look at how the ANR mechanism works in the input system flow.

2, the second type of principle overview

Anrs of the Input type are more common and complex in everyday development, such as user or test feedback, where clicking on a UI element on the screen causes “gridlock.”

A few cases the developer to quickly locate problems, but more often, the problem is random and hard to reappear, the cause of the problem also is more comprehensive, such as low-end equipment resources of the system itself has been very nervous, or multithreaded holding each other needed resources lead to a deadlock, or other complex situations, Therefore, dealing with this type of problem requires some understanding of the ANR mechanism in the input system.

Unlike the component class ANR, the timeout mechanism for Input does not explode when the time is up. Instead, the process of processing a subsequent reported event detects whether or not it should explode, so it is more like a mine-clearing process.

What is minesweeping? For input systems, even if an event takes longer than expected to execute, ANR is not required as long as the user does not generate subsequent input events.

Only when a new round of input events comes, the window (i.e. App itself) that is distributing events at this time cannot release resources for the distribution of new events, then InputDispatcher will dynamically determine whether to prompt ANR information to the corresponding window according to the timeout period.

This is also the reason why the ANR window does not pop up when the user clicks the screen for the first time, even if the event processing times out. However, when the user subconsciously clicks the screen again, the ANR information is displayed on the screen.

The component ANR and Input ANR do differ in principle. In addition, the former is the ANR information processed in the ActivityManager thread and the latter is the ANR information processed in the InputDispatcher thread. Here is a brief picture of the overall flow of the latter:

Now that we have a brief understanding of the ANR mechanism for Input types, we’ll explore its implementation in more in-depth detail below.

3. Asynchronous mechanism of event distribution

Let’s turn our attention again to the implementation details of InputDispatcher.

For event distribution at the Native level of the System_server process, should its communication down to the application process be synchronous or asynchronous?

For the reader, the answer is asynchronous, because two-way communication between the two is established via SocketPair, and because the InputDispatcher dispatcher distribution of events in system_server is actually one-to-many, if it is synchronous, then if one of the applications times out, The InputDispatcher thread gets stuck and never gets to the next round of event distribution, let alone the minesweeper mechanism.

Therefore, unlike event distribution in the application process, which we can generally think of as synchronous in the main thread, the internal implementation of the entire input system is more complex because it involves asynchronous communication between the system process and multiple application processes.

Since event distribution involves an asynchronous callback mechanism and InputDispatcher needs to maintain and manage events, the question becomes what data structure is appropriate for maintaining these input events.

4. Three queues

InputDispatcher source code implementation, the overall event distribution process uses a total of three event queues:

  • MInBoundQueue: Used for recordsInputReaderInput events sent;
  • OutBoundQueue: Records input events that will be distributed to the target application window.
  • WaitQueue: Records input events that have been dispatched to the target application and have not yet been processed by the application.

Below, the author briefly combs the functions of the three queues through the example of two rounds of event distribution.

4.1 First round of event distribution

First, the InputReader thread listens for the underlying input event report via EventHub, places it in the mInBoundQueue, and wakes up the InputDispatcher thread.

InputDispatcher then starts the first round of event dispatch. There are no events in progress, so InputDispatcher retrieves the event from the mInBoundQueue header, resets the ANR timer, and checks if the window is ready. The event is moved to the outBoundQueue queue because the peer end of the application pipeline is properly connected, so the event is fetched from the outBoundQueue and put into the waitQueue queue because two-way Socket communication has been established, and then the application process receives a new event. And then distribute it.

If application process events are distributed normally, system_server is notified of completion through the Socket, and the corresponding event is eventually removed from the waitQueue.

4.2 Second round of event distribution

If the first round of event distribution has not yet received a callback notification, what happens when the second round of event distribution arrives?

When the second round of events arrive at InputDispatcher, InputDispatcher finds that an event is being processed, so it does not fetch new events from mInBoundQueue, but directly checks whether the window is ready. If not, it enters ANR detection state.

The following conditions will lead to ANR detection:

1. The target application is not empty, but the target window is empty. An exception occurs during the startup of the application. 2. The target Activity is in the Pause state, which is no longer the Focused application. 3. The target window is still processing the previous event.

The reader needs to understand that not all “target Windows still processing the last event” will throw an ANR. Instead, the event distribution of this round will be aborted if it does not time out, and the occurrence of ANR will be determined if the event distribution times out.

This is why ANR of Input type is described as minesweeping: Here, minesweeper means that on the premise that a time-consuming event is being processed in the current input system, each subsequent input event will detect whether the previous event being processed has timed out (entering the minesweeper state) and whether the current time has timed out from the last input event distribution point. If the previous input event resets the TIMEOUT of ANR so that it does not explode.

At this point, the input system detects the occurrence of ANR and throws information about the ANR to the upper layer.

summary

This article aims to provide a systematic overview of the Android input system. Readers should not take this article as the only learning material, but should use this article to get a preliminary understanding of the knowledge system, and make breakthroughs in individual directions according to their own requirements. For those who already know the skeleton, the finer details are just flesh and blood waiting to be fleshed out.

This article from LiTi to release, the whole process took almost one and half months, in this process, the author consulted the content of this article dozens of times, benefit, also deeply with ease As the writing goal spread out easily, and the content of hard – but with concise and fluent language to knowledge of a large complex system collapsed, Under such strict requirements, the description of each sentence needs to be extremely accurate, which is a challenge for the author. However, the author also has a very high understanding of the whole knowledge system after completing it.

And that’s what the Reflection series is all about. I hope you enjoy it.

Reference & extended reading

As mentioned above, input system and ANR itself are both very large propositions. In addition to a broad body of knowledge, it is also necessary to practice and summarize in person. Some relevant references are listed below for readers to expand their reading selectively according to their own needs:

3. Understand the trigger principle of Android ANR @gityuan

Gityuan’s ANR blog series is definitely pioneering. Especially in the first article, the description of time bombs and mine clearance is appropriate and easy to understand. This writing style shows the author’s deep grasp of the whole knowledge system. The next two articles provide a source code level analysis of each type of ANR, and they are very tasty.

Android-Android Event Input System @ Dust everywhere

I wanted to write an illustrated Android series, but for a variety of reasons I gave it up, only to find that there were pioneers who tried it a few years ago, and the content was of very high quality. I believe, can spend very big energy summary of the article will not be buried, and this article, is destined to become a classic in the classic.

5. Android Input series @stan_z

An author recently paid attention to a very good author, the article is very deep, the Input series for the whole Input system for a more detailed source level analysis, very worth collecting.

6, Android signal processing aspects of signal definition, behavior and source @rambo2188

If readers are interested in “The behavior of Android signal processing,” then this article is a must.

7, Android development master class @ Zhang Shaowen

The classic work in actual combat, each summary of the course is extremely deep, the value is inestimable. Because it may be related to interests, and the recommendation will not get money from Teacher Zhang, so this article is not linked and put at the bottom (laugh).


About me

If you think this article is valuable to you, welcome to ❤️, and also welcome to follow my blog or GitHub.

If you feel that the article is not enough, please also urge me to write a better article by paying attention to it — in case I make progress one day?

  • My Android learning system
  • About the article error correction
  • About paying for knowledge
  • About the Reflections series