When we say fluency, what are we talking about? Different people have different definitions of fluency and different perceptions of the lag threshold, so it is important to clarify what is involved before starting this series of articles, in case there are any differences in understanding, and in case you look at these questions with questions in mind. Here are some basic instructions

  1. For mobile phone users, lag includes many scenarios, such as dropping frames when sliding the list, excessively long white screen for application startup, slow light screen when clicking the power button, no response to interface operation and then flash back, no response when clicking the icon, incoherent window animation, no follow-up when sliding, restarting the phone and entering the desktop lag, etc. These scenarios are a little different from what we developers think of as a problem, and developers tend to analyze these problems in a more nuanced way. This is a cognitive difference between developers and users, especially when dealing with user (or tester) problem feedback
  2. For developers, the above scenario includes fluency (sliding list when dropped frames, window animation incoherent, reboot into desktop phone card), the response speed (application startup hang is too long, slow light screen click the power button and sliding also), stability (interface without reaction and then flash back, click the icon no response) of the three big categories. The reason for this classification is that each classification has different analysis methods and steps, and it is important to quickly identify which category the problem belongs to
  3. Technically speaking, fluency, response speed and stability (ANR) the reason why these three users perception are caton, because the principle of these three problems are consistent, is because the main thread of the Message in a mission timeouts, according to the different timeout threshold division, so to understand these problems, You need to understand some basic operating mechanisms of the system, and this article will introduce some basic operating mechanisms
  4. This series focuses on analyzing problems related to fluency, and there will be a special article on response speed and stability. After understanding the content related to fluency, it will get twice the result with half the effort to analyze response speed and stability
  5. Fluency this series is mainly about how to use Systrace (Perfetto) tool to analyze. The reason why Systrace is the starting point is that there are many factors affecting fluency, including the reasons of App itself and system. The (Perfetto) tool can show the process of the problem from the perspective of the whole machine operation, which is convenient for us to locate the problem initially

Systrace Fluency exercises currently include the following three chapters

  1. Systrace Fluency Practice 1: Understand the Caton Principle
  2. Systrace Fluency Combat 2: Case study: MIUI desktop slide stuck analysis
  3. Systrace Fluency Practice 3: Some questions during Caton analysis

If you are not familiar with the basic use of the Systrace (Perfetto) tool, you should first complete the Systrace basics series

Know the Caton principle

Caton phenomenon and its influence

As stated at the beginning of this article, the focus of this article is on fluency related issues. Fluency is a definition, and when we evaluate the fluency of a scene, we tend to use FPS. For example, 60 FPS means 60 updates per second; 120 FPS, which means 120 updates per second. If at 120 FPS the screen is only updated 110 times per second (continuous animation), this is called frame drop, it’s stuttering, and FPS drops from 120 to 110, which can be accurately monitored

There are many reasons why frames are dropped at the same time, including the APP itself, the system, the hardware layer and the whole machine. For this, please refer to the following four articles

  1. Android lag frame loss causes overview – methodology
  2. Android lag frame loss causes overview – System part
  3. Android lag frame loss causes overview – application chapter
  4. Android lag frame loss causes overview – low memory section

Users in the process of using mobile phones, the lag is the most easily felt

  1. The occasional hiccup, such as a hiccup while swiping weibo, or a return to the desktop animation, can degrade the user experience
  2. The appearance of the machine will make the phone can not be used
  3. In today’s era of high frame rates, if a user is used to 120 FPS and suddenly switches to 60 FPS in a scene that is easy for the user to perceive, the user will also have a significant perception and feel that there is a lag

Therefore, whether it is the application or the system, should try to avoid the occurrence of the stuck, found the problem of the best priority to solve

Caton definition

The overall flow of applying a frame render

In order to understand how stutter occurs, we need to know how does a frame of the application main thread work

From the point of view of execution order

Choreographer starts with Vsync and ends with SurfaceFlinger/HWC compositing a frame.

From Systrace’s point of view

The above flowchart is more intuitive from the perspective of Systrace (Perfetto)

The specific process will be clear by referring to the above two figures and codes. In the above overall process, any step timeout may lead to lag, so the analysis of the lag problem needs to be analyzed from multiple levels. Such as application main thread, render thread, SystemServer process, SurfaceFlinger process, Linux region, etc

Caton definition

My definition of caton is: the number of frames that appear in a stable frame rate output without drawing the corresponding application word is Smooth VS Jank

For example, in the picture below, a frame is not drawn in the main thread of App during normal drawing (usually animation or list sliding), so we think that this frame may cause staid (here it is possible, due to the existence of Triple Buffer, it may also not drop frames).

Here are three definitions of Caton

  1. Phenomenally speaking, during the App’s continuous animation playback or finger sliding list (continuous is the key), if the App’s picture does not change for two or more consecutive frames, then we think there is a lag
  2. From SurfaceFlinger’s perspective, if there is a Vsync coming and the App doesn’t have a Buffer to synthesize during the App’s continuous animation or finger-swiping list (continuous is the key), Then the Vsync cycle SurfaceFlinger will not go through the compositing logic (or synthesize other layers), and this frame will display the previous frame of the App. We think there is a lag here
  3. From the App’s point of view, if the render thread does not have a queueBuffer in the SurfaceFlinger App’s BufferQueue within a Vsync cycle, then we consider that something is stalling

The main thread is not mentioned here, because the main thread takes a long time, which usually indirectly leads to the delay of the render thread, increasing the risk of the execution of the render thread timeout, thus causing the lag; And the application causes the lag, most of them are the main thread time-consuming caused by too long

It is also necessary to distinguish between logical lag and logical lag. Logical lag means that there is no problem in the rendering process of a frame and corresponding Buffer is also given to SurfaceFlinger for synthesis. However, the content of this App Buffer is the same (or almost the same) as that of the last App Buffer. Undiscernable to the naked eye), then it appears to the user that two consecutive frames are showing the same content. Generally speaking, we also think that there is a deadlock (but we need to distinguish the specific situation); Logic lag is mainly caused by the application’s own code logic

Introduction to the system operating mechanism

Since there are many reasons for the lag, if you want to analyze the lag problem, you must first have a certain understanding of the operating mechanism of the Android system. The following is a brief introduction to the system operating mechanism that needs to be understood to analyze the lag problem:

  1. Operation principle of App main thread
  2. Message, Handler, MessageQueue, and Looper mechanisms
  3. Screen refresh mechanism and Vsync
  4. Choreogrepher mechanism
  5. Buffer process and TripleBuffer
  6. The Input process

System mechanism – running principle of App main thread

When the App process forks, it calls the main method of ActivityThread to initialize the main thread

frameworks/base/core/java/android/app/ActivityThread.java
public static void main(String[] args) {...// create Looper, Handler, MessageQueueLooper.prepareMainLooper(); . ActivityThread thread =new ActivityThread();
       thread.attach(false, startSeq);

       if (sMainThreadHandler == null) { sMainThreadHandler = thread.getHandler(); }...// Start preparing to receive messages
       Looper.loop();
}

// Prepare Looper for the main thread
frameworks/base/core/java/android/os/Looper.java
public static void prepareMainLooper(a) {
    prepare(false);
    synchronized (Looper.class) {
        if(sMainLooper ! =null) {
            throw new IllegalStateException("The main Looper has already been prepared."); } sMainLooper = myLooper(); }}// The prepare method creates a Looper object
frameworks/base/core/java/android/os/Looper.java
private static void prepare(boolean quitAllowed) {
    if(sThreadLocal.get() ! =null) {
        throw new RuntimeException("Only one Looper may be created per thread");
    }
    sThreadLocal.set(new Looper(quitAllowed));
}

// When the Looper object is created, a MessageQueue is created
frameworks/base/core/java/android/os/Looper.java
private Looper(boolean quitAllowed) {
    mQueue = new MessageQueue(quitAllowed);
    mThread = Thread.currentThread()
}
Copy the code

After the main thread is initialized, the main thread has a complete Looper, MessageQueue, and Handler, and the Handler of the ActivityThread can start processing messages. Life cycle functions, including Application, Activity, ContentProvider, Service, Broadcast and other components, are processed in the main thread in the form of Message in order. This is the initialization and operation principle of the main thread of App. Some of the processed messages are as follows

frameworks/base/core/java/android/app/ActivityThread.java
class H extends Handler {
    public static final int BIND_APPLICATION        = 110;
    public static final int EXIT_APPLICATION        = 111;
    public static final int RECEIVER                = 113;
    public static final int CREATE_SERVICE          = 114;
    public static final int SERVICE_ARGS            = 115;
    public static final int STOP_SERVICE            = 116;

    public void handleMessage(Message msg) {
        switch (msg.what) {
            case BIND_APPLICATION:
                Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "bindApplication");
                AppBindData data = (AppBindData)msg.obj;
                handleBindApplication(data);
                Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
                break; }}}Copy the code

System mechanism – Message mechanism

When the main thread is initialized, it blocks and waits for a Message. When a Message arrives, the main thread wakes up and processes the Message. If there are no other messages to process, The main thread then goes to sleep and blocks and continues to wait

As you can see from the figure below, the core of Android Message mechanism is four: Handler, Looper, MessageQueue, and Message

There is a lot of detailed analysis of the Message mechanism code on the web, so here is a brief overview of what the four core components of the Message mechanism do

  1. Handler: The Handler is used to process messages. Applications can create handlers on any Thread, as long as the Looper is specified at the time of creation. If this is not specified, the Looper for the current Thread is specified by default
  2. Looper : Looper can be regarded as a circulator. After its loop method is enabled, it continuously obtains Message from MessageQueue, delivers and dispatches Message, and finally sends it to the corresponding Handler for processing. Since applications in Looper can insert their own printers before and after Message processing, many APM tools use this as an entry point for performance monitoring, as shown in 0co-matrix and BlockCanary
  3. MessageQueue: MessageQueue, as shown in the figure above, is a manager with Message in the queue. When there is no Message, MessageQueue blocks and waits with the help of the nativePoll mechanism of Linux until a Message enters the queue
  4. Message: Message is the object that sends a Message. It contains the content to be delivered. The most common ones include What, ARG, callback, and so on

ActivityThread is a function that uses the Message mechanism to process App life cycles and component life cycles

System mechanism – screen refresh mechanism and Vsync

First of all, we need to know what is the screen refresh rate. To put it simply, the screen refresh rate is a hardware concept, which refers to the frequency of the screen hardware to refresh the picture: for example, the 60Hz refresh rate means that the screen will refresh the display content 60 times within 1 second; Accordingly, 90Hz means refresh the display 90 times in 1 second

As opposed to screen refresh rate, FPS is a software concept, as opposed to the hardware concept of screen refresh rate, which is determined by the software system: FPS stands for Frame Per Second, which is the number of frames produced Per Second. For example, 60FPS means 60 frames per second; 90FPS refers to 90 frames per second

VSync is short for Vertical Synchronization. The basic idea is to synchronize your FPS with your monitor’s refresh rate. The goal is to avoid a phenomenon called tearing.

  1. In a 60 FPS system, 60 frames need to be generated within 1s for display, which means that it takes 16.67ms (1/60) to draw a Frame without dropping the Frame.
  2. In a 90 FPS system, 90 frames are generated within 1s for display, which means that it takes 11.11ms (1/90) to draw a Frame without dropping a Frame.

Generally speaking, the screen refresh rate is controlled by the screen, while the FPS rate is controlled by Vsync. In actual usage scenarios, the screen refresh rate and FPS rate are generally one-to-one corresponding. For details, please refer to the following two articles:

  1. New Smooth Android experience, 90Hz Ramble
  2. Android Systrace basics – Vsync interpretation

System Mechanics – Choreographer

In the previous section, when WE talked about Vsync controlling FPS, it is Choreographer that controls the frequency of application refreshes

Choreographer was introduced to work with Vsync to provide a stable time for Message processing for upper-level App renderers. This is when Vsync arrives and the system controls the timing of each frame drawing operation by adjusting the Vsync signal cycle. As for the reason why the Vsync cycle is set to 16.6ms (60 FPS), it is because most mobile phone screens have a refresh rate of 60Hz, that is, 16.6ms per refresh. The system also sets the Vsync cycle to 16.6ms to match the screen refresh frequency. Every 16.6ms, Vsync signals wake up Choreographer to start drawing apps. If each Vsync cycle can be rendered, the App will have an FPS of 60, which will feel very smooth to the user. This is the main role introduced into Choreographer

Choreographer plays the role of the go-between for Android rendering links

  1. Responsible for receiving and processing various update messages and callbacks of the App until the arrival of Vsync. For example, processing Input(mainly processing Input events), Animation(Animation related), Traversal(including measure, layout, draw and other operations), judging the frame delay, recording CallBack time, etc
  2. Start down: responsible for requesting and receiving Vsync signals. Receive Vsync event callback (through FrameDisplayEventReceiver onVsync); Request Vsync (FrameDisplayEventReceiver scheduleVsync).

Here is Choreographer starting a frame drawing workflow with the help of the Message mechanism when the Vsync signal arrives

For a detailed overview of the process, see This article about Android’s Choreographer based rendering mechanism

System mechanism – Buffer flow and TripleBuffer

A BufferQueue is a data structure in a producer-consumer model. Generally speaking, the consumers create a BufferQueue, Producers are generally not in the same process as BufferQueues

In the rendering process of Android App, App is a Producer, while SurfaceFlinger is a Consumer. Therefore, the above process can be translated as

  1. When the App needs a Buffer, it frees the Buffer from the BufferQueue request by calling dequeueBuffer () and specifying the Buffer’s width, height, pixel format, and usage flag
  2. App can render using CPU or gpu. After rendering, call queueBuffer () to return the buffer to App’s BufferQueue. (For GPU rendering, there is also a GPU processing process. Therefore, this Buffer will not be available immediately, it needs to wait for GPU rendering to complete.)
  3. After receiving the Vsync signal, the SurfaceFlinger prepares to synthesize and acquireBuffer () obtains the Buffer in the BufferQueue corresponding to App and performs the synthesize operation
  4. After composition, SurfaceFlinger returns the Buffer to the App’s corresponding BufferQueue by calling releaseBuffer ()

Knowing the process of Buffer flow, it should be explained below that in most current systems, each application has three buffers in rotation to reduce the situation of running out of buffers due to the excessively long time spent in a process of Buffer

The following figure shows a comparison of double and triple buffers

The benefits of three buffers are as follows

  1. Buffer drop: As you can see from the Double Buffer and Triple Buffer diagram above, in this case (with consecutive mainthread timeouts), the rotation of the three buffers helps mitigate the number of frames dropped (from two frames dropped -> only one frame dropped). , App main thread timeout will not necessarily lead to frame drop. Due to the existence of Triple Buffer, frame drop on some App terminals (mainly caused by GPU) is not necessarily a frame drop on SurfaceFlinger, which is a point that needs to be paid attention to when watching Systrace
  2. Reduce mainthread and render thread wait time: Sometimes the App main thread has to wait for the SurfaceFlinger(consumer) to release the Buffer before it can get the Buffer for production. In this case, there is a problem. At present, most mobile phones SurfaceFlinger and App receive Vsync signal at the same time. If the main thread of App waits for SurfaceFlinger(consumer) to release Buffer, the execution time of the main thread of App will be delayed
  3. Reduce GPU and SurfaceFlinger bottlenecks: This is easier to understand. In the case of double Buffer, the Buffer produced by App must be taken to GPU for rendering in time, and then SurfaceFlinger can synthesize it. Once the GPU times out, It is very easy to appear that SurfaceFlinger can not synthesize in time, resulting in frame drop; In the case of three Buffer cycles, buffers produced by App can enter BufferQueue as early as possible for GPU to render (because there is no need to wait, even if two buffers are accumulated here and the next frame is synthesized, the process will be carried out earlier. In addition, if the SurfaceFlinger itself has a large load, three Buffer cycles can effectively reduce the dequeueBuffer wait time

The downside is that too many buffers take up memory

This part of the detailed process can be read in the Android Systrace basics – Triple Buffer interpretation of this article

System mechanism – Input flow

Android system is event-driven, and input is one of the most common events. User click, slide, long press and other operations are input event-driven, the core of which is InputReader and InputDispatcher. InputReader and InputDispatcher are two Native threads running in SystemServer, which are responsible for reading and distributing Input events. We analyze the Input event stream of Systrace, the first is to find here.

  1. InputReader is responsible for reading Input events from EventHub and handing them to InputDispatcher for event distribution
  2. InputDispatcher wraps and dispatches the event after receiving it from the InputReader.
  3. The OutboundQueue holds events that will be sent to the corresponding AppConnection
  4. The WaitQueue records events that have been sent to AppConnection, but the App is still processing events that have not returned success
  5. PendingInputEventQueue records the Input events that the App needs to process, and you can see that it has reached the application process
  6. DeliverInputEvent Indicates that the App UI Thread is awakened by the Input event
  7. InputResponse identifies the Input event region, where you can see that the processing stages of an Input_Down event + several Input_Move events + one Input_Up event are counted
  8. The App responds to an Input event and lets go. The App responds to an Input event and lets go. The App responds to an Input event and lets go

The Systrace corresponding to the above process is as follows

This part of the detailed process can be read in the Android Systrace basics – Input article

series

  1. Systrace Fluency Practice 1: Understand the Caton Principle
  2. Systrace Fluency Combat 2: Case study: MIUI desktop slide stuck analysis
  3. Systrace Fluency Practice 3: Some questions during Caton analysis

The attachment

The attachment has been uploaded to Github and can be downloaded by yourself: github.com/Gracker/Sys…

  1. Xiaomi_launcher. Zip: Systrace file of desktop slide card, this case is mainly to analyze the Systrace file
  2. Xiaomi_launcher_scroll_all_the_time. zip: Systrace file that has been sliding on the desktop
  3. Oppo_launcher_scroll. zip: comparison file

About my && blog

  1. About me, I really hope to communicate with you and make progress together.
  2. Blog Content navigation
  3. Excellent Blog Post record – Android performance optimization is a must

A person can go faster, a group can go farther