This article is the eleventh in the Systrace series. It mainly introduces the Triple Buffer in Systrace, briefly introduces how to judge the occurrence of the stuck situation in Systrace, and conducts preliminary positioning and analysis. And introduces the impact of Triple Buffer on performance
The purpose of this series is to use Systrace as a tool to look at the overall operation of the Android system from another perspective, and to learn about the Framework from another perspective. Maybe you read a lot of articles about the Framework, but can’t remember the code, or the process it runs through. Maybe you can understand it more deeply from the graphical perspective of Systrace.
Series of articles
- Systrace profile
- Systrace Basics – Systrace prep
- Systrace Basics – Why 60 FPS?
- Systrace basic knowledge – SystemServer interpretation
- Systrace Basics – SurfaceFlinger interpretation
- Systrace Basics – Input interpretation
- Systrace basics – Vsync interpretation
- Systrace Basics – Vsync-App: Choreographer based Rendering mechanics in detail
- Systrace Basics – MainThread and RenderThread interpretation
- Systrace Basics – Binder and lock competition interpretation
- Systrace Basics – Triple Buffer interpretation
- Systrace Basics – CPU Info interpretation
How do I define frame drop?
In Systrace, we can see that the application will drop frames if the main thread exceeds 16.6ms. In fact, it is not true. This is related to the Triple Buffer mentioned in this article. In Systrace, we judge frames from App and SurfaceFlinger together
The App side determines the frame
For those of you who have not seen Systrace before, it is only theoretically possible that the application of the following Trace will drop frames and draw the main thread in more than 16.6ms, but it is not necessarily possible because of the presence of BufferQueue and TripleBuffer. The BufferQueue may have a Buffer from the previous frame or from the previous frame that SurfaceFlinger can use to compose, or it may not
“Therefore, we cannot directly determine whether frames are dropped from the App end of Systrace. We need to look at it from the SurfaceFlinger end of Systrace“
The SurfaceFlinger terminal determines the frame
The SurfaceFlinger side can see the SurfaceFlinger main thread and composition and the application of the corresponding Buffer in the BufferQueue. The image above is an example of a frame drop. App didn’t finish rendering in time, and there was no Buffer of the previous frames in BufferQueue at this time, so the SurfaceFlinger didn’t synthesize the Layer corresponding to App in this frame, and a frame was dropped in the user’s view
And in the first picture we said you can’t tell if you’re dropping frames from the App side, The Trace of SurfaceFlinger corresponding to that figure is as follows. It can be seen that due to the existence of Triple Buffer, SF has the Buffer of previous App, so although a frame measured by App exceeds 16.6ms, But SF still has buffers available for composition, so no frames are dropped
The logical frame drop
The above frame drop we are looking at from the render side, this kind of frame drop can be easily detected in Systrace; There is also a frame drop situation called logical frame drop
“Logical frame drop” refers to the situation that the screen is updated not in a uniform or physical curve way, but in a jump update due to the application of its own code logic problem. Such frame drop can not be seen on Systrace, but users can obviously feel it when using
To take a simple example, for example, when a list is sliding, it would be perfect if we let go of the list and the progress before each frame is a uniform curve that approaches zero. But if one frame goes 20 times as far as the last frame, the next frame goes 10 times as far as the next frame, and the next frame goes 30 times as far as the next frame, this is a jump update. On Systrace, each frame is rendered in time and SurfaceFlinger is synthesized in time, but the user feels stuck using it. But I list this example, the Android has aimed at this kind of situation has been optimized, interested can go to have a look at the Android/view/animation/AnimationUtils Java class – this class, focus on the following three methods to use
public static void lockAnimationClock(long vsyncMillis)
public static void unlockAnimationClock()
public static long currentAnimationTimeMillis()
Copy the code
Android animations generally don’t have this problem, but app developers can write code that calculates the animation properties based on the current time (not the time of Vsync)**. In this case, when a frame drops, the animation changes unevenly. If you’re interested, think about this piece for yourself
In addition, there are many reasons for Android frame drop, you can refer to the following three articles to eat:
- Android lag frame loss causes overview – methodology
- Android lag frame loss causes overview – System part
- Android lag frame loss causes overview – application chapter
BufferQueue and Triple Buffer
BufferQueue
So if WE look at a BufferQueue, a BufferQueue is a data structure in a producer-consumer model, and generally speaking, the Consumer creates a BufferQueue, Producers are generally not in the same process as BufferQueues
The logic is as follows
- When a Producer needs a Buffer, it releases it from a BufferQueue request by calling dequeueBuffer () and specifying the Buffer’s width, height, pixel format, and usage flag
- The Producer populates the buffer and returns it to the queue by calling queueBuffer ().
- The Consumer acquireBuffer () acquireBuffer and consumes its contents
- When finished, the Consumer returns the Buffer to the queue by calling releaseBuffer ()
Android uses Vsync to control when buffers flow in a BufferQueue. If you are not familiar with Vsync, check out the following two articles to get an idea
- Systrace basics – Vsync interpretation
- Details on Android based rendering mechanism Choreographer
The above process is quite abstract, so I’ll give you a concrete example to help you understand the above diagram, and to help you understand the BufferQueue in Systrace.
In the rendering process of Android App, App is a Producer, while SurfaceFlinger is a Consumer. Therefore, the above process can be translated as
- When “App” needs a Buffer, it frees the Buffer from the BufferQueue request by calling dequeueBuffer () and specifying the Buffer’s width, height, pixel format, and usage flag
- The “App” can be rendered either by the CPU or by the GPU. Once rendered, call queueBuffer () to return the buffer to the App’s BufferQueue(there is also a GPU process for gpu rendering).
- After receiving the Vsync signal, “SurfaceFlinger” starts to prepare for synthesis. AcquireBuffer () is used to obtain the Buffer in the BufferQueue corresponding to App and synthesize
- After composition, SurfaceFlinger returns the Buffer to the App’s corresponding BufferQueue by calling releaseBuffer ()
Now that you understand what a BufferQueue does, let’s talk about buffers in a BufferQueue
As you can see from the figure above, producers and consumers in BufferQueue apply for or release buffers through dequeueBuffer, queueBuffer, acquireBuffer, releaseBuffer, So how many buffers does a BufferQueue need to run? From the single Buffer, Analysis of double Buffer and Triple Buffer (note that the analysis is only made from the perspective of Buffer, for example, RenderThread is involved in the App test. RenderThread is also blocked by MainThread due to its relationship to the unBlockUiThread execution time.
Single Buffer
In the case of a single Buffer, since only one Buffer is available, this Buffer needs to be used for both the composite display and the application for rendering
Ideally, a single Buffer will do the job (with vsync-offset present)
- App receives Vsync signal, obtains Buffer and starts rendering
- After the interval of vsync-offset, the SurfaceFlinger receives the Vsync signal and starts to synthesize
- The screen refreshes and we see the composited image
Unfortunately, ideally, if the App rendering or SurfaceFlinger composition is not complete before the screen refreshes, the Buffer will be incomplete when the screen refreshes, which will feel like a tear to the user
Of course, Single Buffer is no longer in use. The above is just an example
Double Buffer
A Double Buffer is equivalent to a BufferQueue with two buffers available for rotation. When consumers consume buffers, producers can also obtain the spare buffers for production
Let’s take a look at the ideal Double Buffer workflow
However, the Double Buffer can also have performance problems. For example, if the App produces two consecutive frames that exceed the Vsync cycle (specifically, the SurfaceFlinger synthesis time), the frame will drop
Triple Buffer
In Triple Buffer, we added a BackBuffer, so that there are three buffers in the BufferQueue that can be rotated. When the FrontBuffer is in use, The SurfaceFling App has two free buffers that can be used for production. Even if the GPU times out, the CPU can still use a new Buffer to produce the SurfaceFling. The CPU uses a BackBuffer “)
The following is a solution to the Double Buffer frame drop problem caused by insufficient Buffer after the introduction of Triple Buffer
Here, the two graphs are put together for comparison (one is that the Double Buffer dropped two frames, and the other is that the Triple Buffer only dropped one frame).
The role of Triple Buffer
Ease off the frame
As you can see from the comparison of Double and Triple buffers in the previous section, the rotation of three buffers helps mitigate the number of frames dropped (two frames dropped -> only one frame dropped) in this case.
Therefore, from the definition of frame drop in section 1, we know that the timeout of App’s main thread does not necessarily lead to frame drop. Due to the existence of Triple Buffer, frame drop in some App terminals (mainly caused by GPU) may not necessarily result in frame drop in SurfaceFlinger. This is one of the things to watch out for when looking at Systrace
Reduce mainthread and render thread wait time
Sometimes the App main thread must wait for the SurfaceFlinger(consumer) to release the Buffer before it can get the Buffer for production. In this case, a problem occurs. At present, most mobile phones SurfaceFlinger and App receive Vsync signal at the same time. If the main thread of App waits for SurfaceFlinger(consumer) to release Buffer, the execution time of the main thread of App will be delayed, such as the following figure. It is obvious that: “Buffer B is not consumed when Vsync signal arrives (because it is still in use), but when Buffer A is consumed and Buffer B is released, the App can get Buffer B for production. During this period, there is A certain delay, which makes the main thread available for A shorter time.”
Let’s look at what happens when this happens in Systrace
In the case of a three-buffer rotation, this is almost never the case. The render thread can get the available Buffer when the dequeueBuffer is used (of course, if the dequeueBuffer itself is time-consuming, it is not the subject of discussion here).
Reduced GPU and SurfaceFlinger bottlenecks
This is easier to understand. In the case of double Buffer, the Buffer produced by App must be taken to GPU for rendering in time, and then SurfaceFlinger can synthesize it. Once the GPU times out, It is very easy to have SurfaceFlinger not synthesize in time and drop frames
In the case of three Buffer cycles, buffers produced by App can enter BufferQueue as early as possible for GPU to render (because there is no need to wait, even if two buffers are accumulated here and the next frame is synthesized, the process will be carried out earlier. In addition, if the SurfaceFlinger itself has a large load, three Buffer cycles can effectively reduce the dequeueBuffer wait time
For example, the following two figures show the situation of “double Buffer dropping frames” between SurfaceFlinger and App. Since SurfaceFlinger itself is time-consuming (in specific scenarios), App’s dequeueBuffer cannot receive timely response. A serious frame drop occurred. After switching to Triple Buffer, this situation largely disappeared
Debug Triple Buffer
Dumpsys SurfaceFlinger
Dumpsys SurfaceFlinger can view many current states output by SurfaceFlinger, such as some performance indicators, Buffer states, layer information, etc., which can be separately mentioned if there is a space later. The following is the intercepted Buffer usage of each App in the case of Double Buffer and Triple Buffer. It can be seen that different APPS have different utilization rates of Triple Buffer under different loads. Double buffers use Double buffers entirely
Close the Triple Buffer
Different Android versions have different property Settings (this is a Google logic Bug that has been fixed in Android 10)
Android version <= Android P
// Control codeproperty_get("ro.sf.disable_triple_buffer", value, "1");
mLayerTripleBufferingDisabled = atoi(value);
ALOGI_IF(mLayerTripleBufferingDisabled, "Disabling Triple Buffering");
Copy the code
“Modify the corresponding property value, and restart the Framework“
// Execute the following statements in order (Root permission required)adb root
adb shell setprop ro.sf.disable_triple_buffer 0
adb shell stop && adb shell start
Copy the code
Android > Android P
// Control codeproperty_get("ro.sf.disable_triple_buffer", value, "0");
mLayerTripleBufferingDisabled = atoi(value);
ALOGI_IF(mLayerTripleBufferingDisabled, "Disabling Triple Buffering");
Copy the code
“Modify the corresponding property value, and restart the Framework“
// Execute the following statements in order (Root permission required)adb root
adb shell setprop ro.sf.disable_triple_buffer 1
adb shell stop && adb shell start
Copy the code
reference
- Source. The android. Google. Cn/devices/gra…
The attachment
The attachment involved in this article has also been uploaded, you can unzip the download, use “Chrome” browser to open the link to download the Systrace attachment involved in this article
About my && blog
- About me, I really hope to communicate with you and make progress together.
- Blog Content navigation
- Excellent Blog Post record – Android performance optimization is a must
“A person can go faster, a group can go farther“
This article uses MDNICE typesetting, domain name record, if com cannot access, can be changed to CN access