This article is the fifth part of Android Systrace series. It mainly introduces the SurfaceFlinger in Android system. It introduces several important threads in SurfaceFlinger. Including Vsync signal interpretation, application of Buffer display, stuck judgment, etc. As Vsync has already been covered in two articles: Systrace Basics – Vsync Interpretation and Android Based On Choreographer rendering, I will not go into details here.
The purpose of this series is to use Systrace as a tool to look at the overall operation of the Android system from another perspective, and to learn about the Framework from another perspective. Maybe you read a lot of articles about the Framework, but can’t remember the code, or the process it runs through. Maybe you can understand it more deeply from the graphical perspective of Systrace.
Series of articles
- Systrace profile
- Systrace Basics – Systrace prep
- Systrace Basics – Why 60 FPS?
- Systrace basic knowledge – SystemServer interpretation
- Systrace Basics – SurfaceFlinger interpretation
- Systrace Basics – Input interpretation
- Systrace basics – Vsync interpretation
- Systrace Basics – Vsync-App: Choreographer based Rendering mechanics in detail
- Systrace Basics – MainThread and RenderThread interpretation
- Systrace Basics – Binder and lock competition interpretation
- Systrace Basics – Triple Buffer interpretation
- Systrace Basics – CPU Info interpretation
The body of the
Here is the official definition of the SurfaceFlinger process
- Most apps display three layers at a time on the screen: a status bar at the top of the screen, a navigation bar at the bottom or side, and the app interface. Some apps have more or fewer layers (for example, default home screen apps have a separate wallpaper layer, while full-screen games may hide the status bar). Each layer can be updated individually. The status bar and navigation bar are rendered by the system process, while the application layer is rendered by the application. There is no coordination between the two.
- The device display refreshes at a rate, typically 60 FPS on phones and tablets. If the display is updated during the refresh, it will tear; Therefore, be sure to update content only between cycles. When it is safe to update the content, the system receives a signal from the display device. For historical reasons, we refer to this signal as the VSYNC signal.
- Refresh rates can vary over time; for example, some mobile devices have frame rates ranging from 58 to 62 FPS, depending on current conditions. For HDMI connected TVS, the refresh rate could theoretically drop to 24 Hz or 48 Hz to match video. Since the screen can only be updated once per refresh cycle, committing a buffer for a display device at 200 FPS is a waste of resources because most frames are discarded. SurfaceFlinger does not perform an operation every time a commit buffer is applied, but wakes up when the display device is ready to receive a new buffer.
- When the VSYNC signal arrives, SurfaceFlinger traverses its layer list looking for a new buffer. If it finds a new buffer, it gets it; Otherwise, it will continue to use the previously acquired buffer. SurfaceFlinger must always display content, so it keeps a buffer. If there is no commit buffer on a layer, that layer is ignored.
- After SurfaceFlinger collects all the buffers in the visible layer, it asks Hardware How to compose.”
—- quoted from SurfaceFlinger and Hardware Composer
SurfaceFlinger takes buffers of data from multiple sources, synthesizes them, and sends them to the display device.
So in Systrace, we focus on the corresponding part of the diagram above
- App part
- BufferQueue part
- SurfaceFlinger part
- HWComposer part
These four parts have corresponding places in Systrace, and the sequence of time occurrence is 1, 2, 3 and 4. Let’s look at the whole rendering process from these four parts of Systrace
App part
As for App, it has been clearly explained in the basic knowledge of Systrace – Interpretation of MainThread and RenderThread. If it is not clear, you can see it in this article. The main process is as follows:
From the perspective of SurfaceFlinger, App is mainly responsible for the production of Surface required for SurfaceFlinger synthesis.
The interaction between App and SurfaceFlinger mainly focuses on three points
- Vsync signal receiving and processing
- The RenderThread dequeueBuffer
- The RenderThread queueBuffer
Vsync signal receiving and processing
For details of this article, see Choreographer on Android. The first point of interaction between App and SurfaceFlinger is the request and reception of Vsync signals. As shown in the first sign above, the vsync-App signal arrives, It refers to SurfaceFlinger’s vsync-app signal. When the application receives this signal, it starts preparing a frame for rendering
The RenderThread dequeueBuffer
A dequeue is a Buffer drawn from a queue, which is the SurfaceFlinger BufferQueue. Before the application starts rendering, Binder calls are required to retrieve a Buffer from SurfaceFlinger’s BufferQueue as follows:
“The Systrace of the App is shown as follows“
“Systrace on the SurfaceFlinger side is shown below“
The RenderThread queueBuffer
QueueBuffer is put back in the BufferQueue, and once the App has processed the Buffer (write the drawcall), Will take this Buffer by eglSwapBuffersWithDamageKHR – > queueBuffer this process, due to return to BufferQueue Buffer, the process is as follows
“The Systrace of the App is shown as follows“
“Systrace on the SurfaceFlinger side is shown below“
Through the above three parts, you should have a more intuitive understanding of the process in the figure below
BufferQueue part
A BufferQueue is created for each process with a display interface. The user creates and owns a BufferQueue data structure. And can exist in a different process from its producer. The BufferQueue workflow is as follows:
The figure above is mainly deQueue, Queue, Acquire and Release. In this case, App is the “producer” and is responsible for filling the display Buffer. SurfaceFlinger is a “consumer” that synthesizes the display buffers of various processes
- Dequeue (producer-initiated) : When the producer needs a buffer, it requests an available buffer from the BufferQueue by calling dequeueBuffer(), specifying the buffer’s width, height, pixel format, and usage tag.
- Queue (producer-initiated) : The producer fills the buffer and returns it to the queue by calling queueBuffer().
- Acquire (consumer initiated) : The consumer acquireBuffer() acquireBuffer and uses its contents
- Release (consumer initiated) : When the consumer action is complete, it returns the buffer to the queue by calling releaseBuffer()
SurfaceFlinger part
The working process
From the front we know that SurfaceFlinger’s main job is compositing:
❝
When the VSYNC signal arrives, SurfaceFlinger traverses its layer list looking for a new buffer. If it finds a new buffer, it gets it; Otherwise, it will continue to use the previously acquired buffer. SurfaceFlinger must always display content, so it keeps a buffer. If there is no commit buffer on a layer, that layer is ignored. After SurfaceFlinger collects all the buffers for the visible layer, it asks Hardware Composer how to compose.
❞
Its Systrace main thread can be seen to start working mainly after receiving a Vsync signal
The corresponding code is as follows, dealing mainly with two messages
- MessageQueue::INVALIDATE – handleMessageTransaction and handleMessageInvalidate are implemented
- MessageQueue::REFRESH — Basically executes the handleMessageRefresh method
frameworks/native/services/surfaceflinger/SurfaceFlinger.cpp
void SurfaceFlinger::onMessageReceived(int32_t what) NO_THREAD_SAFETY_ANALYSIS {
ATRACE_CALL();
switch (what) {
case MessageQueue::INVALIDATE: {
. bool refreshNeeded = handleMessageTransaction(); refreshNeeded |= handleMessageInvalidate(); . break; } case MessageQueue::REFRESH: { handleMessageRefresh(); break; } } } / / handleMessageInvalidate implementation is as follows bool SurfaceFlinger::handleMessageInvalidate() { ATRACE_CALL(); bool refreshNeeded = handlePageFlip(); if (mVisibleRegionsDirty) { computeLayerBounds(); if (mTracingEnabled) { mTracing.notify("visibleRegionsDirty"); } } for (auto& layer : mLayersPendingRefresh) { Region visibleReg; visibleReg.set(layer->getScreenBounds()); invalidateLayerStack(layer, visibleReg); } mLayersPendingRefresh.clear(); return refreshNeeded; } //handleMessageRefresh is implemented as follows, most of SurfaceFlinger's work is initiated in handleMessageRefresh void SurfaceFlinger::handleMessageRefresh() { ATRACE_CALL(); mRefreshPending = false; const bool repaintEverything = mRepaintEverything.exchange(false); preComposition(); rebuildLayerStacks(); calculateWorkingSet(); for (const auto& [token, display] : mDisplays) { beginFrame(display); prepareFrame(display); doDebugFlashRegions(display, repaintEverything); doComposition(display, repaintEverything); } logLayerStats(); postFrame(); postComposition(); mHadClientComposition = false; mHadDeviceComposition = false; for (const auto& [token, displayDevice] : mDisplays) { auto display = displayDevice->getCompositionDisplay(); const auto displayId = display->getId(); mHadClientComposition = mHadClientComposition || getHwComposer().hasClientComposition(displayId); mHadDeviceComposition = mHadDeviceComposition || getHwComposer().hasDeviceComposition(displayId); } mVsyncModulator.onRefreshed(mHadClientComposition); mLayersWithQueuedFrames.clear(); } Copy the code
Handlemessage Free Resh has the following functions in order of importance
- The preparatory work
- preComposition();
- rebuildLayerStacks();
- calculateWorkingSet();
- Synthesis of work
- begiFrame(display);
- prepareFrame(display);
- doDebugFlashRegions(display, repaintEverything);
- doComposition(display, repaintEverything);
- Finishing touches
- logLayerStats();
- postFrame();
- postComposition();
Since the display system has a lot of details, I won’t explain it here. If you work in this part, you need to be familiar with all the processes. If you just want to be familiar with the process, you don’t need to go too far, just know the main working logic of SurfaceFlinger
Frame drop
When determining whether an application “drops frames” through Systrace, we usually look directly at the SurfaceFlinger section, which is mainly the following steps
- Is SurfaceFlinger’s main thread not synthesized at every vsync-SF?
- If there is no composition operation, then we need to look at the reason why there is no composition:
- No composition operation because SurfaceFlinger check found no Buffer available?
- Because SurfaceFlinger is occupied by other work (screenshots, HWC, etc.)?
- If there is a synthesis operation, it is necessary to check whether the number of available buffers of the corresponding App is normal: If the App has 0 buffers available, why is there no queueBuffer available on the App? SurfaceFlinger may trigger another process that has buffers available
How do you think about this part of Systrace? There has been a detailed interpretation in the part of Systrace basics – Triple Buffer interpretation – Frame drop detection. You can go to see this part
HWComposer part
For HWComposer, we can directly read the official introduction
- Hardware Composer HAL (HWC) is used to determine the most efficient way to compose buffers from available Hardware. As HAL, its implementation is device-specific and is typically done by display device hardware Original equipment manufacturers (Oems).
- When you consider using overlay planes, it’s easy to see the benefits of this approach, which synthesizes multiple buffers in the display hardware (rather than the GPU). For example, suppose you have a normal Android phone with the screen facing up, the status bar at the top, the navigation bar at the bottom, and other areas showing app content. The contents of each layer are in separate buffers. You can handle composition using any of the following methods (the latter can be significantly more efficient) :
- Render the application content into the staging buffer, then render the status bar on it, then render the navigation bar on it, and finally pass the staging buffer to the display hardware.
- Pass all three buffers to the display hardware and instruct it to read data from different parts of the screen from different buffers.
- Display processor functions vary widely. The number of stacking layers (whether or not the layers can be rotated or blended) and the limitations on positioning and stacking are difficult to express through the API. To accommodate these options, HWC performs the following calculations (best performance on each device, since hardware vendors can customize the decision code) :
- SurfaceFlinger provides HWC with a complete list of layers and asks “What do you want to do with these layers?”
- HWC responds by marking each layer as a overlay layer or a GLES composite.
- SurfaceFlinger handles all of the GLES composition, passes the output buffer to HWC, and lets HWC handle the rest.
- When nothing changes on the screen, the superposition plane may be less efficient than GL compositing. This is especially true when the overlay content has transparent pixels and the overlay is mixed together. In such cases, HWC can choose to request the GLES composition for some or all of the layers and retain the synthesized buffer. If SurfaceFlinger returns with a request to compose the same set of buffers, HWC can continue to display the temporary buffers that were previously composed. This can extend the battery life of idle devices.
- Devices running Android 4.4 or later typically support four overlay planes. Attempting to compose more layers than stacking can cause the system to use GLES composition for some of them, meaning that the number of layers used by the application can have a significant impact on power consumption and performance.
——– quoted from SurfaceFlinger and Hardware Composer
Let’s continue to look at the SurfaceFlinger main thread, corresponding to step 3 above. The following figure shows the communication between SurfaceFlinger and HWC
This also corresponds to the latter part of the top picture
But there are so many details that I won’t go into them here. As for why to mention HWC, because HWC is not only an important link on the rendering link, its performance will also affect the performance of the whole machine, Android in the lag frame loss reason overview – system article this article has a list of HWC caused by the lag problem (insufficient performance, slow interrupt signal and other problems)
Want to know more about HWC knowledge, can refer to this article Android P graphics display system (a) hardware synthesis HWC2, of course, the author of the Android P graphics display department this series we can take a look
Refer to the article
- Android P graphics display system (a) hardware synthesis HWC2
- Android P graphics display system
- The definition of SurfaceFlinger
- surfacefliner
About my && blog
- About me, I really hope to communicate with you and make progress together.
- Blog Content navigation
- Excellent Blog Post record – Android performance optimization is a must
“A person can go faster, a group can go farther“
This article is formatted using MDNICE