This is the seventh in a series of Systrace articles covering the Vsync mechanism in Android. This article will look at how the Android system displays each frame based on Vsync from a Systrace perspective. Vsync is a very key mechanism in Systrace. Although we can’t see it or touch it when we operate mobile phones, we can see in Systrace that the Android system carries out the rendering and synthesis operation of every frame in an orderly manner under the guidance of Vsync signal. So we can enjoy a stable frame rate.
The purpose of this series is to use Systrace as a tool to look at the overall operation of the Android system from another perspective, and to learn about the Framework from another perspective. Maybe you read a lot of articles about the Framework, but can’t remember the code, or the process it runs through. Maybe you can understand it more deeply from the graphical perspective of Systrace
Series of articles
- Systrace profile
- Systrace Basics – Systrace prep
- Systrace Basics – Why 60 FPS?
- Systrace basic knowledge – SystemServer interpretation
- Systrace Basics – SurfaceFlinger interpretation
- Systrace Basics – Input interpretation
- Systrace basics – Vsync interpretation
- Systrace Basics – Vsync-App: Choreographer based Rendering mechanics in detail
- Systrace Basics – MainThread and RenderThread interpretation
- Systrace Basics – Binder and lock competition interpretation
- Systrace Basics – Triple Buffer interpretation
- Systrace Basics – CPU Info interpretation
The body of the
Vsync signals can be generated by hardware, or they can be simulated by software, but these days it’s mostly generated by hardware, HWC is responsible for generating hardware Vsync, which generates Vsync events and sends them to SurfaceFlinge via callbacks. DispSync generates Vsync signals VSYNC_APP and VSYNC_SF used by Choreographer and SurfaceFlinger
In this article we have discussed Android based on Choreographer rendering: Choreographer was introduced to work with Vsync to provide a stable time for Message processing for upper-level App renderers. This is when Vsync arrives and the system controls the timing of each frame drawing operation by adjusting the Vsync signal cycle. At present, most mobile phones have a refresh rate of 60Hz, that is, 16.6ms per refresh. In order to match the refresh frequency of the screen, the system also sets the cycle of Vsync to 16.6ms, 16.6ms per refresh. The Vsync signal wakes up Choreographer to do the drawing operations of your App, and this is where Choreographer is introduced
The Render layer (App) deals with Vsync with Choreographer, while the composition layer deals with Vsync with SurfaceFlinger. SurfaceFlinger will also synthesize all the ready surfaces when Vsync arrives
The following figure shows VSYNC_APP and VSYNC_SF in the SurfaceFlinger process in Systrace
Android graphics data flow
First of all, we need to roughly understand the direction of graph data flow in Android. From the following figure, combined with Android image flow, we can roughly draw from App to screen display, which is divided into the following stages:
- The first stage: when the App receives vsync-App, measure, layout and draw are carried out in the main thread (the DisplayList is constructed, which contains the commands and data required by OpenGL rendering). This corresponds to the main thread “doFrame” operation in Systrace
- Phase 2: The CPU uploads (shares or copies) data to the GPU. The MEMORY of the ARM device is generally shared by the GPU and CPU. The corresponding “Flush Drawing Commands” operation of the render thread in Systrace
- The third stage: inform GPU rendering. The real machine generally does not block and wait for THE completion of GPU rendering, and the CPU returns to continue performing other tasks after the notification. The Fence mechanism is used to assist GPU CPU in synchronization operation
- Stage 4: swapBuffers and inform SurfaceFlinger of layer composition. Here the corresponding rendering threads in Systrace “eglSwapBuffersWithDamageKHR” operation
- Stage 5: SurfaceFlinger begins to compose layers, if the GPU rendering task submitted before is not finished, wait for the GPU rendering to complete, then compose (Fence mechanism), composition still depends on GPU, but this is the next task. Here corresponds to the onMessageReceived operation of the SurfaceFlinger main thread in Systrace (including HandleTransaction, handleMessageInvalidate, handleMessageRefresh) SurfaceFlinger Will delegate some composition work to Hardware Composer to reduce the load from OpenGL and GPU, only layers that Hardware Composer can’t handle, or layers that OpenGL can’t handle. Other layers will be composed by Hardware Composer
- The sixth stage: the final synthesized data will be put into the corresponding Frame Buffer on the screen, which can be seen when fixed refresh
The graph below is also the official one. Combining the above phases, from left to right, you can see how a frame of data flows between processes
file:///Users/gaojack/blog/source/images/15751536775887.jpg
Image data flow in Systrace
Knowing the direction of the graph data flow in Android, we can map this relatively abstract data flow diagram on Systrace
The figure above mainly contains SurfaceFlinger, App and HWC processes. The following uses the labels in the figure to further illustrate the flow of data
- The first Vsync signal arrives and both SurfaceFlinger and App receive the Vsync signal at the same time
- The SurfaceFlinger receives vsync-SF signal and starts composing the Buffer from the previous frame in the App
- App receives vSYCN-APP signal and starts rendering the Buffer of this frame (corresponding to stages 1, 2, 3 and 4 above)
- When the second Vsync signal arrives, SurfaceFlinger and App receive the Vsync signal at the same time, SurfaceFlinger obtains the Buffer rendered by App in the second step and starts to synthesize (corresponding to the fifth stage above). App receives vSYCN-APP signal and starts rendering of a new frame’s Buffer (corresponding to stages 1, 2, 3 and 4 above)
Vsync Offset
As mentioned at the beginning, Vsync signals can be generated by hardware or simulated by software, but now they are mostly generated by hardware. HWC is responsible for generating hardware Vsync, which generates Vsync events and sends them to SurfaceFlinge via callbacks. DispSync generates Vsync signals VSYNC_APP and VSYNC_SF used by Choreographer and SurfaceFlinger.
Where app and SF have an offset relative to HW_Vsync_0, namely phase-app and Phase-sf, as shown in the following figure
“Vsync Offset we mean that there is an Offset between VSYNC_APP and VSYNC_SF, which is the value of phase-Sf-Phase-APP in the figure above”. This Offset can be configured by the manufacturer. If Offset is not 0, it means that “App and SurfaceFlinger main process do not receive Vsync signal at the same time, but interval Offset (usually between 0 and 16.6ms)”.
At present, most manufacturers do not configure this Offset, so App and SurfaceFlinger receive Vsync signal at the same time.
The corresponding values can be viewed by Dumpsys SurfaceFlinger
Offset = 0 :(sf phase-app phase = 0)
Sync configuration: [using: EGL_ANDROID_native_fence_sync EGL_KHR_wait_sync]
DispSync configuration:
app phase 1000000 ns, sf phase 1000000 ns
early app phase 1000000 ns, early sf phase 1000000 ns
early app gl phase 1000000 ns, early sf gl phase 1000000 ns
present offset 0 ns refresh 16666666 ns
Copy the code
“Offset not 0” (SF phase-app phase = 4 ms)
Sync configuration: [using: EGL_ANDROID_native_fence_sync EGL_KHR_wait_sync] Copy the code
VSYNC configuration: app phase: 2000000 ns SF phase: 6000000 ns early app phase: 2000000 ns early SF phase: 6000000 ns GL early app phase: 2000000 ns GL early SF phase: 6000000 ns present offset: 0 ns VSYNC period: 16666666 ns
The following uses Systrace as an example to look at the performance of Offset in Systrace
Offset 0
First, when Offset is 0, App and SurfaceFlinger receive Vsync signal at the same time, and the corresponding Systrace diagram is as follows:
There is also explanation in this figure, so I will not elaborate on it here. All you need to see is that the Buffer rendered by App will be synthesized by SurfaceFlinger when the next vsync-SF comes, which takes about 16.6ms. At this point, you might be thinking, “If App’s Buffer rendering ends and Swap to BufferQueue triggers SurfaceFlinger to synthesize, wouldn’t that save some time (0-16.6ms)?”
The answer is yes, which introduces the Offset mechanism. In this case, the App first receives the Vsync signal to render a frame, and then after the Offset time, the SurfaceFlinger receives the Vsync signal and starts to synthesize. If the App’s Buffer is Ready, the SurfaceFlinger composite will contain the App frame and the user will see it earlier.
Offset 0
The following figure shows a case whose Offset is 4ms. SurfaceFlinger does not receive Vsync signal until App receives Vsync 4ms
Advantages and disadvantages of Offset
One of the difficult points of Offset is how to set the time of Offset, which is also one of the reasons why many manufacturers do not configure Offset by default. Its advantages and disadvantages are dynamic and have a great relationship with the performance and use scenarios of the model
- If Offset is too short, SurfaceFlinger may receive vsync-sf before rendering is completed after App receives vsync-app. Then if App BufferQueue has no Buffer accumulated before, Then SurfaceFlinger will not contain the contents of App in this synthesis, and the content of App will be synthesized in the next vsync-SF. The time is equivalent to Vsync cycle +Offset, instead of Offset as we expected
- If Offset is too long, it won’t work
HW_Vsync
Note that hardware does not generate Vsync every time a Vsync request is made. HWC requests HW_VSYNC only when the time between the request and the last time is more than 500ms
Taking desktop sliding as an example, you can view the HW_VSYNC status by viewing the SurfaceFlinger process Trace
When an App applies for Vsync, HW_VSYNC exists and HW_VSYNC does not exist
Do not use HW_VSYNC
Using HW_VSYNC
file:///Users/gaojack/blog/source/images/15751538247774.jpg
HW_VSYNC is mainly used to make predictions using the nearest hardware VSYNC. The minimum number of predictions is 3, and the maximum is 32. If the Present Fence received does not exceed the error, hardware VSYNC will be turned off. Otherwise, hardware VSYNC will continue to receive and calculate the value of SW_VSYNC until the error is less than threshold. You can refer to this article: Sw-vsync generation and delivery for more details. Here are excerpts of his conclusions
❝
The SurfaceFlinger implements HWC2::ComposerCallback. When HW-vsync arrives, the SurfaceFlinger receives a callback and sends it to DispSync. DispSync will record the time stamps of these HW-vsync. When enough HW-vsync (currently more than or equal to 6) are counted, the offset mPeriod of SW-vsync is counted. The calculated mPeriod will be DispSyncThread to simulate the periodicality of HW-vsync and to notify listeners interested in VSYNC. These listeners include SurfaceFlinger and any apps that need to render images. These listeners register with EventThread in abstract form of Connection. DispSyncThread and EventThread are connected through DispSyncSource as a middleman. EventThread will notify all interested connections after receiving sw-vsync, and the SurfaceFlinger will start composing and the app will start drawing frames. When enough HW-vsync is received and the error is within the allowable range, hw-Vsync is turned off via EventControlThread.
❞
About my && blog
- About me, I really hope to communicate with you and make progress together.
- Blog Content navigation
- Excellent Blog Post record – Android performance optimization is a must
“A person can go faster, a group can go farther“
reference
- VSYNC
- Juejin. Cn/post / 684490…
- Gityuan.com/2017/02/05/…
- Sw-vsync generation and transmission
- Echuang54.blogspot.com/2015/01/dis…
This article is formatted using MDNICE