Author: Idle fish technology – Cloud cong

1 Overall Thinking

In the process of rapid business iteration, the smoothness of long list sliding of APP gradually deteriorates, which harms users’ browsing experience. As the pioneer of flutter application in China, APP exists as a hybrid project of flutter and Native. Here we share our optimizations for Android native and Flutter pages.

This paper is divided into three parts:

  • Fluency indicator and detection tool construction
  • Native Android long list optimization
  • Optimization of flutter long list

The overall thinking diagram of fluency optimization is as follows:

2 fluency index and detection tool construction

2.1 Current situation and difficulties

Status quo of detection tools: Taking Android as an example, the existing fluency tools can be divided into:

  • invasive
    • Integrate SDK to calculate smoothness by registering frame callback. See the Android Choreographer class
    • Profile mode
  • No invasive
    • Execute system commands, such asadb shell dumpsys gfxinfo ${packageName}
    • Tencent GT APP, low-level implementationservice call SurfaceFlinger 1013Later versions of Android are no longer supported

The current status of fluency indicators is as follows:

  • FPS (Frames Per Second)
  • SF (SkippedFrame, skip frame)

Choreographer skips the number of times your app can execute doFrame() within 1 second per unit time (see Mobile App Performance Evaluation and Optimization)

  • SM (Smooth, Smooth)

The number of times Choreographer doFrame() can actually be executed by app in unit time 1 second. The SM = 60 – SF. (See Mobile App Performance Evaluation and Optimization)

  • Frame time data

Average frame time to get several key quantiles using adb command:

Total frames rendered: 2245
Janky frames: 31 (1.38%)
50th percentile: 5ms
90th percentile: 10ms
95th percentile: 14ms
99th percentile: 18ms
Copy the code

However, the definition of the above tools and indicators is still problematic in the complex scenario of APP

  1. Multi-platform problem

Current APP technologies include native, H5, applets, RN, WEEx, Flutter, etc. Currently, no non-invasive fluency testing tool can support multiple platforms, multiple models and multiple index data at the same time, while invasive testing tools cannot detect competing apps.

  1. Consistency between indicator selection and user experience

We expect to have a small number of indicator data to accurately express user fluency and body sensation. Average FPS (SM and SF are similar) is not enough to reflect user experience. For example, the same 30 FPS can be 30 33.3ms in 1s, or 29 16.6ms and one 516.9ms, but the user experience is different.

  1. Fluency data is influenced by many factors

Idle (stop), drag (drag) and fling (free slide) are all important factors affecting the flow data.

2.2 Development of fluency indicators

Wikipedia animation definition: a work and film technique in which a picture or object (frame) is mistaken for motion by the visual residual image of the naked eye, resulting from a series of stationary solid images (frames) being shot at a certain frequency and moving speed (e.g., 16 frames per second).

Similarly for list sliding, APP calculates a series of still pictures at a certain frequency (16.6ms at 60Hz) and different offsets, so that the naked eye can see the sliding animation.

When we say that the list does not slide smoothly, it is because the frequency is too low to allow the naked eye to produce visual residue, or the time (the duration of the picture) and space (the content of the picture) jump, so that the user can perceive the change is not natural. Based on this, we can define indicators as follows:

  • Time perspective
    • Define average FPS: Define the average frame rate of a check. Average dwell time of reaction images.
    • Define 1s big lag times: the average number of pictures occupying 3 frames or more within 1s. Reaction screen residence time jump
  • Space Angle
    • Offset: in the case that the frames do not drop, if one of the frames jumps, or even the screen is painted or green, the user will experience a bad flow. In the process of APP sliding, the screen content is determined by the offset, and the offset jump is related to both the snap length and the realization of the differentiator. The existing realization of the differentiator is basically based on the D/T curve (distance/time), so the average FPS and 1s big lag times largely reflect the screen jump. At the same time, considering the difficulty of non-invasive offset detection, offset jump value is not considered.

To sum up, we define the fluency index as average FPS value and 1 second lag times.

2.3 Fluency detection tool implementation

We start from the screen recording screen of APP and calculate the smoothness index value. When we get the screen recording data in the sliding process of APP, we can detect whether the screen recording screen changes every 16.6ms. When the continuous screen does not change, it means that there is a lag. The number of continuous frames without change represents the length of Caton.

In order to obtain the screen recording data of the target APP, the detection tool APP registers the screen recording service with the system, and then continuously reads the screen recording picture in the frame callback of the detection tool APP and compares it with the hash value of the last detection picture.

  • The detection tool APP is isolated from the target APP process, and the stalling of the target APP does not affect the frame callback of the detection tool APP
  • To ensure that each screen recording screen reading and hash value calculation is completed within 16.6ms, the screen width and height compression ratio should be adjusted according to the high and low-end models.

In order to eliminate the interference of sliding operation on the smoothness value, we used the sliding tool APP of script operation detection and the target APP. The automatic script works by using ADB commands to operate mobile phones

Click adb shell Input tap$x $yAdb shell input swipe$x1 $y1 $x2 $y2 $duration
Copy the code

2.4 Test tool demonstration

Fluency detection tool APP is displayed in the form of floating box. The target detection APP is shown below:

Fluency detection tool interface

2.5 Summary and outlook

In terms of fluency index, we defined average FPS and 1s klatch times as indicators to better reflect user experience. In terms of fluency detection tools, we have implemented non-invasive detection tools that support the following features:

  • No intrusion
  • Support the detection of third-party apps
  • Supports multiple platforms: Native, Flutter, H5, applets
  • Multidimensional data: average FPS, average number of 1s large card frames, histogram of frame distribution, mean square error of frame distribution
  • Automatic operation, avoid manual operation difference

In addition, fluency detection tools have some drawbacks

  • There are video cards in the list
    • When sliding is stopped, if there is a video playing in the list, the detection tool cannot determine whether sliding has stopped because the picture keeps changing. At the same time, the FPS value of the video is around 30, resulting in low fluency data
    • How to avoid: Ensure that the list does not stop sliding during detection
  • Low end (Y67) real FPS calculation is incorrect
    • In order to ensure that the hash value of large images calculated on low-end machines (such as Vivo Y67) is within 16ms, the compression of screen recording is relatively large (width compression is 100, height compression is 10). Therefore, it is impossible to detect subtle changes in the picture under a large number of blank or large color blocks, so the FPS calculation is relatively low.
    • How to avoid: Avoid scenarios where a lot of white space or large color blocks are detected on low-end machines

3 Native Android long list optimization

Android growth list optimization is very mature, in terms of tools such as TraceView, BlockCanary, DDMS, Android Profile, etc. There are also many common optimization methods: layout hierarchy optimization, excessive rendering optimization, frequent measure and layout optimization, UI thread time-consuming method optimization, redundant resource resource loading optimization, etc., which will not be described here.

In addition, xianyu uses the following 2 points to optimize the home page

3.1 Asynchronously Building a View cache pool

Tool tests or time consuming to print, found that initial slip and loadmore trigger item list view build takes seriously (RecyclerView. OnCreateViewHolder)

Viewing the home page display and the initial sliding process, you can find that there is room for optimization in other UI operations and waiting user operations in the process.

Using the principle of AsyncLayoutInflater to asynchronously build the view cache pool, optimizing the home page list process is as follows:

The completion time of the view cache pool construction varies with different models. It may be completed before the multi-card construction of the first screen of the list, or in the construction process, or before the user’s sliding operation, or it may throw an error to stop the construction at the beginning of the construction

Note: cannot be used directly AsyncLayoutInflater AsyncLayoutInflater in asynchronous build faltered a reversion to the UI thread build logic, in order to avoid relegation logic resulted in buffer pool in the UI thread build, lead to more caton page, need to remove the downgrade logic: Asynchronous inflaters failed and cache pool construction stopped. Procedure

3.2 ViewDataUnbinder Fast extraction UI operation

The card data binding phase (RecyclerView onBindViewHolder), on the low-end machine takes more severe, the reason is in the card data binding approaches, while the UI and the UI operation mix together, because the UI logic will be implemented in the UI thread, As a result, all logic can only be executed in the UI thread.

It is possible to define the view data layer and separate UI and non-UI operations, but the actual coding found that the business code was prone to change and error, and the AB test logic was difficult to implement. Is there a better way to extract UI operations with as little code as possible?

The ViewData class is automatically generated at compile time according to the view class, and the view class instance is replaced. The ViewData class and the view class have the same key method signature, logging the view actions as the methods execute and switching to the UI thread to execute the view actions.

Specific code examples are as follows

  1. Annotation View class

Annotate the view class with ViewDataAnno, and UIMethodAnno annotates UI action methods.

 There’s a note that says

  1. Generated codebehind (categories.aspx.cs) class

  1. Business code modification
    • Change the view variable to type ViewData
    • The raw view data binding logic places background threads

3.3 Optimization Results

Xianyu home page, after restoring the speed of the content on the screen (fluency reduced) to improve the fluency

4 Optimization of Flutter complex long list

Flutter has always been known for its high performance. How does flutter achieve such performance as native? This is also an important reason why idle fish originally chose flutter. However, on the actual flutter pages, such as the product details page and the search results page, the long list slide smoothness experience is not satisfactory.

4.1 Tool use and common optimizations

Before optimizing the performance of FLUTTER, you need to understand the rendering principle of flutter, such as the three tree structure of The Widget, Element and RenderObject, and the process of displaying the Widget to the screen. Please refer to the detailed analysis of flutter rendering engine. How to ensure the high performance and smoothness of Flutter for complex business? .

For Performance problems, the official Flutter Performance analysis tool is recommended and Performance problems can be viewed using the Profile mode. See the introduction to Flutter Performance analysis tool.

The Profile mode only runs on the real machine, not on the emulator: basically the same as the Release mode, except that service extensions and tracing are enabled, with some minimal support for tracing (like the ability to connect the Observatory to the process). The command flutter run –profile runs in this mode, Run sky/tools/gn — Android — Runtime-mode =profile or sky/tools/gn –ios –runtime-mode=profile. Because the simulator does not represent a real scenario, it cannot be run on the simulator citation: Flutter performance tuning, complex business guarantees Flutter performance and high flow

4.1.1 Checking the Rebuild status of the Widget

On Android Studio, View → Tool Windows → Flutter Performance opens to check the rebuild status of the Widget. It can be found that FDButtonBar is frequently rebuilt. The view content, however, has not changed. Locating the viewing code into reducer.dart updates the scrollPercent in state based on the slide event, resulting in a rebuild. In the details page, scrollPercent is not used in the Widget build.

Fish-redux is used in the idle fish page, and different state objects are returned in the Reducer.dart method to indicate that the widget needs to be rebuilt

// reducer.dart
// Slide event listener
  static BottomBarState onScroll(BottomBarState state, Action action) {
    ...
    returnstate.clone().. scrollPercent = scrollPercent; . }Copy the code

4.1.2 Using Fish-Redux Performance Logs

Fish-redux is a set of Redux framework developed by Xianyu on Flutter, which is widely used in Xianyu APP. Dart. If you want to print performance logs in profile or Release mode, modify the source code by yourself.

When the idle fish detail page is sliding, check the ADB log, you can find a large number of sliding broadcast notifications, and there are events that take more than 1ms to process.

11-15 15:03:43.684 27076 27271 I flutter: CommonBuyDetailPage performance: ItemBodyAction. 15 15:03:43 onScrollBroadcast 11-261. 701 27076 27271 I flutter: CommonBuyDetailPage performance: ItemBodyAction. 15 15:03:43 onScrollBroadcast 11-1933. 716 27076 27271 I flutter: CommonBuyDetailPage performance: ItemBodyAction.onScrollBroadcast 371Copy the code

Time logs in profile mode

Because there is interview linkage in the details page, such as the show hide gradient in the title bar,Ask the sellerThe disappearance of the display needs to be judged according to the sliding event. Combined with business logic, it can be found that in addition toAsk the sellerBesides, other views will not change after receiving the sliding event after sliding beyond 600. whileAsk the sellerAfter sliding beyond a larger value, the display will disappear forever. When the value is not exceeded at the beginning, it only needs to judge the sliding direction. Based on the above business background, after sliding beyond 600, ifAsk the sellerIf no state is displayed, no sliding event is sent. Otherwise only events are sent within 30 distance of the start of the slide.

Dart indicates widget reconstruction if a new state object is returned. Check all method implementations in the Reducer. Dart file and check for any invalid widget reconstruction that may occur.

4.1.3 Optimize ClipPath and ClipRPath

Using Timeline to look at the render thread performance consumption, you can see that there are multiple CliprectLayers and CliprRectLayers.

Open the Debug flagdebugDisableClipLayersdebugDisablePhysicalShapeLayersReexamining the view, you can see that part of the ClipRectLayer is generated because the image content is outside the view boundary, Part of the ClipRRectLayer is due to the ClipRRect Settings in the card Widget rounded corners and in the image controls based on external textures (even if radius is 0)

After understanding the principle, we added new parameters to the idle fish picture control, supporting image content rounded corner setting and image content width and height clipping, so that the Bitmap generated by native layer has met the requirements of rounded corner and aspect ratio. Also fixed ClipRRect setting for RADIUS 0. The optimized Timeline is shown as follows:

4.1.4 Other Optimization Suggestions

There are many excellent articles related to flutter performance optimization. This article will not repeat similar investigation and optimization methods. Here is a brief summary:

  • The widget build optimization
    1. The setState state refresh location should be placed as low as possible in the view tree
    2. How the Model is retrieved from the Provider affects the refresh scope. It is recommended to use Selector or Consumer to obtain the ancestor Model to maintain a minimum refresh range
    3. For long lists, avoid the ListView() constructor and use the ListView.Builder constructor instead
    4. In reducer, when the view data in the state object really changes, create a state object
  • Optimize master ISOLATE 5. Reduce or delay widget build non-view logic, such as exposure buried point delay to sliding stop aggregation trigger 5. If the height of a list Item is known, it is recommended to set itemExtent to reduce frequent calculation of the list height during sliding. 5. Decorates widgets or ordinary objects that need no change with const 5. When using AnimatedBuilder, avoid building a widget tree in the constructor of a widget that doesn’t depend on animation. Each change to the animation recreates the widget tree. Instead, build that part of the subtree and pass it as a child to AnimatedBuilder 5. Avoid clipping in animation. If possible, pre-cut the image before the animation begins
  • Render thread optimization 10. For frequently updated controls (such as animations), isolate it using the RepaintBoundary and create a separate layer to reduce the redraw area 10. 10. Reduce the use of saveLayer (ShaderMask, ColorFilter, Text Overflow), clipPath and improve render thread performance 10. Avoid using Opacity Widgets, especially in animations. Please replace 10 with AnimatedOpacity or FadeInImage. Avoid long text with newlines
  • Use Debug Flags to troubleshoot problems (recommended introduction to Flutter Performance analysis tool) 15. Take advantage of framework logs, such as fish-Redux performance logs

4.2 List Element reuse optimization

The flutter list controls are divided into visible and Cache areas. When the flutter slides down, element is created from the bottom into the bottom Cache area, then into the visible area, then into the top Cache area, and finally destroyed. The logic of sliding up is similar. By sliding back and forth without using keepAlive, previously created elements need to be recreated. In our business, where the structure of the list Item Widget is similar, we can improve performance by reusing elements by type.

See sliver_list list control source code. In the dart RenderSliverList. PerformLayout element () are cached _childElements array, with the index as the index. If the structure of the Item Widget is so different that even if you reuse element, the element. updateChild method ends up implementing the inflateWidget method inside of it

${widget.key} → List

Create the index → ${widget.key} mapping at widget creation, at the logical point where the element should be destroyed and removed, Cache element at List< Element > of the ${widget.key} map (note that the renderObject needs to be removed from the parent node). List of sliding process, the priority according to the mapping relationship between element found in the cache and use (note update element. The renderObject. The index value of parentData)

4.3 Displaying complex Widgets in frames

After all the above optimization methods are tried, the details page and search page of Idle fish are still far from reaching expectations. The reason is that guessing you like the card and searching the page card is complicated enough, and because we introduced DX technology to make the Widget even bigger, the end result is that even on high-end machines, you can’t render it in a single frame. However, beyond the technical perspective, from a business perspective, both the card’s ability to present content and DX’s dynamic capabilities are needed. How do you achieve the high performance of a large Widget while meeting your business needs?

The business side only needs Text, but in DX technology DXTextWidget is used

Guess you like the card in redmi K30Pro (CPU Snapdragon 865) Timeline graph

Search results card Timeline graph, supplemented with performLayout, updateChild, Widget Build

In the known common optimization methods can not meet the situation, we return to the starting point of GUI system performance optimization to think about the problem. Fluency optimization ideas can be roughly divided into three directions:

  1. Multithreaded scheme

This is common in Android native development. However, in the DART world, the memory of different threads (ISOLATE) is isolated. In addition, because the flutter rendering process has three trees, we cannot operate RenderObject directly, and it is difficult to implement multithreaded schemes in FLUTTER (excluding conventional scenes such as displaying IO updated data).

  1. Optimize each task, squeeze the CPU computation, and ensure a frame time (16.6ms) to complete the task

The main idea of optimization in Flutter, the previous optimization methods are the same idea

  1. Respond quickly to the user, making the user feel fast enough without blocking the user’s interaction. That is, if there are still tasks that are not completed within one frame, the execution will be stopped to ensure that the sliding list is performed first, and the tasks that are not executed will be executed on the time slice of the following frames

Refer to React Fiber framework. Based on the idea of time fragmentation, a task tree is transformed into a task chain (parent node → Child node → Sibling node → parent node) in the coordination stage, which satisfies the requirement that the task chain can be interrupted and render submitted in advance. Finally, a task chain is decomposed into multi-frame time fragments for digestion.

After eliminating directions 1 and 2, only direction 3 is left. Combined with the guess what you like card Timeline chart, it can be found that the time of one frame created in the card Widget is less than 16.6ms, and the time consumed in the following frames is far less than 16.6ms. It can be thought that direction 3 is correct. That leaves just two key questions:

  1. Is it possible to split a large Widget build task into several smaller Widget build tasks and distribute them roughly equally across multiple time slices?
  2. Does a large widget split screen time affect the experience?

Task time graph on Timeline

The Flutter widget splits and frames the screen

Based on the general direction of time sharding, we split a large widget into a blank frame and two card widgets, and then split the card widget into a card frame and FXImage widgets. The parts of the Widget framework that are not immediately displayed are temporarily replaced with placeholder widgets. Thus, a high-optimal large task queue and a low-optimal small task queue are constructed. Tasks in the high-optimal large task queue have high optimal execution and occupy a frame time, while tasks in the low-optimal small task queue have low optimal execution and can execute 12 tasks at most in a frame time. Then the build task was deferred to the subsequent time fragments by gradually labeling the flutter.

The result is a huge widget build spread out from 1 frame to 4 frames, optimizing the lag.

Timeline card graph (Redmi K30Pro, CPU Snapdragon 865)

In terms of experience, there is known to be an invisible Cache area when we talk about the list control structure in front, so most of the screen is completed in this invisible area, which is not perceived by users on high-end computers or under normal sliding conditions. On the low end, a quick slide can clearly see the card blank, but the overall feeling is better than a serious pause.

4.4 Data Optimization

Based on the above optimization methods, the fluency FPS of the details page and search page of Idle fish is increased by 3 points, the lag times of low-end phones is reduced by half, and the fluency of mid-high-end phones is increased to 57 or above, and the lag times is close to 0.

The online high availability FPS data is as follows:

Line low end FPS curve. Green is the optimized version. The farther to the right of the curve distribution, the better the smoothness

FPS curve of online high-end machine. Green indicates the optimized version

Search for high available FPS data on the page line as follows:

Line low end FPS curve. Green indicates the optimized version

FPS curve of online high-end machine. Green indicates the optimized version

4.5 Sliding differentiator optimization

After the above optimization, the data curves of offline self-built fluency detection tool and online FPS have been greatly improved, and the data indexes are close to the fluency of the native APP. On mid – and high-end models, FPS has been optimized to 57 and above, and the number of 1s kink is close to 0. When the native APP smoothness FPS is 57 or above, the lag will almost never be felt while swiping. However, the lag will still be felt when the Flutter page is actually swiped.

Review the principle of self-built fluency detection tool: Based on the comparison of each frame, non-intrusion and the same automatic script, we believe that the data (average FPS and 1s kappa times) in our offline test are accurate. The performance data were close, but the body sensation was different, and the performance data were accurate and reliable, so it was confirmed that the fluency index (average FPS and 1 second lag times) could not fully reflect the body sensation.

Reviewing the formulation of fluency index in 2.2, it can be found that we have not detected the offset jump of spatial dimension (screen content jump). Based on this, we can compare the offset changes of Android native RecyclerView and Flutter SliverList under the condition of stagnation

Android native RecyclerView and Flutter SliverList Fling phase offset/time curve

As can be seen from the above, the reason why Android RecyclerView is better than the flutter list control in user’s sense of motion when FPS value reaches 57 is that the offset value does not double and jump in the flutter immediately.

Check the flutter sliding algorithm, it can be found that the flutter distance is calculated based on a D/T curve, so the input timeOffset value is doubled immediately, and the calculated offset value is nearly doubled.

The flutter ClampingScrollSimulation D/T curve

In order to eliminate the offset jump in case of small jam, we have customized physics and simulation. When the small time jump occurs, we modify the sliding distance algorithm by using V/T curve algorithm. Distance is calculated by accumulative method. The case of curve jump caused by doubling of time offset was optimized

Distance = Velocity (time) * 16.6ms + distanceCopy the code

Note: it is necessary to adapt to models with system frequencies greater than 60 Hz (e.g. 90Hz, 120Hz), and it is possible to calculate multiple distances within a frame

Based on the V/T curve, we provide the following sliding differentiators:

  • SmoothClampingScrollPhysics

No rebound differential, the offset value does not jump after the pause. End sliding effect with ClampingScrollSimulation

  • SmoothBouncingScrollPhysics

Rebound differential, the offset value does not jump after the pause

5 summary and outlook

After the above optimization, in terms of native Android, xianyu’s home page fluency and content on the screen have been significantly improved; On Flutter, the fluency FPS of the details page and search page increased by 3 points, and the lag times of low-end phones decreased by half. On mid and high-end models, the smoothness increased to 57 or above, and the lag times approached zero. The experience of the same small lag was improved.

Fluency optimization is something that every GUI system strives for all the time, and there are many excellent tool presentations, official and unofficial optimization articles. In this optimization process, we also learned from many other people’s articles and found and optimized some problems. However, this article tries not to repeat the description, and we recommend readers to read relevant optimization articles or official documents. While the above optimization methods can not achieve the final goal, we have also made some different optimization, hoping to cast a brick to introduce jade, to help and inspire readers:

  • Based on user experience, the fluency indicators are constructed: average FPS and 1 second lag times
  • Aiming at the indicators, we have built a fluency detection tool, which supports non-intrusion, cross-platform and automation
  • [Android] Shows the ViewDataUnbinder component to quickly extract UI operations in complex business logic
  • [Flutter] Modify the source code of the Flutter engine to support list element reuse
  • Flutter implements a large Widget that frames the on-screen component
  • Optimization of the Flutter differential algorithm

We will continue to think about the following:

  • How to internally productize fluency detection tools for non-r&d use?
  • How to use existing experience, tools, and components to quickly optimize other business pages?
  • How to find and prevent invalid rebuild and other problems in the development stage?
  • How to detect deterioration in page fluency on CI platform?
  • How to achieve automatic and reasonable frame loading of large business widgets in a non-intrusive way?