Author: Sun Ran (Boiled shrimp)

For small program technology, it is inevitable that there will be a blank screen or loading page in the process of container loading and front-end asynchronous rendering, which can be as short as a moment or as long as it takes several seconds to display the first screen. If the blank screen lasts for a long time, the user experience is affected. According to Google, 53% of users quit a page if it takes longer than 3s to load.

In order to accelerate the display of the home page of small programs, Alipay and Handout use htMl-based snapshot technology, the main idea is to cache the home page HTML for the next startup with the data first render to advance the time of the first screen display, suitable for traditional WebView rendering of small program scene. This HTML-based snapshot technology can greatly reduce the white screen time at startup, but the first screen is still not fast enough to give users a visible white screen experience. And the snapshot still shows a page that can’t be clicked, so you can click the interaction until the JS part is ready.

In order to pursue the ultimate experience effect, we propose a new small program snapshot technology, the goal is not only to completely eliminate the white screen phenomenon, but also to be able to respond to user interaction.

The core idea

Different from the existing HTML-based snapshot technology, we propose a native image-level snapshot technology, which consists of the following three steps:

  • Step 1: Save the front page of the mini program as a picture, which we call a snapshot, at an appropriate time after the start of the mini program
  • Step 2: Display the snapshot saved last time before starting the mini-program next time
  • Step 3: Hide the snapshot and show the real home page of the applet at an appropriate time after the applet starts, and save the current interface view as the next snapshot (same as step 1).

The effect

Now the new DING schedule page in The nail uses the snapshot technology, and the effect comparison is as follows:

before after

It can be seen that through the snapshot technology, the page has realized the effect of opening the first screen in seconds, the phenomenon of white screen completely disappears, and the rendering time of the first screen of the page has been reduced from about 1700ms to less than 300ms.

I’ll go through a few key considerations of snapshot technology.

Setting and Timing

Ideally, a snapshot should be able to overlap exactly with the first screen without any visual changes while the snapshot is hidden. The optimal effect of the snapshot technology is directly determined by the time when the snapshot is generated and the scenario in which the snapshot is used.

What pages are good for snapshots?

Not all applets are suited to using snapshot technology to improve the first-screen experience. If not used correctly, snapshots can also be a subtraction from the experience. In order to achieve the best results, it is generally suitable to use snapshots when the following conditions are met on the first screen:

  • The first screen is fixed. If the first screen is not fixed, it is difficult to find a proper snapshot time to ensure that the snapshot overlaps with the next first screen
  • The first screen does not contain user privacy data. Users’ private data should not be snapped

When is a snapshot generated?

If the snapshot is taken too early, the snapshot may also show a blank screen or an incomplete home page frame.

If a snapshot is taken too late, users may interact with the first screen, such as scrolling or clicking. In this case, snapshots that cannot overlap with the first screen may be generated.

Therefore, you need to determine the best snapshot timing according to different first-screen scenarios. Generally, we consider snapshots when:

  • In the onReady lifecycle callback of the first screen Page of the applet. At this point, however, the page is probably still not rendered, so consider taking a snapshot after an appropriate delay
  • To remotely pull the data on the first screen of a small program, you can perform this operation after remotely obtaining the data on the first screen
  • Snapshots are not taken when users scroll or click

When to hide a snapshot?

We generally consider hiding the snapshot that is currently displayed when creating the snapshot. Generally, snapshots are generated immediately after the snapshot is hidden to achieve seamless connection between the snapshot and the real page.

Of course, you also need to consider scenarios in which a small program might fail to start. If the first screen of the small program does not start successfully when the display time reaches the upper limit, the snapshot will be directly hidden to avoid the embarrassing situation that the first screen is visible to the user but does not respond.

Here we made a small visual optimization while still hiding the snapshot. When you hide the snapshot, if you hide the snapshot directly, once the snapshot is slightly different from the real page, there may be a flicker in the visual sense.

So when we hide the snapshot, we do a 200ms fade out animation to mitigate the flicker caused by the difference between the snapshot and the real page. Sometimes the snapshot may be slightly earlier than the successful loading of network data on the home page, the successful loading of pictures and other asynchronous events, leading to the lack of snapshot than the real page elements or inaccurate data, and fade animation can effectively dilute the sense of visual abnormalities caused by these errors. The following demo compares these two cases:

Straight out of the Fade out

interactive

Since the snapshot and the real first screen page are basically the same, from the user’s sense of body, the user will have successfully displayed the first screen, and it should be an interactive page. So it’s not enough just to show a dead snapshot page, making it interactive is an important part of our snapshot capability.

Our snapshot supports response to user click behavior by temporarily storing user click event when user clicks snapshot and distributing this event to real page when snapshot is hidden.

If the user clicks multiple times during this process, we will only respond to the last click event.

From the user’s sense of body, the user may feel that the response of this click will be slow, but the user will not perceive whether it is a snapshot or a real home page.

For scenarios where the small program starts slowly, you can also consider displaying loading after the user clicks:

To further improve the interactivity of the snapshot layer, we can even allow developers to set up click areas and simple actions for the snapshot layer, allowing users to quickly respond to click events when clicking on the snapshot layer. For example, The Nail workbench is a good fit for this scenario: applications in the workbench tend to change infrequently and have well-defined chunks:

You can configure different click areas and corresponding actions (e.g. jump to other pages/apps), such as:

[{
  area: {
    left: 100, 
    top: 100, 
    width: 100, 
    height: 100
  },
  action: {
    type: 'openLink', 
    params: { url: 'http://xxx' }
  }
}, ...]
Copy the code

In this way, users can jump directly when they click on the area specified by the snapshot, without waiting for the applet to complete.

Storage and Security

Snapshots are sensitive data and can only be stored locally on the client and cannot be uploaded. Therefore, you must be very careful to manage snapshots. Otherwise, it may cause public relations problems.

For snapshot storage, we consider the following:

Encrypted storage

The snapshot data must be stored encrypted using the encryption method used in the group’s wireless bodyguard.

Privacy protection

The snapshot cannot contain users’ private data. That is, snapshots should contain only UI elements or meaningless default data, not user privacy data.

Does not contain user privacy data Including user privacy data

So how to get a snapshot of the home page without user privacy data? Consider taking a snapshot before the front end retrieves data from the network or cache. However, such snapshots are bound to be incomplete and lose some experience, which is why we do not recommend using snapshots in first-screen scenarios with user data.

The snapshot cleanup

Snapshots are stored on clients and must have a storage upper limit. When a certain amount of snapshot data is generated, some old snapshot data needs to be discarded. Secondly, the existing snapshot data should be cleaned up when the small program version is updated and the user logs out and switches users.

accuracy

After a snapshot goes online, you need to know the user experience of the snapshot. The best experience is when the user is not aware of the snapshot at all, i.e. the snapshot is exactly the same as the real page. However, if the snapshot is too different from the real page, the user experience will be greatly reduced, which is what we need to be aware of.

Here we focus on the snapshot accuracy metric, which is the degree to which the snapshot and the real page resemble (overlap). The higher the accuracy, the more natural the transition between the snapshot and the real page, the better the experience. Doing otherwise will not enhance the experience and may even confuse users.

How do I determine the accuracy of snapshots

When a snapshot is generated, the system compares the current snapshot with the previous snapshot to obtain quantitative indicators to reflect the accuracy of the snapshot. The next question then becomes how to judge the similarity between the two images.

Here, it may first come to mind to directly calculate the proportion of different pixels in two snapshots by comparing pixels one by one. The higher the proportion, the more accurate the snapshot will be. However, in fact, this method cannot reflect the real similarity and the user’s body sense. For example, if the position of two snapshots is slightly offset, the similarity value may be very low. Or two snapshots with very small color differences can also get very poor results. Also, snapshots can be on the order of millions of pixels, and tests have found that pixel-by-pixel comparisons can take several seconds at a time.

We now use the “perceptual hashing algorithm” used in Google’s Image search to quantify the accuracy of the snapshot. The process of the algorithm itself is to compress the image to obtain some “fingerprint” information, and then calculate the “difference index” by comparing the fingerprint information of different images. The higher the difference index is, the lower the similarity is. This algorithm can reflect the similarity of two snapshots, and its efficiency is greatly improved compared with the pixel-by-pixel comparison method. The time of online data statistics to the whole algorithm is less than 3ms.

We experimented with a number of scenarios and came up with a difference index. It can be seen that the difference index is very low for scenes with small character changes. The difference index was higher in scenes with significant visual gaps. This quantified value can reflect the impact of snapshots on users’ real sense of entity.

Scene difference index visual effects
Small number of character changes 1
The overall shift 6

How do I recover an error snapshot scenario

Once we can sense the accuracy of the snapshot, we also need to know the difference between the snapshot and the real page for the poor accuracy of the snapshot, so as to improve the timing of the snapshot.

Here, we trace the current real page situation by retrieving the front-end page DOM tree at the time of the snapshot. The specific operation is to obtain the DOM tree information after desensitization of the HTML page of the current small program when generating the snapshot, and then rely on the CSS file of the small program framework, and finally directly use the browser to restore the interface when the snapshot

Other abilities

Local snapshot

One of the major limitations of snapshot is that it cannot adapt to changing first-screen scenarios. In such scenarios, each snapshot may fail to coincide with the real home page, which in turn degrades user experience. So we thought about providing the ability to take snapshots of only the parts of the home page that are basically the same every time, and not the other parts that are changing, so that part of the first screen can be taken in seconds every time.

For example, the top half of the homepage of contacts in Dingding is relatively fixed, while the bottom half of the feed stream may display different information every time it is opened. So in this scenario, we do not need to take a snapshot of the whole first screen of the home page every time. We can specify a certain height to take a snapshot, so that part of the home page can be realized in seconds.

Super screen snapshot

When the first page is scrollable, we can even consider snapshots longer than one screen, and make them scrollable the next time they are displayed when the applet starts. This solution needs to pay attention to two issues:

  1. Statistics on the snapshot size line show that the average size of snapshot files in one screen is about 100 KB. If the snapshot is more than one screen, the size may be several hundred K. You need to estimate an upper limit for the length or size of a snapshot during snapshot generation to prevent exceptions such as OOM on low-end VMS.
  2. Snapshot scrolling If a user scrolls a snapshot during display, record the offset of the current scroll when hiding the snapshot. In this way, the real home page can be rolled to the specified position so that the snapshot overlaps with the real page.

performance

For the performance of snapshots, we conducted laboratory tests and online statistics.

In lab tests, we constructed an extreme scenario of a very large snapshot (5.2m) and compared it with a normal snapshot on a low-end machine:

Common scenario Extreme scenario
A snapshot of the size 262K 5.2 M
Memory footprint 1840K 3245K
Loading visual experience Appears directly There is a very short delay

The snapshot loading process does not affect normal page switching, but there may be a short delay in loading large snapshots.

Online data shows that the loading time of pages with snapshots is about 280ms, and the average size of snapshots is about 110K.

Snapshot generation and accuracy detection are all performed in asynchronous threads. At this time, user interaction does not start. Snapshots are not taken after user scrolling or interaction, which has little impact on performance.

Here’s another interesting statistic: the average user clicks on a snapshot 0.6 times, and the first click takes about 1500ms. That is, more than half of the people had their first interaction after the snapshot was shown for 1.5 seconds. That’s enough to explain the importance of making snapshots interactionable.

Looking forward to

Although snapshot technology originated to solve the performance problem of small program startup, but the actual application scenarios can be extended to more places.

Theoretically speaking, any form of asynchronous rendering scene, no matter the current WebView or weeX rendering small program, or ordinary H5 webpage, or even some native scenes (scenes requiring loading), as long as it is a view that can be displayed in the client, Both of them can use snapshot technology to solve the problem of blank screen or loading in their process, and both of them can be out in seconds and interactive. Because snapshot is a pure native technology, its implementation does not depend on the rendering mode of real pages, but it needs to pay more attention to more appropriate snapshot timing and application scenarios so as to obtain better experience.

conclusion

We propose a new snapshot technology of small program, realizing the second opening and interactivity of the home page of small program. It can completely eliminate loading or blank screen in the opening process of small programs, making the opening of small programs a native experience and responding to user clicks.

It is a pure native technology that does not rely on small program containers and front-end rendering. Snapshots can be taken as long as there is a view, and snapshot data can be displayed immediately as long as there is snapshot data. It can even be extended to other non-small program scenes.

However, its limitations mainly depend on the style of the first screen and the selection of snapshot timing. The changeable first screen containing user privacy data is not suitable for snapshot, and the generation of high-quality snapshots has strict requirements on the timing. In terms of guaranteeing the accuracy of snapshots, there is still a lot of room for optimization in the method of snapshot similarity comparison, which still needs to be polished in the future.