1. The origin
It all started with a cozy weekend afternoon…
That day, the phone suddenly woke up with multiple wechat messages. It turned out that they were sent to us by a group of activities that were being promoted on campus this weekend. Considering the methodical development progress and friendly communication with products, the activity should be well received.
The reality is harsh:
“Our little program opens slowly into a dog!”
“This loading is taking too long!”
“Scrolling loading has a bit of a card, and it’s easy to report errors…”
What I saw was the most direct indictment.
To see the user’s screen recording, these problems do appear, so we still need to optimize the performance of the main process of the small program, three sentences of complaint can be summed up as three points:
- Small programs start slowly
- Small program requests are slow
- Small programs interact slowly
2. The positioning
2.1. Start slowly
After receiving the feedback, the first reaction is whether the user’s network speed is too slow. When he runs again, he finds that his mobile phone runs ok, and the grey is often smooth. Subconsciously, he may want to record a screen to reply to the past.
But there are users in the screen, of course, can not be so hasty, so we checked the management background small program in different networks of the market data:
network | Start the time-consuming |
---|---|
The overall | 3.6 s |
Wifi | 3.5 s |
4G | 3.9 s |
2G-3G | 4.1 s |
According to statistics, the total startup time of 3.7s, the network will have an impact on the startup time, but the impact is not very big, even under 2G-3G compared with the data of the market is not much slower, it can be seen that things are not simple.
So we look at the market data from another dimension:
models | Start the time-consuming | JS injection | First apply colours to a drawing |
---|---|---|---|
The overall | 3.6 s | 0.29 s | 0.16 s |
High-end machine | 2.9 s | 0.19 s | 0.06 s |
In the end machine | 4.8 s | 0.42 s | 0.19 s |
Low-end machine | 7.9 s | 0.72 s | 0.43 s |
Issue came from here you can see, the performance of the mobile phone for small program startup influence is very big, low-end machine relatively high-end machine have the difference of two to three times, especially the render layer even there is a difference of 5-6 times, and the problem is true for the feedback of the mobile phone users use mid-range machine, but we can’t control what users are using mobile phone, that there is a way to optimize?
To solve this problem, we need to understand the startup process of small programs. According to the introduction of official documents, the startup of small programs can be divided into the following steps:
The figure above describes a complete cold start process when the user clicks the small program to start and requests data from the page. The initialization process of the small program (information preparation and environment preparation) takes a long time, but this part of the work is completed by the wechat client, and the developer cannot intervene. So we can only focus on the next steps (download the package, inject the package, render the first time).
According to the introduction of the official document, the optimization methods for this part are as follows:
- Reduce the package size
- Reduce code complexity
- Reduce synchronized code interface calls
- Reduce page structure complexity
- Reduce the number of custom components
The last four items are not limited technically, so we need to check the complexity and costly interface calls during Code Review. The complexity can also be analyzed by tools such as CodeCC to reduce the number of custom components, which is difficult to decide. There is a trade-off between code readability and reusability, which is not the focus of this optimization.
So we focus on the code package volume issue, and we can collect our total package size through our CI records:
It can be seen that the main package volume reaches 1949.71KB, approaching the limit of 2M. After dependency analysis, it is found that except for some unused modules and components, a large part of the content is static resources. Meanwhile, we also see the following sentence in the official document:
Small program code packages are compressed when downloaded using the ZSTD algorithm. These resource files take up a lot of code package volume and are often difficult to compress further, having a much greater impact on download time than code files.
So to reduce the size of the code package, the most direct way is to remove the non-essential resources:
-
Optimize static resources and upload non-essential static resource files to the CDN
-
Perform dependency analysis on components of small programs to filter out unused components
At the same time we also focus on the, there are some of the subcontract very small, but because is common of the subcontract, when open the pages also need to download the first package, here on the package download time, there are some waste is typically the WebView page, they often need to deal with the parameters, and only for the main package dependency is not very strong, So there is another point that can be optimized:
- Independent subcontracting of relatively strong pages, as far as possible to reduce package download time
2.2. The request is slow
We can find out from the log that the data request of this user’s home page will be returned by 3-4s. Under normal circumstances, there will be two kinds of slow requests:
- The server responds slowly due to the sudden increase in concurrency
- The user’s network speed is slow. As a result, sending and receiving requests are slow
Through log statistics, we found that the user’s access time side, the request quantity is consistent with the usual time, and there is no big fluctuation in the statistics of the market request time:
Therefore, it can be basically ruled out that it is the background problem. Although the data of the market is around 500ms, when the user network is not good, how to ensure this piece?
The answer, of course, is to do in advance pull, when the user cold start, we can use the small program official data pre-pull ability in advance pull, small program start time, can cover our interface request time, can let the small program started after the success of direct rendering page.
In the case of warm start, request slow mainly reflected in the user interaction of the request and the page switching occurs when the request, we in the next section on the analysis of the interaction, here basically see page switching, from the point of our statistics, page switching takes roughly around 400 ms, which can use the time is about 50 ms – 100 ms:
The page switching time can be used to load the page data in advance, which can reduce the user’s sensory data request time. At the same time, after the first request, the page data can be cached according to certain policies, so as to achieve the effect of second access to the page.
In summary, request slow optimization means have the following several, and the theoretical effect will be very significant:
- Cold Start Enables data prefetching
- Data is pulled in advance during page routing switchover
- Cache the data
2.3. The interaction is slow
First of all, we have received feedback from users that after the user’s first screen loads smoothly, subsequent scrolling loads and some button clicks are very slow and error prone. After receiving this feedback, I have been locating for a long time. In a reasonable way, if the request is slow due to the user’s network problems, all the requests should be slow. However, what the user shows is that the follow-up loading and interaction are very good, but the first screen is still normal.
Through log query, we found that all requests reported by this user were request timeout, why the timeout is concentrated in interactive loading? After a period of time we found that a user’s errors were concentrated in the first screen load immediately after the slide or click, if after a period of time click again will not be reported error.
After discovering this phenomenon, we thought of a restriction in the official document on the use of the Internet:
Request, wx.uploadFile, and wx.downloadFile have a maximum concurrency limit of 10
In combination with our encapsulation of WX.request, the request timeout timer starts when wX.reqeust is called. If the request concurrency exceeds the limit, it is easy to timeout the request, and when we request the data from the first business interface, we will have a series of data reports. Pv, component exposure, monitor, etc., so we used the resDelay method of Whistle to return our request for 5000ms, and sure enough, we replicated the situation of the user feedback.
After the problem is found, the direction to be optimized is clear:
- Ensure that service requests related to user experience are sent properly
Are there other reasons for slow interactions? In the process of continuing to dig the performance bottleneck, I found that the details page of our course has a lot of content, with a height of 5-6 screens, and users only care whether the first screen can be presented faster. However, our original processing method was rather rough. We processed the data after we got the details page. This. SetData is called once to update the page, so to speed up the first screen, here’s what you need to do:
- The page is rendered step by step
2.4. Summary of optimization points
To summarize the points and directions to be optimized:
- Slow start is mainly from optimizing the code package:
- Optimize static resources and upload non-essential static resource files to the CDN
- Perform dependency analysis on components of small programs to filter out unused components
- Independent subcontracting of relatively independent pages to reduce the main package download time
- Request slowness mainly comes from preloading and caching:
- Cold Start Enables data prefetching
- Data is pulled in advance during page routing switchover
- Cache the data
- Interaction slowness starts with initiating the request and rendering the page:
- Ensure that service requests related to user experience are sent properly
- The page is rendered step by step
3. The optimized 🔧
3.1. Start optimization
3.1.1. Independent subcontracting
The user feedback is mainly due to campus promotion activities, and the activity page is hosted by embedded H5 in web-view, and the startup process of web-View page is not quite the same as the small program native page:
In fact, the web-view page only needs to improve the function of the transfer of the login state, and the main package is not very dependent on it. Moreover, the larger performance problem of this part of the page needs to be optimized by H5, so we carried out independent subcontracting for it in the first time.
The result was a good optimization, as there was no need to download the main package during startup, and startup performance improved by 30%.
3.1.2. CDN on static resources
Our small program composition is mainly composed of native pages + Kbone pages, kBONE is the official solution, built through WebPack, there are many solutions to package static resources separately. Our native page is built using gulp. The original main function is to convert TS in the source code into JS, and convert CSS files into WXSS through postCSS. Because WXSS does not support referencing relative paths, So the images and fonts referenced in WXSS are converted to Base64, and the rest of the files, such as JSON and WXML files, are copied directly into the product.
This is a rough way to use postCSS to convert all the local images referenced by background-image into Base64, and can result in many images taking up twice the volume in the project.
Therefore, we first need to match the static resources in the source code and build them separately, and to avoid the problem of the same name file, we need to hashtag the resources, we need to use a gulp plug-in gulp-rev, which can hash resource-based content.
After the image is added to the CDN, the reference path in CSS, JS, JSON and WXML is replaced with the CDN address. The specific replacement logic is as follows.
3.1.3. Filter unused Components
With the iteration of the business, it is inevitable that some components will be abandoned, but it is difficult to detect. By analyzing the components used in the page through the small program scaffolding imWeb-miniprogram-CLI developed by our team, the unused components in the project can be filtered out and will not be packaged into the final product. The general idea is as follows:
Start from app.json, get all the pages and subpackages configured by the small program, collect by checking the custom components used in the app, page and subpackage, and recursively check the components used by the custom components. If any unused components are detected, it will also give a prompt, very friendly:
As you can see, we found several unused components in our project.
3.2. Request optimization
3.2.1. Data prepull
Data need pull open the background, the management of the small program data source server or cloud development, can choose to developers choose our server will have some limitations, if CGI address directly, can pull a data, inflexible, and if it builds a service to do preliminary pull involve the workload is very big, Therefore, we choose the cloud development method. The general process is as follows:
When the small program is started, the wechat client will pull the specified cloud function according to the configuration. In the cloud function, the service in the business background is called by CL5 to pull the required data. After the data is pulled, the client will cache the data locally. Business code invokes the wx. GetBackgroundFetchData can get data pull, if the cache data to the needed data can directly apply colours to a drawing, if not then pull again interface down to the business.
In the cloud function, we can get the path and Query parameters of the small program startup, so we can judge which service of the business background needs to be called this time according to the two parameters, so that the small program can be started from different pages through a cloud function to pre-pull the required data.
const preFetchMap = {
'pages/index/index': fetchIndex,
'pages/course/course': fetchCourse,
}
// Cloud function entry function
exports.main = async (event) => {
const { path, query = ' ' } = event;
const fetchFn = preFetchMap[path];
if (fetchFn) {
const res = await fetchFn(query);
return res;
}
return {
error: {
event,
retcode: -1002.msg: `${path}The pre-pull logic is not set on the page}}; };Copy the code
However, it should be noted that because the applet itself has done a lot of initialization optimization, it is possible that the pre-pulled data has not been returned after the applet is started, so we have made further optimization, in the process of business pull through wx.onBackgroundFetchData to listen for the return of the pre-pulled, received the return directly render. Render the first screen with pre-pulled data whenever possible.
3.2.2. Pre-pull & data cache
As mentioned above, pre-pull is to use the gap between small programs to switch pages to start to pull data, so as to reduce the time of data request in the sense. The overall logic is to add different data pull logic to the corresponding page through the encapsulated jump logic, and mount the pulled promise on the app. When the page switch is complete, the promise on the app is preferentially used to obtain data.
Data cache is to save fixed data locally through wX. setStorage after the data is successfully pulled. When switching to this page for the second time, the data in the local cache is first used for rendering, and then the data is updated through the pulled data.
3.3. Interaction optimization
3.3.1. Service Request Assurance
The core idea of ensuring business requests is to give priority to business requests. We encapsulate a queuing request module. By intercepting THE WX. request API, the requests are sorted according to the configuration with priority. Allow sufficient channels for high priority business requests.
3.3.2. Step by step rendering
The solution here, which I’m sure you’ll understand, is to prioritize the data you need on the first screen and update the view with setData, and then deal with the rest of the data. But according to the official document:
The setData function is used to send data from the logical layer to the view layer (asynchronously) and change the corresponding value of this.data (synchronously).
The execution sequence of the small program code also follows the JS event loop mechanism. If only the data after processing is called setData, and then continue or handle the next step through Promise, it cannot achieve the purpose of step-by-step rendering. Instead, the nested rendering is directly used in setTimeout through callback. The code becomes less readable and less elegant. Our solution is to use setTimeout to encapsulate a promise-compliant method so that we can continue to render in steps just as we would with a Promise:
Results of 4.
After a series of optimization, the effect is relatively obvious
4.1. Package size
In terms of package size
- The total package decreased by 27% from 9132.94KB to 6736.42KB;
- The main package decreased from 1949.71KB to 985.96KB, a decrease of 49.5%.
From the boot time data, download time and JS injection time have obviously decreased:
Looking at the distribution of opening time, it can be seen that the proportion of users opening within 3s increases significantly, from 56.26% to 64.25%.
4.2. Request time
Data pre-pull, pre-pull, data caching in cold start and page switching have played a very good effect:
Home page request speed decreased from 400ms to 50ms, optimized by 87.5%;
The class detail page request speed decreased from 800ms to 90ms on average, optimizing by 88.75%;
The data cache allows the page to open in seconds on the second visit:
After the queued request is used, the intervention effect on the order of network request is quite obvious. The average time of grayscale user’s business request is 50-100ms, about 15% optimization.
At the same time, we found that the longer the request time was, the more obvious the optimization effect would be, that is to say, it could play a better role in the case of weak network by analyzing the effect of the time consuming in 80th, 50th and 20th bits respectively.
Render 4.3.
After using stepwise rendering, our page can start rendering immediately after processing the basic data of the first screen. As our directory structure is relatively complex, it takes a long time to process, so the second part only deals with the directory. The actual rendering effect is as follows:
The first screen can be rendered 100ms-150ms earlier than the original.
5. To summarize
We optimize the performance of the small program start, request, interaction, rendering many aspects of the performance of mining, is in the base of the library version is not high requirements can do the extreme.
Take our core page, home page and course details page:
- The part of the optimization that developers can intervene in is about 1300 download + 300 injection + 170 relief + 430 request = 2200ms -> 750 + 245 + 170 + 50 = 1215ms, an optimization of 45%
- The developer can intervene part of the optimization is about 1300 download + 300 injection + 170 hill-hill + 790 request = 2560ms -> 750 + 245 + 170 + 100 = 1265ms, optimized by 50.5%
- Page switching takes time to enter details page for the first time from 400 route + 800 request + 450 processing = 1650ms -> 400 + 720 + 300 = 1420ms, optimized by 14%
- You can hardly see the loading and rendering process when you enter the details page twice
Are there more optimizations? The official also provides some more advanced functions, the basic library version of the higher requirements, such as:
- Component injection on demand and when used can further reduce the download time of the code package, but there was a problem when we released this feature, which caused the custom components on the home page to not be loaded, so we did not use it for the time being.
- You can also use the initial render cache supported in 2.11.1 Start to render views earlier without having to wait until the logical layer is initialized.
- The subcontracting asynchronization, which is still in the experimental stage, can also reduce the download time of code package and the injection time of JS by using the method of asynchronous loading module.
Using these capabilities can be optimized in more detail, we will explore and follow up further, if you have a better plan, welcome to discuss.