Scenario reduction
At about 19:05 p.m. on December 28, 2019, I was suddenly pulled into the wechat group. Released a wechat chat record, the company (a well-known Domestic Internet finance company) executives found that the promotion page of the company’s APP opened slowly, 20+ seconds later, the product and Marketing Department boss was questioned. This is when the problem comes to our development team.
After receiving the questions, the team leader and other operators can access the same promotion page with their mobile phones and open the page within 1.5 seconds.
Question assumptions
First, a brief introduction to the project structure and background.
This project is a configuration project, let’s call it “active configuration project”, which is divided into a configuration platform and an H5 display platform. On the configuration platform, display pictures of H5 pages, button positioning, wheel casting, etc., can be configured, and uploaded to CDN by generating a structured JSON file. H5 display platform obtains THE JSON file of CDN through Ajax, parses the JSON file at the same time, and renders the corresponding page. At the same time, the company needs to connect with the three platform apps, so there are many INTERNAL JS files that interact with the APP, as well as the statistical JS unique to each platform. As the three platforms in the company are basically independent from each other, some capabilities of Native are not unified at the technical level, which leads to the redundancy of some JS functions, but they have to be introduced.
The “Active Configuration Project” configuration generates two main parts: the first part: the configuration JSon file described, and the second part: the image used in the project. Both are stored in CDN.
As you can guess from the above scenario, the page business logic should not be a problem. Make the following guesses: and prioritize from top to bottom
- Static resources load slowly on certain networks, including HTML, JS, JSon files configured, and image resources.
- Check the front-end page code to see if JS is getting slow (blocking) and the page is not getting into the business.
Screening direction
The first point of investigation: all THE CDN used by the company were tested and no obvious abnormality was found. In the CDN back source test, some CDN sites were found to have abnormal 504, which should not be caused by CDN after investigation.
The second check is server usage. Multiple services are deployed on the active SERVER and one public IIS service. The service response time may be delayed due to service resource contention at some time.
Third check: it is found that H5 display platform to introduce JS into the head or body of HTML for loading, at the same time, through the chromeDev tool, found a total of about 20 JS files introduced.
Problem orientation
Because we can’t reproduce the problem of the company’s boss, but since the problem appears, we still want to try our best to solve the problem and optimize it.
When JS async and defer are not enabled in the front end, JS loading execution is performed from top to bottom. All 20 JS are placed in the head at the same time, only one JS is required to load slowly, which can lead to the subsequent business cannot be executed. This has to be addressed.
At the same time, when there are multiple JS in the header at the same time, each JS download needs to establish a connection with the server, the number of concurrent connection in Chrome browser is about 7, all subsequent resources need to wait. Establishing multiple connections at the same time requires the establishment of TCP, handshake, and SSL, which are costly at the network level.
Problem location 1: The proposed strategy is to reduce the establishment of connections, that is, reduce JS network requests
Whether or not you need to put all JJS in the header depends on whether your business needs to rely on them. Therefore, all JS are sorted out and the js with strong business dependence is loaded at the front. For business weakly relevant JS, put a tail reference. There are two types of weak correlation here. The first type is similar to statistical function, even if the JS load fails, the business page will not be unable to appear. The second type is js that is only used in page click events and so on. Based on the second one, there will still be JS not loaded successfully, and the business logic cannot be completed in the click event.
Problem location two: clear up the strong dependent JS and weak dependent JS, and load the end and end separately
File size is a big factor in file transfer time, so specific processing of files can have a big impact
Fault location 3: Compress uncompressed JS files. At the same time, enable GIZP compression on the server to reduce the size of bytes transmitted over the network
Based on the positioning of three problems, the overall optimization scheme is as follows: optimize the CDN back source configuration and migrate the activities to a separate server. At the front level, the strong dependent JS is put in the head and the weak dependent JS is put in the tail. At the same time, the js in the head is combined and compressed, and the JS in the tail is compressed and merged. Gzip compression is enabled on the server
Fixed problem
Basically determine the direction of optimization, the rest is how to achieve the optimization program.
- Clarify page rendering dependencies by placing mandatory dependencies at the head and non-dependent or weak dependencies at the tail.
- We didn’t use any build tools at the beginning of the project, and we didn’t compress some JS. The optimized scheme is to use gulp to merge and compress the pre-dependent JS, package the post-weak dependent or non-dependent JS separately and introduce it.
- Make the page render the first priority and leave the rest of the SDK initialization to the end.
Test Performance Report
Before the change
Modified the network connection transmission status under ChromeDev Tool before. As shown below, it can be seen from the figure that 20 JS are loaded on the page. Besides the number of JS loaded, many yellow and purple colors can be seen in waterfall column, indicating the necessary network connection, yellow means the time to establish the connection, and purple means the time to establish the connection by SSL. The main reason for this problem is that the concurrent number of Chrome connections can only be about 7, and redundant resources need to be downloaded, which requires queuing and may lead to disconnection and failure to reuse the connection, which has already taken a lot of time.
The modified
After modification, js resources are changed from 20 to 10, pre-dependencies are packaged as Vendor. js, and weak or non-dependencies of app are packaged as app.js. By reducing JS requests, the establishment of HTTP connections and SSL is reduced, and the IO operation of the server is reduced.
Waterfall has lost yellow and purple from the ChromeDev Tool, proving that the network aspect is performing well.
conclusion
What is unavoidable is a summary
The performance of the front end directly affects the user experience. After all, the front end engineer should understand the product from the user’s perspective. On the other hand, front-end optimization ignores implementation factors at the front-end code level, and the most important factor is network factors. The network element is complex, and access to resources depends on what your network architecture is.
From the simplest: client initiates a request -> (establishes a connection) -> Resource server returns a resource -> Client executes and renders.
However, to improve service availability, the network architecture uses proxy servers
Proxy service is used:
Client initiates a request -> (establishes a connection) -> Proxy checks if the resource is valid -> (valid) -> Resource returns -> Client executes and renders
Client initiates a request -> (establishes a connection) -> Proxy server checks if the resource is valid -> (invalid) -> Access resource server -> Resource Return -> Proxy server cache (continues to respond to the client) -> Client executes and renders
However, if some resources are stored in the CDN, the resources need to access the CDN and return to the source
The client initiates a request -> (find the CDN server of the nearest node, establish a connection) -> CDN exists resources -> (exists) -> Resource return -> client executes and renders
The client initiates a request -> (find the CDN server of the nearest node, establish a connection) -> Whether the CDN exists -> (does not exist) -> source processing -> Resource return -> client executes and renders
When a page’s replacement resources are deployed in different ways, you need to identify the key points that are most likely to cause problems. This of course depends on the performance of the situation when it happens. An investigation on the network is basically a process of tracing the access to resources.
Back to the end, check at the code level to see if there are any actions that affect code execution or take a long time. As the js loading problem mentioned above, it has been a commonplace in the front end. In HTML, if you do not add defer and async to JS, the EXECUTION of JS code depends on the completion of the execution of the previous JS code. This will block the rendering of the page, so put js first.
For complex services, it is a good practice to split the service into multiple modules to facilitate future maintenance and incremental requirements. But this will inevitably lead to the increase of resource files, when too many resources may cause the browser concurrency is not enough and time-consuming. About in the development process recommended modularization, but before the release of resources still need to merge compression processing, here is not absolute, mainly depends on the number of resources, it is recommended to keep JS resource files within 10.
In addition, for the data source, you can use cache. At present, the browser has good support for localStorage, and key data can be stored. Display cached data before ajax async succeeds. Of course, for those with high requirements on data accuracy, it is also a good choice to explicitly prompt with some loading.
End 2020-01-01 in the blink of an eye, it is 2020.
Lao Xu, do you want a wife or not? If you say the word, they’ll be here tonight. The Wrangler