Penguin’s H5 pages have accumulated performance problems over long iterations, causing pages to load and render slowly. In order to improve user experience, recently for the page loading speed, rendering speed to do a special optimization, this paper is the practice of the optimization summary. The analysis process is more detailed, I hope it can give some help to students who lack experience in performance analysis.
Project background
H5 project is the core of the penguin counselling, iterative more than four years, including the course details page/teacher details page/registration page/payment page page, such as building products used in penguin counselling APP/H5 (WeChat/QQ/browser), the iteration process has also accumulated some performance problems lead to page load, the rendering speed slow, in order to improve the user experience, Recently launched the “H5 performance optimization” project, for the page loading speed, rendering speed to do a special optimization, the following is a summary of the optimization, including the following parts.
- Performance optimization effect display
- Performance indicators and data collection
- Performance analysis method and environment preparation
- Performance optimization practices
I. Performance indicators and data collection
The performance indicators adopted by Penguin Tutor H5 include:
1. Page load time: How quickly the page loads and renders elements onto the page.
- First Contentful Paint (FCP) : Measures the time it takes for a page to start loading a piece of content to appear on the page.
- Largest Contentful Paint (LCP) : Measures how long it took the page to start loading the Largest block of text or image on the page.
- DomContentLoaded Event: Time when DOM parsing is complete
- OnLoad Event: indicates the time when page resources are loaded
2. Post-load response time: How long after the page loads and executes the JS code to respond to user interaction.
- First Input Delay (FID) : Measures the time between the user’s First interaction with the site (e.g. clicking a link, button, js custom control) and the browser’s actual response.
3. Visual stability: Whether page elements move in unexpected ways and interfere with user interaction.
- Cumulative Layout Shift (CLS) : Measures the Cumulative score of unexpected Layout shifts that occur between page loading and status changing to hidden.
The project used IMLOG to report data, ELK system to monitor the live network data, and Grafana configuration view to observe the live network.
According to the data distribution of indicators, it can find the abnormal page data in time and take measures.
2. Performance analysis and environment preparation
Current page situation:
It can be seen that the progress bar continues loading even after the page is displayed. The loading time is more than ten seconds, which relatively affects the user experience.
According to the Google development documentation, the browser architecture explains:
When the navigation is committed, the rendering process starts loading resources and rendering the page. Once the rendering process has “finished” rendering, it notifies the browser process via THE IPC (note that this occurs when the onload event for all frames on the page has been triggered and the corresponding handler has completed execution), and the UI thread stops the rotation of the navigation bar
As we can see, the load time of the progress bar is closely related to the onload time, and the onload time should be reduced in order for the progress bar to end as quickly as possible.
According to the current situation, use ChromeDevTool as the basic performance analysis tool to observe the page performance
Network: Observe the time and sequence of loading Network resources
Performace: Observe the page rendering performance and JS execution
Lighthouse: Score your site as a whole and find optimized items
The following takes the penguin tutorial details page as a case study to find out potential optimization items
(Note that Chrome inactivates the window and disables plugins to remove other add-ons from the page)
1. The Network analysis
Network analysis usually requires disabling caching and enabling network rate limiting (4G / 3G) to simulate the loading of weak mobile networks, as wifi networks may bridge the performance gap.
You can see that the DOMContentLoaded time is 6.03s, but the onLoad time is 20.92s
In the DOMContentLoaded phase, the longest request path is found in vvendor. Js, the js size is 170kB, and it takes 4.32s
Let’s look at the time between DOMContentLoaded and onLoad
You can see that the onLoad event is blocked by a large number of media resources. You can refer to the factors affecting the onLoad eventThis article
The conclusion is that onLoad is triggered only when the browser thinks the resource has been fully loaded (htML-parsed resources and dynamically loaded resources)
The image, video, and iFrame resources are loaded, blocking the onLoad event
The Network summary
- DOM parsing is affected by JS loading and execution, try to compress JS, split processing (HTTP2), can reduce DOMContentLoaded time
- The image, video, and iFrame resources block the onLoad event. Therefore, optimize the resource loading time to trigger onLoad as soon as possible
2. The Performance analysis
Note that mobile phones have less processor power than PCS, so it is common to set the CPU to either 4X Slowdown or 6X Slowdown to simulate
Look at a few cores of data
- Web Vitals (FP/FCP/LCP/Layout Shift) Core page metrics and Timings
You can see that LCP, DCL and Onload events take longer and Layout Shift appears multiple times.
To trigger LCP as early as possible, reduce the rendering time for large sections of the page, observe the ScreenShots of Frames or ScreenShots, and pay attention to how the elements of the page are rendered.
You can find the specific offset content in the Summary panel by clicking Layout Shift in the Experience row.
- Main Long Tasks Number and duration of Long Tasks
You can see that the page has a large number of Long Tasks that need to be optimized, with the parsing execution time of Couse.js (the page code) up to 800ms.
Long Tasks can be recorded in the development environment, so that the specific code execution file and elapsed time can be seen in the Main Timeline.
The Performance summary
- The LCP of the page is triggered late and multiple layout offsets occur, which affects the user experience. Therefore, it is necessary to render the content as early as possible and reduce layout offsets
- There are many Long Tasks on the page, so it is necessary to split and load JS reasonably to reduce the number of Long Tasks, especially those affecting DCL and Onload events
3. The Lighthouse
Use ChromeDevTool and built-in Lighthouse to score your pages
The score is low, you can see that Metrics gave the core Metrics, here is TTI, SI, TBT, LCP needed to be improved, FCP and CLS are good, you can see how the score was calculated
In addition, Lighthouse will provide optimization suggestions. In the Oppotcommunities and Diagnostics section, you will see specific guidelines on image size, removal of unwanted JS, and so on, which can be used to optimize the project.
The ratings of Lighthouse are based on the overall load of the project, and the issues reviewed also include Network and Performance issues, so they can also be viewed as optimization suggestions for Network and Performance issues.
Lighthouse summary
- According to the score, it can be seen that TTI, SI, TBT, and LCP need to be improved, and you can refer to the Lighthouse document for optimization.
- Oppotcommunities and Diagnostics provide specific optimization recommendations that can be referenced for improvement.
4. Environment preparation
This is a preliminary problem analysis of the online web page, to actually optimize and observe, need to carry out the simulation of the environment, so that the optimization effect can be more real in the test environment.
Proxies: whistle, Charles, fiddler, etc
Local environment and test environment simulation: nginx, NOhost, stke, etc
Data reporting: IMLOG, TAM, RUM, etc
Front-end code packaging analysis: Webpack-bundle-Analyzer, Rollup-plugin-Visualizer, and so on
When analyzing problems, use local code, verify optimization effect in local simulation online environment, and finally deploy to test environment to verify, so as to improve development efficiency.
Third, performance optimization specific practice
PART1: Optimize load time
Network classifies the resources loaded on the page
The first part is the JS resources that affect DOM parsing. You can see that the JS resources are classified as critical JS and non-critical JS based on whether they participate in the first rendering
For non-critical JS, we can consider delaying asynchronous loading, and for critical JS, we can split and optimize
1. Key JS packaging optimization
The number of JS files is 8, the total volume is 460.8KB, the maximum file is 170KB
1.1 Correct configuration of Splitchunks
Vvendor. Js 170kB(gzipd) is a common file that all pages load. The packaging rule is miniChunks: 3, and modules referenced more than 3 times will be typed into this JS
Analyze the specific composition of Vendor.js (above)
For example, string-strip-html.umd.js is 34.7KB in size, accounting for 20% of the volume of vvendor. However, only one page uses this package multiple times, triggering miniChunks rules and being typed in vvendor.
Similarly, other modules of Vendor. js are analyzed. Iosselect. js, howler.js, weixin-js-SDK and other modules only have 3 or 4 page/component dependencies, but they are also entered into vendor.js.
From the above analysis, we can conclude that we can’t simply rely on the miniChunks rule to unpack the page dependency modules. We should split the common dependencies according to the specific situation.
The modified vendor extracts common dependencies (imutils/imlog/ QQAPI) shared by different pages and components according to specific business requirements.
vendor: {
test({ resource }) {
return /[\\/]node_modules[\\/](@tencent\/imutils|imlog\/)|qqapi/.test(resource);
},
name: 'vendor'.priority: 50.minChunks: 1.reuseExistingChunk: true,},Copy the code
For other public dependencies that are not specified, add a new common.js and raise the threshold to 20 or higher (the current number of pages is 76). Make the public dependencies become the dependencies of most pages and increase the dependency cache utilization
The combined size of the two files is 72KB, a 60% (100KB) reduction from the pre-optimization size
1.2 Loading common Components on demand
Course.js 101kB (gzipd) This file is the file of the page business code
If you look at the figure above, it is mostly business code, except for a huge component Icon, which takes up 25K ** and 1/4 of the volume of the page file, but there are only 8 ICONS used in the code
Analyzing the code, you can see that the require SVG is loaded, and Webpack packs the contents of the Require folder together, resulting in redundant page Icon components
How to solve this problem to achieve on-demand loading?
Content loaded on demand should be independent components, we changed the previous single-entry ICON component (dynamic dangerouslySetInnerHTML) to single-file component mode directly introduced using ICONS.
However, in actual development, this will be a little troublesome. Generally, a unified import path is required to specify the required Icon to be reloaded. By referring to babel-plugin-import, we can configure the dependent loading path of Babel to adjust the importing mode of Icon, so as to realize the on-demand loading of Icon.
After loading on demand and recompiling to see the benefits of packaging, the Icons component stat size of the page was reduced from 74KB to 20KB, a 70% reduction in volume
1.3 Code Splitting of Business Components
If you look at the page, you can see that the “Syllabus”, “Course details” and “Purchasing Instructions” modules are not in the first screen rendering of the page.
We can consider splitting these parts of the page and lazily loading them to reduce the size and execution time of the business code JS
You can use the React loadable, @loadable/ Component libraries, or react. Lazy
The split code
Code splitting can lead to delays in rendering components, so the decision to use them in a project should be made based on user experience and performance. Splitting can also allow some resources to be loaded later to optimize load times.
1.4 Optimization of Tree Shaking
TreeShaking optimizations were used in the project, but be careful with sideEffects usage scenarios to avoid packaging artifacts that are inconsistent with development.
After the above optimization steps, the overall package content:
Number of JS files 6, total volume 308KB, maximum file size 109KB
Key JS optimization data comparison:
File volume | Maximum file size | |
---|---|---|
Before optimization | 460.8 KB | 170 kb |
The optimized | 308 kb | 109 kb |
The optimization effect | The total volume is reduced by 50% | The maximum file size is reduced by 56% |
2. Non-critical JS lazy loading
The page contains some JS related to reporting, such as Sentry, beacon (LIGHTHOUSE SDK), etc., for these resources, if in the weak network situation, may become a factor affecting DOM parsing
To reduce the impact of such non-critical JS, non-critical JS can be loaded after the page has finished loading. For example, the Sentry official also provides lazy loading
In the project, I also found some non-critical JS, such as the verification code component. In order to make use of the cache to load the next page as soon as possible, so the last page is loaded in advance to generate the cache
If the next page is not visited, it is considered an invalid load, and this kind of advance caching scheme can affect the performance of the page.
So what we can do is we can use Resource Hints to Prefetch the Resource
To see if the browser supports Prefech, if it does, we can create a Prefetch link, and if not, load it directly using the old logic, which ensures better page performance and allows the next page to load before it is ready.
const isPrefetchSupported = () = > {
const link = document.createElement('link');
const { relList } = link;
if(! relList || ! relList.supports) {return false;
}
return relList.supports('prefetch');
};
const prefetch = () = > {
const isPrefetchSupport = isPrefetchSupported();
if (isPrefetchSupport) {
const link = document.createElement('link');
link.rel = 'prefetch';
link.as = type;
link.href = url;
document.head.appendChild(link);
} else if (type === 'script') {
// load script}};Copy the code
Optimization effect: non-critical JS does not affect page loading
3. Optimize media resource loading
3.1 Load timing optimization
You can see that onLoad is blocked by a large number of image and video resources, but the corresponding image or video is not displayed on the page. This part of the content should be lazy loaded.
Processing mode is mainly to control the image lazyload logic (such as onload after loading), can use all kinds of lazyload library to achieve. The H5 project uses getBoundingClientRect (location detection) to display the image when it reaches the visible area of the page.
However, it is important to note that lazy loading cannot block the normal display of services. You should take necessary measures such as timeout processing and retry
3.2 Size optimization
Each detail page is 1715px wide, which is 4x on 6s (375px). Large images will slow down the page loading and rendering on weak web conditions
Use the CDN map bed size compression function, according to different equipment rendering different sizes of pictures to adjust the picture format, according to the network situation, rendering different sharpness of the picture
It can be seen that in the case of weak network (mobile 3G network), the difference between the highest and lowest loading speed of different sizes of the same picture is nearly 6 times, giving users a completely different experience
CDN with business implementation: use the IMG tag srcset/sizes attribute and the picutre tag to implement a responsive image, please refer to the documentation
The DYNAMIC URL splicing method is used to construct URL requests. According to the model width and network conditions, the current image width multiple is determined and adjusted (such as iPhone 1x, iPad 2X, weak network 0.5x).
Optimization effect: The image volume of the mobile terminal is reduced by 220% in the case of normal network, and 13 times in the case of weak network
Pay attention to the actual business needs visual students to participate, assess whether the clarity of the picture meets the visual standards, avoid reverse optimization!
3.3 Optimization of Other Types of Resources
iframe
Loading the iframe can have a serious impact on page loading. Loading before onload can block the onLoad event and thus block loading, but there is another problem
As shown in the following figure, the page triggers the loading of the iframe when it has been onload, and the progress bar continues to rotate until the loading of the iframe is complete.
You can set the iframe time after onLoad and use setTimeout to trigger asynchronous iframe loading to avoid loading effects caused by iframe
The data reported
If image data is used in the project to report requests, the impact on page performance may not be felt under normal network conditions
However, in some special cases, such as one image request takes a long time, the page onLoad event will be blocked and the loading time will be prolonged
The following solutions are available to resolve the problem of reported impact on performance
- Delayed merge report
- Using beacons API
- Reporting using POST
The H5 project adopts the delayed combined report scheme. Services can be selected based on actual needs
Optimization effect: All data is reported and processed after onLOAD, avoiding impact on performance.
The font to optimize
The project may contain many visually specified rendered fonts. When the font file is large, it will also affect the loading and rendering of the page. You can use Fontmin to compress the font resources and generate a compact version of the font file.
Before optimization: 20kB => after optimization: 14kB
PART2: Page rendering optimization
1. Optimize the TTFB time of the straight out page
At present, we have deployed the straight out service in STKE. Through monitoring, it is found that the average time of straight out is 300+ms
The TTFB time fluctuates between 100 and 200, which affects the rendering of the straight out page
Through log dotting, checking Nginx Accesslog logs and gateway monitoring time, the following data can be obtained (as shown in the figure)
- STKE straight out of the program takes about 20ms
- The direct outbound gateway NGW -> STKE takes about 60ms
- The reverse proxy gateway NGINX -> NGW takes about 60ms
Log in to the NGW machine and ping the STKE machine. The following data is available
The average delay is 32ms, TCP three-way handshake + returned data (data sent in the last ACK) = 2 RTT, about 64ms, which is consistent with the data recorded in the log
The NGW machine is located in Tianjin, and the STKE machine is located in Nanjing. It can be preliminarly judged that the network delay is caused by the physical distance of the machine room, as shown in the figure below
Switch NGW to nanjing machine and ping STKE nanjing machine. The following data is available:
The network delay of ping in the same area is only 0.x ms, as shown in the figure below:
Based on the above analysis, the root cause of the long TTFB time is that NGW gateway deployment is not in the same area as Nginx and STKE, resulting in network delay
The solution is to deploy the gateway and the direct service room in the same area. The following operations are performed:
- NGW capacity
- North Star opens nearby visit
Before optimization
The optimized
The optimized effect is shown in the figure above:
The average gateway time is seven days | |
---|---|
Before optimization | 153 ms |
The optimized | 31 ms optimization 80% (120 ms) |
2. Optimize page rendering time
Screenshot Simulation of weak network (Slow 3G) Performance record page rendering, as can be found in the following Screenshot
- DOM begins to parse, but the page is not yet rendered
- The page is rendered only after the CSS file is downloaded
The CSS will not block page parsing, but will block page rendering. If the CSS file is large or the web is weak, the page rendering time will be affected and the user experience will be affected.
Use ChromeDevTool’s Coverage tool (in More Tools) to record CSS usage during page rendering
It turns out that the first screen is only used 15% of the time. Consider inlining the key CSS on the first screen so that the rendering is not blocked by the CSS, and then loading the full CSS in
Critial CSS optimization can be considered using critters
Optimized effect:
When the CSS resources are being downloaded, the page can be rendered normally. Compared with before optimization, the rendering time has increased by 1 or 2 CSS file loading time.
3. Optimize page layout jitter
Watch the elements of the page change
Before optimization (left) : Missing ICONS, missing backgrounds, page jitter due to font size changes, and jitter due to unexpected page elements
After optimization: The content is relatively fixed, and the page elements appear without abruptiveness
Main optimization contents:
- Determine the location of straight out page elements, according to the straight out data to do a good layout
- Page miniatures can be processed using Base64 and are displayed as soon as the page parses
- Reduce the impact of dynamic content on the layout of the page by using a way out of the flow of the document or by setting width and height
Fourth, performance optimization effect display
The optimization effect is quantified by the following indicators
FCP (First Contentful Paint) : Marks the point in time when the browser rendered the First content from the DOM
Largest Contentful Paint (LCP) : Largest Contentful Paint (LCP) : represents nearly complete rendering of the visible area of the page
Loading progress bar time: time when the onLoad event is triggered. After the onLoad event is triggered, the navigation progress bar is displayed
Chrome Emulator 4G no cache comparison (left before and right after optimization)
Maximum first screen content drawing time | Progress bar loading (onLoad) time | |
---|---|---|
Before optimization | 1067 ms | 6.18 s |
The optimized | 31 ms optimization 80% (120 ms) | 1.19 s optimization of 81% |
Lighthouse run score comparison
Before optimization
The optimized
Performance score | |
---|---|
Before optimization | The average is 40 to 50 |
The optimized | The average 75 to 85 increased by 47% |
Srobot Performance check data for one week
Srobot is a team performance monitoring tool that uses the TRobot command to create a one-click page health check and automatically detects page performance and exceptions on a regular basis
Before optimization
The optimized
Average progress bar loading (onLoad) time (4G) | |
---|---|
Before optimization | 4632ms |
The optimized | 45% increase to 2581 ms |
Fifth, optimize the summary and future planning
- The above optimization methods mainly focus on the time and rendering optimization of the first loading page, but there is still a lot of room for optimization of the second loading, such as the use of PWA, non-straight out of the page skeleton screen processing, CSR to SSR, etc.
- Compared with rival products, we find that the download of CDN takes a long time. We are going to start the cloud on CDN in the near future, and expect the effect of CDN to improve after the cloud.
- Project iterations are going on all the time, and you need to think about engineering how to keep page performance going
- The above is the analysis and optimization of the course details page. Although the whole project has been optimized, there is no silver bullet for performance optimization. The optimization of different pages should be carried out according to the specific needs of the pages, and students should take the initiative to pay attention to it.
Thank you for your patience to read, welcome to communicate, correct the mistakes and omissions in the text, learn together!