Summary: Performance optimization is a systematic and holistic thing, which is embedded in every detail of the project development process, and is also a big battlefield reflecting the technical depth. In this paper, Quick BI complex system as the background, detailed introduction of performance optimization ideas and means, as well as systematic thinking.

Performance has always been a topic of technical concern, especially in medium to large complex projects. Just like the performance of an automobile, comfort and practicality should be guaranteed while the pursuit of maximum speed. Every link of automobile manufacturing, integration of parts, engine tuning, etc., will ultimately affect the user’s sense of motion and commercial achievement. The following figure shows the impact of performance on revenue.

Performance optimization is a systematic and holistic thing, which is embedded in every detail of the project development process, and is also a big battlefield reflecting the technical depth. Next, I will take Quick BI’s complex system as the background to dig into the whole performance optimization ideas and means, as well as systematic thinking.

How do I locate performance problems?

Generally speaking, we are sensitive to the frame rate of the animation (within 16ms), but if there is a performance problem, our actual motion perception may be one word: “slow”, but this does not help us solve the problem, so we need to analyze the whole link behind the word.

The figure above is a common process for browsers. In combination with our scenario, I have abstracted it into the following steps:

It can be seen that the main time-consuming stages are divided into two parts:

Stage 1: Download Code

Script Execution & Fetch Data

How to go into these two stages, we generally use the following main tools to analyze:

Network

One of the first tools we’ll use is Chrome’s Network, which helps us initially identify bottlenecks:

In the example shown in the figure, you can see the entire page at a glance: Finish time, load resource size, number of requests, time and point of each request, resource priority, and so on. As is clear from the example above, the entire page load is large, approaching 30MB.

Coverage of code

For complex front-end engineering, the products of engineering construction are usually redundant or even unused. These invalid loaded codes can be analyzed in real time through the Coverage tool:

As you can see in the example above, the entire page is 28.3MB, of which 19.5MB is unused (executed), and the engine-style.css file is used less than 0.7%

Resources in a larger version

We already know that front-end resource utilization is very low, so what exactly is the bad code being introduced? In this case, we use webpack-bundle-Analyzer to analyze the entire build product (stats can be output via webpack –profile –json=stats. Json) :

In the example above, combined with our current business, we can see the problem with the build product:

First, the initial package is too large (common.js)

Second, there are multiple repeated packages (Momentjs et al.)

Third, dependent third-party packages are too bulky

Module dependencies

Have the resources to build a larger version, we can also know about the optimization of the points, but in a system, hundreds of modules are generally organization by means of reference each other together, packaging tools through dependencies to build it together (such as into common. Js a single file), want to remove a module code directly or may not be easy, Therefore, we may need to pull the pieces together to some extent, using tools to clarify the dependencies of modules in the system, and then optimizing the dependencies or loading methods:

In the figure above, we use WebPack’s official analyse tool (other tools include Webpack-Xray, Madge) and just upload stats. Json to get a dependency map

Performance

Chrome provides very powerful tools for analyzing what to use in the “execute & fetch” section: Performance:

In the example above, we can see at least a few points: main process serialization, long task, and high frequency task.

How to optimize performance?

Combined with the analysis tools mentioned just now, we have basically covered the two major stages of “resource package download” and “execute & fetch” mentioned just now, and the basic problem and solution method have gradually developed ideas in the continuous analysis. Here I will combine with our scenes to give some good optimization ideas and effects

Large packages are loaded on demand

Remember, front-end engineering build packaging (such as Webpack) usually starts from entry and looks for the entire dependency tree (direct dependency), thus generating multiple JS and CSS bundles or trunks based on this tree. Once a module appears in the dependency tree, when the page loads entry, The module is also loaded.

So our idea is to break this direct dependency and use asynchronous dependency for the terminal module, as follows:

Change the synchronous import {Marker} from ‘@antv/l7’ to asynchronous so that the dependent Marker forms a chunk at build time and the thunk is loaded only when this code is executed (on demand), reducing the size of the first screen package.

However, there is a problem in the above scheme. The construction will treat the whole @ANTV/L7 as a chunk rather than part of the code of Marker, resulting in the failure of the TreeShaking of this chunk with a large volume. We can use build sharding to solve:

As mentioned above, the sharding file of Marker is first created to make it capable of TreeShaking, and then asynchronous introduction is made on this basis.

Below are the comparison results of our optimized processes:

In this step, we unpack on demand and load asynchronously, saving the resource download time and part of the execution time

Resource preloading

In fact, we have found a problem of “serialization of the main process” in the analysis stage. Js execution is single thread, but the browser actually runs in multiple threads, including asynchronous fetch, etc. Therefore, our further idea is to parallel fetch Data and resource download through multiple threads.

According to the current situation, the interface fetch logic is generally coupled in the business logic or data processing logic, so decoupling (from UI, business modules, etc.) is essential. Pure FETCH request (and a small amount of processing logic) is stripped out and put into a higher priority stage to initiate the request. So where do we put it? As we know, browser processing of resources is prioritized, normally in the following order:

  1. HTML/CSS/FONT
  2. Preload/SCRIPT/XHR
  3. Image/Audio/Video
  4. Prefetch

In order to do resource pull and initiate fetch in parallel, it is necessary to advance the fetch to the first priority (immediately after the HTML is parsed, instead of waiting for SCRIPT tag resource load to initiate the request during execution), and our flow will become as follows:

Note that since JS execution is serial, the logic that initiates the fetch must be executed before the logic of the main process and cannot be put into nextTick (such as using setTimeout(() => doFetch())). Otherwise, the main process will occupy CPU time and the request cannot be sent

Active task scheduling

Browser also has priority strategy for resources, but it does not know our business level, exactly want what resources first loading/execution, which resources after loading/execution, so we jump out, if the implementation of the whole business level resource loading + / access process into a a small task, these tasks solely by ourselves to control the: Does granularity of packaging, load time, execution time mean that you are maximizing CPU time and network resources?

The answer is yes, but generally for simple projects, the browser’s own scheduling priority policy is sufficient, but for large and complex projects with relatively extreme optimization, it is necessary to introduce a “custom task scheduling” solution.

In Quick BI, for example, our early goal was to make the main content on the first screen faster. So the CPU/ network allocation from the resource loading, code execution, and fetch levels should be based on our business priorities, such as: I want the “card drop down menu” to start loading after the main content is displayed on the first screen or when the CPU is idle (i.e., lower priority, or even when the user mouse over the card, increase priority and start loading and displaying immediately). As follows:

Here we encapsulate a task scheduler whose purpose is to declare a piece of logic to be executed after one of its dependencies (promises) completes. Our flow chart changes as follows:

The yellow blocks represent parts of the module that have been demoted, helping to reduce the overall first screen time

TreeShaking

All above methods are mostly based on priority, in fact, in the era of front-end engineering increasingly complex (large project has the hundreds of thousands of lines of code), the birth of a more intelligent optimization scheme is used to reduce packet size, the idea is simple: tools depend on analysis, will not be out in reference to the code from the final product.

Sounds cool and actually works great, but here are a few things that TreeShaking often fails to do:

Side effects

Side Effects usually represent code that has an impact on the global (window object, etc.) or environment.

As shown in the example, the b code appears to be unused, but there is console.log(b(1)) in the file, and packaging tools like Webpack are afraid to remove it, so it is typed in as usual.

The solution

Specify explicitly in package.json or webpack configuration which code has sideEffects (for example, sideEffects: [” **/*.css “]), and code without sideEffects will be removed

IIFE class code

IIFE is the Immediately invoked Function expression that is executed

This type of code causes TreeShaking to fail

The solution

Three principles:

  • Avoid immediate function calls
  • [Avoid] immediate new operations
  • Avoid code that immediately affects the world

Lazy loading

We mentioned in the “load on Demand” section that unpacking an asynchronous import would cause TreeShaking to fail. Here’s another case:

As shown, since index.ts synchronously imports sharedStr from bar.ts, and then somewhere asynchronously imports (‘./bar’), this causes two problems at the same time:

  1. TreeShaking fails (unusedStr will be entered)
  2. Asynchronous lazy loading failure (bar.ts will be typed with index.ts)

When the amount of code reaches a certain level, this problem can easily occur

The solution

  • [Avoid] synchronous and asynchronous import of the same file

On-demand policy (Lazy)

Now that resource packs can be loaded on demand, can a component be rendered on demand? Can an object instance be used on demand? A data cache can also be generated on demand?

LazyComponent

As shown, piearc.private. ts corresponds to a complex React component. PieArc is encapsulated by the makeLazyComponent as a default lazy component, which is loaded and executed only when the code is executed here. You can even declare a dependency with a second argument (deps) and wait until the promise ends before loading and executing.

LazyCache

Lazy caching is used in scenarios where the transformed results of a piece of data in a data stream (or other subscriptable data) need to be used anywhere and only at the moment of use

LazyObject

Lazy objects mean that they are instantiated only when they are used (properties/methods are accessed, modified, deleted, etc.)

As shown, When globalRecorder is introduced, it is not instantiated, only when globalRecorder.Record () is called

Data flow: Throttled rendering

In order to facilitate state management, data flow schemes are usually used in medium and large projects, as follows:

The data stored in store is usually partial atomic and has very small granularity. For example, state has: A, B, C… If N=20, then there will be 20 View updates in 16ms (1 frame) :

This obviously causes very large performance issues, so we need to buffer and throttle the ACTION volume for a short period of time. After 20 ACTION state changes, only one View update is performed, as follows:

This solution works in the form of Redux middleware in Quick BI and works well in complex + frequent data update scenarios

thinking

As a Chinese saying goes, “the gentleman takes precautions against problems by thinking about them.” When we look back, more than 80% of these performance problems can be avoided in the stage of architecture design and coding, and 20% can be balanced by “space <=> time replacement strategy” and other ways. So, the best performance optimization is to focus on the quality of each piece of code: do we account for the volume of build artifacts that such module dependencies can bring? Have you considered how often this logic might be executed? Is it possible to control space or CPU usage as data grows? And so on. Performance optimization doesn’t have a silver bullet. As a technical person, you need to internalize your mind (know the underlying principles) and embed your performance obsession into your instinctive thinking.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.