This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.

What is Tree Shaking

Tree-shaking is a Dead Code Elimination technology based on the ES Module specification. It statically analyzes the imports and exports between modules during operation, determines which exported values in ESM modules are not used by other modules, and removes them. In order to achieve the optimization of packaging products.

Tree Shaking was first implemented by Rich Harris in Rollup earlier. Webpack has been introduced since version 2.0 and has become a widely used performance optimization tool.

1.1 Starting Tree Shaking in Webpack

In Webpack, three conditions must be met to start the Tree Shaking function:

  • Write module code using ESM specifications
  • configurationoptimization.usedExportstrueTo enable the tag function
  • The code optimization function can be enabled in the following ways:
    • configurationmode = production
    • configurationoptimization.minimize = true
    • provideoptimization.minimizerAn array of

Such as:

// webpack.config.js
module.exports = {
  entry: "./src/index".mode: "production".devtool: false.optimization: {
    usedExports: true,}};Copy the code

1.2 Theoretical Basis

In older versions of JavaScript modularization schemes such as CommonJs, AMD, CMD, etc., import/export behavior is highly dynamic and unpredictable, such as:

if(process.env.NODE_ENV === 'development') {require('./bar');
  exports.foo = 'foo';
}
Copy the code

The ESM scheme avoids this behavior from the specification level. It requires that all import and export statements only appear at the top of the module, and the module name of the import and export must be a string constant. This means that the following code is illegal under the ESM scheme:

if(process.env.NODE_ENV === 'development') {import bar from 'bar';
  export const foo = 'foo';
}
Copy the code

Therefore, the dependencies between ESM modules are highly determined and independent of the runtime state. The compiler only needs to do a static analysis of ESM modules to infer from the code literals which module values are not used by other modules. This is a necessary condition for implementing Tree Shaking technology.

1.3 the sample

For the following code:

// index.js
import {bar} from './bar';
console.log(bar);

// bar.js
export const bar = 'bar';
export const foo = 'foo';
Copy the code

In the example, the bar.js module exports bar and foo, but only the exported value of bar is used by other modules. After Tree Shaking, the foo variable is removed as garbage code.

Two, the implementation principle

In Webpack, tree-shaking is implemented by first marking out unused module export values, and by using Terser to remove unused export statements. The labeling process can be roughly divided into three steps:

  • The Make phase collects module export variables and records them in the module dependency graph variable, ModuleGraph
  • In the Seal phase, the ModuleGraph is traversed to mark whether the module export variable is being used
  • If the variable is not used by other modules, the corresponding export statement is deleted when the product is generated

The optimization. UsedExports = true command is required to enable the tag function

That is, the effect of the tag is to remove export statements that are not used by other modules, such as:

In the example, the bar.js module (second from left) exports two variables: bar and foo. Foo is not used by other modules, so after marking, the export statement corresponding to the foo variable in the build product (first from right) is removed. By contrast, if the tag function is not enabled (optimization.usedexports = false), the export statement is retained whether or not the variable is used, as shown in the output code in the second from right above.

Note that the code for the foo variable const foo=’foo’ remains intact at this point, because the tagging only affects the module’s export statements; it is the Terser plugin that actually “Shaking”. For example, in the example above, the foo variable is marked as Dead Code — Code that cannot be executed. Using Terser’s DCE function, you can remove the definition statement to achieve the full Tree Shaking effect.

Next, I’ll expand the source code for the tagging process and go into detail about the Tree Shaking implementation in Webpack 5. If you’re not interested in the source code, you can skip to the next chapter.

2.1 Exporting collection Modules

First, Webpack needs to figure out what the export values are for each module. This happens in the make phase.

For more information on the Make phase, please refer to the previous article to understand the core principles of Webpack.

  1. Convert all ESM export statements for the module to Dependency objects and log tomoduleThe object’sdependenciesSet, transformation rule:
  • Named export is converted toHarmonyExportSpecifierDependencyobject
  • defaultExport convert toHarmonyExportExpressionDependencyobject

For example, for the following module:

export const bar = 'bar';
export const foo = 'foo';

export default 'foo-bar'
Copy the code

The corresponding dependencies value is:

  1. After all modules are compiled, triggercompilation.hooks.finishModulesHook, start executionFlagDependencyExportsPluginPlug-in callback
  2. FlagDependencyExportsPluginThe plug-in reads module information stored in the ModuleGraph from Entry, traversing allmoduleobject
  3. traversemoduleThe object’sdependenciesArray, find allHarmonyExportXXXDependencyType, which is converted toExportInfoObject and log it into the ModuleGraph system

After FlagDependencyExportsPlugin plug-in processing, export of all ESM style statement will be recorded in ModuleGraph system, further operation can then be read out directly from the ModuleGraph module export value.

References:

  1. [10,000 words summary] One article thoroughly understand the core principle of Webpack
  2. Dependency Graph: Dependency Graph (Dependency Graph

2.2 Label module export

After module export information is collected, Webpack needs to mark which exported values are used by other modules in the exported list of each module and which are not. This process occurs in the Seal phase. The main flow is as follows:

  1. The triggercompilation.hooks.optimizeDependenciesHook, start executionFlagDependencyUsagePluginPlug-in logic
  2. inFlagDependencyUsagePluginIn the plugin, step by step through all of ModuleGraph’s stores, starting with Entrymoduleobject
  3. traversemoduleObject correspondingexportInfoAn array of
  4. For each oneexportInfoObject to performcompilation.getDependencyReferencedExportsMethod to determine its correspondingdependencyWhether the object is used by other modules
  5. Call the exported value used by any moduleexportInfo.setUsedConditionallyMethod to mark it as used.
  6. exportInfo.setUsedConditionallyInternal changesexportInfo._usedInRuntimeProperty to record how the export is used
  7. The end of the

Here is the extremely simplified version, with a lot of branching logic and complex set operations in between. Let’s get to the point: Tag module export this operation focused on FlagDependencyUsagePlugin plug-in, the results will eventually be recorded in the corresponding module export exportInfo. _usedInRuntime dictionary.

2.3 Generating Code

After the previous collection and tagging steps, Webpack has clearly recorded in the ModuleGraph architecture what values are exported by each module, and each exported value is not used by that module. Next, Webpack generates different code depending on how the exported values are used, for example:

Js file. Bar is used by the index.js module so the __webpack_require__.d call “bar” is generated: ()=>(/* binding */ bar), foo only retains the definition statement, and does not generate the corresponding export in chunk.

For information about the contents of the Webpack product and the meaning of the __webpack_require__.d method, refer to the Webpack Principles series 6: Understanding the Webpack runtime thoroughly.

This paragraph generation logic are derived by statements corresponding HarmonyExportXXXDependency class implements, general process:

  1. Package phase, callHarmonyExportXXXDependency.Template.applyMethod generation code
  2. inapplyMethod, which is stored in ModuleGraphexportsInfoInformation to determine which exported values are used and which are not
  3. Creates corresponding values for used and unused exported valuesHarmonyExportInitFragmentObject, save toinitFragmentsAn array of
  4. traverseinitFragmentsArray to generate the final result

Basically, the logic in this step is to generate export statements using the exported values of the unmodule exportsInfo object collected earlier.

2.4 Deleting Dead Code

After the previous steps, unused values from the module export list are not defined in the __webpack_exports__ object, resulting in a Dead Code effect that cannot be executed, such as foo in the example above:

After that, DCE tools such as Terser and UglifyJS “shake” the invalid code to form a complete Tree Shaking operation.

2.5 summarize

To sum up, the implementation of Tree Shaking in Webpack is divided into the following steps:

  • inFlagDependencyExportsPluginPlug-ins based on the moduledependenciesThe list collects module export values and records them in the ModuleGraph systemexportsInfo
  • inFlagDependencyUsagePluginThe use of exported values for modules is collected in the plug-in and recorded toexportInfo._usedInRuntimeIn the collection
  • inHarmonyExportXXXDependency.Template.applyMethod to generate different export statements based on the usage of the export value
  • Use the DCE tool to remove Dead Code to achieve the complete tree shake effect

The above implementation principle requires high background knowledge, so readers are advised to eat the following documents synchronously:

  1. [10,000 words summary] One article thoroughly understand the core principle of Webpack
  2. Dependency Graph: Dependency Graph (Dependency Graph
  3. Webpack Principles Series 6: Thoroughly understand the Webpack runtime

Best practices

Although Webpack has supported Tree Shaking natively since 2.x, due to the dynamic nature of JS and the complexity of its modules, many code side effects have not been addressed until the latest version 5.0. The optimization is not as good as Tree Shaking would like it to be, so users need to consciously optimize the code structure, or use some patching techniques to help Webpack detect invalid code more accurately.

3.1 Avoid meaningless assignments

When working with Webpack, you need to consciously avoid unnecessary assignments. Look at this example code:

In the example, the index.js module references foo of the bar.js module and assigns it to f, but it does not use foo or f further. In this case, the foo exported by the bar.js module is not actually used and should be deleted. But the Tree Shaking operation for Webpack doesn’t work, and the foo export remains in the product:

The superficial reason for this is that Webpack’s Tree Shaking logic stays at the level of code static analysis and is only superficial:

  • Whether a module export variable is referenced by other modules
  • Does this variable appear in the body code of the reference module

There is no further semantic analysis of whether the exported values of the module are actually used effectively.

A further reason is that JavaScript assignments are not pure and may have unintended side effects depending on the situation, such as:

import { bar, foo } from "./bar";

let count = 0;

const mock = {}

Object.defineProperty(mock, 'f', {
    set(v) {
        mock._f = v;
        count += 1;
    }
})

mock.f = foo;

console.log(count);
Copy the code

In the example, the object.defineProperty call to a mock Object causes the mock.f = foo assignment statement to have a side effect on the count variable, in a scenario where even complex dynamic semantic analysis is difficult to ensure correct side effects. Perfectly Shaking off all useless code branches.

Therefore, when using Webpack, developers need to consciously avoid these meaningless repetitive assignments.

3.3 the use of#pureAnnotate pure function calls

Like assignment statements, function calls in JavaScript can have side effects, so Webpack does not Tree Shaking function calls by default. However, developers can add a /*#__PURE__*/ comment before the call statement to explicitly tell Webpack that the function call has no contextual side effects, such as:

Foo (‘be retained’) is retained without /*#__PURE__*/. By contrast, Foo (‘be removed’) is taken out of Pure and removed by Tree Shaking.

3.3 Prohibit Babel translation module import/export statements

Babel is a very popular JavaScript code converter that translates older VERSIONS of JS code into more compatible older versions of code, allowing front-end developers to use the latest language features to develop code compatible with older browsers.

Babel provides some features that make Tree Shaking invalid. For example, Babel can translate import/export style ESM statements into CommonJS style modular statements. However, this feature causes Webpack to fail to do static analysis of the imported and exported content of the translated module, for example:

The example uses babel-loader to process the *.js file and sets the Babel configuration item modules = ‘commonjs’ to translate the modular scheme from ESM to CommonJS, The resulting translation code (top right) does not correctly mark the unused export value foo. For comparison, Figure 2 on the right shows the result of packing with Modules = false, when the foo variable is correctly marked as Dead Code.

Therefore, when using babel-loader in Webpack, it is recommended that the moduels configuration item of babel-preset-env be set to false to disable the conversion of module import and export statements.

3.4 Optimize the granularity of exported values

Tree Shaking logic works on the EXPORT statement of the ESM, so for export scenarios like the following:

export default {
    bar: 'bar'.foo: 'foo'
}
Copy the code

Even if only one of the default exported values is actually used, the entire default object is retained intact. Therefore, in actual development, the granularity and atomicity of the derived value should be maintained as far as possible. The optimized version of the above example code:

const bar = 'bar'
const foo = 'foo'

export {
    bar,
    foo
}
Copy the code

3.5 Using packages that support Tree Shaking

If possible, use NPM packages that support Tree Shaking, for example:

  • uselodash-esalternativelodash, or usingbabel-plugin-lodashAchieve similar effects

However, not all NPM packages have room for Tree Shaking. Frameworks such as React and Vue2 have already been optimized to the point where business code needs the full functionality provided by the whole package, and Tree Shaking is less of a necessity.