Summary of recent learning about webpack asynchronous subcontracting loading of business projects

The outline is as follows:

  • Relevant concepts
  • Webpack subcontract configuration
  • How to implement webpack asynchronous load subcontracting

Relevant concepts

  • Module, chunk, and bundle concepts

First, a wave of nouns. First, a picture on the Internet to explain:

1. Module: Each file in our source directory is treated as module in Webpack (the file types that webpack does not support are implemented by loader). Modules form chunks. 2, the chunk. A product of the Webpack packaging process. By default, x webpack entries output x bundles. 3, the bundle. The final output of Webpack can be run directly in the browser. As you can see from the figure, a chunk will output multiple bundles when the CSS is removed (or images, font files, etc.), but by default a chunk will output only one bundle

  • hash,chunkhash,contenthash

I’m not going to do a demo here, there are a lot of demos online.

The hash. All bundles use the same hash value for each webpack process

Chunkhash. Hash based on the content of each chunk. All bundle products of the same chunk have the same hash value. So if one of the bundles is modified, all the hash products of the same chunk are also modified.

Contenthash. The calculation is related to the content of the file itself.

Tips: Note that in hot update mode, chunkhash and Contenthash calculations will be incorrect, Error: Cannot use [chunkhash] or [Contenthash] for chunk in ‘[name].[chunkhash].js’ (use [hash] instead) Therefore, hot update can only use hash mode or no hash mode. In production we usually use contenthash or Chunkhash.

Having said that, what are the benefits of using asynchronous loading/subcontracting loading? In short, there are the following points

1. Make better use of the browser cache. If we have a large project that does not use subcontracting, only one JS file will be generated per package, assuming the JS package is 2MB. When the regular code is released, we might just change one line of code, but the hash value of the packaged JS changes because the content changes. The browser will then have to reload the 2MB JS file. If subcontracting is used, several chunks are separated, and a line of code is changed, only the hashes of that chunk will be affected (here, strictly speaking, several hashes may change without pulling mainifest), and the other hashes will remain unchanged. This takes advantage of a cache where the hash doesn’t change part of the code

2, faster loading speed. Assuming that entering a page requires loading a 2MB JS, after subcontracting and pulling out, entering this page may become loading 4 500Kb JS. We know that the maximum number of concurrent requests for the same domain is 6 (webpack’s maxAsyncRequests default is 6), so four 500KB JSS will be loaded simultaneously, equivalent to loading 500KB resources. There will be a corresponding increase in speed.

3, if the implementation is asynchronous lazy loading code. For some parts of the code may be used in some places, only when used to load, can also play a good traffic saving purpose.

Webpack subcontract configuration

Before we do that, let’s emphasize the concept of splitChunk, which is for chunk, not module. For the same chunk, no matter how many times a code file is referenced by the same chunk, it still counts once. Only a code file referenced by more than one chunk counts as multiple.

The default subcontracting configuration for Webpack is as follows

module.exports = {
  optimization: {
    splitChunks: {
      / / * * ` splitChunks. Chunks: 'async ` * *. Represents which types of chunks participate in a split. The default is chunk loaded asynchronously. The value can also be 'initial' (for chunk synchronization) or 'all' (equivalent to 'initial' + 'async').
      chunks: "async".// minSize indicates the minimum size of new chunks that are generated according to code splitting. By default, new chunks are generated only if they are larger than 30KB
      minSize: 30000.// maxSize indicates that WebPack will try to split the chunk larger than maxSize into smaller chunks. The value after splitting must be larger than minSize
      maxSize: 0.// split when a module is shared by the minimum number of chunks
      minChunks: 1.// Maximum number of asynchronous requests. This value can be regarded as an asynchronous chunk. The number of chunks to be extracted and loaded at the same time does not exceed this value. If it is 1, the asynchronous chunk will not pull out any code blocks
      maxAsyncRequests: 5.// Maximum number of chunk requests. This is used in the case of multi-entry chunk. It represents the maximum number of chunks that can be loaded simultaneously by the multi-entry chunk common code
      maxInitialRequests: 3.// Maximum number of initial chunk requests.
      // When a chunk is split into a small chunk, the name of the chunk is a combination of chunks and a concatenation
      automaticNameDelimiter: "~".// Names representing chunks are automatically generated (by cacheGroups' key and entry names)
      name: true.// cacheGroups represents subgrouping rules, with each subgroup inheriting from default
      // priority indicates the priority. If a chunk may be matched by multiple grouping rules, the chunk with a higher priority is used
      // When provided, test indicates which modules will be removed
      cacheGroups: {
        vendors: {
          test: /[\\/]node_modules[\\/]/.priority: - 10
        },
        default: {
          minChunks: 2.priority: - 20.// Reuse the chunk that has been generated
          reuseExistingChunk: true}}}}};Copy the code

Another important configuration is output.jsonpFunction (webpackJsonp by default). This is a global variable used when loading chunks asynchronously. In a multi-Webpack environment, it is best to set this value to a relatively unique value in order to prevent the function naming collisions.

In general, there is no perfect subcontracting configuration, only the configuration that best fits the requirements of the current project scenario. Most of the time, the default configuration is sufficient.

In general, to ensure the stability of hash, it is recommended that:

1, use webpack. HashedModuleIdsPlugin. The plugin generates a four-digit hash for the module ID based on the relative path of the module. By default, Webpack is named with the module numeric increment ID. When a module is inserted (or a module is deleted) using an ID, all subsequent module ids are affected, causing the module ID to change and the hash of the packaged file to change. Use this plug-in to solve this problem.

2. Chunkid is also auto-increment, and may also encounter module ID problems. You can use the named chunk name by setting Optimization.namedchunks to true (the default is true in dev mode and false in PROd mode).

The effects after 1 and 2 are as follows.

mini-css-extract-plugin
contenthash

The following takes a console page of Tencent Cloud as an example. After the webpack path is used, the asynchronous loading effect is shown as follows. As you can see, the first time you visit the page. Here we request a total entry JS, and load the code for this route based on the route we access (route 1). Here we can see that the number of JS loaded asynchronously is 5, which is equivalent to the default configuration item maxAsyncRequests mentioned above. In waterfall, we can see that this is concurrent requests. If another route is added (route 2), only one js of the other route will be loaded (or vendor JS that is not currently loaded). If you only change route 1’s own business code, vendor-dependent hashes and other route hashes, these files will make good use of the browser cache

How to implement webpack asynchronous load subcontracting

As we know, by default, browser environment JS does not support import and asynchronous import(‘ XXX ‘).then(…). . So how webpack is implemented to make the browser support it, let’s analyze the code after webPack is built to understand the principle behind it.

The experimental code structure is as follows

Begin to see

// webpack.js
const webpack = require("webpack");
const path = require("path");
const CleanWebpackPlugin = require("clean-webpack-plugin").CleanWebpackPlugin;
const HtmlWebpackPlugin = require("html-webpack-plugin");

module.exports = { entry: { a: "./src/a.js", b: "./src/b.js" }, output: { filename: "[name].[chunkhash].js", chunkFilename: "[name].[chunkhash].js", path: **dirname + "/dist", jsonpFunction: "_**jsonp" }, optimization: { splitChunks: { minSize: 0 } // namedChunks: true }, plugins: [ new CleanWebpackPlugin(), new HtmlWebpackPlugin() //new webpack.HashedModuleIdsPlugin() ], devServer: { contentBase: path.join(__dirname, "dist"), compress: true, port: 8000 } };

// src/a.js import { common1 } from "./common1"; import { common2 } from "./common2"; common1(); common2(); import(/_ webpackChunkName: "asyncCommon2" _/ "./asyncCommon2.js").then( ({ asyncCommon2 }) => { asyncCommon2(); console.log("done"); });

// src/b.js import { common1 } from "./common1"; common1(); import(/_ webpackChunkName: "asyncCommon2" _/ "./asyncCommon2.js").then( ({ asyncCommon2 }) => { asyncCommon2(); console.log("done"); });

// src/asyncCommon1.js export function asyncCommon1(){ console.log('asyncCommon1') } // src/asyncCommon2.js export function asyncCommon2(){ console.log('asyncCommon2') }

// ./src/common1.js export function common1() { console.log("common11"); } import(/_ webpackChunkName: "asyncCommon1" _/ "./asyncCommon1").then( ({ asyncCommon1 }) => { asyncCommon1(); });

Copy the code

// src/common2.js export function common2(){ console.log('common2') }

// import file a.js
(function() {
  / /...
  function webpackJsonpCallback(data){
    //....
  }

  // Cache already loaded modules. Modules loaded synchronously or asynchronously are entered into this cache
  var installedModules = {};
  // Record the chunk status bits
  // Value: 0 indicates that the load is complete.
  // undefined: chunk is not loaded yet
  // null :chunk preloaded/prefetched
  // Promise: Chunk is loading
  var installedChunks = {
    a: 0
  };


// Use to get asynchronously loaded JS addresses according to chunkId
function jsonpScriptSrc(chunkId){
/ /...
}

The import / / synchronization
function __webpack_require__(moduleId){
  / /...
}

// The method used to load asynchronous imports
__webpack_require__.e = function requireEnsure(chunkId)  {
  / /...
}
  // Load and execute the entry js
  return __webpack_require__((__webpack_require__.s = "./src/a.js")); ({})"./src/a.js": function(module, __webpack_exports__, __webpack_require__) {
    eval(...). ;//./ SRC /a.js file contents
  },
  "./src/common1.js":... ."./src/common2.js":... });Copy the code

As you can see, the webpack-packed entry file is an instant-execute function whose arguments are code module objects imported for the synchronization of the entry function. The key value is the pathname, and the value value is an eval function that executes the corresponding module code. There are several important variables/functions in this entry function.

  • webpackJsonpCallbackFunction. Load the callback completed by the asynchronous module.
  • installedModulesThe variable. Caches loaded Modules. Modules loaded synchronously or asynchronously are entered into this cache.keyIs the module ID,valueIt’s an object{I: module ID, l: Boolean that indicates whether the module has been loaded, exports: exported value of that module}.
  • installedChunksThe variable. Caches the state of chunks that have been loaded. There are several state bits.0Indicates that the load is complete.undefinedChunk hasn’t loaded yet,null: the chunkpreloaded/prefetchedLoaded modules,Promise: Chunk is loading
  • jsonpScriptSrcThe variable. Used to return the JS address of asynchronous chunk. If you set it upwebpack.publicPath(Usually CDN domain name, this will be saved to__webpack_require__.p, and the final address will be concatenated with the address
  • __webpack_require__Function. synchronousimportThe call
  • __webpack_require__.eFunction. asynchronousimportThe call

After each module is constructed, there is a function of the following type. The function input parameter module corresponds to the relevant state of the current module (whether loading is complete, exported value, ID, etc.). As mentioned below), __webpack_exports__ is the export of the current module, and __webpack_require__ is the __webpack_require__ function of the import chunk, which is used to import other code

function(module, __webpack_exports__, __webpack_require__) {
"use strict";
eval(Module code...) ;/ / (1)
 }
Copy the code

The code in eval is as follows, using A. js as an example.

/ / (1)
// After formatting as js
__webpack_require__.r(__webpack_exports__);
var _common1__WEBPACK_IMPORTED_MODULE_0__ = __webpack_require__(
  "./src/common1.js"
);
var _common2__WEBPACK_IMPORTED_MODULE_1__ = __webpack_require__(
  "./src/common2.js"
);
// _common1__WEBPACK_IMPORTED_MODULE_0__ is an exported object
// Execute the exported common1 method
/ / source js:
// import { common1 } from "./common1";
// common1();
Object(_common1__WEBPACK_IMPORTED_MODULE_0__["common1") ();Object(_common2__WEBPACK_IMPORTED_MODULE_1__["common2") (); __webpack_require__ .e("asyncCommon2")
  .then(__webpack_require__.bind(null."./src/asyncCommon2.js"))
  .then(({ asyncCommon2 }) = > {
    asyncCommon2();
    console.log("done");
  });
Copy the code

Then we know

  • synchronousimportAnd eventually convert to__webpack_require__function
  • asynchronousimportAnd eventually convert to__webpack_require__.emethods

The whole process execution is.

The entry file initially loads the entry js with __webpack_require__((__webpack_require__. S = “./ SRC /a.js”)), (as you can see above, the installedChunked variable starts with {a:0},), And execute the code in A. js through eval.

__webpack_require__ is arguably the most common thing in the webpack build, so what does __webpack_require__ do?

function __webpack_require__(moduleId) {
  // If a module has already been imported, import it again
  if (installedModules[moduleId]) {
    return installedModules[moduleId].exports;
  }
  // If it has not been loaded before, it will be cached in installedModules
  var module = (installedModules[moduleId] = {
    i: moduleId,
    l: false.exports: {}});// Execute the corresponding loaded module
  modules[moduleId].call(
    module.exports,
    module.module.exports,
    __webpack_require__
  );

  // Set the state of the module to loaded
  module.l = true;

  // Returns the exported value of the module
  return module.exports;
}
Copy the code

This should be straightforward: the function receives a moduleId corresponding to the key of the argument passed to the immediate function. If a module has been loaded before, return the exported value of the module directly. If the module has not been loaded, execute the module, cache it in the corresponding moduleId key of installedModules, and return the exported value of the module. So in webpack packaging code, importing a module multiple times will only be executed once. Another point is that in the Webpack packaging module, the default import and require are the same, and are eventually converted to __webpack_require__.

Going back to the classic question, what happens when circular references occur in a WebPack environment? A.js has an import x from ‘./b.js’ and b.js has an import x from ‘a.js’. This is easy to see from the __webpack_require__ analysis above. Before a module is executed, Webpack hooks it into installedModules. For example, when a.js is executed, it introduces B.js, and a.js is introduced in b.js. At this point, b.js gets the content introduced into A that has already been exported when A.js is currently executed (because a.js has been hooked to installedModules, so a.js will not be re-executed).

After the synchronous loading is complete, portal Chunk executes a.js.

Next, go back to the snippet of the A. js module that executes in eval, asynchronously loading the JS part.

/ / a. s module
__webpack_require__
  .e("asyncCommon2")
  .then(__webpack_require__.bind(null."./src/asyncCommon1.js")) // (1) The asynchronous module file has been injected into the immediate function's input 'modules' variable, which has the same effect as the synchronous' import' call '__webpack_require__'
  .then(({ asyncCommon2 }) = > {
    //(2) get the corresponding module and execute the related logic (2).
    asyncCommon2();
    console.log("done");
  });
Copy the code

All __webpack_require__. E does is load the asynchronous chunk file corresponding to the incoming chunkId, which returns a promise. It is loaded by jSONP using script tags. Will this function be called multiple times, or will it only make a js request once? When loaded, the asynchronous module file has been injected into the immediate function’s input modules variable, which has the same effect as a synchronized import call to __webpack_require__ (this injection is done by the webpackJsonpCallback function). Call __webpack_require__.bind(null, “./ SRC/asynccommon1.js “)(1) in the promise callback to retrieve the module and execute the logic (2).

// __webpack_require__.e Async import calls the function
// Review the chunk state bits mentioned above
// Record the chunk status bits
// Value: 0 indicates that the load is complete.
// undefined: chunk is not loaded yet
// null :chunk preloaded/prefetched
// Promise: Chunk is loading
var installedChunks = {
  a: 0
};

__webpack_require__.e = function requireEnsure(chunkId) {
  / /... Keep only the core code
  var promises = [];
  var installedChunkData = installedChunks[chunkId];
  if(installedChunkData ! = =0) {
    // Chunk is not loaded yet
    if (installedChunkData) {
      // Chunk is loading
      // Continue to wait, so it will only load once
      promises.push(installedChunkData[2]);
    } else {
      // Chunk is not loaded yet
      // Use the script tag to load the corresponding JS
      var promise = new Promise(function(resolve, reject) {
        installedChunkData = installedChunks[chunkId] = [resolve, reject];
      });
      promises.push((installedChunkData[2] = promise)); // start chunk loading

      //
      var script = document.createElement("script");
      var onScriptComplete;

      script.src = jsonpScriptSrc(chunkId);
      document.head.appendChild(script);
  / /...
  }
  // The promise's resolve call is called in jsonpFunctionCallback
  return Promise.all(promises);
};

Copy the code

Let’s take a look at the code structure of asyncCommon1 Chunk loaded asynchronously. What it does is simply add an array of jsonpFunction’s global array push (note that this is not an array push, but a webpackJsonpCallback function rewritten as an entry chunk), This array consists of the chunk name and the Module object for that chunk.

// asyncCommon1 chunk
(window["jsonpFunction"] = window["jsonpFunction"] || []).push([["asyncCommon1"] and {"./src/asyncCommon1.js":
 (function(module, __webpack_exports__, __webpack_require__) {
eval(moduleThe code...). ; }}));Copy the code

The time to execute the webpackJsonpCallback is when we get the asynchronous chunk back via script (sure, because the request code comes back and executes the push method in the asynchronous chunk). . It’s easy to see from the asynchronous chunk code and the following webpackJsonpCallback. WebpackJsonpCallback does a few things:

1. Set the state position of asynchronous chunk to 0, indicating that the chunk has been loaded. installedChunks[chunkId] = 0;

Resolve the corresponding chunk load promise generated in __webpack_require__.e

3. Mount the modules of asynchronous chunk to the immediate function parameter modules of the importing chunk. It is available for __webpack_require__. This process has been mentioned in the analysis A. JS module above

//
function webpackJsonpCallback(data) {
  var chunkIds = data[0];
  var moreModules = data[1];
  var moduleId,
    chunkId,
    i = 0,
    resolves = [];
  for (; i < chunkIds.length; i++) {
    chunkId = chunkIds[i];
    if (
      Object.prototype.hasOwnProperty.call(installedChunks, chunkId) &&
      installedChunks[chunkId]
    ) {
      resolves.push(installedChunks[chunkId][0]);
    }
    // Set the current chunk to loaded
    installedChunks[chunkId] = 0;
  }
  for (moduleId in moreModules) {
    if (Object.prototype.hasOwnProperty.call(moreModules, moduleId)) {
      // Mount the asynchronous' chunk 'module to the immediate function parameter' modules' of the entry 'chunk'modules[moduleId] = moreModules[moduleId]; }}// Execute the old jsonPFunction
  // Can be read as a native Array Array, but here is the essential, can prevent the packet collision situation part of the module is not loaded!
  if (parentJsonpFunction) parentJsonpFunction(data);

  while (resolves.length) {
    // Resolve the corresponding chunk load promise generated in __webpack_require__.eresolves.shift()(); }}Copy the code

Summary:

1. After webpack packaging, module files in each chunk are combined and formed as follows

{
  [moduleName:string] : function(module, __webpack_exports__, __webpack_require__){
    eval('Module file source code')}}Copy the code

2, the same page multiple Webpack environment, output.jsonpFunction as far as possible do not collide the name. Bumped into also won’t hang up commonly. It simply hangs some module code asynchronously loaded by other WebPack environments on the input modules of the immediately executing function. (May cause some memory increase?)

3. Each entry chunk entry is a similarly executed function immediately

(function(modules){
//....
})({
   [moduleName:string] : function(module, __webpack_exports__, __webpack_require__){
    eval('Module file source code')}})Copy the code

4. Behind asynchronous loading is the use of script tags to load code

5. Asynchronous loading is less mysterious and works better when the project is large enough

(The level is limited, if there are mistakes, welcome to clap bricks)