preface

I have been using scaffolding for Vue project development. I know little about Webpack, the black box, and always have no idea when there is a problem. Therefore, I learned it completely while Webpack5.0 was released soon. This article summarizes the learning results. The overall outline is shown below. This article is an advanced part. For the basic part, please click portalWebpack5.0 learning summary – basics.

Take a peek at webpack principles

How to develop a loader

Loader is essentially a function that processes the matched source file content and outputs it. When a rule is processed by multiple Loaders, it will be executed from bottom to top, and the last step will get the content completed by the previous step. You can think of it as a chain call. Therefore, when developing loader, the most important concern is its input and output. Here is a step-by-step example of how to develop a Loader

  1. Introduce your own loader in the WebPack configuration file and use it in a rule.
  2. Write a custom loader.
  3. Compare the bundle file (main.js) before and after using loader to verify the loader effect.

Let’s first clarify what the loader is intended to achieve. In this example, the function of deleting JS annotations is simply implemented to introduce the loader writing process.

Importing loader into the configuration file

Loader is introduced in webpack.config.js. The function of resolveLoader is to configure the search path of loader. If the loader parameters in resolveLoader and Rules are not configured, fill in the loader file path completely.

// webpack.config.js

const path = require("path");
module.exports = {
    mode: "none".// Mode is set to None and no default configuration is enabled to prevent Webpack from automatically handling interfering loader effects.
    /* Parse loader rules */
    resolveLoader: {
        // the default directory is node_modules, so we usually write loader (such as babel-loader) to node_modules
        modules: ["node_modules", path.resolve(__dirname, "loaders")].// Add the search path. The order is from front to back
    },
    module: {
        rules: [{test: /\.js$/.// myLoader is found in the loaders folder because resolveLoader is configured
                loader: "myLoader".options: {oneLine: true.// Whether to remove single-line comments
                    multiline: true.// Whether to remove multi-line comments}}]},}Copy the code

2. Write a custom loader

// myLoader.js

module.exports = function (source) {
    // Starting with Webpackage 5.0, you no longer need to use the tool to get options
    // Get the options configured in webpack.config.js
    let options = this.getOptions();
    let result = source;
    // The default single-line and multi-line comments are removed
    const defaultOption = {
        oneLine: true.multiline: true,
    }
    options = Object.assign({}, defaultOption, options);
    if (options.oneLine) {
        // Remove single-line comments
        result = result.replace(/\/\/.*/g."")}if (options.multiline) {
        // Remove multi-line comments
        result = result.replace(/ / \ \ *. *? \*\//g."")}// Loader must have output, otherwise Webpack build error
    return result
}
Copy the code

3. Verify the loader effect by comparing the bundle output.

To keep the comparison clear and concise, the contents of the source code index.js are very simple.

  • The source code
// index.js

/* Add multi-line comments for testing */
const x = 100;
let y = x; // Inline single-line test
// Single-line comment test
console.log(y);

Copy the code
  • If the loader is not used in the output file, you can see that the comments in the source code remain.
// main.js

/ * * * * * * / (function() { // webpackBootstrap
var __webpack_exports__ = {};
/* Add multi-line comments for testing */
const x = 100;
let y = x; // Inline single-line test
// Single-line comment test
console.log(y);

/ * * * * * * / })()
;
Copy the code
  • Use loader output file, it is clear that the comments in the source code have been removed, loader takes effect.
// main.js

/ * * * * * * / (function() { // webpackBootstrap
var __webpack_exports__ = {};

const x = 100;
let y = x; 

console.log(y);

/ * * * * * * / })()
;
Copy the code

The above is the basic process of writing a loader. There are a few additional notes:

  • Check options parameters: You can use the three-party library schema-utils to check options parameters.
  • Synchronous and asynchronous: Loaders are classified into synchronous and asynchronous loaders. Asynchronous Loaders may be required in some scenarios. As follows:
module.exports = function (source) {
    // Generate an asynchronous callback function.
    const callback = this.async();
    setTimeout(() = > {
        // The first argument to the callback is an error message, the second argument is the output, and the third argument is source-map
        callback(null, source);
    }, 1000);
};
Copy the code
  • When developing a loader, try to keep its responsibilities simple. That is, a loader does only one task. This makes the Loader easier to maintain and reusable in more scenarios.

How to develop a plug-in

The packaging process of Webpack is like an assembly line of products, one step at a time. Plug-ins are additional features that are inserted at various stages of this pipeline, and Webpack uses them to extend its capabilities. Before we get to the examples, we need to take a quick look at exactly how plug-ins are inserted into Webpack at different stages of packaging. It uses the Tapable utility class, which is extended by the Compiler and compilation classes.

Tapable profile

Tapable is similar to a publish-subscribe model, where different plug-ins can subscribe to the same event, which is distributed to each registered plug-in when Webpack executes on it. Tapable provides many types of hooks, including synchronous and asynchronous hooks, which are registered in different ways. Synchronous hooks are registered with TAP and asynchronous hooks are registered with tapAsync or tapPromise, the difference being that the former uses a callback function and the latter uses a Promise. Tapable also has a number of subtypes, such as Hooks of the Bail type, which can be used to terminate the call of such a Hook registration event. Here’s a look at the Tapable hook using the file reading example

const { SyncHook, AsyncSeriesHook } = require("tapable");
const fs = require("fs");

// The hook holds the container
const hooks = {
    beforeRead: new SyncHook(["param"]), The array represents the parameters of the callback function at registration time.
    afterRead: new AsyncSeriesHook(["param"]) // Execute hooks asynchronously
}
/ / subscribe beforeRead
hooks.beforeRead.tap("name".(param) = > {
    console.log(param, "BeforeRead executes trigger callback");
})
/ / subscribe afterRead
hooks.afterRead.tapAsync("name".(param, callback) = > {
    console.log(param, AfterRead executes trigger callback);
    setTimeout(() = > {
        // The callback is complete
        callback()
    }, 1000);
})

// beforeRead is called before the file is read, and the registration events are executed synchronously in the registration order
hooks.beforeRead.call("Start reading")
fs.readFile("package.json", ((err, data) = > {
    if (err) {
        throw new Error(err)
    }
    // Execute the afterRead hook after reading the file
    hooks.afterRead.callAsync(data, () = > {
        // Called after all registration events are completed, similar to promise.all
        console.log("afterRead end~"); }})))Copy the code

In both phases of reading the file, the corresponding hooks are executed, and notifications are broadcast to all registered events during execution. Continue with the following steps after you finish.

Custom plug-in writing

A plug-in is essentially a constructor that must have a apply method on its prototype. After Webpack initializes the Compiler object, the plug-in instance’s Apply method is called and the Compiler object is passed in. The plug-in can then register the desired hook with the Compiler, and Webpack triggers the registration event when it reaches the corresponding stage. Here are two simple plug-in examples to demonstrate this process.

Plugin 1: Delete files in the output folder

Imitate the CleanWebpackPlugin, but do not delete folders, because Node can only delete empty folders, recursion is required to fully realize the function of CleanWebpackPlugin, the plug-in writing process is mainly demonstrated here, so it is simplified to delete files only.

// RmFilePlugin.js

const path = require("path");
const fs = require("fs");

class RmFilePlugin {
    constructor(options = {}) {
        // Options for the plug-in
        this.options = options;
    }
    // Webpack automatically calls the plug-in's apply method, passing it the Compiler parameter
    apply(compiler) {
        // Get all webpack configurations
        const webpackOptions = compiler.options;
        // Context is the execution environment of Webpack (execution folder path)
        const { context } = webpackOptions
        // Register events on the Compiler object's beforeRun hook
        compiler.hooks.beforeRun.tap("RmFilePlugin".(compiler) = > {
            // Get the package output path
            const outputPath = webpackOptions.output.path || path.resolve(context, "dist");
            const fileList = fs.readdirSync(outputPath, { withFileTypes: true });
            fileList.forEach(item= > {
                // Delete only files, do not recursively delete folders, simplify logic
                if (item.isFile()) {
                    constdelPath = path.resolve(outputPath, item.name) fs.unlinkSync(delPath); }})}); }};/ / export Plugin
module.exports = RmFilePlugin;
Copy the code

This example is very simple and only uses compiler objects. In most cases, compilation objects are also required for plug-in development. How is this different from compiler?

  • Personally, Compiler represents the entire life cycle of Webpack from startup to shutdown. The hooks on it are based on Webpack to run itself, such as whether the packaging environment is ready, whether compilation has started, etc. Compilation focuses on compilation, and its hooks exist in the details of compilation, such as modules being loaded, optimized, and chunked.

The following example uses a compilation object

Plugin 2: Remove JS comments

The function of this plug-in has been implemented in loader above, and it is implemented again in plugin to illustrate that plugin can do all the things loader can do, and plugin can do more thoroughly.

// DelCommentPlugin.js

const { sources } = require('webpack');

class DelCommentPlugin {
    constructor(options) {
        this.options = options
    }

    apply(compiler) {
        // compilation Executes the registration event after creation
        compiler.hooks.compilation.tap("DelCommentPlugin".(compilation) = > {
            / / handle the asset
            compilation.hooks.processAssets.tap(
                {
                    name: 'DelCommentPlugin'.// Plug-in name
                    // What type of processing is required for the asset
                    stage: compiler.webpack.Compilation.PROCESS_ASSETS_STAGE_PRE_PROCESS,
                },
                (assets) = > {
                    for (const name in assets) {
                        // Only js assets are processed
                        if (name.endsWith(".js")) {
                            if (Object.hasOwnProperty.call(assets, name)) {
                                const asset = compilation.getAsset(name); // Get asset by asset name
                                const contents = asset.source.source(); // Get the contents of asset
                                const result = contents.replace(/\/\/.*/g."").replace(/ / \ \ *. *? \*\//g."");// Delete comments
                                // Update the contents of asset
                                compilation.updateAsset(
                                    name,
                                    newsources.RawSource(result) ); }}}}); }}})module.exports = DelCommentPlugin
Copy the code

As with loader, compare the output after using this plug-in.

// main.js

 (function() { 
var __webpack_exports__ = {};

const x = 100;
let y = x; 

console.log(y);



 })()
;
Copy the code

Obviously, there is no problem with removing comments, and as you can see, it removes all comments from the main.js file, whereas loader can only remove comments from the source code. Plugin can directly change the final output bundle content.

Hand write a simple Webpack

Webpack is a Node application, so it essentially runs a chunk (a big chunk) of JS code on a Node environment that looks something like this.

// built.js

const myWebpack = require(".. /lib/myWebpack");
// Introduce a custom configuration
const config = require(".. /config/webpack.config.js");


const compiler = myWebpack(config);
// Start webpack packing
compiler.run();

Copy the code

Pass the config to the myWebpack function, and then construct a Compiler object that executes its Run method. The run method focuses on two things. First, it finds and records all dependencies from the entry file. Second, it assembles the output boundle function from strings. Here’s a two-step code analysis:

Analyze the dependency table from the entry file

// myWebpack.js

const fs = require("fs");
const path = require("path");
const babelParser = require("@babel/parser");
const traverse = require("@babel/traverse").default;
const { transformFromAstSync } = require("@babel/core")

// Compiler constructor
class Compiler {
    constructor(options = {}) {
        this.options = options; // Get the WebPack configuration
        this.entry = this.options.entry || "./src/index.js" // Get the entry file. If not, use the default value
        this.entryDir = path.dirname(this.entry) 
        this.depsGraph = {}; // Dependency table, output of the first step
    }
    // Start webpack packaging
    async run() {
        const { entry, entryDir } = this
        // Get module information from the entry file
        this.getModuleInfo(entry, entryDir);
        console.log(this.depsGraph);
        // Generate the build content after obtaining the module information, the content of the second step, first comment.
        // this.outputBuild()
    }
    // Obtain module information according to the file path
    getModuleInfo(modulePath, dirname) {
        const { depsGraph } = this
        /* Use the fs module and the file path to read the file contents, and then according to the file contents (import and export) can analyze the dependencies between modules. There's nothing wrong with doing this on your own. For convenience, the babelParser library generates an abstract model AST (Abstract Syntax tree). The AST abstracts our code to make it easier to manipulate. * /
        const ast = getAst(modulePath);
        // Use ast and Traverse libraries to get the module's dependencies. The principle is to analyze the "import" statement in the code.
        const deps = getDeps(ast, dirname);
        // Use ast and Babel /core to encode source code output through Babel. The transform method of Babel /core can also be used to transcode source code without using ast
        const code = getParseCode(ast)
        // depsGraph stores module information that is the source code and its dependencies
        depsGraph[modulePath] = {
            deps,
            code
        }
        // If the module has dependency deps, it continues recursively to find the dependencies below it, so that the loop finds all dependencies at the beginning of the entry file.
        if (Object.keys(deps).length) {
            for (const key in deps) {
                if (Object.hasOwnProperty.call(deps, key)) {
                    // Get module information recursively
                    this.getModuleInfo(deps[key], dirname)
                }
            }
        }
    }
}

// Three utility functions used in getModuleInfo
// Get the abstract syntax tree from the file path
const getAst = (modulePath) = > {
    const file = fs.readFileSync(modulePath, "utf-8");
    // 2. Parse it into an AST abstract syntax tree
    const ast = babelParser.parse(file, {
        sourceType: "module".// Es6 module (default: commonJs)
    });
    return ast
};
// Get dependencies from the abstract syntax tree AST
const getDeps = (ast, dirname) = > {
    // This module depends on collections
    const dependSet = {
    }
    Traverse the traverse library to traverse the dependencies, or collect them yourself. Dependencies can be retrieved from either the abstract syntax tree or source code. Off-the-shelf libraries are convenient
    traverse(ast, {
        // Run the program. Body inside the AST to determine the statement type
        // The current function is triggered if type is ImportDeclaration
        ImportDeclaration({ node }) {
            const relativePath = node.source.value // The relative path of the import file
            const absolutePath = path.resolve(dirname, relativePath) 
            dependSet[relativePath] = absolutePath // The absolute path of the record file in the dependency}})return dependSet
};
// Get the final output code according to the abstract syntax tree
const getParseCode = (ast) = > {
    // Compile code for syntax that is not recognized by modern browsers
    // @babel/core can directly compile the AST abstract syntax tree into compatible code
    /* Complete compiling, output */
    const { code } = transformFromAstSync(ast, null, {
        presets: ["@babel/preset-env"]})return code
}

// The myWebpack function to be output by this module
const myWebpack = (config) = > {
    return new Compiler(config);
};
module.exports = myWebpack;
Copy the code

If you run the built. Js file above now, it will print out the dependency table, which looks something like this.

depsGraph = {
    './src/index.js': {
        deps: {
            './add.js': 'E:\\study\\JavaScript\\webpack\\bWebpack\\principle\\myWebpack2\\src\\add.js'.'./sub.js': 'E:\\study\\JavaScript\\webpack\\bWebpack\\principle\\myWebpack2\\src\\sub.js'
        },
        code: '"use strict"; \n' +
            '\n' +
            'var _add = _interopRequireDefault(require("./add.js")); \n' +
            '\n' +
            'var _sub = _interopRequireDefault(require("./sub.js")); \n' +
            '\n' +
            'function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }\n' +
            '\n' +
            'console.log((0, _add.default)(1, 2)); \n' +
            'console.log((0, _sub.default)(3, 1)); '
    },
    'E:\\study\\JavaScript\\webpack\\bWebpack\\principle\\myWebpack2\\src\\add.js': {
        deps: {},
        code: '"use strict"; \n' +
            '\n' +
            'Object.defineProperty(exports, "__esModule", {\n' +
            ' value: true\n' +
            '}); \n' +
            'exports.default = _default; \n' +
            '\n' +
            'function _default(x, y) {\n' +
            '  return x + y;\n' +
            '} '
    },
    'E:\\study\\JavaScript\\webpack\\bWebpack\\principle\\myWebpack2\\src\\sub.js': {
        deps: {},
        code: '"use strict"; \n' +
            '\n' +
            'Object.defineProperty(exports, "__esModule", {\n' +
            ' value: true\n' +
            '}); \n' +
            'exports.default = _default; \n' +
            '\n' +
            'function _default(x, y) {\n' +
            '  return x - y;\n' +
            '} '}}Copy the code

The second thing to do is to close the table based on dependencies and export the final bundle file.

Assembly output function

If you assemble output functions directly from strings, it can be a little confusing. So first implement the function you want to output in a JS. This function internally implements the require and export functions, taking the dependency table as arguments, because the code encoded by Babel uses CommonJs rules.

(function (depsGraph) {
    // To load the entry file
    function require(module) {
        // Define the require function inside the module
        function localRequire(relativePath) {
            // To find the absolute path to import modules, use require loading
            return require(depsGraph[module].deps[relativePath])
        };
        // Define the exposed object
        var exports = {};
        /* We need to define the localRequire function inside the module, instead of using the require function directly. The reason is that we use the code transformed by Babell, require uses the relative path when passing parameters, and we need to implement a layer of transformation */
        (function (require.exports, code) {
            // code is a string and is executed using eval
            eval(code)
        })(localRequire, exports, depsGraph[module].code);

        // return as the return value of require
        // The following require function can get exposed content
        return exports;
    }
    // Load the entry file
    require("./src/index.js")
})(depsGraph);
Copy the code

This is the final bundle to export. If you take the dependency table obtained in the first step and execute this function directly, you will have the same effect as executing the source code. The last thing you need to do is assemble the function with strings in mywebpack.js. Below is the complete code in myWebpack.js.

MyWebpack complete source code

const fs = require("fs");
const path = require("path");

const babelParser = require("@babel/parser");
const traverse = require("@babel/traverse").default;
const { transformFromAstSync } = require("@babel/core")


const myWebpack = (config) = > {
    return new Compiler(config);
};

// Compiler constructor
class Compiler {
    constructor(options = {}) {
        this.options = options; // Get the WebPack configuration
        this.entry = this.options.entry || "./src/index.js" // Get the entry file. If not, use the default value
        this.entryDir = path.dirname(this.entry) 
        this.depsGraph = {}; // Dependency table, output of the first step
    }
    // Start webpack packaging
    async run() {
        const { entry, entryDir } = this
        // Get module information from the entry file
        this.getModuleInfo(entry, entryDir);
        // Generate the build content after obtaining the module information
        this.outputBuild()
    }
    // Obtain module information according to the file path
    getModuleInfo(modulePath, dirname) {
        const { depsGraph } = this
        const ast = getAst(modulePath);
        const deps = getDeps(ast, dirname);
        const code = getParseCode(ast)
        // depsGraph stores module information that is the source code and its dependencies
        depsGraph[modulePath] = {
            deps,
            code
        }
        // If the module has dependency deps, it continues recursively to find the dependencies below it, so that the loop finds all dependencies at the beginning of the entry file.
        if (Object.keys(deps).length) {
            for (const key in deps) {
                if (Object.hasOwnProperty.call(deps, key)) {
                    // Get module information recursively
                    this.getModuleInfo(deps[key], dirname)
                }
            }
        }
    }
    // Finally, use fs to output js files
    outputBuild() {
        const build = '(function (depsGraph) {function require(module) {function localRequire(relativePath) { Return require(depsGraph[module]. Deps [relativePath])}; Var exports = {}; (function (require, exports, code) {// do eval(code)})(localRequire, exports, depsGraph[module].code); return exports; } require("The ${this.options.entry}")
        })((The ${JSON.stringify(this.depsGraph)})) `;
        let outputPath = path.resolve(this.options.output.path, this.options.output.filename)
        fs.writeFileSync(outputPath, build, "utf-8")}}// Get the abstract syntax tree from the file path
const getAst = (modulePath) = > {
    // 1. Read the contents of the import file
    /* The second argument returns Buffer data if it is not written, or string data */ if utF-8 decoding is written
    // All relative paths to node are based on the runtime environment, in this case package.json.
    // the myWebpack directory
    const module = fs.readFileSync(modulePath, "utf-8");
    // 2. Parse it into an AST abstract syntax tree
    const ast = babelParser.parse(module, {
        sourceType: "module".// Es6 module (default: commonJs)
    });
    return ast
};
// Get dependencies from the abstract syntax tree AST
const getDeps = (ast, dirname) = > {
    // Rely on the collection
    const dependSet = {
    }
    Traverse libraries to traverse dependencies. You can also traverse dependencies from abstract syntax trees or import source code. Off-the-shelf libraries are convenient
    traverse(ast, {
        // Run the program. Body inside the AST to determine the statement type
        // If type: ImportDeclaration fires the current function
        ImportDeclaration({ node }) {
            // Module relative path "./add.js"
            const relativePath = node.source.value
            const absolutePath = path.resolve(dirname, relativePath)
            dependSet[relativePath] = absolutePath
        }
    })
    return dependSet
};
// Get the final output code according to the abstract syntax tree
const getParseCode = (ast) = > {
    // Compile code for syntax that is not recognized by modern browsers
    // @babel/core can directly compile the AST abstract syntax tree into compatible code
    /* Complete compiling, output */
    const { code } = transformFromAstSync(ast, null, {
        presets: ["@babel/preset-env"]})return code
}


module.exports = myWebpack;
Copy the code

conclusion

The code required to write a complete Webpack is enormous, and the above is just the simplest overall architecture. But even so, still feel more difficult than the summary of the foundation, may also appear some mistakes, welcome you to correct!