The article origin

In the last article, I shared the experience and summary of using Webpack to build VUE project, but I still only stay in the stage of building scaffolding with Webpack, and I still don’t know how to webpack principle. In addition, there are few articles on the principle of Webpack in various forums, or some headline party. How to configure WebPack, how to optimize; Or the whole copy source + simple annotations; Of course, there are also articles written by Daniel. Although the articles are good, they are obscure and difficult to understand.

For various reasons, I decided to study the implementation principle of WebPack (really hard ah). But I believe that by reading through this article, even a chicken will be able to understand the principles of Webpack thoroughly.

Well, no more gossip, first look at webpack’s official website to their own definition, from the definition to find a breakthrough! let’s go~

Essentially, WebPack is a static Module Bundler for modern JavaScript applications. When WebPack works with an application, it recursively builds a dependency graph that contains every module the application needs, and then packages all of those modules into one or more bundles.

The keywordrecursive,Rely on,Generate one or more bundles

What is a recursive build and what is a dependency? The following figure

Module a. js refers to module B.js, and module B.js refers to module C. JS. In this case, A, B, and C constitute A dependency. Why recursively build? How is the Webpack configuration file configured?

Whether it is a single entry or multiple entries, an entry in a configuration file has only one or more entry files.

For example, if you set a.js to entry, webPack must package all require modules in A.js together (B.js), but b. js also has dependencies (C.js), which must be resolved recursively (if there are dependencies, then recursively). C.js is first packaged and merged with B. JS, and then the merged code is merged with A. JS to generate the final bundle.

Did you get a clue? The above example is only the simplest analysis of how WebPack builds dependency modules from entry parsing. Let’s take a look at the packaged webPack code from the project.

The project structure directory is as follows

├─ node_modules ├─ SRC │ ├─ index.js │ ├── ticket-.js │ ├─ price.js │ ├─ webpack.config.jsCopy the code

Webpack configuration items.

const path = require('path')
module.exports = {
  entry: './src/index.js'.output: {
    filename: 'min-bundle.js'.path: path.join(__dirname, 'dist')}},Copy the code

Module code

Analyze webPack bundle files

There are three module files, index.js is the entry file of webpack, it depends on ticket.js, ticket.js depends on price.js. We hope that after running the min-bundle.js generated by WebPack, we can log out “Disney ticket sale, ticket price is 299 RMB/person”, and then execute the packaging. So how does WebPack do it?

This depends on the min-bundle.js generated. After removing as much irrelevant code as possible to make it easier to understand, the main code looks like this:

See? Generated bundle. Since the js is a calling function, parameter is an object, the key to the current project entry documents and their dependencies in the module, namely the. / SRC/index. Js,. / SRC/ticket. Js,. / SRC/price, js, value is a function that is corresponding to each module of code, Use eval to execute the internal code. From the calling function, there is a __webpack_require__ function in the function body. The following is a step-by-step analysis:

First execution

Return __webpack_require__(‘./ SRC /index.js’) directly from the calling function, so the __webpack_require__ function is executed with the __webpack_require__ parameter moduleId, When first executed, this is the entry file for the project, which is./ SRC /index.js. Let’s go inside the function, okay? A Module object was found

const module = {
    i: moduleId,
    exports: {}}moduleThe main purpose of the object is to provide each module with a separatemoduleObject,moduleThere is also an exports object inside the object which makes it available to every modulemoduleExports and exportsCopy the code

Modules [moduleId].call(module.exports, module, module.exports, __webpack_require__). Don’t be confused, follow the rhythm, the next step by step to explain each line of code

  • What is modules? These are the arguments passed in from the calling function, which is the object in the red box in the figure above

  • What about moduleId? The first execution is ‘./ SRC /index.js’

  • So it becomes: modules[‘./ SRC /index.js’], what does that mean? / SRC /index.js: / SRC /index.js: / SRC /index.js Who is this pointing to? . Call (module.exports, module, module.exports, __webpack_require__) takes four parameters

    Arg1:`module.exports`Clear,thisPoints to (because each module will have its ownmoduleObject) arg2:`module`Object that makes it possible to pass within a modulemoduleArg3:`module.exports`}}}}}}}}}}}}}}}}}}}}}}}}}`__webpack_require__`Function, why? If you want to execute all the functions that correspond to the keys in the Modules object`__webpack_require__()`To add further dependencies, where does this function come from? That's where it comes in, and that's the call for recursion`__webpack_require__`
    Copy the code
  • Starts executing a function within the modules object with key ‘./ SRC /index.js’

    function(module, exports, __webpack_require__) {
       eval(`const ticket = __webpack_require__("./src/ticket.js"); Console. log(' Disney Ticket sale, '+ ticket); `)}Copy the code
  • The function calls __webpack_require__(“./ SRC /ticket.js”), so you have to go through the process all over again. That’s right, because index.js relies on ticket.js

Second execution

The __webpack_require__(moduleId) argument is changed to ‘./ SRC /ticket.js’, still repeating the above step, when running modules[‘./ SRC /ticket.js’].call(), The function of ‘./ SRC /ticket.js’ in the modules object is executed

function(module, exports, __webpack_require__) {
    eval(`const price = __webpack_require__("./src/price.js"); Module. exports = 'ticket price is' + price. Content; `)}Copy the code

This function relies on price. Js again

Third execution

The argument to __webpack_require__(moduleId) becomes ‘./ SRC /price.js’, and the function in the modules object with the key ‘./ SRC /price.js’ is repeated

function(module, exports, __webpack_require__) {
    eval('module.exports = {content: '299 RMB/person '}; `)}Copy the code

{content: ‘299 RMB/person ‘}

Finished? Of course not

Price.js is finished, ticket.js is still waiting, and assignment begins

When price.js is not executed recursivelyeval(` const price = __webpack_require__("./src/price.js"); Module. exports = 'ticket price is' + price. Content; `) after recursively executing price.jseval(} module. Exports = '/ exports')
Copy the code

Don’t worry bro, ticket.js is done, index.js is still waiting

Ticket.js is not executed recursivelyeval(` const ticket = __webpack_require__("./src/ticket.js"); Console. log(' Disney Ticket sale, '+ ticket); `) after recursively executing ticket.jseval('const ticket =' console.log(' Disney ticket sale, '+' ticket price: 299 RMB/person '); `)
Copy the code

At this point, all dependency modules are resolved, and back to the original self-calling function code, return __webpack_require__(“./ SRC /index.js”).

At this point __webpack_require__(“./ SRC /index.js”) already has a result, i.e

‘Disney Ticket sale, ticket price is 299 RMB/person’

Just return, and you’re done! Congratulations!

Phase summary: Webpack takes the name of each JS file as the key, the code of the JS file as the value, and stores it into the object one by one as parameters. I then implemented a __webpack_require__ function that recursively imported dependencies.

This is the end of the analysis of bundle.js after webPack packaging. As mentioned above, WebPack stores the name of each JS file as a key, and the code of that JS file as a value, one by one, into the object as a parameter. So the question is, how does it work inside? If the loader and plugin are configured, how to handle the JS code in the module?

Now let’s practice, to achieve a mini WebPack of their own, deep understanding of webPack packaging principle, loader principle and plug-in principle.

Project preparation

Create two new projects. One is the main program of Min-pack, which you can think of as Webpack. The other project is the developer’s own project, which means that if you want to use min-pack, you have to have your own program code first. Otherwise, who do you want to use min-pack?

The realization of min – pack

The preparatory work

  1. First create a new directory and name it Min-pack

  2. Create a new lib directory in the root directory, and create compiler. js in the directory. This js is used for parsing and packaging

  3. In the root directory, create a template directory with output.ejs inside. Use the EJS template to generate the package code

  4. Create a bin directory under this project and put the packaging tool main program in it

    #! /usr/bin/env node
    const path = require('path')
    
    // Minpack.config. js is the developer's own project configuration file. Webpack4 defaults to 0
    // Let's not do anything complicated here and just specify the configuration file as minpack.config.js
    // If you want to use my min-pack, you must have the minpack.config.js configuration file at the root of your project
    // Note: path.resolve can resolve minpack.config.js in the developer's working directory
    const config = require(path.resolve('minpack.config.js'))
    
    // Introduce the packaged main program code compiler.js
    const Compiler = require('.. /lib/Compiler')
    // Pass the configuration file into the Compiler and execute the start method to start packaging
    new Compiler(config).start()
    Copy the code

    Note: At the top of the main program should be: #! /usr/bin/env Node identifier, which specifies the program execution environment as node

  5. Configure the bin script in package.json in the project

    "Bin ": {"min-pack": "./bin/min-pack.js"} // After this configuration, the developer can use 'min-pack' for packaging in their own projects.Copy the code
  6. Link to the global package via NPM Link for local testing. After the test is completed, it will be released to NPM for third-party developers to use

Once you’ve done that, you can run min-Pack in another project, the one that the developer wants to package. Compiler. Js is not implemented yet, so it is not possible to parse the build. Let’s implement the packaging function.

Compiler class

What does compiler.js do?

If you want to use my min-pack, you must have the minpack.config.js configuration file at the root of your project

  1. Compiler.js accepts minpack.config.js and gets the value of the entry in the configuration file, such as./ SRC /index.js

  2. Use fs.readFileSync in the node module to read the module file and obtain the source code of the module file

  3. Convert the module source code into an AST syntax tree. What? What is an AST syntax tree?

    • In fact, UglifyJS or Babel conversion code, the actual behind is in the JavaScript abstract syntax tree operation.
    • Just to make it simple,ASTSyntax trees are all about making JavaScript code more efficient and concise. Because in step 4 below, the module source coderequirereplace__webpack_require__, how to replace? Are you asking me to write regular expressions? Or manipulation strings? That would be Low
  4. Replace all require in the source code with __webpack_require__ (why?).

    • Because the browser environment does not recognize the require syntax. Export const xx = 1 or exports default {… } to export modules, no require. Babel will internally convert your import to require and export and export default to exports. The following figure

    • Recall that when we first analyzed min-bundle.js packaged with WebPack, it internally replaced __webpack_require__ with the require() inside our project’s entry file and all of its dependencies. Then I implemented __webpack_require__ myself, which defines a module object with exports inside: {}, so you can export modules using exports or module.exports, and import modules using __webpack_require__.

  5. Store the parameter in the module file require(), the path of the module file’s dependencies, in an array named dependencies

  6. The relative path of the module file, i.e./ SRC /xxx.js, is used as the key, and the processed source code as the value, stored in an object, which is defined as modules for the time being.

    • Why do you want to store it in an object? I have to go back to the analysis abovemin-bundle.js. Inside it is a self-calling function whose arguments are just definedmodulesObject that the function body passes__webpack_require__Recursive callsmodulesObjectkeyThe correspondingvalueThat is to say, shouldkeyCorresponding source code.
  7. After the first module file is parsed, if the module has a dependency file, it is time to start parsing its dependency modules. In step 5, you store the dependencies module path in the Dependencies array. Ok, iterate through the array and start recursion step 2 until the last module has no dependencies.

  8. In this case, modules is an object with the module path as the key and the module source code as the value, as shown in the figure below

  9. Now that modules are available, how do you generate the package code? Don’t forget, we have a template output.ejs. Look inside the template:

    Familiar? This is similar to the webPack generation we first analyzedmin-bundle.js? All we have to do is toCompiler.jsInside,Entry file pathAnd what I just generatedmodulesobject, the use ofejsTemplate syntax for nesting

  10. When nesting is complete, the output path in the configuration file is read and the contents of output.ejs are written to the directory specified in the developer project via fs.writefilesync

  11. Finish packing!

So in summary, the basic idea is

  • Recursively look for dependencies and parse the AST syntax tree, changing the require for all dependencies to __webpack_require__
  • Use the FS module to read all the modified dependencies
  • The relative path that each module depends on is the key, and the module code is the value, stored in the object that generates the final bundle file

Implementation of the Compiler class

const path = require('path')
const fs = require('fs')
const ejs = require('ejs')
// Parse the AST syntax tree
const parser = require('@babel/parser')
// Maintains the state of the entire AST tree, replacing, deleting, and adding nodes
const traverse = require('@babel/traverse').default
// Convert the AST to code
const generator = require('@babel/generator').default

class Compiler {
  constructor(config) {
    this.config = config
    this.entry = config.entry
    // root: the absolute path of the directory where the min-pack command is executed
    this.root = process.cwd()
    this.modules = {}
  }
  
  /** * Package dependency analysis * @param {Object} modulePath Absolute path of the current module */
  depAnalyse(modulePath, relativePath) {

    let self = this

    // 1. Read the code of the module file
    let source = fs.readFileSync(modulePath, 'utf-8')

    // 2. Declare a dependency array to store all dependencies of the current module
    let dependencies = []
    
    // 3. Convert the current module code to AST syntax
    let ast = parser.parse(source)

    // 4. Modify the AST syntax tree
    traverse(ast, {
      CallExpression(p) {
        
        if(p.node.callee.name === 'require') {

          p.node.callee.name = '__webpack_require__'
          
          // Extract and process the file path passed in require()
          p.node.arguments[0].value = '/' + path.join('src', p.node.arguments[0].value))
          
          // Process the backslash \ in the path
          p.node.arguments[0].value = p.node.arguments[0].value.replace(/\\+/g.'/')
          
          // Store the current module path in the Dependencies array for a recursive call to the depAnalyse function
          dependencies.push(p.node.arguments[0].value)
        }
      }
    })

    // 5. Convert the processed AST syntax tree into program code
    let resultSourceCode = generator(ast).code

    // 6. Obtain the relative path between the absolute path of the directory where the package is executed and the absolute path of the current module
    let modulePathRelative = this.replaceSlash('/' + path.relative(this.root, modulePath))
    
    // 7. Store the relative path obtained in 6 as the key and the code processed by the current module as the value in this.modules
    this.modules[modulePathRelative] = resultSourceCode

    dependencies.forEach(dep= > {
      return this.depAnalyse(path.resolve(this.root, dep), dep)
    })

  }

  /** * Concatenate the generated this.modules with the fetching template string */
  emitFile() {
    const templatePath = path.join(__dirname, '.. /template/output.ejs')
    // Read the template file
    let template = fs.readFileSync(templatePath, 'utf-8')
    // Render the template
    let result = ejs.render(template, {
      entry: this.entry,
      modules: this.modules
    })

    // Read the output of the configuration file and write the generated result to the configuration output specified file
    let outputPath = path.join(this.config.output.path, this.config.output.filename)

    fs.writeFileSync(outputPath, result)

  }

  start() {
    // 1. Dependency analysis
    this.depAnalyse(path.resolve(this.root, this.entry), this.entry
    // 2. Generate the final packaged code
    this.emitFile()
  }
  
}

module.exports = Compiler
Copy the code

This is the Compiler implementation. Completing this step means that your project code is ready to be min-packed, so give it a try

Of course, this is just a super mini version of WebPack. If you read this, you may have ignored the existence of Loader and Plugin, and may have some questions, how to write your own min-pack similar to webpack loader and plugin functionality?

What is the loader

Webpack can use loader to preprocess files. This allows you to package any static resource other than JavaScript. You can easily write your own loader using Node.js

Simply put, a Loader is a JS file that exposes a function that handles module code such as

Source code for the module

So how do we combine the Loader with our own compiler. js?

Add loader functionality to min-pack

  • Since you need a loader to process your code, you must have this loader and yoursminpack.config.jsMust also be configured accordinglyrulesThe value of use in rules can be a string, an object, or an array.
    module: {
        rules: [{test: /\.js$/.use: [
              './loaders/loader1.js'.'./loaders/loader2.js'.'./loaders/loader3.js']]}}Copy the code
  • inCompiler.jsTo read the configuration filerules
  • becauseCompiler.jsdepAnalyseInside the function, read the source code of the module file, take the module code as an argument,Reverse iterationCall all Loader functions (loaders are loaded from right to left, so they must also be called backwards)
  • Finally return the processed code, AST syntax tree parsing, replacementrequire(Previous steps)…..
  • So onceloaderIf the correct file type is matched, the loader function is called. A file has n filesloaderIf it matches, the file is processed n times, and when it’s done, it returns the processed code. That’s whywebpackPackaged inloaderThe reason why it takes the most time at this level is that only matches are calledloaderFunction to deal with

Ah, I’m so tired, I can’t write any more…

What is a plugin

Plugins are an important part of the WebPack ecosystem and provide a plug-in interface that gives developers direct access to the build process

Official definition: Plug-ins can hook into all critical events that are triggered in each compilation

Simple understanding, the plug-in is in the WebPack compilation process of the life cycle hook, coding development, implementation of the corresponding function. That is, when your plug-in needs to be executed during the compile cycle, it calls the corresponding hook function, and within that hook, it implements the function

Attach a lifecycle hook for the WebPack compiler

Question: Isn’t WebPack a packer? Why have a life cycle? How does it implement the lifecycle?

From the implementation of loader in compiler. js above, it is not difficult to see that the compilation process of WebPack is like A pipeline, and each compilation stage is like A pipeline worker to process it. After processing, A is handed over to B, and B is handed over to C… Each worker has a single responsibility until the process is complete.

Now I have a mineral water processing plant, let’s see how a bottle of water is produced:

  • First, store raw water. Without water, what do you want me to process? Similar to writing webpack.config.js, no configuration file, what do you want me to pack? (Of course webPack4 is awesome, it is not configured by default)
  • Multilayer filter, something likeloaderEvery bottle of water goes through a filter
  • Uv kills bacteria, similar toJS cssCode compression (Uglifyjs, mini-CSs-extract-plugin). Hey, isn’t that a plug-in?
  • Bottle it and paste it, something like thathtml-webpack-pluginThe plugin,bundle.jsAutomatically imported into the production HTML. Hey, isn’t that also a plug-in?
  • Finished!

Now we have a question. How does the machine that processes mineral water know when to sterilize, when to bottle and when to put up advertisements? Similarly webpack.

In fact, webPack, through Tapable, a small library, has the ability to connect the various generation lines in the form of a stream of events, the core principle of which is the publish subscriber model. Tapable provides a series of synchronous and asynchronous hooks that WebPack uses to define its own lifecycle. Webpack publishes events at different stages of the process. The plugin only needs to subscribe to the events you need to use. When webPack compiles to this stage, it executes the callback function of the subscribed event in your plugin.

Confused? Ok, let’s go back to the previous example

  • When storing raw water, arrange a worker Zhang SAN, once start processing, broadcast: “start processing!!” (released)
  • At the beginning of the water filtering phase, arrange a worker Li Si, once the filtering begins, broadcast: “Filtering!!” (released)
  • Once the filtering is complete, arrange for a worker to work with you. Once the filtering is complete, broadcast: “Filtering is over!!” (released)
  • When processing is finished, arrange a worker zhao six, once processing is finished, broadcast: “processing is finished!!” (released)
  • So the question is, I have two steps, sterilization and bottling and paste the advertisement, when should I put these two steps in? Can I sterilize before processing? Zhang SAN said, you go away, I haven’t even started! Can I sterilize it before filtration? Li Si won’t let me again.
  • So, in the sterilization step, I have to internally tell the machine, you’re going to broadcast to me, “When the filtration is over, start sterilization.” What do you call that? This is in the subscription king five broadcast, once the king five broadcast time, I will sterilization!
  • Similarly, the step of bottling and pasting the advertisement is the same, it also has to subscribe to king’s broadcast, after king’s broadcast, bottling.
    • Don’t you? I bottled it without sterilizing it? Black business!
    • Actually, who put the sterilizing machine and the bottling machine in order? The manufacturer, of course, so this requires manipulation
    • In the same way,webpackMy code is first processed by plug-in A, and then the processed code is handed over to plug-in B. Who wrote the plugin order? You, of course, so when using plug-ins, you have to know what each one does and then call them in order.

Do you understand how plug-ins work? Don’t worry, let’s implement a plug-in using the Tapable library in our min-pack implementation.

Implement the lifecycle and publish the events

  1. First install tapable. How to use tapable? portal

  2. Then, in the Compiler class, you define the lifecycle

    class Compiler {
      constructor(config) {
        this.config = config
        this.entry = config.entry
        // root: the absolute path of the directory where the min-pack command is executed
        this.root = process.cwd()
        this.hooks = {
          start: new SyncHook(), // Min-pack starts compiling the hook
          compile: new SyncHook(["relativePath"]), // The hook in the compilation knows the name of the currently compiled module
          afterCompile: new SyncHook(), // All hooks are compiled
          emit: new SyncHook(["filename"]), // Start wrapping the bundle.js hook
          afterEmit: new SyncHook(["outputPath"]), // Wrap bundle.js to end the hook
          done: new SyncHook() // Min-pack compile end hook
        }
        this.modules = {}
      }
    }
    Copy the code

    Above, we defined six lifecycle hooks. When will they be released?

  3. Publish lifecycle hooks

    start() {
        // Compile the whole hook (start)
        this.hooks.start.call()
        
        // compile the hook.
        this.hooks.compile.call() 
        
        // The compiler function starts compiling
        this.depAnalyse(path.resolve(this.root, this.entry), this.entry)
        
        // End of compile hook (afterCompile)
        this.hooks.afterCompile.call()
        
        // Compile the whole hook (done)
        this.hooks.done.call()
      }
    Copy the code

    Within the function, emit and afterEmit hooks are published. The code is described above, but some of the code is omitted

    emitFile() {
        / /... Omit code here
        
        // Start packaging bundle.js hooks (emit)
        this.hooks.emit.call(this.config.output.filename)
        
        // fs write to file (generate bundle.js)
        fs.writeFileSync(outputPath, result)
        
        // Wrap the bundle.js hook (afterEmit)
        this.hooks.afterEmit.call(outputPath)
      }
    Copy the code
  4. Ok, we have our lifecycle, and we publish the events in the specified phase. What’s next? Write plug-ins! Finally, I can write my own plugin.

    • But because we did it ourselvesminiVersion of thewebpack“, so noCompilationObject, huh? First time I heard it. What is itCompilation? Explained later.
    • Therefore, our plug-in can only be writtenhelloWorldLevel. Then we’ll name him temporarilyHelloWorldPlugins

implementationHelloWorldPluginsThe plug-in

How do you write a WebPack plug-in? Official definition:

The WebPack plug-in consists of the following:

  1. A JavaScript named function.
  2. Define an apply method on the prototype of the plug-in function.
  3. Specifies an event hook that is bound to webPack itself.
  4. Handle specific data for webPack internal instances.
  5. When the functionality is complete, call the callback provided by WebPack.

In addition to the third point, it doesn’t have to be just one. If your plugin needs to do different things in different phases, you can also bind multiple event hooks, but it is not recommended.

Look at the code ~

module.exports = class HelloWorldPlugins {
  / / the apply method
  apply(compiler) {
   // Specify one (or more in this plug-in) event hook bound to WebPack itself.
   // Subscribe to the start hook
    compiler.hooks.start.tap('HelloWorldPlugin', () = > {console.log('WebPack starts compiling')});// Subscribe to the compile hook
    compiler.hooks.compile.tap('HelloWorldPlugin', () = > {console.log('Compiling')});// Subscribe to the afterCompile hook
    compiler.hooks.afterCompile.tap('HelloWorldPlugin', () = > {console.log('WebPack compile finished')});// Subscribe to the emit hook
    compiler.hooks.emit.tap('HelloWorldPlugin', (filename) => {
      console.log('Start packing a file named:', filename)
    });
    
    // Subscribe to afterEmit hooks
    compiler.hooks.afterEmit.tap('HelloWorldPlugin', (path) => {
      console.log('The file is packaged and the file path is:', path)
    });
    
    // Subscribe to the done hook
    compiler.hooks.done.tap('HelloWorldPlugin', () = > {console.log('End of WebPack')}}}Copy the code

Look at the log after running:

At this point, we’re done with the HelloWorldPlugins plugin. Because there’s no Compilation object, we can’t do any cool things. Just to understand how the WebPack plugin works. It’s easy to write a real WebPack plug-in

A function -> call the apply method -> subscribe to the event hook -> write your program code -> call the callback provided by WebPack

The Compiler and Compilation

Above leave all questions, what is Compilation, for the difference between Compiler and Compilation, there are also many articles on the Internet, in fact, very simple

  • compilerThe webPack object represents the immutable WebPack environment and is specific to WebPack. It contains information such as options, loaders, plugins, etcwebpackWhich we wrote ourselvesCompiler
  • compilationObject is for project files that are mutable, that is, every time a file changes during compilation,compilationIt will be recreated. Can be achieved bycompilation.assetsTo get all the resource files that need to be output,compilationYou can get itcompilerObject.

conclusion

This is the end of the analysis of the principles of Webpack, can read here, I believe you have a deeper understanding of the principles of Webpack, the article is a lot of space, if there are shortcomings, please make corrections. Github source address Webpack source analysis