preface

This article will summarize some problems in JS modularization from understanding to implementation, it includes the following several parts:

  1. Understanding modularity

    • Understand the value of modularity in developing front-end JS applications
  2. Differences between several popular modularity specifications

    • Understand the differences between CommonJS, AMD and ES Module specifications
  3. The realization of module system

    • Understand the implementation of the module system in Node
    • Understand the implementation of Module system in Webpack (CommonJS, ES Module)

Understanding modularity

This can be broken down into two questions:

  1. What is modular development? What are the benefits of modular development?
  2. What is the value of modularity to developing front-end JS applications?

Modular development is extremely common. It is necessary to develop large applications. C, Java, Python, and other languages support modularity. Modularization is to split a large application into independent modules, which can only interact with each other through specific interfaces. The benefit of modularity is the separation of concerns, with a module responsible for a single function point, which facilitates reuse and maintenance.

There was no unified modularity standard for JavaScript until ES 2015 introduced the ES Module modularity specification, which is supported by Node.js and some major browsers (currently 93.4 👏). Some people may ask, since the browser does not support, that front end how to do modular development?

It’s true that we were using modular syntax before browsers supported it, import, export, all of these things that we use every day. It’s just that all of this code needs to be converted to a syntax that most browsers are compatible with before it can be executed in the browser. (More on this later)

If you’re interested, you can also learn how front-end engineers wrote JS applications back in the days before these packaging tools. It is recommended to look at the value of front-end modular development, a front-end veteran’s hands-on experience. In the early days, JS code was introduced with script tags like this:

<script src="./time.js"></script> 
<script src="./util.js"></script>
<script src="./main.js"></script>
Copy the code

You can think of a script as a “module,” but the fact that they are in the same scope, rather than separate 😺, is enough to make your development experience significantly less enjoyable. You have to worry about, is your variable going to conflict with other files? Even if it can be constrained by namespaces and other methods, it still needs to be defined and managed manually. This is just one of the things. The order in which scripts are introduced is also artificially maintained, and errors can also occur if dependent files are not introduced or loaded before the current file.

Therefore, using this way to write large JS applications, especially in the collaboration of many people, or there will be a lot of problems, this also reflects the value of JS modular development.

As a result, a lot of JS modular specifications slowly appeared behind, such as CommonJS, AMD, CMD, ES6, etc., related implementation has also appeared, in order to enable developers to write JS applications in a modular way.

Some of the most popular specifications are described below. We don’t need to be familiar with the details of these specifications, but it’s useful to know some of their differences and design intentions.

CommonJS & AMD & ES6 modular specification

First of all, we need to be clear about the difference between modular specification and modular implementation. The specification is the definition, such as specifying that A objects should have B attributes, and the implementation is referring to these specifications to support the corresponding modular syntax, which can be A platform or A tool library.

A specification can have multiple implementations, CommonJS, AMD, ES6 are modular specifications, and NodeJS Module, RequireJS, Webpack Module are the implementation of these modular specifications, on the basis of these implementations, we can use the corresponding modular syntax.

Before starting, I also recommend reading the history of front-end modularity development, which is the same author who wrote the famous SEAJS library, seaJS supports JS modularity development, following the CMD specification.

(1) the CommonJS

The CommonJS website explains why the project started: “The official JavaScript specification defines some apis for browser-side applications, but lacks a standard library declaration for more scenarios (server-side applications, command-line tools, desktop applications, etc.). The CommonJS API will fill this gap by defining apis for common scenarios, ultimately providing a standard library as rich as Python, Ruby, and Java.”

The official JavaScript specification defines APIs for some objects that are useful for building browser-based applications. However, the spec does not define a standard library that is useful for building a broader range of applications.

The CommonJS API will fill that gap by defining APIs that handle many common application needs, ultimately providing a standard library as rich as those of Python, Ruby and Java

At the time, Javascript was only running in browsers, and the vision of CommonJS was for developers to use the standard CommonJS API and, in the future, use Javascript to write any application with a CommonJs-compliant interpreter or platform.

So the CommonJS project is more than just a modularity specification to be precise. It contains Javascript API specification definitions for many scenarios, such as I/O, Filesystem, Package management, etc. The modularity specification is only one part of it.

CommonJS module specification

In the CommonJS Module 1.0 specification, the minimum features that need to be implemented to support modularity are defined.

  1. In a module, there should be a function “require” that can be used directly.

    (1) The “require” function takes a module identifier argument

    (2) “require” returns the API exported by the external module

    (3) If there is a dependency cycle, the object returned by “require” shall contain at least the exported results of the external module before calling the code introduced by the module

    (4) If the imported module cannot return, “require” must throw an error.

  2. In a module, there should be a directly available object “exports” to which the exported API is added at module execution.

  3. A module must use an “Exports” object as the only way to export its API.

What do I mean by cyclic dependency? We can use the following example to help understand. “A. js” (module A) and “B. js” (module B), A leads to B, and B leads to A, thus falling into an endless loop.

// a.js
const b = require('./b');
console.log('Print b in module a =>', b);
exports.name = 'a';

// b.js
const a = require('./a');
console.log('Print a in module b =>', a);
exports.name = 'b';
Copy the code

We can also use the following example of a function call loop:

function funcA() {
  const b = funcB();
  return 'a' + b;
}

function funcB() {
  const a = funcA();
  return 'b' + a;
}
Copy the code

Node (V15.9.0) : RangeError: Maximum call stack size exceeded

Therefore, module systems must solve the problem of cyclic dependencies. The specification does not state a standard way to resolve circular dependencies, but it does state what results should be returned when they occur. In the following code, run a.js, and finally b.js can get the correct export result before the require statement that causes the loop dependency.

// a.js
exports.val = 100; // This should be exported correctly because it precedes the loop dependent require statement
const b = require('./b'); // a cyclic dependency occurs
console.log('Print b in module a =>', b);
exports.name = 'a';

// b.js
const a = require('./a');
console.log('Print a in module b =>', a);
exports.name = 'b';

// Print the result after running a.js
Print a in module b => { val: 100 }
Print b in module a => { name: 'b' }
Copy the code

An example of a complete CommonJS module is as follows:

// math.js
exports.add = function() { // Export an add method
    var sum = 0, i = 0, args = arguments, l = args.length;
    while (i < l) {
        sum += args[i++];
    }
    return sum;
};

// increment.js
var add = require('math').add; // Introduce the add method of the Math module
exports.increment = function(val) {
    return add(val, 1);
};

// program.js
var inc = require('increment').increment; // Add the increment method of the increment module
var a = 1;
inc(a); / / 2
Copy the code

In summary, CommonJS defines a set of module export and import mechanism, each module does not interfere with each other, can only interact through “require” and “exports” interface, which solves the modular development problem.

A bit of history: CommonJS was originally called ServerJS, which can be found on the website. It was intended to promote JS on the server side, but has since changed its name to CommonJS as well. Node.js borrowed the CommonJS specification to implement its own module system, and achieved good results.

(2) the AMD

AMD stands for Asynchronous Module Definition. It specifies a mechanism by which modules and their dependencies can be loaded asynchronously.

Why do you need AMD when you have CommonJS? These two articles on RequireJS (an implementation of AMD) WHY WEB MODULES? , WHY AMD said what they were thinking. Suffice it to say: CommonJS is not suited to a browser environment.

Unlike Node.js, browsers themselves don’t support CommonJS syntax, so if we want to use it, we have to resort to some extra “helper measures.”

// CommonJS require
var add = require('math').add;
Copy the code

There are two ways to do this:

  1. Transform the code before deployment

    Convert the code string before deploying the development code to the browser. Such as using functions to wrap module code (this is what Browserify, Webpack later did)

    function(module.exports.require){
     // Module code goes here
    }
    Copy the code
  2. Or use XMLHttpRequest (XHR) to load the module text and parse it in the browser

The authors of RequireJS argue that both of these approaches place an unnecessary burden on developers. In addition, CommonJS states that a JS file can only have one module, which is also a limitation for browsers. If you have a lot of modules, that means loading a lot of JS files, but the browser has a limited number of concurrent network requests, so it’s not efficient.

To address these issues, AMD has come up with a new module specification for the browser environment.

AMD module specification

The AMD API defines a global function define that takes three arguments:

define(id? , dependencies? , factory);Copy the code

Where, “ID” is the module identifier (which can be omitted), “factory” is the module id. If it is a function, its return value is regarded as the export of the module, and if it is an object, the export of the module.

The following is a simple example:

define(['alpha'].function (alpha) {
  return {
    verb: function () {
      return alpha.verb() + 2; }}; });Copy the code

Alpha is the imported module dependency, and only when all dependencies have been loaded will subsequent callbacks be executed. All module code is encapsulated in a factory function, whose dependencies are passed in as arguments, whose internal scope is isolated, and whose exported module contents are returned as functions.

AMD also provides a way to declare multiple modules in a single file, as follows:

// a.js
define('a1'['my/cart'.'my/inventory'].function (cart, inventory) {
  // module code...
});

define('a2'['my/cart'.'my/inventory'].function (cart, inventory) {
  // module code...
});
Copy the code

Where a1 and A2 are module identifiers so that multiple modules defined in a file can be distinguished.

One popular IMPLEMENTATION from AMD, RequireJS, has had its heyday. RequireJS is very simple to use, we just need to introduce the RequireJS library to use AMD module syntax in the browser. The introduction method is as follows:

 <script data-main="scripts/main" src="scripts/require.js"></script>
Copy the code

Just specify the module’s entry, data-main, and everything else is left to RequireJS.

AMD VS. CommonJS

The CommonJS module specification is specific to the Javascript language, but is not limited to the browser platform, and applying it to the browser environment requires some additional “helper measures.” AMD saw the weakness and came up with an asynchronous module loading specification that was easier to implement in browsers.

Although the AMD specification was later popularized, some of the mechanics were not accepted by the CommonJS community and many developers, such as the timing of module execution.

In CommonJS, the first time require loads and executes the corresponding module.

// b.js
var a = require("./a") // require only a.js
Copy the code

In AMD, the dependency module is already loaded and executed when DEFINE is introduced.

define(['alpha'].function (alpha) { 
  // Here the alpha module is loaded and executed
  return {
    verb: function () {
      return alpha.verb() + 2; }}; });// Another example
define(["a"."b"].function(a, b) {
   if (false) {
       // Module B is executed ahead of schedule even though it is not used at all
       b.foo()
   }

})
Copy the code

So, AMD is kind of ahead of the game, and all modules are executed when the dependencies are declared first, not where they are used in the code.

It should be easier to understand the following example of a CMD module (the CMD specification combines CommonJS and AMD features and is not covered separately in this article).

define(function(require.exports) {
  var util = require('./util.js'); // Load the module here

  exports.init = function() {
      / /...
  };
});
Copy the code

There are also some popular implementations of CMD in the community, such as sea-.js.

But in general, CommonJS, AMD, CMD are all community specifications, and browsers don’t support them natively until the official ES Module specification comes along.

(3) ES Module

ECMAScript is a JavaScript language standard designed to establish a uniform specification across browsers. ES 2015 introduced the definition of modularity, which is now natively supported by all major browsers, which means that our modular syntax can be run directly in the browser 👏.

The grammar of ES Module is familiar to the front-end friends. The following are the basic import and export grammars.

// Import the syntax
import { create, createReportList } from './modules/canvas.js';
import randomSquare from './modules/square.js';
​
// Export the syntax
export { name, draw, reportArea, reportPerimeter };
export default randomSquare;
Copy the code

To run in the browser, we simply add type=module to indicate that this is a module.

<script type="module" src="./main.js"></script>
Copy the code

ES Module VS. CommonJS

Next, let’s talk about some of the differences between ES Module and CommonJS in the import/export mechanism. Many articles on the Internet, including the ES6 Module written by Ruan Yifeng, have mentioned that CommonJS Module is loaded at runtime, while ES6 Module is output at compile time.

How do you understand that? In the article ES-modules-a-Cartoon-deep-Dive, the author analyzed the implementation principle of ES Module in browser. In summary, there is a compilation stage before the Module code is executed. Point all import and export to the storage space of the corresponding variable.

This is why you can’t use variables when importing modules, because the compile phase is static parsing and the code doesn’t run, so you can’t get values for variables.

import mod from `${path}/foo.js` // error
Copy the code

But sometimes we do need to load different Module code depending on the situation. ES Module provides another way to solve this problem, dynamic import, as shown in the following example:

import('/modules/myModulejs')
  .then((module) = > {
    // Do something with the module.
  });
Copy the code

Copy or reference?

The difference between run-time and compile-time is that one returns a copy of the value, while the other returns a reference to the value.

We can use a piece of code to see the difference between CommonJS and ESModule. Here is an example CommonJS module running node.js V15.9.0.

// lib.js
let counter = 3;
function incCounter() {
  counter++;
}
module.exports = {
  counter,
  incCounter,
};
​
// main.js
let mod = require('./lib');
​
console.log(mod.counter);  / / 3
mod.incCounter();
console.log(mod.counter); / / 3
Copy the code

Calling mod.inccounter () in main.js changes the value of counter defined in lib.js. This has no effect on module.exports and prints 3 both times.

For the same example, let’s try ES Module and run it in a browser that supports ES Module.

// lib.js
export let counter = 3;
export function incCounter() {
  counter++;
}
​
// main.js
import { counter, incCounter } from './lib';
console.log(counter); / / 3
incCounter();
console.log(counter); / / 4
Copy the code

The printed result is 3,4, which means that the counter imported is the one defined in lib.js.

conclusion

This article on CommonJS, AMD, ES Module these three specifications are introduced and analyzed, of course, JS modular specification is far more than these. At present, ES Module is expected to unify the server and browser side and become a universal Module specification.

This section introduces the modularity specification, but let’s take a look at how a modular system can be implemented.

JS module system implementation

What does it take to implement a modular system? First we can take a look at the implementation in Node.js, which was one of the first to support JS modular development, although it is different from the browser environment, we can also learn from it.

Node.js

In the chapter “Module Implementation of Node” of Node.js, the introduction of Node into modules is divided into three steps:

  1. Path analysis
  2. File location
  3. Compile implementation

The previous two steps, “path analysis” and “file location”, are to locate the correct file location through the module identity passed in by require.

const utils = require('./utils') // './utils' is the module identifier
Copy the code

There are many small details involved, such as identifiers that might be:

  • Internal modules provided by Node, such as HTTP, FS
  • Relative path or absolute path
  • Introduced NPM packages such as Express and Axios

In addition, module identifiers in code are often simplified, such as./utils, which may be.js,.json, or even.utils/index.js since it has no extension.

We developers don’t have to worry about these details, but for modular systems it is important to analyze them.

The second part is “Compilation and execution”. After the file is located, Node first compiles the file. Js,.node, and.json files are supported in Node, and their loading methods are also different. Each successfully compiled module is stored in the cache with the file path as an index to improve the performance of secondary import, which is also known as module cache.

This is mainly about the compilation of JS files. Before the JS module is executed, Node encapsulates the acquired JS file. For example, a module file like this:

 const math = require('math'); 
 exports.area = function (radius) { 
    return Math.PI * radius * radius; 
 };
Copy the code

When encapsulated, it becomes:

(function (exports.require.module, __filename, __dirname) { 
   const math = require('math'); 
      exports.area = function (radius) { 
   return Math.PI * radius * radius; 
   }; 
});
Copy the code

Through function packaging, each module file is scope isolated, and each module can directly use require and exports for internal import and export, and unified management by module system, so Node.js supports modularity.

Webpack

If you look at some of the modular implementations in the front-end domain, you’ll see that they have a lot in common with node.js implementations.

First, Webpack. We can easily read the packaged code by setting mode=development and devtool=false in webpack.config.js.

// webpack.config.js
const path = require('path');
module.exports = {
  entry: path.join(__dirname, 'main.js'),
  output: {
    path: path.join(__dirname, 'output'),
    filename: 'index.js',},mode: 'development'.devtool: false};Copy the code

CommonJS Module

First we configure an entry file main.js and a module that relies on bar.js, and then run Webpack (WebPack 5.51.1, webpack-CLI 4.8.0) to pack.

// main.js
let bar = require('./bar');
function foo() {
  return bar.bar();
}
​
//bar.js
exports.bar = function () {
  return 1;
};
Copy the code

Everything that wraps the result is as follows (unnecessary comments are omitted) :

(() = > {
  // webpackBootstrap
  var __webpack_modules__ = {
    './bar.js': (__unused_webpack_module, exports) = > {
      exports.bar = function () {
        return 1; }; }};// The module cache
  var __webpack_module_cache__ = {};
​
  // The require function
  function __webpack_require__(moduleId) {
    // Check if module is in cache
    var cachedModule = __webpack_module_cache__[moduleId];
    if(cachedModule ! = =undefined) {
      return cachedModule.exports;
    }
    // Create a new module (and put it into the cache)
    var module = (__webpack_module_cache__[moduleId] = {
      exports: {},});// Execute the module function
    __webpack_modules__[moduleId](module.module.exports, __webpack_require__);
​
    // Return the exports of the module
    return module.exports;
  }
​
  var __webpack_exports__ = {};
  (() = > {
    // ./main.js
    let bar = __webpack_require__('./bar.js');
    function foo() {
      returnbar.bar(); }}) (); }) ();Copy the code

We can understand it completely if we look at these two parts.

  1. __webpack_modules__
var __webpack_modules__ = {
  './bar.js': (__unused_webpack_module, exports) = > {
    exports.bar = function () {
      return 1; }; }};Copy the code

__webpack_modules__ is an object in which the module file is stored with the module path as the key value, and the module code we originally wrote is also wrapped in a function. __unused_webpack_module and exports are passed in.

Note: This is very similar to what Node.js does. The first argument is actually module, but it is not used in the module, so it is marked __unused_webpack_module; The third argument should be require, which is also not used here, but can be seen in main.js below, which is __webpack_require__.

  1. __webpack_require__
var __webpack_module_cache__ = {};
​
function __webpack_require__(moduleId) {
  // Check if module is in cache
  var cachedModule = __webpack_module_cache__[moduleId];
  if(cachedModule ! = =undefined) {
    return cachedModule.exports;
  }
  // Create a new module (and put it into the cache)
  var module = (__webpack_module_cache__[moduleId] = {
    exports: {},});// Execute the module function
  __webpack_modules__[moduleId](module.module.exports, __webpack_require__);
​
  // Return the exports of the module
  return module.exports;
}
Copy the code

It’s actually the equivalent of the require function that we’re using, but we’ve changed the name here. __webpack_module_cache__ is first fetched from __webpack_module_cache__ and returned if it exists. If it does not, create a new module object, place it in the cache, and execute the corresponding module. Pass this module object and module.exports into the module function.

At this point, you should be able to understand how Webpack enables browsers to “run CommonJS code,” which is really very similar to Node.js 😺.

ES Module

How does Webpack implement ES Module export as a reference rather than a copy?

Again, prepare two files:

// bar.js
export let counter = 1;
​
​
// main.js
import { counter } from './bar';
console.log(counter);
Copy the code

The result of packing is as follows:

(() = > {
  // webpackBootstrap
  'use strict';
  var __webpack_modules__ = {
    './bar.js': (__unused_webpack_module, __webpack_exports__, __webpack_require__,) = > {
      __webpack_require__.r(__webpack_exports__);
      __webpack_require__.d(__webpack_exports__, {
        counter: () = > counter,
      });
      let counter = 1; }};// The module cache
  var __webpack_module_cache__ = {};
​
  // The require function
  function __webpack_require__(moduleId) {
    // Check if module is in cache
    var cachedModule = __webpack_module_cache__[moduleId];
    if(cachedModule ! = =undefined) {
      return cachedModule.exports;
    }
    // Create a new module (and put it into the cache)
    var module = (__webpack_module_cache__[moduleId] = {
      // no module.id needed
      // no module.loaded needed
      exports: {},});// Execute the module function
    __webpack_modules__[moduleId](module.module.exports, __webpack_require__);
​
    // Return the exports of the module
    return module.exports;
  }
  /* webpack/runtime/define property getters */
  (() = > {
    // define getter functions for harmony exports
    __webpack_require__.d = (exports, definition) = > {
      for (var key in definition) {
        if( __webpack_require__.o(definition, key) && ! __webpack_require__.o(exports, key)
        ) {
          Object.defineProperty(exports, key, {
            enumerable: true.get: definition[key], }); }}}; }) ();/* webpack/runtime/hasOwnProperty shorthand */
  (() = > {
    __webpack_require__.o = (obj, prop) = >
      Object.prototype.hasOwnProperty.call(obj, prop); }) ();/* webpack/runtime/make namespace object */
  (() = > {
    // define __esModule on exports
    __webpack_require__.r = (exports) = > {
      if (typeof Symbol! = ='undefined' && Symbol.toStringTag) {
        Object.defineProperty(exports.Symbol.toStringTag, {
          value: 'Module'}); }Object.defineProperty(exports.'__esModule', { value: true}); }; }) ();var __webpack_exports__ = {};
  // This entry need to be wrapped in an IIFE because it need to be isolated against other modules in the chunk.
  (() = > {
    // ./main.js
​
    __webpack_require__.r(__webpack_exports__);
    /* harmony import */
    var _bar__WEBPACK_IMPORTED_MODULE_0__ = __webpack_require__('./bar.js');
​
    console.log(_bar__WEBPACK_IMPORTED_MODULE_0__.counter); }) (); }) ();Copy the code

We’ve extracted the key parts:

var __webpack_modules__ = {
  './bar.js': (__unused_webpack_module, __webpack_exports__, __webpack_require__,) = > {
    __webpack_require__.r(__webpack_exports__);
    __webpack_require__.d(__webpack_exports__, { // 1. Note here 🌟
      counter: () = > counter,
    });
    let counter = 1; }};/* webpack/runtime/define property getters */
(() = > {
  // define getter functions for harmony exports
  __webpack_require__.d = (exports, definition) = > {
    for (var key in definition) {
      if( __webpack_require__.o(definition, key) && ! __webpack_require__.o(exports, key)
      ) {
        Object.defineProperty(exports, key, { // 2. And here 🌟
          enumerable: true.get: definition[key], }); }}}; }) ();Copy the code

Originally, here do a layer of agent 😺. We have simplified the code a bit to make it easier to understand. The code before and after packaging is as follows:

/ / before packaging
export let counter = 1;
​
/ / after packaging
let counter = 1;
Object.defineProperty(exports.'counter', {
   enumerable: true.get: () = > counter
})
Copy the code

This way, when we visit exports.counter in another module, we will actually directly delegate to the counter variable defined in that module.

At this point, you should understand the difference between CommonJS Module and ES Module handling in Webpack.

Other implementations will be added later, such as lazy loading of modules and cyclic dependency handling

reference

Finally, all reference documents & articles are attached

  1. AMD
  2. CommonJS
  3. Relation-between-commonjs-amd-and-requirejs
  4. Browserify
  5. Node.js Module
  6. RequireJS
  7. WHY WEB MODULES?
  8. WHY AMD
  9. Value of front-end modular development (recommended reading 🌟)
  10. Front-end modular development history (🌟)
  11. Node.js
  12. The principle and implementation of browser loading CommonJS module
  13. ES6 Module Ruan Yifeng
  14. ESModule in-depth parsing
  15. Webpack modularity principle – CommonJS
  16. Webpack modular principle -ES Module
  17. Front-end build this decade