As a front-end developer, have you ever wondered why it’s possible to use the require method directly in code to load a module, why Node knows which file to select when loading a third-party package, and why, as a common question, Why does ES6 Module export base data types with reference types?

With these questions and curiosity, I hope this article will answer your questions.

CommonJS specification

Prior to ES6, ECMAScript did not provide a way to organize code, which was often “modular” based on IIFE. With the massive adoption of JavaScript on the front end and the push of server-side JavaScript, The original browser-side module specification is not conducive to large-scale application. Thus came the CommonJS specification early on, whose goal was to define modules and provide a common way of organizing modules.

Module definition and usage

In Commonjs, a file is a module. Define a module export via exports or module.exports mount.

exports.count = 1;
Copy the code

It is also easy to import a module and get an exports object by requiring the corresponding module.

const counter = require('./counter');
console.log(counter.count);
Copy the code

CommonJS modules are implemented mainly by the native module module, and there are some properties on this class that help us understand the module mechanism.

Module {
  id: '. '.// If it is mainModule, its id is fixed to '.'; if it is not, it is the absolute path of the module
  exports: {}, // Module exports
  filename: '/absolute/path/to/entry.js'.// The absolute path of the current module
  loaded: false.// Whether the module has been loaded
  children: [].// The module referenced by the module
  parent: ' '.// The first module that references this module
  paths: [ // The module's search path
   '/absolute/path/to/node_modules'.'/absolute/path/node_modules'.'/absolute/node_modules'.'/node_modules']}Copy the code

Where does require come from?

When writing CommonJS modules, we used require to load modules, exports to export modules, module, __filename, __dirname, etc. Why do they not need to be imported?

The reason is that when Node parses a JS module, it first reads the content as text, and then wraps the content of the module, enclosing a function in the outer layer and passing in variables. Use vm. RunInThisContext to convert string to Function to create scope to avoid global contamination.

let wrap = function(script) {
  return Module.wrapper[0] + script + Module.wrapper[1];
};

const wrapper = [
  '(function (exports, require, module, __filename, __dirname) { '.'\n}); '
];
Copy the code

So you can access these methods, these variables directly in your CommmonJS module without requiring require.

Module is the module instance of the current module (although the module code is not yet compiled and executed), exports is the alias of module.exports, which is the output value of module.exports when required. Require ultimately calls the module._load method as well. __filename and __dirname are the absolute and folder paths of the current module in the system respectively.

The module lookup process

It’s pretty easy for developers to use require, but it’s actually a bit of a hassle to find modules of different types, node_modules packages, etc.

First, when creating a module object, there is a Paths property whose value is computed from the current file path, from the current directory to node_modules in the system root directory. See if you can print module.paths in the module.

[ 
  '/Users/evan/Desktop/demo/node_modules'.'/Users/evan/Desktop/node_modules'.'/Users/evan/node_modules'.'/Users/node_modules'.'/node_modules'
]
Copy the code

In addition, the global path (if one exists) is also looked up

[
  execPath/../../lib/node_modules, // Lib /node_modules in the relative path of the current node execution file
  NODE_PATH, // the global variable NODE_PATH
  HOME/.node_modules, Node_module in // HOME
  HOME/.node_libraries'// HOME directory.node-libraries]Copy the code

The search process given in official documents is detailed enough, and only the general process is given here.

It runs from the Y pathrequire(X)

1.If X is a built-in module (e.grequire('http')) a. Return the module. B. No further action is required.2.If X is'/'A. Set Y to'/'

3.If X is'/''/''.. / '- (Y + X) - (Y + X).js - (Y + X).json - (Y + X).node b. Try loading directories in turn, Json - (Y + X + package.json).js - (Y + X + package.json).json - (Y + X + package.json) Main field).node C. throw"not found"
4.traversemodulePaths search, if found, will not be executed5.throw"not found"
Copy the code

The module lookup process will replace the soft chain with the real path in the system, such as lib/foo/node_moduels/bar soft chain to lib/bar, bar package require(‘quux’), finally run foo module, The lookup path for require(‘quux’) is lib/bar/node_moduels/quux instead of lib/foo/node_moduels/quux.

Module load correlation

MainModule

When running Node index.js, Node calls the static method _load(process.argv[1]) on the Module class to load the Module, marks it as the mainModule, and assigns values to process.mainModule and require.main, You can use these two fields to determine whether the current module is the main module or required.

The CommonJS specification is to load modules synchronously and obstructively while the code is running, and when require(X) is encountered during code execution, it will stop and wait until the new module is loaded before continuing to execute the following code.

Although it is synchronization blocking, this step is actually very fast, and the browser blocking download, parse, execute JS file is not the same level, hard disk read file is much faster than the network request.

Caching and circular references

File module search is time-consuming, if every require need to traverse the folder search, performance will be poor; Also, in real development, modules may contain side effects such as addEventListener execution at the top level of the module, which may cause problems if the require process is repeated several times.

Caching in CommonJS solves the problem of repeated lookups and repeated execution. During module loading, the absolute path of the module is the key and the module object is the value written to the cache. Module. exports is returned directly from the cache if the module is in the cache. If no, the system searches for the module and writes the module to the cache.

// a.js
module.exports = {
    foo: 1};// main.js
const a1 = require('./a.js');
a1.foo = 2;

const a2 = require('./a.js');

console.log(a2.foo); / / 2
console.log(a1 === a2); // true
Copy the code

In the above example, require a.js and modify the foo property, then require a.js again. You can see that the result of both require is the same.

The module cache can be viewed by printing require.cache.

{ 
    '/Users/evan/Desktop/demo/main.js': 
       Module {
         id: '. '.exports: {},
         parent: null.filename: '/Users/evan/Desktop/demo/main.js'.loaded: false.children: [[Object]],paths: 
          [ '/Users/evan/Desktop/demo/node_modules'.'/Users/evan/Desktop/node_modules'.'/Users/evan/node_modules'.'/Users/node_modules'.'/node_modules']},'/Users/evan/Desktop/demo/a.js': 
       Module {
         id: '/Users/evan/Desktop/demo/a.js'.exports: { foo: 1 },
         parent: 
          Module {
            id: '. '.exports: {},
            parent: null.filename: '/Users/evan/Desktop/demo/main.js'.loaded: false.children: [Array].paths: [Array]},filename: '/Users/evan/Desktop/demo/a.js'.loaded: true.children: [].paths: 
          [ '/Users/evan/Desktop/demo/node_modules'.'/Users/evan/Desktop/node_modules'.'/Users/evan/node_modules'.'/Users/node_modules'.'/node_modules']}}Copy the code

Caching also solves the problem of circular references. For example, now we have module A require module B; Module B requires module A.

// main.js
const a = require('./a');
console.log('in main, a.a1 = %j, a.a2 = %j', a.a1, a.a2);

// a.js
exports.a1 = true;
const b = require('./b.js');
console.log('in a, b.done = %j', b.done);
exports.a2 = true;

// b.js
const a = require('./a.js');
console.log('in b, a.a1 = %j, a.a2 = %j', a.a1, a.a2);
Copy the code

The execution result of the program is as follows:

in b, a.a1 = true, a.a2 = undefined
in main, a.a1 = true, a.a2 = true
Copy the code

The Module instance was created and written to the cache before the Module A code was executed. Exports is an empty object.

'/Users/evan/Desktop/module/a.js': 
   Module {
     exports: {},
     / /...}}Copy the code

Exports.a1 = true; Changed a1 on Module. exports to true before a2 code was executed.

'/Users/evan/Desktop/module/a.js': 
   Module {
     exports: {
      a1: true
    }
     / /...}}Copy the code

Module B, require A. js found on the cache, access module A on the exports. Print a1, a2 true, and undefined, respectively.

Exports.a2 = true; exports.a2 = true; Export objects a1 and A2 of Module A are true.

exports: { 
  a1: true.a2: true
}
Copy the code

Back in the main module, since require(‘./a.js’) gets a reference to the module a export object, print a1, a2 both true.

Summary:

CommonJS module loading process is synchronous block loading, module code is written to the cache before being run, the same module is executed only once, repeated require gets the same exports reference.

Note that the cache key uses the absolute position of the module in the system, and the same require(‘foo’) code does not guarantee that it will return a uniform object reference, due to the location of the module’s call. I happen to have encountered two require(‘egg-core’) before, but they are not equal.

ES6 module

ES6 module is a more familiar way of front-end development. It uses import and export keywords to input and output modules. Instead of modularity using closures and function encapsulation, ES6 provides modularity at the syntactic level.

ES6 modules do not have require, module.exports, __filename, etc., and import cannot be used in CommonJS. The two specifications are incompatible, and most ES6 module code written in everyday life ends up being processed into CommonJS code by Babel, Typescript, etc.

Using Node’s native ES6 modules tells Node to load the module as ES Module by changing the js file suffix to MJS or the package.json “type” field to “module”.

ES6 module loading process

The loading process of ES6 module is divided into three steps:

1. Find, download, parse and build all module instances.

ES6 modules will find all modules according to the module relationship before the program starts, generate a acyclic relationship diagram, and create all module instances. This way naturally avoids the problem of circular reference. Of course, there is also module loading cache, repeatedly import the same module, and only execute the code once.

2. Free space in memory for the content to be exported (export value is not written at this time). Import and export are then directed to these Spaces in memory, a process also known as wiring.

The work done in this step is the Living Binding import export, to help understand with the following example.

// counter.js
let count = 1;

function increment () {
  count++;
}

module.exports = {
  count,
  increment
}

// main.js
const counter = require('counter.cjs');

counter.increment();
console.log(counter.count); / / 1
Copy the code

Count ++ changes the base datatype variable in the module and does not change exports.count, so the printable result is 1.

// counter.mjs
export let count = 1;

export function increment () {
  count++;
}

// main.mjs
import { increment, count } from './counter.mjs'

increment();
console.log(count); / / 2
Copy the code

According to the result, when the variable of export is modified in ES6 module, the result of import will be affected. The implementation of this feature is the Living Binding. How it is implemented at the bottom of the specification is a matter of the moment, but the Living Binding is better understood than the online article described as “the ES6 module outputs references to values”.

CommonJS code closer to an ES6 module might look like this:

exports.counter = 1;

exports.increment = function () {
    exports.counter++;
}
Copy the code

3. Run the module code to fill in the actual value of the variable in the space generated in step 2.

At step 3, a depth-first post-traversal is performed based on the acyclic graph generated in step 1, and an exception is thrown if an uninitialized space is accessed during this process.

// a.mjs
export const a1 = true;
import * as b from './b.mjs';
export const a2 = true;

// b.mjs
import { a1, a2 } from './a.mjs'
console.log(a1, a2);
Copy the code

The above example will raise ReferenceError at runtime: Cannot access ‘a1’ before initialization. Import * as a from ‘a.mjs’ import * as a from ‘a.mjs’ import * as a from ‘a.mjs’ import * as a from ‘a.mjs’ import * as a from ‘a.mjs’

// b.mjs
import * as a from './a.mjs'
console.log(a);
Copy the code

In the output {a1:

, a2:

}, it can be seen that ES6 module reserved space for the variable of export, but no value was assigned. This is different from CommonJS where a1 is true and a2 is undefined

In addition, we can deduce some differences between ES6 modules and CommonJS:

  • CommonJSRequire can be done at run time using variables, for examplerequire(path.join('xxxx', 'xxx.js'))And the staticimportGrammar (andDynamic importTo return toPromise) No, because ES6 modules parse all modules before executing code.

  • requireIt will be completeexportsObject introduction,importYou can onlyimportPart of the essential content, is that why useTree ShakingMust be written in ES6 module.
  • importThe other module does notexport, an error will be reported before the code executes, andCommonJSAn error is reported when the module is running.

Why can the peacetime development mix write?

As mentioned earlier, ES6 modules and CommonJS modules are quite different and cannot be written together. This is different because ES6 modules written during development will eventually be processed by the packaging tool into CommonJS modules to make them compatible with more environments and to integrate with common CommonJS modules in the current community.

Some confusion may arise during the conversion process, such as:

__esModuleWhat is? What’s it for?

When working with ES6 modules using transformation tools, it is common to see an __esModule attribute after packaging, which literally marks it as ES6 Module. This variable exists for easy handling when referencing modules.

Exports [‘default’] : exports[‘default’] : exports[‘default’] : exports[‘default’] : exports[‘default’] : exports[‘default’] : exports[‘default’] : exports[‘default’] In order to be consistent with ES6’s import a from ‘./a.js’ behavior, the processing is based on __esModule judgment.

// a.js
export default 1;

// main.js
import a from './a';

console.log(a);
Copy the code

After the transformation

// a.js
Object.defineProperty(exports."__esModule", {
  value: true
});
exports.default = 1;

// main.js
'use strict';

var _a = require('./a');

var _a2 = _interopRequireDefault(_a);

function _interopRequireDefault(obj) { return obj && obj.__esModule ? obj : { default: obj }; }

console.log(_a2.default);
Copy the code

Export defualt = exports.default = 1; This is also the reason why.default is often needed to get the target value when require is used in front-end project development.

Then when you run import a from ‘./a.js’, the ES Module expects to return the contents of export. The utility converts the code to an _interopRequireDefault package, which checks whether it is esModule or not, and wraps it with a layer {default: Obj}, and when it finally gets the value of a, it will also be loaded with _a1.default.

A link to the

  • Nodejs.org/api/modules…
  • Hacks.mozilla.org/2018/03/es-…
  • Segmentfault.com/a/119000000…
  • www.infoq.cn/article/nod…