preface

Early JavaScript was lacking a module system. To write JS scripts, we must rely on HTML to manage them, which seriously restricts the development of JavaScript. The CommonJS specification gives JavaScript the basic ability to develop large applications. NodeJS uses CommonJS Modules specification to realize a simple and easy to use module system, paving the way for JavaScript development in the server side.

CommonJS module specification

The CommonJS module specification defines three parts:

  • Module reference: The context of the module provides the require method, which can accept the module identifier to introduce a module’s API into the context of the current module for parameters.

  • Module definition: In a module, there is a Module object that represents the module itself and a reference to the module properties of exports.

  • Module id: The module id is a parameter to the require method that requires a string, relative or absolute, to be named for the small hump.

The implementation of Node module

  • Module references: In the context of a Node module, there is the require method, which can introduce modules, such as const fs = require(“fs”); .

  • Module definition: Node uses a single file as the base unit of modules, i.e., a file as a module, and all method properties mounted on exports objects are exports.

// person.js
exports.name = "vincent";
exports.say = function() {
    console.log("hello world");
};

// driver.js
const person = require("person");
exports.say = function() {
    person.say();
    console.log(`I am ${person.name}`);
};
Copy the code
  • Module identification: Node divides modules into two categories. One is the built-in module provided by Node, also known as core module. The other category is user-written modules, called user or third-party modules. Module identifiers in Node fall into the following categories.

    1. Absolute path: /path/my/module
    2. Relative path form:.. The/path/my/the module or. / path/my/module
    3. The module name can be HTTP, FS, or KOA

That’s an overview of Node’s implementation of the CommonJS module specification. Node, however, has made some changes to the module specification and added its own features to the require and exports module process.

Module require procedure

  1. Query cache: Node modules have the same loading policy as most loaders, which follows cache first, with second loads of the same module looking up the cache first. This is similar to the browser’s strategy of caching static resources, except that Node modules cache compiled and executed module objects. In addition, the Node process will load part of the core module into the memory when starting, which omits steps 3 and 4. At the same time, the built-in module takes priority in path analysis, so the speed of loading the core module is the fastest.
  2. Path analysis: analyze the module identifier to determine the type introduced by the module. When the module in the form of path is required, it will convert the identifier into a real path and cache it as an index. If a module is a core module, follow Step 1 to load it. If a third-party module is a core module, Node will find the path in the module path field.

    As you can see from the figure above, the module path lookup rule can be summarized as: search the node_modules directory step by step along the current file path directory. This is similar to JavaScript scoped chain lookups, where the deeper the hierarchy, the slower the lookups. The following isrequireThe core logic of the implementation (omitting some code)

Module.prototype.require = function(id) {
    return Module._load(id, this, /* isMain */ false);
};

Module._load = function(request, parent, isMain) {
  let relResolveCacheIdentifier;
  if(the parent) {/ / there is the parent module, joining together the path as a temporary cache index (query real path) relResolveCacheIdentifier = `${parent.path}\x00${request}`; const filename = relativeResolveCache[relResolveCacheIdentifier]; // Query modules in the cacheif(filename ! == undefined) { const cachedModule = Module._cache[filename];if(cachedModule ! == undefined) {// Push the module to the children array of the parent module.true);
        returncachedModule.exports; } / / delete temporary index delete relativeResolveCache [relResolveCacheIdentifier]; Const filename = module. _resolveFilename(request, parent, isMain); const cachedModule = Module._cache[filename]; Module.exports is returned if the cache existsif(cachedModule ! == undefined) { updateChildren(parent, cachedModule,true);
    returncachedModule.exports; } // If the cache does not exist, find the core module first const mod = loadNativeModule(filename, request, experimentalModules); Module.exports is returned directly if the developer can reuqire it directlyif (mod && mod.canBeRequiredByUsers) returnmod.exports; // Generate module instances and cache const module = new module (filename, parent); Module._cache[filename] = module;if(parent ! == undefined) { relativeResolveCache[relResolveCacheIdentifier] = filename; } // Check whether the load is successfullet threw = true; // Load the module, Json -> fs.readFileSync -> json.parse //.node -> compile (compile) Fs. readFileSync -> dlopen (C/C++ module) // other types omitted module.load(filename); threw =false; } finally {// Load failed, delete cache and indexif (threw) {
      delete Module._cache[filename];
      if(parent ! == undefined) { delete relativeResolveCache[relResolveCacheIdentifier]; }}}return module.exports;
};
Copy the code
  1. File location: Step 2 explains the module path analysis strategy, the following more directly through the source code to understand the principle of file location.
Module._findPath = function(Request, paths, isMain) {// Absolute path const absoluteRequest = path.isAbsolute(request);if (absoluteRequest) {
    paths = [' '];
  } else if(! paths || paths.length === 0) {return false; } // Try to get const cacheKey = request + through the path cache index'\x00' +
                (paths.length === 1 ? paths[0] : paths.join('\x00'));
  const entry = Module._pathCache[cacheKey];
  if (entry) returnentry; // Check whether the path is valid": /"Var exts; var trailingSlash = request.length > 0 && request.charCodeAt(request.length - 1) === CHAR_FORWARD_SLASH;if(! trailingSlash) { trailingSlash = /(? : ^ | \ \. /)? \.$/.test(request); } // For each pathfor (var i = 0; i < paths.length; i++) {
    // Don't search further if path doesn't exist
    const curPath = paths[i];
    if (curPath && stat(curPath) < 1) continue; var basePath = resolveExports(curPath, request, absoluteRequest); var filename; // Query the file typestat(basePath);
    if(! TrailingSlash) {// Whether the file existsif(rc === 0) {// File. // Try to get the real path based on the module type, omit filename = findPath(); } // Try to add a suffix to the fileif(! filename) {if(exts === undefined) exts = Object.keys(Module._extensions); filename = tryExtensions(basePath, exts, isMain); }} // The current file path is a file directory and the extension name does not exist, try to get filename/index[.extension]if(! filename && rc === 1) { // Directory.if(exts === undefined) exts = Object.keys(Module._extensions); filename = tryPackage(basePath, exts, isMain, request); } // Cache the path and returnif (filename) {
      Module._pathCache[cacheKey] = filename;
      returnfilename; }} // No file found, returnfalse
  return false;
};
Copy the code
  1. Compile execution: Follow the steps above to determine the actual path of the module. After the file name extension is found, Node will compile the module in different ways based on the file name extension. The types of compilation are roughly divided into JavaScript modules, C/C++ modules, and JSON files. Only JavaScript modules are described here. When compiling a JavaScript module, Node wraps the header and tail of the JS file.
/ / (function(exports, require, module, __filename, __dirname) {\n JS file code... // \n})Copy the code

This way each module file is directly scoped out. The wrapped code returns a concrete function object via the VM native module’s runInThisContext(). Finally, the module reference, require method, exports property and other global properties of the current module object are passed into function as parameters for execution. This is why these variables are not defined but exist in every module file.

Module Exports procedure

When I first came to Node, there was some confusion about the relationship between Module. exports and exports. They can both mount property methods as exports for the current module. But what do they mean? What difference does it make? Look at the following code:

// a.js
exports.name = "vincent";
module.exports.name = "ziwen.fu";
exports.age = 24;

// b.js
const a = require("a"); console.log(a); / / {"name": "ziwen.fu", age: 24 };
Copy the code

Module. Exports ({}) {module. Exports ({}) {module. Exports ({}) {module. Property methods mounted on exports will eventually be exported by module.exports. Continue with the following code:

// a.js
exports = "from exports";

module.exports = "from module.exports";

// b.js
const a = require("a");
console.log(a);         // from module.exports
Copy the code

Module. exports and exports are assigned directly, and module.exports is exported. Exports is passed as a parameter in the current module context. Directly changing the reference to the parameter does not change the value outside the domain. The test code is as follows:

const myModule = function(myExports) {
    myExports = "24";
    console.log(myExports);
};

const myExports = "8"; myModule(myExports); // 24 console.log(myExports); / / 8Copy the code

Node module loop dependency issues

Here’s an example from the Node website:

// a.js
console.log('a starting');
exports.done = false;
const b = require('./b.js');
console.log('in a, b.done = %j', b.done);
exports.done = true;
console.log('a done');

// b.js
console.log('b starting');
exports.done = false;
const a = require('./a.js');
console.log('in b, a.done = %j', a.done);
exports.done = true;
console.log('b done');

// main.js
console.log('main starting');
const a = require('./a.js');
const b = require('./b.js');
console.log('in main, a.done = %j, b.done = %j', a.done, b.done);

// console.log
main starting
a starting
b starting
in b, a.done = false
b done
in a, b.done = true
a done
in main, a.done = true, b.done = true
Copy the code

When main.js loads a.js, a.js tries to add b.js. At this point, B.js tries to load A.js. To avoid an infinite loop, A. js exports an unfinished copy for B. js to complete the load, and then B. js exports to A. js to complete the whole process. So far the ES6 Module has provided a solution, and the ES6 Module of Node has entered the testing phase, so I won’t go into details here.

Actual application of Node module

The basic mechanism of Node modules is described above. In most cases, we would probably require modules in a dependency front mode, which is to import the required module in advance and place it at the top of the code. Sometimes, however, there are modules that the program does not need to use immediately, and dynamic introduction is a better choice. The following code provides a different approach to the Node Web services framework Loader implementation:

// egg-core/lib/loader/utils loadFile(filepath) {// filepath from require.resolve(path) try {// non-javascript module, // module. _extension support Node Module extension array const extName = path. extName (filepath);if(extname && ! Module._extensions[extname]) {returnfs.readFileSync(filepath); } // JavaScript modules require const obj = require(filepath);if(! obj)returnobj; // ES6 module returns processingif (obj.__esModule) return 'default' in obj ? obj.default : obj;
      returnobj; } catch (err) { // ... }} // egg-core/lib/loader/context_loader // proxy context Object app.context // property corresponding to the project name object.defineProperty (app.context, property, {get() {// Query the cacheif(! this[CLASSLOADER]) { this[CLASSLOADER] = new Map(); } const classLoader = this[CLASSLOADER]; // Get the module object instance and cache itlet instance = classLoader.get(property);
        if(! instance) { instance = getInstance(target, this); classLoader.set(property, instance); }returninstance; }});Copy the code

The idea behind egg-loader is to dynamically require the module, wrap it, and mount it to the context object by proxying the property on app.context. Property is the module name located by require.resolve. With the help of caching mechanism, the program can introduce modules as needed during the running process, and the cost of introducing modules and maintaining module names and paths is also reduced.

Currently, in the module mechanism of Node, require module is a synchronization API based on readFileSync implementation, which has many inconveniences for the introduction of large files. The ES Module in the experimental process supports asynchronous dynamic introduction and also solves the problem of loop dependence, which may be widely used in Node Module mechanism in the future.

reference

  • Nodejs in Plain English
  • commonjs-what-why-and-how
  • Nodejs.org/docs/latest…
  • eggjs.org/api/