Node.js follows the CommonJS specification and the core is to load other dependent modules via require. This article first introduces the internal logic of require, and then analyzes the source code of the latest Version of Node V17.1.0 require logic.

Require internal logic

This chapter attempts to analyze the internal logic of REQUIRE from our daily node development scenario where require is used.

There are three scenarios for daily use of require loading modules:

  1. Load the Node native module as shown inrequire('fs');
  2. Load files in your own project, usually expressed as relative or absolute paths, as in:require('./utils')
  3. Load NPM dependencies, such asrequire('lodash');

According to the Node manual, the internal logic of require(‘X’) is:

  1. If the native module
  • A. Return the module
  • B. No further action is required
  1. If X begins with a relative path or an absolute path:
  • A. Based on the current usagerequireThe module path of,Determine the absolute path to X;
  • B. Regard X as a file and search for it in sequenceX,X.js,X.json,X.nodeAs long as one of them exists, the file is returned and execution is stopped.
  • C. Use X as a directoryX/package.jsonIf main exists, return the file to which main points; If the main field does not exist, continue the searchX/index.js,X/index.json,X/index.nodeAs long as one of them exists, the file is returned and execution is stopped.
  1. X has no path
  • A. Based on the current usagerequireThe module path of,Identify possible installation paths for X;
  • B. Search for files in each directory using X as the file name or directory name, as shown in 2b cThe logic of the
  1. Throw an exception,"not found"

Require source code analysis

This chapter analysis the require of logic, mainly in the files lib/internal/modules/CJS/loader. Js.

1, require definition

Require is defined on the prototype chain of the Module constructor, so every Module instance can use the require method

// Loads a module at the given file path. Returns that module's
// `exports` property.
Module.prototype.require = function(id) {
  validateString(id, 'id');
  if (id === ' ') {
    throw new ERR_INVALID_ARG_VALUE('id', id,
                                    'must be a non-empty string');
  }
  requireDepth++;
  try {
    return Module._load(id, this./* isMain */ false);
  } finally{ requireDepth--; }};Copy the code

Module._load is called to perform the load logic.

Module._load method

The module. _load method is a long source, leaving only the main logic:

Module._load = function(request, parent, isMain) {

// Calculate the absolute path as the module identifier
const filename = Module._resolveFilename(request, parent, isMain);

// if there is a cache, fetch it from the cache and return it
const cachedModule = Module._cache[filename];
if(cachedModule ! = =undefined) {
    return cachedModule.exports;
  }
// 2. Check whether it is a native module, if so, return it directly
  const mod = loadNativeModule(filename, request);
  if(mod? .canBeRequiredByUsers)return mod.exports;
// 3. Generate the module instance
const module = cachedModule || new Module(filename, parent);
// 4
Module._cache[filename] = module;

// load the module
  let threw = true;
  try {
    module.load(filename);
    threw = false;
  } finally {
    if (threw) {
      deleteModule._cache[filename]; }}// 6. Exports the module instance
  return module.exports;
}
Copy the code

As you can see from the above code, the key logic for module._load lies in two methods:

  • Module._resolveFilename(): Determines the absolute strength of the module as the module identifier
  • module.load(): Loading modules

3, module. _resolveFilename source code analysis

Module._resolveFilename = function(request, parent) {
  // Step 1: If it is a built-in module starting with Node:, return the original name
  if (StringPrototypeStartsWith(request, 'node:') ||
      NativeModule.canBeRequiredByUsers(request)) {
    return request;
  }

  // Step 2: Identify all possible paths
  let paths;
  paths = Module._resolveLookupPaths(request, parent);

  // Step 3: Start from the longest path to find the path that really exists
  // Look up the filename first, since that's the cache key.
  const filename = Module._findPath(request, paths, isMain, false);
  if (filename) return filename;
  // An error is thrown if no one is found
  let message = `Cannot find module '${request}'`;
  const err = new Error(message);
  err.code = 'MODULE_NOT_FOUND';
  throw err;
};
Copy the code

In the module. _resolveFilename file, two methods are used: module. _resolveLookupPaths and module. _findPath, which are used to determine all possible paths and to find the true path based on all possible paths respectively.

Module._resolveLookupPaths

Module._resolveLookupPaths return the values for modulePaths and parent-PATHS.

Module._resolveLookupPaths = function(request, parent) {
    let paths = modulePaths;
    if(parent? .paths? .length) { paths = ArrayPrototypeConcat(parent.paths, paths); } debug('looking for %j in %j', request, paths);
    return paths.length > 0 ? paths : null;
}
// Paths has the value:
[
  '/User/user/code/dev-cli/node_modules'.'/User/user/code/node_modules'.'/User/user/node_modules'.'/User/node_modules'.'/node_modules'."/Users/user/.node_modules"."/Users/user/.node_libraries"."/usr/local/lib/node",]Copy the code

Node_modules and. Node_libraries in the node installation directory and home directory, which are defined in the module. _initPaths method:

modulePaths = [
  "/Users/user/.node_modules"."/Users/user/.node_libraries"."/usr/local/lib/node".// Node installation directory
]
Copy the code

The values in parent. Paths are given by module._nodemodulePaths (process.cwd()) :

// If process.cwd() is set to '/User/ User/ code/dev-cli', parent.paths is set to:
parent.paths = [
  '/User/user/code/dev-cli/node_modules'.'/User/user/code/node_modules'.'/User/user/node_modules'.'/User/node_modules'.'/node_modules',]Copy the code

Module._findPath

Module._findPath Tries to find the correct path from all possible paths.

Module._findPath = function(request, paths, isMain) {
  // If it is an absolute path, path is set to ["].
  const absoluteRequest = path.isAbsolute(request);
  if (absoluteRequest) {
    paths = [' '];
  } else if(! paths || paths.length ===0) {
    return false;
  }

  // If the path is already in the cache, return it directly
  const cacheKey = request + '\x00' + ArrayPrototypeJoin(paths, '\x00');
  const entry = Module._pathCache[cacheKey];
  if (entry)
    return entry;

  // Determine whether it ends with /. If not, determine whether it ends with.. /. /.. At the end
  let trailingSlash = request.length > 0 &&
  StringPrototypeCharCodeAt(request, request.length - 1) ===
  CHAR_FORWARD_SLASH;
  if(! trailingSlash) { trailingSlash = RegExpPrototypeTest(trailingSlashRegex, request); }let exts;
  / / traversal paths
  for (let i = 0; i < paths.length; i++) {
    // Don't search further if path doesn't exist
    const curPath = paths[i];
    if (curPath && stat(curPath) < 1) continue;

    const basePath = path.resolve(curPath, request);
    let filename;

    const rc = stat(basePath);
    if(! trailingSlash) {if (rc === 0) {  // File.
        // Whether the file exists
        filename = toRealPath(basePath);
      }

      // If not, add the suffix to see if it exists
      if(! filename) {// Try it with each of the extensions
        if (exts === undefined) exts = ObjectKeys(Module._extensions); filename = tryExtensions(basePath, exts, isMain); }}// Is the directory, tryPackage, see if there is main in package.json, if you did not try to find index.(js,json,node)
    if(! filename && rc ===1) {  // Directory.
      // try it with each of the extensions at "index"
      if (exts === undefined)
        exts = ObjectKeys(Module._extensions);
      filename = tryPackage(basePath, exts, isMain, request);
    }
    // Cache the found path and return it
    if (filename) {
      Module._pathCache[cacheKey] = filename;
      returnfilename; }}// Not found, return false
  return false;
}
Copy the code

4, the module. The load

Module.prototype.load = function(filename) {
  this.filename = filename;
  this.paths = Module._nodeModulePaths(path.dirname(filename));
  // Find the longest suffix registered in module._extensions. If there is no match,.js is returned by default
  const extension = findLongestRegisteredExtension(filename);
  // Select different loading methods according to different suffixes
  Module._extensions[extension](this, filename);
  this.loaded = true;
}
Copy the code

The loading method of the three suffixes is as follows:

// Native extension for .js
Module._extensions['.js'] = function(module, filename) {
  // If already analyzed the source, then it will be cached.
  const cached = cjsParseCache.get(module);
  let content;
  if(cached? .source) { content = cached.source; cached.source =undefined;
  } else {
    content = fs.readFileSync(filename, 'utf8');
  }
  module._compile(content, filename);
};
// Native extension for .json
Module._extensions['.json'] = function(module, filename) {
  const content = fs.readFileSync(filename, 'utf8');
  try {
    module.exports = JSONParse(stripBOM(content));
  } catch (err) {
    err.message = filename + ':' + err.message;
    throwerr; }};// Native extension for .node
Module._extensions['.node'] = function(module, filename) {
  // Be aware this doesn't use `content`
  return process.dlopen(module, path.toNamespacedPath(filename));
};
Copy the code

Json files are easy to load by converting the file contents to objects via json. parse and assigning them to module.exports.

The node file is loaded through process.dlopen, which we delve into here.

We carefully analyze the loading of js files, according to the code, through fs.readfilesync read the file content, then handed to module._compile for compilation.

Module.prototype._compile = function(content, filename) {
  const compiledWrapper = wrapSafe(filename, content, this);
  const dirname = path.dirname(filename);
  let result;
  const exports = this.exports;
  const thisValue = exports;
  result = ReflectApply(compiledWrapper, thisValue,[exports.require.module, filename, dirname]);
}
Copy the code

Module. _compile wraps the contents of the file around the function:

(function (exports.require.module, __filename, __dirname) {
  // js module contents
});
Copy the code

conclusion

In general, file import goes through three steps:

  • Path analysis
  • File location
  • Compile implementation

Native modules and built-in modules are compiled into binary executable files during the compilation of Node source code. When starting Node, some core modules are directly loaded into memory, so the introduction of this part of the module directly skips file location and compilation execution, and takes precedence over loading in path analysis.

Non-native modules need to complete the entire path analysis, file location, compilation and execution.

Exports, module.exports, and __filename and __dirname are not global variables. It is a parameter injected during loading.

I look at the process of source code, gradually found that we thought it was very intelligent function, in fact, the underlying implementation is very simple. For example, module._nodemodulePaths (process.cwd())) determines possible paths by traversing the current path, separating each /, and adding node_modules. Of course, other people’s code is simple, does not mean that their implementation can be so simple, through learning other people’s implementation ideas, to improve their coding ability, which is also a big harvest to see the source code.

In the process of looking at the source code, you may encounter a question like me: why does loader.js use the require method in the first place? It’s time to explore the node initialization process. Stay tuned for my next article