This series of articles is a translation and reading notes of Node.js Design Patterns Second Edition, updated on GitHub with a link to the translation.

Please pay attention to my column, the following blog will be synchronized in the column:

  • The nuggets column of Encounter
  • Programming Thoughts on Zhihu’s Encounter
  • Segmentfault front end station

Node.js Essential Patterns

For Node.js, the asynchronous feature is the most prominent feature, but for other languages, such as PHP, asynchronous code is not often handled.

In synchronous programming, we are used to thinking of code execution as sequential top-down steps of computation. Each operation blocks, meaning that the next operation cannot be executed until one has completed, which is easy to understand and debug.

In asynchronous programming, however, we can perform operations such as reading files or performing network requests in the background. When we call an asynchronous operation method, subsequent operations continue even if the current or previous operation has not completed, the operation performed in the background will complete at any time, and the application will react in the correct way when the asynchronous call completes.

While this non-blocking approach performs better than the blocking approach, it is really difficult for programmers to understand, and the asynchronous sequence can become difficult to manipulate when dealing with advanced applications with more complex asynchronous control flows.

Node.js provides a set of tools and design patterns to best handle asynchronous code. It is important to know how to use them to write applications that perform and are easy to understand and debug.

In this chapter, we’ll look at two of the most important asynchronous patterns: callbacks and event publishers.

The callback mode

As discussed in the previous chapter, callbacks are instances of the REACTOR pattern’s handler, and callbacks are one of node.js’s unique programming styles. A callback function is a function that propagates the results of an asynchronous operation after its completion and is always used in place of the return instruction of a synchronous operation. JavaScript happens to be the best language for callbacks. In JavaScript, functions are first-class citizens, and we can pass a function variable as an argument and call it in another function, storing the result of the call in some data structure. Another ideal structure for implementing callbacks is closures. Using closures, we are able to preserve the context in which the function was created, so that whenever the callback is invoked, the context in which the asynchronous operation was requested is preserved.

In this section, we examine programming ideas and patterns based on callbacks, rather than return instruction patterns for synchronous operations.

CPS

In JavaScript, a callback function is passed as an argument to another function and called when the operation is complete. In functional programming, this method of passing results is called CPS. This is a general concept, and not just for asynchronous operations. In fact, it simply passes the result as an argument to another function (the callback function), and then calls the callback function in the body logic to get the result of the operation, rather than returning it directly to the caller.

Synchronous CPS

To understand CPS more clearly, let’s look at this simple synchronization function:

function add(a, b) {
  return a + b;
}Copy the code

The above example is nothing special about the direct programming style, where a return statement is used to pass the result directly to the caller. It represents the most common way to return results in synchronous programming. CPS for the above functions is written as follows:

function add(a, b, callback) {
  callback(a + b);
}Copy the code

The add() function is a synchronous CPS function. The CPS function gets the result of the add() function only when it is called. This is how it is called:

console.log('before');
add(1.2, result => console.log('Result: ' + result));
console.log('after');Copy the code

Since add() is synchronous, the code above prints the following:

before
Result: 3
afterCopy the code

Asynchronous CPS

Consider the following example, where the add() function is asynchronous:

function additionAsync(a, b, callback) {
 setTimeout((a)= > callback(a + b), 100);
}Copy the code

In the above code, we use setTimeout() to simulate an asynchronous callback. Now we call additionalAsync and look at the concrete output.

console.log('before');
additionAsync(1.2, result => console.log('Result: ' + result));
console.log('after');Copy the code

The above code should have the following output:

before
after
Result: 3Copy the code

Because setTimeout() is an asynchronous operation, it does not wait for the callback to execute, but returns immediately, giving control to addAsync() and then returning it to its caller. This property in Node.js is important because whenever an asynchronous request is made, control is given to the event loop, allowing new events from the queue to be processed.

The following image shows the event loop in Node.js:

When the asynchronous operation completes, execution is given to the callback function where the asynchronous operation started. Execution will start with the event loop, so it will have a new stack. For JavaScript, this is an advantage. Because closures preserve their context, callbacks can be called at different points in time and from different locations, and they work just fine.

The synchronization function blocks until it completes its operation. The asynchronous function returns immediately, and the result is passed to the handler (in our case, a callback) later in the event loop.

Non-cps style callback pattern

In some cases, we might think that callback CPS writing is asynchronous, but it is not. For example, the map() method of an Array object:

const result = [1.5.7].map(element= > element - 1);
console.log(result); / / [0, 4, 6]Copy the code

In the example above, the callback is used only to iterate over the elements of the array, not to pass the result of the operation. In fact, this example uses a callback to synchronize the return rather than pass the result. Whether a callback is passing the result of an operation is usually specified in the API documentation.

Synchronous or asynchronous?

We’ve seen that the order in which code is executed can change radically depending on how it is executed synchronously or asynchronously. This has a significant impact on the flow, correctness, and efficiency of the entire application. Here’s an analysis of both models and their pitfalls. In general, what must be avoided is confusion that is difficult to detect and expand due to inconsistent execution sequences. Here is an example of asynchrony with a trap:

A problem function

One of the most dangerous situations is when an API that should be executed asynchronously is executed synchronously under certain conditions. Take the following code for example:

const fs = require('fs');
const cache = {};

function inconsistentRead(filename, callback) {
  if (cache[filename]) {
    // If the cache hits, the callback is executed synchronously
    callback(cache[filename]);
  } else {
    // If no match is made, an asynchronous non-blocking I/O operation is performed
    fs.readFile(filename, 'utf8', (err, data) => { cache[filename] = data; callback(data); }); }}Copy the code

The above functionality uses caching to store the results of different file read operations. Remember, however, that this is just an example, it lacks error handling, and the cache logic itself is not optimal (i.e., no cache elimination policy). In addition, the above function is very dangerous because without caching, it behaves asynchronously and does not execute synchronously until the fs.readfile () function returns the result, at which point the cache does not fire and goes through the asynchronous callback.

liberationzalgo

With Zalgo, this is all about the uncertainty of synchronous or asynchronous behavior, which almost always leads to bugs that are very difficult to track.

Now, let’s see how you can use a function whose order is unpredictable and which can even break an application easily. Look at the following code:

function createFileReader(filename) {
  const listeners = [];
  inconsistentRead(filename, value => {
    listeners.forEach(listener= > listener(value));
  });
  return {
    onDataReady: listener= > listeners.push(listener)
  };
}Copy the code

When the above function is called, it creates a new object that acts as an event publisher, allowing us to set up multiple event listeners for file read operations. All listeners are invoked immediately when the read operation is complete and the data is available. The previous function uses the previously defined inconsistentRead() function to do this. We now try calling the createFileReader() function:

const reader1 = createFileReader('data.txt');
reader1.onDataReady(data= > {
 console.log('First call data: ' + data);
 // Read the same file from fs again
 const reader2 = createFileReader('data.txt');
 reader2.onDataReady(data= > {
   console.log('Second call data: ' + data);
 });
});Copy the code

The following output looks like this:

First call data: some dataCopy the code

Here’s why the second callback was not called:

When reader1 is created, the inconsistentRead() function is executed asynchronously, with no cached results available, so we have time to register event listeners. After the read operation is complete, it will be invoked in the next event loop.

Reader2 is then created in the loop of the event loop where a cache of the requested file already exists. In this case, the internal call to inconsistentRead() will be synchronous. So, its callback will be called immediately, which means that all listeners of Reader2 will also be called synchronously. However, we only started registering listeners after we created reader2, so they will never be called.

The behavior of the inconsistentRead() callback function is unpredictable, as it depends on many factors such as the frequency of the call, the filename passed as an argument, and the time taken to load the file.

In a real-world application, errors like the one we just saw might be too complex to identify and replicate in a real application. Imagine having multiple concurrent requests using similar functionality in a Web server; Imagine that these requests hang for no apparent reason, and no log is logged. This is definitely one of those annoying bugs.

Isaac Z. Schlueter, founder of NPM and former Node.js project lead, compared using this unpredictable feature to releasing Zalgo in one of his blog posts. If you are not familiar with Zalgo. Check out Isaac Z. Schlueter’s original post.

Using the Synchronization API

From the zalgo example above, we know that the API must clearly define its nature: is it synchronous or asynchronous?

The appropriate way to fix the above bugs in the inconsistentRead() function is to make it block execution completely synchronously. And this is entirely possible because Node.js provides a set of synchronous apis for most basic I/O operations. For example, we could use the fs.readfilesync () function instead of its asynchronous counterpart. The code now looks like this:

const fs = require('fs');
const cache = {};

function consistentReadSync(filename) {
 if (cache[filename]) {
   return cache[filename];
 } else {
   cache[filename] = fs.readFileSync(filename, 'utf8');
   returncache[filename]; }}Copy the code

We can see that the whole function is converted to a synchronous blocking call mode. If a function is synchronous, it is not CPS style. In fact, we can say that it has always been best practice to implement a synchronous API using CPS, which would eliminate any confusion in its nature and be more efficient from a performance perspective.

Remember to change the API from CPS to a direct call return style, or asynchronous to synchronous style. For example, in our example, we had to completely change our createFileReader() to sync and adapt it to always work.

Also, when using synchronous apis instead of asynchronous apis, pay special attention to the following:

  • synchronousAPINot applicable to all application scenarios.
  • synchronousAPILoops blocking events and puts concurrent requests in a blocking state. It can destroyJavaScriptThe concurrency model can even degrade the performance of the entire application. We’ll see how this affects our application later in the book.

In our inconsistentRead() function, because each file name is called only once, the application is not affected very much by blocking the call synchronously, and the cached values will be used for all subsequent calls. If we have a limited number of static files, then using consistentReadSync() will not have a significant impact on our event loop. We do not recommend using synchronous I/O in Node.js if we have a large number of files that all need to be read once and have high performance requirements. However, in some cases, synchronous I/O may be the simplest and most effective solution. Therefore, we must properly evaluate the specific application scenario to choose the most appropriate solution. The examples above show that it makes sense to use the synchronous blocking API to load configuration files in a real-world application.

Therefore, remember to consider using synchronous blocking I/O only if it does not affect your application’s concurrency.

Delay processing

Another way to fix the above bugs in the inconsistentRead() function is to make it only asynchronous. The solution here is to call it synchronously on the next event cycle, rather than running it immediately within the same event cycle, making it virtually asynchronous. In Node.js, you can use process.nexttick (), which delays the execution of the function until the next passing event loop. Its functionality is very simple, it takes a callback as an argument and pushes it to the top of the event queue, before any unprocessed I/O events, and immediately returns. As soon as the event loop runs again, the callback is invoked.

InconsistentRead () is an asynchronous sequence with this technique:

const fs = require('fs');
const cache = {};

function consistentReadAsync(filename, callback) {
  if (cache[filename]) {
    // The next event loop is called immediately
    process.nextTick((a)= > callback(cache[filename]));
  } else {
    Asynchronous I/O operations
    fs.readFile(filename, 'utf8', (err, data) => { cache[filename] = data; callback(data); }); }}Copy the code

The above function now fixes the bug by ensuring that its callback is called asynchronously in any case.

Another API for delaying the execution of code is setImmediate(). While their functions look very similar, their actual meanings are quite different. The callback function of process.nexttick () is called before any other I/O operations, while for setImmediate() is called after the other I/O operations. Because process.nexttick () is called before other I/ OS, it may cause I/O to enter an indefinite wait in some cases, such as recursive calls to process.nexttick (), but this does not happen with setImmediate(). We’ll delve into the differences between the two apis later in this book when we examine running synchronous CPU binding tasks using deferred calls.

We ensure that its callback function is called asynchronously using process.nexttick ().

Node.js callback style

CPS style apis and callback functions follow a special set of conventions for Node.js. These conventions make sense not only for the Core Node.js apis, but also for most user-level modules and applications that follow them. Therefore, it is important that we understand these styles and ensure that we follow the rules when we need to design asynchronous apis.

The callback is always the last argument

In all core Node.js methods, the standard convention is that when a function receives a callback in the input, it must be passed as the last argument. Take the following core Node.js apis as an example:

fs.readFile(filename, [options], callback);Copy the code

As you can see from the previous example, the callback is always placed last, even in the presence of optional arguments. The reason for this is that function calls are more readable in the case of callback definitions.

Error handling always comes first

In CPS, errors are passed in callback functions in a different form than the correct result. In Node.js, any errors generated by cps-style callback functions are always passed as the first argument to the callback, and any actual results are passed from the second argument. If the operation succeeds with no errors, the first argument will be null or undefined. Look at the following code:

fs.readFile('foo.txt'.'utf8', (err, data) => {
  if (err)
    handleError(err);
  else
    processData(data);
});Copy the code

The above example is the best way to detect errors. Without it, it might be difficult to find and debug bugs in your code, but another consideration is that errors are always of type Error, which means that simple strings or numbers should not be passed as Error objects (difficult to catch by try catch blocks).

Error propagation

For synchronous blocking, our errors are thrown by a throw statement, and we’re pretty good at catching the context of the error even if it jumps in the error stack.

But for cpS-style asynchronous calls, which are done by passing the error to the next callback in the error stack, here is a typical example:

const fs = require('fs');

function readJSON(filename, callback) {
  fs.readFile(filename, 'utf8', (err, data) => {
    let parsed;
    if (err)
    // Exit the current call if an error occurs
      return callback(err);
    try {
      // Parse the data in the file
      parsed = JSON.parse(data);
    } catch (err) {
      // Catch errors in parsing, and if any errors occur, handle them
      return callback(err);
    }
    // No errors, call the callback
    callback(null, parsed);
  });
};Copy the code

The detail we notice from the above example is how we pass arguments to the callback when we want to do exception handling correctly. In addition, when an error occurs, we use a return statement to exit the current function call immediately, avoiding the following execution.

An exception that cannot be caught

From the readJSON() function above, to avoid throwing any exceptions into the fs.readfile () callback, we place a try catch block around json.parse (). If something goes wrong in an asynchronous callback, it throws an exception and jumps to the event loop without propagating the error to the next callback function.

In Node.js, this is an unrecoverable state, and the application shuts down and prints errors to standard output. To demonstrate this, we try to remove the try catch block from the readJSON() function we defined earlier:

const fs = require('fs');

function readJSONThrows(filename, callback) {
  fs.readFile(filename, 'utf8', (err, data) => {
    if (err) {
      return callback(err);
    }
    // Assuming parse executes without error
    callback(null.JSON.parse(data));
  });
};Copy the code

In the above code, there is no way to catch an exception from Json.parse. If we try to pass a file in a nonstandard JSON format, the following error will be thrown:

SyntaxError: Unexpected token d at Object.parse (native) at [...]  at fs.js:266:14 at Object.oncomplete (fs.js:107:15)Copy the code

Now, if we look at the previous error stack trace, we’ll see that it starts somewhere in the FS module, just as the local API completes the file read and returns to the fs.readfile () function, through the event loop. This information is clearly shown to us, and the exception is passed onto the stack from our callback, then directly into the event loop, and finally caught and thrown to the console. This also means that wrapping a call to readJSONThrows() with a try catch block will not work because the block is on a different stack than the one calling the callback. The following code shows the opposite of what we just described:

try {
  readJSONThrows('nonJSON.txt'.function(err, result) {
    // ... 
  });
} catch (err) {
  console.log('This will not catch the JSON parsing exception');
}Copy the code

The previous catch statement will never receive a JSON parsing exception because it will be returned to the stack that threw the exception. We just saw that the stack ends in an event loop, rather than triggering an asynchronous operation. As mentioned earlier, the application aborts at the moment the exception reaches the event loop, however, we still have the opportunity to do some cleaning or logging before the application terminates. In fact, when this happens, Node.js issues a special event called uncaughtException before exiting the process. The following code shows an example use case:


process.on('uncaughtException', (err) => {
  console.error('This will catch at last the ' +
    'JSON parsing exception: ' + err.message);
  // Terminates the application with 1 (error) as exit code:
  // without the following line, the application would continue
  process.exit(1);
});Copy the code

Importantly, uncaught exceptions can leave the application in a state that is not guaranteed to be consistent, which can cause unforeseen problems. For example, there may also be incomplete I/O requests running or closing that may become inconsistent. This is why it is always recommended, especially in production environments, to write the above code for error logging after receiving an exception that is not caught.

Module system and related modes

Modules are not only the basis for building large applications, their main mechanism is to encapsulate internal implementations, methods, and variables through interfaces. In this section, we’ll look at the Node.js module system and its most common usage patterns.

About the module

One of the main problems with JavaScript is the lack of namespaces. Programs running globally pollute the global namespace, causing conflicts between related variables, data, and method names. The technique to solve this problem is called the module pattern. Look at the following code:

const module = (() = > {
  const privateFoo = (a)= > {
    // ...
  };
  const privateBar = [];
  const exported = {
    publicFoo: (a)= > {
      // ...
    },
    publicBar: (a)= > {
      // ...}};returnexported; }) ();console.log(module);Copy the code

This pattern implements modules using self-executing anonymous functions, exporting only the parts that you want to be invoked publicly. In the above code, the module variable only contains the exported API, and the rest of the module content is not actually accessible externally. As we’ll see later, the idea behind this pattern is used as the basis for the Node.js module system.

Description about the Node.js module

CommonJS is an organization that aims to regulate the JavaScript ecosystem by coming up with the CommonJS module specification. Node.js builds its module system on top of this specification and adds some custom extensions. To illustrate how it works, we can explain the module pattern with an example where each module runs in a private namespace so that every variable defined within the module does not pollute the global namespace.

Custom module system

To explain the departure of module systems, let’s build a similar module system from scratch. The following code creates a function that mimics the original Node.js require() function.

We’ll create a function that loads the module’s contents and wrap it in a private namespace:

function loadModule(filename, module, require) {
  const wrappedSrc = `(function(module, exports, require) {
         ${fs.readFileSync(filename, 'utf8')}})(module, module.exports, require); `;
  eval(wrappedSrc);
}Copy the code

The source code for a module is wrapped in a function as if it were a self-executing anonymous function. The difference here is that we pass some inherent variables to modules, specifically module, exports, and require. Note that the arguments to exported modules are module.exports and exports, which we will discuss later.

Keep in mind that this is just an example and should not be done in a real project. Modules such as Eval () or VM can cause security problems that can be exploited for injection attacks. Eval should be used with great care or avoided altogether.

Let’s now look at how module interfaces, variables, and so on are introduced by the require() function:

const require = (moduleName) = > {
  console.log(`Require invoked for module: ${moduleName}`);
  const id = require.resolve(moduleName);
  // Whether the cache is hit
  if (require.cache[id]) {
    return require.cache[id].exports;
  }
  / / define the module
  const module = {
    exports: {},
    id: id
  };
  // A new module is introduced and cached
  require.cache[id] = module;
  // Load the module
  loadModule(id, module.require);
  // Returns the exported variable
  return module.exports;
};
require.cache = {};
require.resolve = (moduleName) = > {
  Resolve a complete module */
};Copy the code

The above function emulates the behavior of the require() function used to load the module’s native Node.js. Of course, this is just a demo, and it doesn’t accurately or completely reflect the behavior of the require() function, but to better understand the internal implementation of the Node.js module system, define and load modules. The functions of our homemade module system are as follows:

  • The module name is passed in as an argument, and the first thing we do is find the full path to the module, which we callid.require.resolve()It specializes in this functionality, which is implemented through a specific parsing algorithm (discussed later).
  • If the module is already loaded, it should exist in the cache. In this case, we immediately return the module in the cache.
  • If the module has not been loaded, we will load it for the first time. Creates a module object that contains a class initialized with an empty object literalexportsProperties. This property will be used by the module’s code to export the module’s publicAPI.
  • Caches module objects loaded for the first time.
  • The module source code is read from its file, and the code is imported, as described earlier. We’re throughrequire()The function provides the module with the module object we just created. The module is operated or replacedmodule.exportsObject to export its public API.
  • Finally, will represent the module’s publicAPIthemodule.exportsIs returned to the caller

As we can see, the principle of node.js module system is not so sophisticated as imagined, but through our series of operations to create and import and export module source code.

Defining a module

We now know how to define a module by looking at how our custom require() function works. Consider the following example:

// Load another module
const dependency = require('./anotherModule');
// Private functions within the module
function log() {
  console.log(`Well done ${dependency.username}`);
}
// Implement common methods by exporting apis
module.exports.run = (a)= > {
  log();
};Copy the code

Note that everything inside a module is private, unless it is assigned to the module.exports variable. The contents of this variable are then cached and returned when the module is loaded using require().

Defining global variables

Global variables can be defined even if all variables and functions declared in a module are defined in their local scope. In fact, the module system exposes a special variable called global. Everything assigned to this variable will be defined in the global context.

Note: Polluting the global namespace is bad and does not take full advantage of the modular system. Therefore, use global variables only when you really need them.

The module exports and exports

For many developers who are not familiar with Node.js, the difference between exports and module.exports is one of the most confusing. The export variable is just a reference to the initial value of module.exports; We’ve seen that exports are essentially a simple object until the module loads.

This means that we can only append new attributes to objects referenced by the exported variable, as shown in the following code:

exports.hello = (a)= > {
  console.log('Hello');
}Copy the code

Reassigning exports doesn’t have any effect because it doesn’t change the contents of module.exports, it just changes the variable itself. Therefore, the following code is incorrect:

exports = (a)= > {
  console.log('Hello');
}Copy the code

If we wanted to export something other than an object, such as a function, we could reassign module.exports:

module.exports = (a)= > {
  console.log('Hello');
}Copy the code

The require function is synchronous

Another important detail is that the require() function we wrote above is synchronous, which uses a simpler way to return the module’s contents and does not require a callback function. For module.exports, for example, the following code is not correct:

setTimeout((a)= > {
  module.exports = function() {
    // ...
  };
}, 100);Copy the code

Exporting modules in this way has a significant impact on how we define modules, because it limits how we define and use modules synchronously. This is actually one of the most important reasons why the core Node.js library provides synchronous apis instead of asynchronous ones.

If we need to define a module that needs to be initialized asynchronously, we can always define and export modules that need to be initialized asynchronously. However, defining asynchronous modules this way does not guarantee immediate use after require(). In Chapter 9, we will examine this problem in detail and propose some patterns to optimize the solution.

In fact, there used to be an asynchronous version of require() in early Node.js, but this API was quickly removed due to its huge impact on initialization time and asynchronous I/O performance.

Resolve algorithm

Dependency hell describes software dependencies that depend on different versions of software packages, and Node.js solves this problem by loading different versions of modules, depending on where the modules are loaded. All are done by NPM, and the associated algorithm is called the resolve algorithm, which is used in the require() function.

Now let’s give a quick overview of the algorithm. As described below, the resolve() function takes a moduleName (moduleName) as input and returns the full path to the module. This path is then used to load its code, and can also uniquely identify the module. Resolve algorithm can be divided into the following three rules:

  • File module: ifmoduleNameIn order to/At the beginning, then it is already considered the absolute path to the module. If the. /At the beginning, thenmoduleNameKnown as the relative path, it is used fromrequireThe location of the module is calculated.
  • Core module: ifmoduleNameDon’t to/or. /To start, the algorithm will first try to search in the core Node.js module.
  • Module package: If no match is foundmoduleNameIn the current directorynode_modulesIf no search is foundnode_modules, the search continues to the upper-level directorynode_modulesUntil it reaches the root of the file system.

For file and package modules, individual files and directories can also match moduleName. In particular, the algorithm will attempt to match the following:

  • <moduleName>.js
  • <moduleName>/index.js
  • in<moduleName>/package.jsonthemainThe file or directory declared under the value

Specific documentation for the Resolve algorithm

The node_modules directory is actually where NPM installs each package and stores the associated dependencies. This means that, based on the algorithm we just described, each package has its own private dependencies. For example, look at the following directory structure:

MyApp ├ ─ ─ foo js └ ─ ─ node_modules ├ ─ ─ depA │ └ ─ ─ index. The js └ ─ ─ depB │ ├ ─ ─ bar. Js ├ ─ ─ node_modules ├ ─ ─ depA │ └ ─ ─ index, js └── ├─ ├─ ├─ ├─Copy the code

In the previous example, myApp, depB, and depC all depend on depA; However, they all have their own versions of private dependencies! As per the rules of the parsing algorithm, using require(‘depA’) will load different files depending on the required module, as follows:

  • in/myApp/foo.jsIn the callrequire('depA')loads/myApp/node_modules/depA/index.js
  • in/myApp/node_modules/depB/bar.jsIn the callrequire('depA')loads/myApp/node_modules/depB/node_modules/depA/index.js
  • in/myApp/node_modules/depC/foobar.jsIn the callrequire('depA')loads/myApp/node_modules/depC/node_modules/depA/index.js

The Resolve algorithm is a core part of Node.js dependency management, and it exists to prevent conflicts and version incompatibilities even when applications have hundreds or thousands of packages.

When we call require(), the parsing algorithm is transparent to us. However, it can still be used directly by any module by calling require.resolve().

Module cache

Each module is only loaded when it is first introduced, and any subsequent require() call is fetched from the previously cached version. Looking at the custom require() function we wrote earlier, we can see that caching is critical for improving performance, as well as having some other advantages:

  • Makes it possible to reuse module dependencies
  • This ensures to some extent that the same instance is always returned when the same module is requested from a given package, avoiding conflicts

The module cache is viewed through the require.cache variable, so it can be accessed directly if needed. A practical example is to invalidate a cached module by removing relative keys from the require.cache variable. This is useful in testing, but dangerous under normal conditions.

Circular dependencies

Many people think of circular dependencies as an inherent design problem in Node.js, but it can happen in real projects, so we at least know how to make circular dependencies work in Node.js. Looking at our custom require() function, we can immediately see how it works and what to look for.

Look at the following two modules:

  • The modulea.js
exports.loaded = false;
const b = require('./b');
module.exports = {
  bWasLoaded: b.loaded,
  loaded: true
};Copy the code
  • The moduleb.js
exports.loaded = false;
const a = require('./a');
module.exports = {
  aWasLoaded: a.loaded,
  loaded: true
};Copy the code

Then we write the following code in main.js:

const a = require('./a');
const b = require('./b');
console.log(a);
console.log(b);Copy the code

Executing the code above prints the following result:

{
  bWasLoaded: true.loaded: true
}
{
  aWasLoaded: false.loaded: true
}Copy the code

This result shows the order in which loop dependencies are processed. Although both modules, A.js and B.js, are fully initialized when required by the main module, the A.js module is incomplete when loaded from B.js. In particular, this state lasts until the b.js is finished loading. This is something we should pay attention to, specifically to confirm the order we need for the two modules in main.js.

This is because the module A.js will receive an incomplete version of b.js. We now understand that if we lose control over which module to load first, this can easily happen if the project is large enough.

Documentation on circular references

In simple terms, to prevent module loading loops, Node.js will cache the module’s results after the first load and fetch the results directly from the cache the next time it is loaded. So in this cyclic dependency case, there are no dead loops, but modules are not exported as expected due to caching (see below for a detailed case study).

The website shows that three modules are not the simplest case of circular dependency. In fact, two modules can express this situation very clearly. The idea of recursion is that solving the simplest case is half the way to solving a problem of any size (the other half requires figuring out how the solution will change as the problem grows).

JavaScript as an interpreted language, the printout above clearly shows the trajectory of the program. In this example, a.js first requires B.js, the program goes into B.js, and the first line of b.js requires a.js.

As mentioned earlier, in order to avoid an infinite loop of module dependencies, node.js will cache a.js after it runs, but it’s important to note that the only cached a.js is an unfinished copy of the A.js. So when a.js is required in b.js, the result is only an unfinished A. js in the cache. Specifically, it does not specify what is exported (the end of a.js). So the output a in B. js is an empty object.

After that, b.js completes successfully and returns to the require statement of A. js.

Module definition pattern

In addition to its own mechanism for dealing with dependencies, the most common function of modular systems is to define apis. For defining an API, the main consideration is the balance between private and public functionality. The goal is to maximize the availability of information hiding internal implementations and exposed apis, while balancing these with extensibility and code reuse.

In this section, we’ll examine some of the most popular modes for defining modules in Node.js; Each module ensures transparency, extensibility, and code reuse for private variables.

Named after the export

The most basic way to expose a public API is to use named exports, which include assigning all values we want to expose to attributes of objects referenced by export (or module.exports). In this way, the generated export object becomes a container or namespace for a set of related functions.

Take a look at the following code, which is an implementation of this pattern:

//file logger.js
exports.info = (message) = > {
  console.log('info: ' + message);
};
exports.verbose = (message) = > {
  console.log('verbose: ' + message);
};Copy the code

The exported function is then used as an attribute of the module in which it was introduced, as shown in the following code:

// file main.js
const logger = require('./logger');
logger.info('This is an informational message');
logger.verbose('This is a verbose message');Copy the code

Most Node.js modules use this definition.

The CommonJS specification only allows public members to be exposed using the exports variable. Therefore, the named export schema is the only one that is compatible with the CommonJS specification. Module.exports is an extension provided by Node.js to support a broader module definition pattern.

function

One of the most popular module definition patterns involves reassigning the entire module.exports variable to a function. Its main advantage is that it exposes only one function, providing a clear entry point for modules, making it easier to understand and use, and it does a good job of demonstrating the single responsibility principle. This approach to defining modules is also known in the community as the Substack pattern, which can be seen in the following example:

// file logger.js
module.exports = (message) = > {
  console.log(`info: ${message}`);
};Copy the code

This pattern can also use exported functions as namespaces for other public apis. This is a very powerful combination because it still gives modules a separate entry point (the main function of exports). This approach also allows us to expose additional functions with secondary or more advanced use cases. The following code shows how to use the exported function as a namespace to extend the module we defined earlier:

module.exports.verbose = (message) = > {
  console.log(`verbose: ${message}`);
};Copy the code

This code demonstrates how to call the module we just defined:

// file main.js
const logger = require('./logger');
logger('This is an informational message');
logger.verbose('This is a verbose message');Copy the code

While exporting just one function can also be a limitation, it’s actually a perfect way to focus on a single function that represents the most important functionality of the module, while making the internal private variable properties more transparent, exposing only the properties of the exported function itself.

The modularity of Node.js encourages us to adopt the single responsibility principle (SRP) : Each module should be responsible for a single function, and that responsibility should be fully encapsulated by the module to ensure reusability.

Note that the substack pattern here exposes the main functionality of the module by exporting only one function. Use the exported function as a namespace to export other secondary functions.

Constructor (class) export

A module that exports a constructor is a special case of a module that exports a function. The difference is that with this new pattern, we allow users to create new instances using constructors, but we can also extend their prototypes and create new classes (inheritance). Here is an example of this pattern:

// file logger.js
function Logger(name) {
  this.name = name;
}
Logger.prototype.log = function(message) {
  console.log(` [The ${this.name}] ${message}`);
};
Logger.prototype.info = function(message) {
  this.log(`info: ${message}`);
};
Logger.prototype.verbose = function(message) {
  this.log(`verbose: ${message}`);
};
module.exports = Logger;Copy the code

We use the above modules in the following way:

// file main.js
const Logger = require('./logger');
const dbLogger = new Logger('DB');
dbLogger.info('This is an informational message');
const accessLogger = new Logger('ACCESS');
accessLogger.verbose('This is a verbose message');Copy the code

The same pattern can be implemented with ES2015’s class keyword syntax:

class Logger {
  constructor(name) {
    this.name = name;
  }
  log(message) {
    console.log(` [The ${this.name}] ${message}`);
  }
  info(message) {
    this.log(`info: ${message}`);
  }
  verbose(message) {
    this.log(`verbose: ${message}`); }}module.exports = Logger;Copy the code

Since ES2015 classes are just syntactic sugar for stereotypes, the use of this module will be exactly the same as its stereotype – and constructor-based scheme.

An export constructor or class is still a single entry point for a module, but it exposes more of the module’s internal structure than the Substack pattern. However, on the other hand, we can be more convenient when we want to extend the functionality of this module.

Variations on this pattern include calls that do not use new. This little trick lets us use our module as a factory. Look at the following code:

function Logger(name) {
  if(! (this instanceof Logger)) {
    return new Logger(name);
  }
  this.name = name;
};Copy the code

It’s pretty simple: we check that this exists and is an instance of Logger. If any of these conditions are false, it means that the Logger() function is called without using new, and then proceeds to create a new instance correctly and return it to the caller. This technique allows us to use modules as factories as well:

// file logger.js
const Logger = require('./logger');
const dbLogger = Logger('DB');
accessLogger.verbose('This is a verbose message');Copy the code

The new.target syntax of ES2015 provides a cleaner way to do this starting with Node.js 6. This utilization exposes the new.target property, which is a meta-property available in all functions and evaluates to true at run time if the function is called using the new keyword. We can rewrite the factory using this syntax:

function Logger(name) {
  if (!new.target) {
    return new LoggerConstructor(name);
  }
  this.name = name;
}Copy the code

This code does exactly the same as the previous one, so we can say that ES2015’s new.target syntax sugar makes the code more readable and natural.

Instance export

We can take advantage of require() ‘s caching mechanism to easily define stateful instances with state created from constructors or factories that can be shared between modules. The following code shows an example of this pattern:

//file logger.js
function Logger(name) {
  this.count = 0;
  this.name = name;
}
Logger.prototype.log = function(message) {
  this.count++;
  console.log('[' + this.name + '] ' + message);
};
module.exports = new Logger('DEFAULT');Copy the code

The newly defined module can be used like this:

// file main.js
const logger = require('./logger');
logger.log('This is an informational message');Copy the code

Because modules are cached, every module that needs a Logger module actually always retrieves the same instance of the object, thus sharing its state. This pattern is very much like creating singletons. However, it does not guarantee a unique instance of the entire application, because it occurs in the traditional singleton pattern. In analyzing the parsing algorithm, you’ve actually seen that a module can be installed multiple times in an application’s dependency tree. This results in multiple instances of the same logical module, all running in the context of the same Node.js application. In Chapter 7, we’ll examine exported stateful instances and some alternative patterns.

Extensions to the pattern we just described include the constructor of exports for creating instances as well as the instance itself. This allows users to create new instances of the same objects, or to extend them if needed. To do this, we simply assign a new property to the instance, as shown in the following code:

module.exports.Logger = Logger;Copy the code

We can then use the exported constructor to create additional instances of the class:

const customLogger = new logger.Logger('CUSTOM');
customLogger.log('This is an informational message');Copy the code

From a code availability perspective, this is similar to using an exported function as a namespace. The module exports a default instance of an object, which is what we use most of the time, while more advanced functions (such as the ability to create new instances or extend objects) can still be used with fewer exposed attributes.

Modify other modules or global scopes

A module can even export anything and that can seem a little out of place; However, we should not forget that a module can modify the global scope and any objects in it, including other modules in the cache. Note that these are generally considered bad practices, but since this pattern can be useful and secure in certain situations (such as testing), it is sometimes possible to take advantage of this feature, which is worth knowing and understanding. We say that a module can modify other modules or objects in the global scope. It usually refers to temporary changes that modify an existing object at run time to change or extend its behavior or application.

The following example shows how we can add a new function to another module:

// file patcher.js
// ./logger is another module
require('./logger').customMessage = (a)= > console.log('This is a new functionality');Copy the code

Write the following code:

// file main.js
require('./patcher');
const logger = require('./logger');
logger.customMessage();Copy the code

In the above code, you must first introduce the Patcher program to use the Logger module.

This is dangerous. The main consideration is that modules that have modified the global namespace or other modules are side effects. In other words, it affects the state of entities outside its scope, which can lead to unpredictable consequences, especially when multiple modules interact with the same entity. Imagine that two different modules try to set the same global variable, or modify the same attribute of the same module, and the effect can be unpredictable (which module wins?). But most importantly it will have an impact on the entire application.

Observer model

Another important and fundamental mode in Node.js is the observer mode. Like reactor pattern, callback pattern, and module, the Observer pattern is one of the foundations of Node.js and the basis for using many node.js core modules and user-defined modules.

Observer mode is an ideal solution for node.js data responses and a perfect complement to callbacks. We give the following definitions:

A publisher defines an object that notifies a group of observers (or listeners) when its state changes.

The main difference from the callback pattern is that the principal can actually notify multiple observers, whereas traditional CPS style callbacks typically propagate the results of the principal to only one listener.

EventEmitter class

In traditional object-oriented programming, the observer pattern requires interfaces, concrete classes, and hierarchies. In Node.js, everything is much simpler. The observer pattern is built into the core module and can be implemented through the EventEmitter class. The EventEmitter class allows us to register one or more functions as listeners whose callback will be called to notify their listeners when a particular event type is fired. The following image visually illustrates this concept:

EventEmitter is a class (prototype) that is derived from the event core module. The following code shows how to get a reference to it:

const EventEmitter = require('events').EventEmitter;
const eeInstance = new EventEmitter();Copy the code

The basic methods of EventEmitter are as follows:

  • On (the event listener): This method allows you to specify the event type (Type String) Register a new listener (a function)
  • once(event, listener)This method registers a new listener and is then deleted after the event is first published
  • emit(event, [arg1], [...] ): This method generates a new event and provides additional parameters to pass to the listener
  • removeListener(event, listener): This method removes listeners of the specified event type

All of the above methods return an EventEmitter instance to allow linking. Function ([arg1], [… ), so it simply accepts the parameters provided when the event is emitted. In listeners, this refers to instances of events generated by EventEmitter. As you can see, there is a big difference between a listener and a traditional Node.js callback; In particular, the first argument is not error; it is any data passed to emit() at call time.

Create and use EventEmitter

Let’s look at how we can use EventEmitter in practice. The easiest way is to create a new instance and use it immediately. The following code shows how EventEmitter can be used to notify subscribers in real time when a file content matching a particular regular is found in a list of files:

const EventEmitter = require('events').EventEmitter;
const fs = require('fs');

function findPattern(files, regex) {
  const emitter = new EventEmitter();
  files.forEach(function(file) {
    fs.readFile(file, 'utf8', (err, content) => {
      if (err)
        return emitter.emit('error', err);
      emitter.emit('fileread', file);
      let match;
      if (match = content.match(regex))
        match.forEach(elem= > emitter.emit('found', file, elem));
    });
  });
  return emitter;
}Copy the code

The previous function EventEmitter processes the three events that will be generated:

  • filereadEvent: Triggered when a file is read
  • foundEvent: Triggered when the file content is successfully matched by the re
  • errorEvent: Triggered when an error occurred reading a file

Here’s how the findPattern() function is triggered:

findPattern(['fileA.txt', 'fileB.json'], /hello \w+/g)
  .on('fileread', file => console.log(file + ' was read'))
  .on('found', (file, match) => console.log('Matched "' + match + '" in file ' + file))
  .on('error', err => console.log('Error emitted: ' + err.message));Copy the code

In the previous example, we registered a listener for each event type generated by EventEmitter created by the EventParttern() function.

Error propagation

If events are sent asynchronously, EventEmitter cannot throw an exception when an exception occurs, and the exception will be lost in the event loop. Instead, emit a special event called an Error, and the Error object is passed as a parameter. This is exactly what we are doing in the findPattern() function we defined earlier.

For error events, it is always best to register the listener, because Node.js handles it in a special way and will automatically throw an exception and exit the program if the associated listener is not found.

Make any object observable

Sometimes it’s not enough to create a new observable directly from the EventEmitter class, because the original EventEmitter class doesn’t provide extensions to the actual scenarios we use. We can make a generic object observable by extending the EventEmitter class.

To demonstrate this pattern, we try to implement the findPattern() function in an object, as shown in the following code:

const EventEmitter = require('events').EventEmitter;
const fs = require('fs');
class FindPattern extends EventEmitter {
  constructor(regex) {
    super(a);this.regex = regex;
    this.files = [];
  }
  addFile(file) {
    this.files.push(file);
    return this;
  }
  find() {
    this.files.forEach(file= > {
      fs.readFile(file, 'utf8', (err, content) => {
        if (err) {
          return this.emit('error', err);
        }
        this.emit('fileread', file);
        let match = null;
        if (match = content.match(this.regex)) {
          match.forEach(elem= > this.emit('found', file, elem)); }}); });return this; }}Copy the code

The FindPattern class we define uses the inherits() function provided by the core module Util to extend EventEmitter. In this way, it becomes an observable class that fits our actual usage scenario. Here is an example of its use:

const findPatternObject = new FindPattern(/hello \w+/);
findPatternObject
  .addFile('fileA.txt')
  .addFile('fileB.json')
  .find()
  .on('found', (file, match) => console.log(`Matched "${match}"
       in file ${file}`))
  .on('error', err => console.log(`Error emitted ${err.message}`));Copy the code

Now, by inheriting the functions of EventEmitter, we can see that FindPattern objects have a whole set of methods in addition to being observable. This is a common pattern in the Node.js ecosystem. For example, the Server object of the core HTTP module defines methods like Listen (), close(), setTimeout(), and internally it also inherits EventEmitter functions, This allows it to respond to request-related events when a new request is received, a new connection is established, or the server is closed.

Other examples of objects extending EventEmitter are Node.js streams. We’ll look at Node.js flows in more detail in Chapter 5.

Synchronous and asynchronous events

Like the callback mode, events can be sent synchronously or asynchronously. It is important that we never mix two methods in the same EventEmitter, but it is important to consider synchronous or asynchronous when publishing the same event type to avoid zalgO resulting from inconsistent synchronous and asynchronous sequences.

The main difference between publishing synchronous and asynchronous events is the way observers are registered. When events are published asynchronously, the program registers new observers even after EventEmitter is initialized, because the event must not be fired until the next cycle of the event cycle. As in the findPattern() function above. It represents the common method used in most Node.js asynchronous modules.

In contrast, publishing events synchronously requires registering observers before the EventEmitter function starts emitting any events. Look at the following example:

const EventEmitter = require('events').EventEmitter;
class SyncEmit extends EventEmitter {
  constructor() {
    super(a);this.emit('ready'); }}const syncEmit = new SyncEmit();
syncEmit.on('ready', () = >console.log('Object is ready to be used'));Copy the code

This code would work if the Ready event was published asynchronously, however, because the event is published synchronously and the listener is registered after the event is sent, the resulting listener is not called and the code will not be printed to the console.

Because of different application scenarios, it sometimes makes sense to use EventEmitter functions in a synchronous manner. Therefore, the synchronization and asynchronism of our EventEmitter should be clearly highlighted to avoid unnecessary errors and exceptions.

Comparison of event and callback mechanisms

When defining asynchronous apis, a common difficulty is to check whether to use EventEmitter’s event mechanism or only accept callback functions. The general rule of distinction is this: when a result must be returned asynchronously, the callback function should be used, and when the result is not determined, the event mechanism should be used to respond.

However, because the two are so close, and it is possible to achieve the same application scenarios in both ways, there is a lot of confusion. Take the following code for example:

function helloEvents() {
  const eventEmitter = new EventEmitter();
  setTimeout((a)= > eventEmitter.emit('hello'.'hello world'), 100);
  return eventEmitter;
}

function helloCallback(callback) {
  setTimeout((a)= > callback('hello world'), 100);
}Copy the code

HelloEvents () and helloCallback() can be considered equivalent in function, the first implemented using an event mechanism, and the second notifies the caller using a callback, passing the event as a parameter. But what really distinguishes them is the executability, semantics, and amount of code to implement or use. While we can’t give you a definitive set of rules for choosing a style, we can certainly offer some tips to help you decide.

Compared to the first example, the observer pattern, the callback function has some limitations in supporting different types of events. But in fact, we can still distinguish between multiple events by passing the event type as an argument to the callback, or by accepting multiple callbacks. However, doing so cannot be considered an elegant API. In this case, EventEmitter can provide better interfaces and more streamlined code.

Another application where EventEmitter excites is when the same event is fired multiple times or not at all. In fact, a callback is expected to be called only once, whether the operation succeeds or not. But there is a special case where we may not know at what point events are triggered, and in this case EventEmitter is preferred.

Finally, apis that use callbacks only notify specific callbacks, but the EventEmitter function allows multiple listeners to receive notifications.

The callback mechanism is used in combination with the event mechanism

There are also cases where you can combine event mechanisms with callbacks. This pattern is especially useful when we export asynchronous functions. The Node-glob module is an example of this module.

glob(pattern, [options], callback)Copy the code

This function takes a file name matching pattern as its first argument, followed by a set of options and a callback function that is called for the list of files that match the specified file name matching pattern. At the same time, this function returns EventEmitter, which shows the state of the current process. For example, the match event can be published in real time when the file name is successfully matched, the End event can be published in real time when the file list is completely matched, or the ABORT event can be published when the process is manually terminated. Look at the following code:

const glob = require('glob');
glob('data/*.txt', (error, files) => console.log(`All files found: The ${JSON.stringify(files)}`))
  .on('match', match => console.log(`Match found: ${match}`));Copy the code

conclusion

In this chapter, we first looked at the difference between synchronous and asynchronous. We then explored how to use the callback mechanism and the callback mechanism to handle some basic asynchronous scenarios. We also learned about the main differences between the two models, when they are better suited to solving a specific problem than the other. We are just the first step towards a more advanced asynchronous model.

In the next chapter, we’ll look at more complex scenarios and see how callback mechanisms and event mechanisms can be leveraged to handle advanced asynchronous control issues.