Many of you have at one point or another used Node to create HTTP servers to handle HTTP requests, whether it’s a simple blog or a large service that’s already loaded with tens of millions of requests. However, we may not have seen much of the process of creating an HTTP Server in Node, so we hope this article will give you a better understanding of Node.

First on the flow chart, to help you more easily understand the source code

A preliminary study

Let’s start with a simple example of creating an HTTP Server. The basic process can be divided into two steps

  • usecreateServerTo obtainserverobject
  • callserver.listenEnabling the Listening Service
const http = require('http')

// Create a server object
const server = http.createServer((req, res) = > {
  res.writeHead(200, { 'Content-Type': 'text/plain' });
  res.end('Response content');
});

// Start listening for requests on port 3000
server.listen(3000)

Copy the code

This process is very simple, and we will start to analyze the internal process of Node creating HTTP Server based on this process and the source code.

Before we do that, to better understand the code, we need to understand some basic concepts:

Fd – File descriptor

A File descriptor is an abstract concept used to express a reference to a File. The file descriptor is formally a non-negative integer. In fact, it is an index value that points to the log table that the kernel maintains for each process that opens files. When a program opens an existing file or creates a new file, the kernel returns a file descriptor to the process.

Handle – handle

A handle is an integer used by the Windows operating system to identify an object created or used by an application program. It is essentially equivalent to a smart pointer with reference counting. Handles are used when an application wants to reference a block of memory or an object managed by another system, such as a database or operating system. File descriptors on Unix systems are also basically handles.

Handle in this article can be understood as a reference to a related object.

In this paper, we use… Conformance means that part of the code that is less relevant to the content discussed in this paper and does not affect the main logic, such as parameter processing and attribute assignment, is omitted.

http.createServer

CreateServer is a factory method that returns an instance of the Server class in the _http_server module, which is exported from the _http_server file

const {
  Server,
} = require('_http_server');

// http.createServer
function createServer(opts, requestListener) {
  return new Server(opts, requestListener);
}
Copy the code

_http_server

As you can see from the Server class of the _http_server module, http.Server is inherited from net.Server

function Server(options, requestListener) {
  // You can call http.server () without using new.
  if(! (this instanceof Server)) return new Server(options, requestListener);
  
  // Parameter adaptation
  // ...

	/ / inheritance
  net.Server.call(this, { allowHalfOpen: true });

  if (requestListener) {
    this.on('request', requestListener);
  }

	// ...
  this.on('connection', connectionListener);
	// ...
}

// http.Server inherits from net.ServerObjectSetPrototypeOf(Server.prototype, net.Server.prototype); ObjectSetPrototypeOf(Server, net.Server); .Copy the code

The inheritance is also easier to understand: Net. Server in Node is the module used to create TCP or IPC servers. As we all know, HTTP is an application layer protocol and TCP is a transport layer protocol. HTTP transmits data over TCP and parses it again. The HTTP module in Node is reencapsulated based on TCP module, and different parsing processing logic is realized, that is, the inheritance relationship we see appears.

Similarly, Net.Server inherits the EventEmitter class, which has a number of event triggers, including properties that you can look up on your own.

So far, we can see that createServer is just an instantiation of Net.server. Instead of creating a service listener, it is implemented by the server.listen method.

server.listen

When a Server instance is created, you typically need to call the server.listen method to start the service and start processing requests, such as Koa’s app.listen. The Listen method can be used in a number of ways, and we’ll examine each of them below

1. server.listen(handle[, backlog][, callback])

The first is a less common use. Node allows us to start a server and listen for connections on a given Handle that are already bound to a port, a Unix domain socket, or a Windows named pipe.

A Handle can be a Server, a socket (anything with an underlying _handle member), or an object with fd (file descriptor) attributes, such as the Server object we created through createServer.

After identifying the handle object, the listenInCluster method is called. From the name of the method, we can guess that this method is used to start the service listening:

// Handle is an object with the _handle attribute
if (options instanceof TCP) {
  this._handle = options;
  this[async_id_symbol] = this._handle.getAsyncId();
  listenInCluster(this.null.- 1.- 1, backlogFromArgs);
  return this;
}

// When handle is an object with fd attributes
if (typeof options.fd === "number" && options.fd >= 0) {
  listenInCluster(this.null.null.null, backlogFromArgs, options.fd);
  return this;
}
Copy the code

2. server.listen([port[, host[, backlog]]][, callback])

The second is the common listening port. Node allows you to create a server that listens on a port on a given host, which can be an IP address or a domain name link. When host is a domain name link, Node will first use dns.lookup to obtain the IP address. Finally, after checking the validity of the port, it is also called listenInCluster method, source 🔗.

3. [server.listen(path[, backlog][, callback])](http://nodejs.cn/s/yW8Zc1)

Third, Node allows you to start an IPC server to listen on the specified IPC path, namely the named pipe IPC on Windows and Unix Domain sockets on other Unix-like systems.

The path parameter here is the path that identifies the IPC connection. On Unix systems, the path parameter is represented as the file system pathname; on Windows, the path parameter must be in the format \\? \pipe\ or \.pipe \ is the entrance.

Then, the same is called listenInCluster method, source 🔗.

Server.listen (options[, callback]) is another way to call ports and IPC paths, which I won’t cover here.

Finally, throw an error for an exception that does not meet all of the above criteria.

summary

At this point, we can see that the server.listen method parses the different calls and calls the listenInCluster method.

listenInCluster

First, a brief introduction to clsuter.

We all know that JavaScript runs on a single thread, and that a thread only runs on one CPU core. Modern processing is multi-core. In order to take full advantage of multi-core, multiple Node.js processes need to be enabled to handle load tasks.

Node provides the Cluster module to solve this problem. We can use cluster to create multiple processes and listen on the same port at the same time. Isn’t it amazing? Don’t worry, we will decrypt the magic cluster module.

Let’s look at a simple use of a cluster:

const cluster = require('cluster');
const http = require('http');

if (cluster.isMaster) {
  // Spawn the work process.
  for (let i = 0; i < 4; i++) { cluster.fork(); }}else {
  // Worker processes can share any TCP connection.
  // In this case, the HTTP server is shared.
  http.createServer((req, res) = > {
    res.writeHead(200);
    res.end('Hello world \n');
  }).listen(8000);
}
Copy the code

Based on the usage of cluster, the process responsible for starting other processes is called the master process, does not do the specific work, only responsible for starting other processes. Other started processes are called worker processes, which receive requests and provide services externally.

ListenInCluster method mainly does one thing: Distinguish master process (cluster.ismaster) and worker process, adopt different processing strategy:

  • masterProcess: direct callserver._listenStart listening
  • workerProcess: usingclsuter._getServerProcessing incomingserverObject, modifyserver._handleCall againserver._listenStart listening
function listenInCluster(.) {
  // Import the cluster module
  if (cluster === undefined) cluster = require('cluster');

  / / master process
  if (cluster.isMaster || exclusive) {
    server._listen2(address, port, addressType, backlog, fd, flags);
    return;
  }

  // A non-master process is a child process started by a cluster
  const serverQuery = {
    address: address,
    port: port,
    addressType: addressType,
    fd: fd,
    flags,
  };
  
  // call the cluster method processing
  cluster._getServer(server, serverQuery, listenOnMasterHandle);

  function listenOnMasterHandle(err, handle) {
    // ...server._handle = handle; server._listen2(address, port, addressType, backlog, fd, flags); }}Copy the code

The master process

Server. _listen2 is the alias of setupListenHandle.

SetupListenHandle is responsible for calling createServerHandle to obtain the Handle and calling handle.listen to enable the listener depending on the type of server listening connection.

function setupListenHandle(address, port, addressType, backlog, fd, flags) {
  // If it is a Handle, create a Handle
  if (this._handle) {
    // do nothing
  } else {
    let rval = null;
    // If host and port are omitted and fd is not specified
    // If IPv6 is available, the server will receive a connection based on an unspecified IPv6 address (::)
    // Otherwise receive a connection based on an unspecified IPv4 address (0.0.0.0).
    if(! address &&typeoffd ! = ='number') {
      rval = createServerHandle(DEFAULT_IPV6_ADDR, port, 6, fd, flags);

      if (typeof rval === 'number') {
        rval = null;
        address = DEFAULT_IPV4_ADDR;
        addressType = 4;
      } else {
        address = DEFAULT_IPV6_ADDR;
        addressType = 6; }}/ / fd or IPC
    if (rval === null)
      rval = createServerHandle(address, port, addressType, fd, flags);

    // If createServerHandle returns a number, an error occurred and the process exited
    if (typeof rval === 'number') {
      const error = uvExceptionWithHostPort(rval, 'listen', address, port);
      process.nextTick(emitErrorNT, this, error);
      return;
    }

    this._handle = rval; }...// Start listening
  const err = this._handle.listen(backlog || 511); .// Trigger the listening method
}
Copy the code

CreateServerHandle is responsible for calling the tcp_warp.cc and pipe_wrap modules in C++ to create the PIPE and TCP services. PIPE and TCP objects both have a Listen method, To uvlib listen [uv_listen] (http://docs.libuv.org/en/v1.x/stream.html?highlight=uv_listen#c.uv_listen) in the method of encapsulation, And in Linux [listen (2)] (https://man7.org/linux/man-pages/man2/listen.2.html). System capabilities can be invoked to start listening for incoming connections and call back request information when a new connection is received.

PIPE encapsulates stream files (including Sockets and pipes) on Unix and named pipes on Windows. TCP encapsulates TCP services.

function createServerHandle(address, port, addressType, fd, flags) {
  // ...
  let isTCP = false;
  // When the fd option exists
  if (typeof fd === 'number' && fd >= 0) {
    try {
      handle = createHandle(fd, true);
    } catch (e) {
      debug('listen invalid fd=%d:', fd, e.message);
      // Error code in uvlib, indicating invalid parameter, negative number
      returnUV_EINVAL; }... }else if (port === - 1 && addressType === - 1) {
    // If port and address do not exist, that is, listen on Socket or IPC
    // Create Pipe Server
    handle = newPipe(PipeConstants.SERVER); . }else {
    // Create TCB SERVER
    handle = new TCP(TCPConstants.SERVER);
    isTCP = true;
  }
  // ...
  return handle;
}
Copy the code

summary

The server. Listen processing logic of the master process is relatively simple, which can be summarized as directly calling Libuv and using the system capability to enable the listening service.

Worker processes

If the current process is not the master process, things get a lot more complicated.

The listenInCluster method calls the _getServer method exported by the Cluster module. The Cluster module determines whether the current process is a child process based on whether the current process contains NODE_UNIQUE_ID. Using the exported variable of the Child or master file, the corresponding treatment will be different

const childOrMaster = 'NODE_UNIQUE_ID' in process.env ? 'child' : 'master';

module.exports = require(`internal/cluster/${childOrMaster}`);
Copy the code

The worker process in question, without the NODE_UNIQUE_ID environment variable, uses the _getServer method exported by the Child module.

The worker process’s _getServer method does two things:

  • By sendinginternalMessage, that is, the way of interprocess communication, tomasterThe process passes the message and callsqueryServe, register currentworkerProcess information. ifmasterThe process is listening for this for the first timePort/fd çš„ worker, an internal TCP server to listen on thisPort/fdThe duties followed inmasterTo record theworker.
  • If it’s a RoundRobinHandle, change itworkerIn the processnet.ServerThe instancelistenMethod listeningPort/fdSo that it no longer has surveillance responsibilities.
// obj is an instance of net.Server or Socket
cluster._getServer = function(obj, options, cb) {
  let address = options.address;
  // ...
  // const indexesKey = ... ;
  // indexes is a Map object
  indexes.set(indexesKey, index);

  const message = {
    act: 'queryServer',
    index,
    data: null. options }; message.address = address;// Send internalMessage to notify the Master process
  // Accept a callback from the Master process
  send(message, (reply, handle) => {
    if (typeof obj._setServerData === 'function')
      obj._setServerData(reply.data);

    if (handle)
      // When closing the connection, remove handle to avoid memory leaks
      shared(reply, handle, indexesKey, cb);  // Shared listen socket.
    else
      // fake the listen method
      rr(reply, indexesKey, cb);              // Round-robin.
  });

  // ...
};
Copy the code

After the queryServer in master receives the message, it creates RoundRobinHandle and SharedHandle respectively according to different conditions (platform, protocol, etc.), that is, cluster distributes and processes connections.

At the same time, the master process will use the key composed of the monitored port and address information as the unique mark to record the information of handle and corresponding worker.

function queryServer(worker, message) {
  // ...
  const key = `${message.address}:${message.port}:${message.addressType}: ` +
              `${message.fd}:${message.index}`;
  let handle = handles.get(key);

  if (handle === undefined) {
    letaddress = message.address; . letconstructor= RoundRobinHandle; if (schedulingPolicy ! == SCHED_RR || message.addressType === 'udp4' || message.addressType === 'udp6') {constructor = SharedHandle;
    }

    handle = new constructor(key, address, message);
    handles.set(key, handle);
  }

  // ...
  handle.add(worker, (errno, reply, handle) => {
    const { data } = handles.get(key);
    // ...
    send(worker, {
      errno,
      key,
      ack: message.seq, data, ... reply }, handle); }); }Copy the code

RoundRobinHandle

RoundRobinHandle (also the default method on all platforms except Windows) The master process is responsible for monitoring the port, and then circularly distributing the connection to the worker process after receiving the new connection. That is, the request is put into a queue, and a request is separated from the idle worker pool for processing, and then put back into the worker pool after processing, and so on

function RoundRobinHandle(key, address, { port, fd, flags }) {
  this.key = key;
  this.all = new Map(a);this.free = new Map(a);this.handles = [];
  this.handle = null;
  / / create a Server
  this.server = net.createServer(assert.fail);

  // Enable listening
  // this.server.listen(...)

  this.server.once('listening', () = > {this.handle = this.server._handle;
    // The request is received and distributed
    this.handle.onconnection = (err, handle) = > this.distribute(err, handle);
    this.server._handle = null;
    this.server = null;
  });
}

// ...

RoundRobinHandle.prototype.distribute = function(err, handle) {
  this.handles.push(handle);
  const [ workerEntry ] = this.free;

  if (ArrayIsArray(workerEntry)) {
    const [ workerId, worker ] = workerEntry;
    this.free.delete(workerId);
    this.handoff(worker); }}; RoundRobinHandle.prototype.handoff =function(worker) {
  if (!this.all.has(worker.id)) {
    return;  // Worker is closing (or has closed) the server.
  }

  const handle = this.handles.shift();

  if (handle === undefined) {
    this.free.set(worker.id, worker);  // Add to ready queue again.
    return;
  }

  const message = { act: 'newconn'.key: this.key };

  sendHelper(worker.process, message, handle, (reply) => {
    if (reply.accepted)
      handle.close();
    else
      this.distribute(0, handle);

    this.handoff(worker);
  });
};
Copy the code

SharedHandle

The processing mode of SharedHandle is as follows: the master process creates a monitoring server, and then sends the handle of the server to the worker process, which is responsible for receiving the connection directly

function SharedHandle(key, address, { port, addressType, fd, flags }) {
  this.key = key;
  this.workers = new Map(a);this.handle = null;
  this.errno = 0;

  let rval;
  if (addressType === 'udp4' || addressType === 'udp6')
    rval = dgram._createSocketHandle(address, port, addressType, fd, flags);
  else
    rval = net._createServerHandle(address, port, addressType, fd, flags);

  if (typeof rval === 'number')
    this.errno = rval;
  else
    this.handle = rval;
}

// Add storage for worker information
SharedHandle.prototype.add = function(worker, send) {
  assert(!this.workers.has(worker.id));
  this.workers.set(worker.id, worker);
  // Send the handle to the worker process
  send(this.errno, null.this.handle);
};
// ..
Copy the code

PS: Windows does not use RoundRobinHandle for performance reasons. In theory, the second method should be the most efficient. In practice, however, due to the vagaries of the operating system scheduling mechanism, distribution can be unstable, and two out of eight processes may share 70% of the load. In comparison, the method of rotation training will be more efficient.

summary

In worker processes, each worker no longer independently starts the monitoring service, but the master process starts a unified monitoring service, accepts the request connection, and then forwards the request to the worker process for processing.

conclusion

The process for Node to create an HTTP Server is inconsistent in different situations. When the process is master, Node invokes system capabilities directly through libuv to enable listening. When the process is a child process (worker process), Node will use the master process to enable interlistening and distribute the connection to the child process through rotation training or shared Handle.

Finally, it is not easy to write an article. If you like it, you are welcome to triple link