[原文链接]
Feclub. Cn/post/conten…

Translation source:
medium.com/@becintec/b…

When you have a Node application up and running and providing traffic, you probably can’t rest easy. For example, sometimes your application will have some unexpected events, such as database connection times out, memory overflow, or deployment forcing the Nodejs service to restart. At this point, what you need to focus on is what happens to the process that’s providing the service at this point? It goes without saying that as the process terminates, the request that is providing the service also terminates the service.

A method of processing such problems no longer exists, namely, Graceful withdraw, which allows a Nodejs application to withdraw from the process once it has completed responding to all normal requests. Although Nodejs applications can add smooth exit mechanisms relatively easily, the way Docker and NPM start child processes and process signals can lead to some unexpected differences between direct local and Dockerized launches.

A smooth exit

To test the smooth exit function, let’s create a very simple Nodejs application.

Package. Json:

{ "name": "simple_node_app", "main": "server.js", "scripts": { "start": "node server.js" }, "dependencies": {"express": "^4.13.3"}}Copy the code

Server. Js:

'use strict';
const express = require('express');
const PORT = process.env.port || 8080;
const app = express();
app.get('/', function (req, res) {
 res.send('Hello world\n');
});
app.get('/wait', function (req, res) {
 const timeout = 5;
 console.log(`received request, waiting ${timeout} seconds`);
 const delayedResponse = () => {
 res.send('Hello belated world\n');
 };
 setTimeout(delayedResponse, timeout * 1000);
});
app.listen(PORT);
Copy the code

As expected, when we run our application locally, it doesn’t exit gracefully.

$NPM install && NPM start > start simple_node_app > node server.jsCopy the code

Initiate a request at another terminal:

$ curl http://localhost:8080/wait
Copy the code

Then, before the request ends, a SIGTERMsignal is sent to the NPM:

Find NPM process PID $# ps - falx | grep NPM | grep -v grep UID PID PPID CMD # 502 68044 31496 NPM SIGTERM $(15) signal to the process kill -15 68044Copy the code

You can see that the request is terminated as the NPM service terminates.

$ npm start
> node server.js
Running on http://localhost:8080
received request, waiting 5 seconds
Terminated: 15
Copy the code
$ curl http://localhost:8080/wait
curl: (52) Empty reply from server
Copy the code

Processing all the signals

To solve this problem, we need to add an explicit signal processing policy to our server.js file (see: This Great Post by Grigoriy Chudnov).

const server = app.listen(PORT); SIGKILL signal = {'SIGHUP': 1, 'SIGINT': 2, 'SIGTERM': 15}; Const shutdown = (signal, value) => {console.log("shutdown!") ); server.close(() => { console.log(`server stopped by ${signal} with value ${value}`); process.exit(128 + value); }); }; Keys (signals).foreach ((signal) => {process.on(signal, () => { console.log(`process received a ${signal} signal`); shutdown(signal, signals[signal]); }); });Copy the code

Now, to retrace the process, we can see that the Nodejs service will not be shut down until the request completes:

$ npm start
> node server.js
Running on http://localhost:8080
received request, waiting 5 seconds
process received a SIGTERM signal
shutdown!
sending response!
server stopped by SIGTERM with value 15
Copy the code

The request then ends as normal:

$ curl http://localhost:8080/wait
Hello belated world
Copy the code

Note: NPM throws errors here because it does not expect Nodejs to exit. However, since Nodejs is doing what it is supposed to do, this error can be ignored.

npm ERR! [email protected] start: `node server.js` npm ERR! Exit status 143

Docker-all services

Docker is a service container tool that efficiently packages, deploys, and manages applications. Using Docker to container Nodejs services is simple: just add a Dockerfile, build the image, and run the container.

 # Dockerfile 
 FROM node:boron 
 # Create app directory 
 RUN mkdir -p /usr/src/app 
 WORKDIR /usr/src/app 
 # Install app dependencies 
 COPY package.json /usr/src/app/ 
 RUN npm install --production --quiet 
 # Bundle app source 
 COPY . /usr/src/app 
 EXPOSE 8080 
 CMD ["npm", "start"]
Copy the code

Then, we can build and run the Docker application.

$ docker build -q -t grace . && docker run -p 1234:8080 --rm --name=grace grace
> node server.js
Copy the code

Now, to repeat our previous experiment, we want to shut down the process by sending a request to the application in Docker before the request completes. We do this by referring to our new port (Docker will internally map port 8080 to external port 1234) and calling Docker Stop Grace (sending a SIGTERM signal to the Docker container named Grace) :

$ curl http://localhost:1234/wait
curl: (52) Empty reply from server
Copy the code

What? Why do we see requests being aborted when the same code is being tested on the host when it can exit gracefully?

NPM mechanism

To understand why, we need to take a closer look at the execution mechanism of NPM Start.

When we run NPM start locally, it will start the Nodejs service directly as a child process. This is because the node process’s parent process ID (PPID) is the NPM process’s process ID (PID).

$ ps -falx | grep "node\|npm" | grep -v grep
  UID    PID    PPID    CMD
  502    65378  31800   npm
  502    65379  65378   node server.js
Copy the code

We can verify again that NPM starts only one child process by searching for all processes in the process group ID (PGID).

$ ps xao uid,pid,ppid,pgid,comm | grep 65378
  UID    PID    PPID    PGID    CMD
  502    65378  31800   65378   npm
  502    65379  65378   65378   node
Copy the code

However, when we examine the process on the Docker container, we find something different.

$ ps falx
  UID   PID  PPID   COMMAND
    0     1     0   npm
    0    16     1   sh -c node server.js
    0    17    16    \_ node server.js
Copy the code

In the Docker container, the NPM process starts a shell process and then starts the Nodejs process. This means that NPM does not directly start a Nodejs process.

Let’s confirm whether the problem is caused by the mechanism by which the Docker RUN script starts Nodejs, or by the NPM in the container itself. To do this, we SSH into the running Docker container and run NPM start manually to see how it starts the child process.

# Add an extra port mapping to our container so that we can run two node servers
$ docker run -p 1234:8080 -p 5678:5000 --rm --name=grace grace

# SSH into the container in another terminal and check the currently-running processes
$ docker exec -it grace /bin/sh
$ ps falx
  UID    PID    PPID    COMMAND
    0      1       0    npm
    0     15       1    sh -c node server.js
    0     16      15     \_ node server.js

# Start up a second node server on a different port
$ port=5000 npm start
> node server.js
Running on http://localhost:5000
Copy the code

Now let’s enter the container from another terminal and look at the process structure:

$ docker exec -it grace /bin/sh
$ ps falx
  UID    PID    PPID    COMMAND
    0     22       0    /bin/sh
    0     46      22     \_ npm
    0     56      46         \_ sh -c node server.js
    0     57      56             \_ node server.js
    0      1       0    npm
    0     15       1    sh -c node server.js
    0     16      15     \_ node server.js
Copy the code

Here we can see that no matter how NPM start is called, it always starts a shell process followed by a Nodejs process. Instead of performing NPM directly on the host, the Node process is started directly on the host.

Great signalling mechanism

I’m not sure why NPM is different in these two scenarios, but it seems to explain why the same code exits gracefully on the host but is shut down in Docker.

Note: there are a lot of good articles about signaling when the Docker main process starts with PID 1, for example
Grigoriy Chudnov wrote this article.
Brian DeHamer wrote this article, as well as
This article Yelp. There are also many solutions, including
Yelp dumb – init library.
The observatory libraryand
docker run –init 。

The solution to this signaling problem is very simple: run the Nodejs service directly from the Dockerfile via Node server.js instead of NPM start.

# Dockerfile 
EXPOSE 8080 
CMD ["node", "server.js"]
Copy the code

This is a frustrating solution because NPM Start is designed to provide a unified portal to your Nodejs service. This command gives you a lot of configuration options for your Nodejs service, but it’s not enough when it comes to smooth restarts.

When docker stop is passed to the Nodejs service in the container, the Nodejs service can be shut down after the request is completed.

$ docker build -q --no-cache -t grace . && docker run -p 1234:8080 --rm --name=grace grace
Running on http://localhost:8080
received request, waiting 5 seconds
process received a SIGTERM signal
shutdown!
sending response!
server stopped by SIGTERM with value 15
Copy the code

Our experiment met expectations:

$ curl http://localhost:1234/wait
Hello belated world
Copy the code

The response time is too long

If you do have a request with a particularly long response time, you will notice something very strange:

app.get('/wait', function (req, res) { // increase the timeout const timeout = 15; console.log(`received request, waiting ${timeout} seconds`); const delayedResponse = () => { console.log("sending response!" ); res.send('Hello belated world\n'); }; setTimeout(delayedResponse, timeout * 1000); });Copy the code

When we repeat the previous experiment, creating the container, making the request, and closing the container, we will find that the request is back to the non-smooth closed state:

$ docker build -q --no-cache -t grace . && docker run -p 1234:8080 --rm --name=grace --init grace
Running on http://localhost:8080
received request, waiting 15 seconds
process received a SIGTERM signal
shutdown!

$ curl http://localhost:1234/wait
curl: (52) Empty reply from server
Copy the code

The request is aborted because Docker has a default forced abort option of 10 seconds. Further SIGKILL attempts cannot be captured or ignored, meaning that there is no smooth exit once a SIGKILL is sent. However, docker stop has a –time, -t option, which can be used to increase the time the container forces termination. If a request does take 10 seconds or more, consider this option.

(translator’s note: A Graceful conclusion)

It is important that the Web application exit gracefully so that it can perform any cleanup and fulfill requests in the service. This is easily done in Node applications by adding explicit signal processing to the Nodejs process; However, this may not be sufficient for docker-based applications, as the process may spawn other child processes and affect signaling.

The final conclusions are as follows:

Any intermediate service used to start Nodejs, for example
shellor
npmMay not be able to pass signals to the Nodejs process. Therefore, it is best to pass in a Dockerfile
nodeCommand to start the process directly so that the Nodejs process can receive signals correctly.


Also, because Docker is in
docker stopAfter a timeout occurs, the KILL signal is directly sent. Therefore, services that take a long time need to be executed
docker stopTo allow requests to be completed before the application is closed.