Author: synccheng

During this period of time, I have been developing a general HTML dynamic service for all categories of Tencent documents. In order to facilitate the generation and deployment of all categories of access and conform to the trend of cloud, I have considered using Docker to fix the service content and carry out unified product version management. This article will share my optimization experience accumulated in the process of service Docker for your reference.

To start with an example, most of the students who are new to Docker will write the Dockerfile of the project like this:

RUN NPM install # EXPOSE 8000 CMD [" NPM ", "start"]Copy the code

Build, package, upload, all in one go. Then look at the mirror state, and gosh, a simple Node Web service is a staggering 1.3 gigabytes in size, and the image transfer and build is slow:

It would be fine if only one instance of this image was deployed, but the service would have to be available to all developers for high frequency integration and deployment of the environment (see my last article for a solution to high frequency integration). First of all, too large a mirror volume will inevitably affect the pull and update speed of the mirror, and the integration experience will be worse. Second, after the project goes live, there may be thousands of test environment instances online at the same time. Such container memory cost is unacceptable for any project. An optimized solution must be found.

After finding the problem, I began to study the optimization scheme of Docker and prepared to operate on my mirror image.

Optimized production environment of Node project

The first step is certainly the area most familiar to the front end, optimizing the size of the code itself. The previous project was developed with Typescript. To save time, the project was packaged using TSC to generate ES5 and then run directly. There are two main volume issues here. One is that the ts source code for the development environment is not handled, and the JS code for the production environment is not compressed.

Another is that referenced node_modules are too bloated. It still includes many NPM packages in development debugging environments, such as TS-Node, typescript, and so on. Now that it is packaged into JS, these dependencies should be removed naturally.

In general, since server-side code is not as exposed as front-end code, services running on physical machines are more about stability than volume, so these areas are generally not handled. However, after Dockerization, due to the larger deployment scale, these problems are very obvious and need to be optimized in the production environment.

The way to optimize these two points is actually very familiar with the front end, not the focus of this article is roughly covered. For the first point, use Webpack + Babel to degrade and compress Typescript source code. If you’re worried about error checking, add sourcemap, which is a bit redundant for Docker images, as I’ll cover in a moment. For the second point, we will sort out dependencies and devDependencies of the NPM package to remove dependencies that are not necessarily present at runtime, so that the production environment can use NPM install–production to install dependencies.

Optimize project mirror volume

Use as minimal a base image as possible

As we know, container technology provides process isolation at the operating system level. Docker container itself is a process running under an independent operating system. In other words, Docker image needs to be packaged with an operating system level environment that can run independently. Thus, one important factor in determining the size of the image becomes obvious: the size of the Linux operating system packaged into the image.

Generally speaking, reducing the size of the dependent operating system mainly needs to be considered from two aspects. The first is to remove all kinds of tool libraries that are not needed under Linux as far as possible, such as Python, Cmake, Telnet, etc. The second is to go for a lighter Linux distribution. A regular official image should provide an emasculated version for each distribution based on the two factors mentioned above.

Take node:14, the official version of Node, for example. By default, it runs on Ubuntu, which is a large and complete Linux distribution to ensure maximum compatibility. The dependency version that removes useless libraries is called node: the 14-SLIM version. The smallest mirrored distribution is called Node: 14-Alpine. Linux Alpine is a highly streamlined, only contains basic tools lightweight Linux distribution, its Docker image is only 4 ~ 5M in size, so it is ideal for making the smallest version of the Docker image.

In our case, we chose the Alpine version as the base image for our production environment in order to minimize the size of the base image because the dependency to run the service is fixed.

hierarchical

At this point, we encountered a new problem. Because Alpine’s basic tool library is rudimentary, and a packaging tool like WebPack has an extensive plug-in library that might be used behind it, building a project is environment dependent. These libraries are only needed at compile time and can be removed at run time. In this case, we can take advantage of Docker’s hierarchical build feature to solve the problem.

First, we can do the dependency installation under the full version image and give the task an alias (build in this case).

AS build WORKDIR /app COPY package*. Json /app/ RUN [" NPM ", "install"] COPY . /app/ RUN npm run buildCopy the code

We can then start another mirroring task to run the production environment, and the production base image can be replaced with the Alpine version. The compiled source code can be moved to the build task using the –from parameter to get the files in the build task.

FROM node:14-alpine AS release WORKDIR /release COPY package*.json /RUN ["npm", "install", "--registry=http://r.tnpm.oa.com", Public COPY --from=build /app/dist /release/dist ["node", "./dist/index.js"]=Docker image generation rule is that the result of image generation is based on the last image task. Therefore, the previous task does not take up the volume of the final image, which perfectly solves this problem.Copy the code

Of course, as the project becomes more and more complex, it is still possible to encounter library errors at run time. If the problem library is exposed and requires few dependencies, we can supplement the required dependencies, so that the mirror volume remains small.

One of the most common problems is references to the Node-gyp and Node-sass libraries. Since this library is used to translate modules written in other languages into node modules, we need to manually add g++make python dependencies.

FROM node:14-alpine AS dependencies RUN apk add --no-cache python make g++ COPY package*.json /RUN ["npm", "install", "--registry=http://r.tnpm.oa.com", "--production"] RUN apk del .gypCopy the code

Details: github.com/nodejs/dock…

Plan the Docker Layer properly

Optimization of construction speed

As we know, Docker uses the concept of Layer to create and organize the image. Each instruction of Dockerfile will generate a new file Layer, and each Layer contains the file system changes of the image between the state before and after the execution of the command. The more file layers, the larger the image volume. Docker uses caching to improve construction speed. If the statements and dependencies of a layer in a Dockerfile have not changed, the layer can reuse the local cache directly during reconstruction.

As shown below, if Usingcache appears in the log, it indicates that the cache is in effect. The layer will not perform the operation, and the original cache will be directly used as the output result of the layer.

Step 2/3 : npm install 
​
---> Using cache 
---> efvbf79sd1eb
Copy the code

Through the study of Docker cache algorithm, it is found that in the Docker construction process, if a certain layer cannot apply cache, subsequent layers that rely on this step cannot be loaded from the cache. Take this example:

COPY . .RUN npm install
Copy the code

If we change any file in the repository at this point, the cache will not be reused even if the dependency has not been changed because the upper dependencies of the NPM Install layer have changed.

Therefore, to take advantage of the NPM Install layer cache as much as possible, we can change the Dockerfile to look like this:

COPY package*.json .
RUN npm install
COPY src .
Copy the code

This way the node_modules dependency cache can still be used when only the source code is changed.

Thus, we get the optimization principle:

  1. Minimize the processing of change files, changing only the files needed for the next step to minimize cache invalidation during the build process.
  2. Delay the execution of the ADD and COPY commands that process file changes.

Construction volume optimization

On the premise of ensuring speed, volume optimization is also something we need to consider. Here are three things to consider:

  1. Docker uploads image repositories on a layer by layer basis, which also maximizes cache capabilities. Therefore, commands that execute with little change in results need to be pulled out into separate layers, as in the NPM Install example mentioned above.

  2. The smaller the number of mirror layers, the smaller the total upload volume. Therefore, if the command is at the end of the execution chain, that is, it does not affect the cache at other layers, it is best to merge the command to reduce the cache size. For example, instructions to set environment variables and clean up useless files, their output is not used, so you can combine these commands into a single RUN command.

    RUN set ENV=prod && rm -rf ./trash
    Copy the code
  3. Docker cache is also downloaded by layer cache, so in order to reduce the image transfer and download time, we had better use a fixed physical machine to build. For example, by specifying a dedicated host in the pipeline, the preparation time of the image can be greatly reduced.

Of course, time and space optimization never have the best of both worlds, which requires us to weigh the number of Docker Layer layers when designing Dockerfile. For example, in order to optimize time, we need to split the file copy and other operations, which will lead to more layers, slightly more space.

My advice here is to prioritize build time and minimize build cache size without compromising time.

Manage services with Docker thinking

Avoid process daemons

When we write traditional backend services, we always use process daemons such as PM2, Forever, etc., to ensure that the service can be detected and restarted automatically in the event of an unexpected crash. But far from being beneficial under Docker, this brings an additional element of instability.

First of all, Docker itself is a process manager, so the crash, restart, logging and other tasks provided by process daemons can be provided by Docker itself or by Docker-based choreographers (such as Kubernetes) without additional application implementation. In addition, due to the nature of daemons, the following situations will inevitably be affected:

  1. Adding process daemons increases memory footprint and mirror size.
  2. Because the daemon can always run normally, when the service fails, the restart policy of Docker itself will not take effect, and the crash information will not be recorded in the Docker log, causing difficulties in troubleshooting and tracing.
  3. Due to the addition of more processes, monitoring indicators such as CPU and memory provided by Docker will become inaccurate.

Therefore, I still don’t recommend using process daemons, although a process daemon like Pm2 provides a version that works with Docker: PM2 – Runtime.

In fact, this point is actually from our inherent thought and made a mistake. In the process of cloud service, the difficulty lies not only in the adjustment of writing method and architecture, but also in the change of development ideas. We will realize this more deeply in the process of cloud service.

Persistent storage of logs

Whether it is for troubleshooting or auditing purposes, backend services always require logging capabilities. According to the previous idea, we can classify logs and write them into a log file in a certain directory. But in Docker, any local files are not persistent and will be destroyed at the end of the container’s life cycle. Therefore, we need to store logs outside of the container.

The simplest approach is to use DockerManagerVolume, a feature that bypasses the container’s own file system and writes data directly to the host physical machine. Specific usage is as follows:

docker run -d -it --name=app -v /app/log:/usr/share/log app
Copy the code

Run the -v command to bind volumes to the container and mount the /app/log directory to the /usr/share/log directory of the container. This way, when the service writes logs to this folder, they can be persisted on the host and not lost when the Docker is destroyed.

Of course, as the deployment cluster grows, the logs on the physical host become difficult to manage. At this point, a service choreography system is needed for unified management. From the perspective of pure log management, we can report to the network and host it to the cloud log service (such as Tencent CLOUD CLS). Or batch manage containers, such as a container orchestration system such as Kubernetes, so that logs can be kept as a module. There are many ways to do this, so I don’t want to repeat them.

K8s service controller selection

Beyond mirroring optimization, service choreography and the form of load that controls deployment also have a significant impact on performance. Taking the two most popular Kubernetes controllers (Deployment and StatefulSet) as an example, a brief comparison of the two types of organization will help you select the Controller that best suits your service.

StatefulSet is a Controller introduced by K8S after version 1.5. It can implement orderly deployment, update and destruction among pods. Do we need to use StatefulSet for POD management for our products? The official summary is:

Deployment is used to deploy stateless services and StatefulSet is used to deploy stateful services.

It was precise, but not easy to understand. So, what is statelessness? In my opinion, the characteristics of StatefulSet can be understood in the following steps:

  1. StatefulSetDeployment, update, and deletion of managed pods can take place in a fixed sequence. This method is applicable when multiple services depend on each other. For example, start the database service before starting the query service.
  2. Because there are dependencies between pods, each POD must provide different services, soStatefulSetThere is no load balancing capability between managed pods.
  3. And because pod provides different services, each POD will have its own independent storage space, pod does not share.
  4. To ensure that pod updates are deployed sequentially, the pod name must be fixed, so unlikeDeploymentThe resulting POD name is followed by a random number.
  5. And because the POD name is fixed, so withStatefulSetdockingServiceYou can access the domain name directly with the POD name without providing itClusterIPSo withStatefulSetdockingServiceReferred to asHeadlessService.

From this we should understand that if you deploy a single service on K8S or there are no dependencies between multiple services, then Deployment must be the simple and most effective choice, automatic scheduling, automatic load balancing. If services must be started and stopped in a certain order, or if the volume of data mounted to each POD needs to survive destruction, StatefulSet is recommended.

Following the principle of not adding entities unless necessary, Deployment is strongly recommended as Controller for all workloads running a single service.

Written in the end

After a round of research, I almost forgot the initial goal, so I hurriedly rebuilt Docker to see the optimization results.

As can be seen, the optimization effect of mirror volume is still good, up to about 10 times. Of course, if you didn’t need such a high version of Node support in your project, you could further reduce the image size by about half.

The image repository then compresses the stored image files, and the image version packaged with node14 is eventually compressed to within 50M.

Of course, beyond the visible volume data, the more important optimization is actually the architectural design shift from physical machine oriented services to containerized cloud services.

Containerization is the visible future. As a developer, we should always be sensitive to cutting-edge technology and actively practice, so as to transform technology into productivity and make contributions to the evolution of the project.

References:

  1. Kubernetes in Action by Marko Lukša
  2. Optimizing Docker Images

EOF

AlloyTeam welcomes excellent friends to join.

Resume: [email protected]

For details, please click Tencent AlloyTeam to recruit Web front-end engineers (social recruitment)