By default, the resources used by containers are unlimited. The maximum resources allowed by the host kernel scheduler can be used. However, in the process of using containers, it is often necessary to limit the host resources that containers can use. This article describes how to limit the host memory that containers can use.

Why limit container memory usage?

It is important to restrict containers from using too much of the host’s memory. For Linux hosts, once the kernel detects that there is not enough memory to allocate, it throws OOME(Out Of Memmory Exception) and starts killing processes to free up memory space. The bad news is that any process can be targeted by the kernel, including docker Daemons and other important programs. Even more dangerous is if one of the important processes supporting the system is killed, the whole system will go down! Here we consider a common scenario where a large number of containers consume the host memory, and the kernel immediately starts killing processes to free memory after OOME is triggered. What if the first process the kernel kills is the Docker Daemon? The result is that all containers don’t work, which is unacceptable!

Docker tries to solve this problem by adjusting the OOM priority of the Docker Daemon. The kernel scores all the processes when it selects which process to kill, killing the one with the highest score and then the next. When the OOM priority of docker Daemons is lowered (note that the OOM priority of container daemons is not changed), docker Daemons will not only score lower than container daemons, but also lower than some other processes. This makes the Docker Daemon process much safer.

We can use the following script to intuitively see the score of all processes in the current system:

#! /bin/bashfor proc in $(find /proc -maxdepth 1 -regex '/proc/[0-9]+'); do printf "%2d %5d %s\n" \ "$(cat $proc/oom_score)" \ "$(basename $proc)" \ "$(cat $proc/cmdline | tr '\0' ' ' | head -c 50)"done 2>/dev/null | sort -nr | head -n 40

Copy the code

This script outputs the 40 highest scoring processes in order:

The first column shows the process’s score, with mysqld ranking first. Nodeserver.js is a container process, and is generally ranked high. In the red box are the Docker Daemons, very far back, behind SSHD.

After having the mechanism above whether can rest easy! No, docker’s official documentation always emphasizes that this is only a mitigation solution, and provides us with some suggestions to reduce the risk:

  • Learn the memory requirements of the application through tests

  • Ensure that the host running the container has sufficient memory

  • Limits the amount of memory a container can use

  • Configure swap for the host

By limiting the amount of memory a container can use, you can reduce the risk of running out of memory on the host.

Stress testing tool

To test the container’s memory usage, I installed stress in ubuntu’s image and created a new image, U-Stress. All containers demonstrated in this article will be created using u-Stress images (CentOS7 is the host that runs the containers in this article). Here is the Dockerfile that creates the u-stress image:

Copy the code

FROM ubuntu:latestRUN apt-get update && \        apt-get install stress

Copy the code

To create a mirror, run the following command:

$ docker build -t u-stress:latest .

Copy the code

Limit the upper limit of memory usage

Before getting into the tedious setup details, let’s complete a simple use case: limit the maximum memory a container can use to 300M. The -m(–memory=) option does this:

$ docker run -it -m 300M --memory-swap -1 --name con1 u-stress /bin/bash

Copy the code

The following stress command creates a process and allocates memory via the malloc function:

# stress --vm 1 --vm-bytes 500M

Copy the code

Use the docker stats command to view the actual situation:

In the docker run command above, the -m option limits the maximum memory used by the container to 300M. At the same time, memory-swap is set to -1, which indicates that the container is limited in memory usage and unlimited in swap usage (the host can use as many swap containers as it has).

Let’s use the top command to check the actual memory of the Stress process:

In the above screenshot, the pgrep command is used to query the process related to the stress command. The process with a larger number is used to consume memory, so we check its memory information. VIRT is the virtual memory size of the process, so it should be 500M. RES is the actual amount of physical memory allocated, and we see this value hovering around 300M. It looks like we’ve succeeded in limiting the amount of physical memory the container can use.

Limit the available swap size

Note that –memory-swap must be used with –memory.

Normally, the value of –memory-swap contains both container free memory and available swap. So –memory=”300m” –memory-swap=”1g” means that the container can use 300m of physical memory and 700M(1g-300m) of swap. Memory-swap is the sum of the physical memory the container can use and the swap it can use.

Setting –memory-swap to 0 is the same as not setting –memory. If –memory is set, the container can use a swap size twice as large as –memory.

If –memory-swap has the same value as –memory, the container cannot use swap. The following demo demonstrates a scenario where a large amount of memory is requested from the system without a swap available:

$ docker run -it --rm -m 300M --memory-swap=300M u-stress /bin/bash# stress --vm 1 --vm-bytes 500M

Copy the code

The physical memory of the container in demo is limited to 300M, but the process wants to apply for 500M. The process was killed by OOM when no swap was available. If you have enough swaps, your program will at least work.

The oom kill option can be used to prevent oom kill from happening, but I think oom kill isa healthy behavior. Why stop oom kill?

In addition to limiting the size of available swaps, you can also set the urgency with which containers use swaps, just like the swappiness of the host. Containers inherit the host’s swappiness by default. To explicitly set the container’s swappiness value, use the –memory-swappiness option.

conclusion

By limiting the physical memory available to the container, you can prevent the container from running out of host memory due to service exceptions. In this case, restarting the container is a good policy. In this case, you can reduce the risk of running out of host memory.

Author: sparkdev

Reference: http://www.cnblogs.com/sparkdev/