1. Kernel namespaces

Docker containers are similar to LXC containers and provide similar security features. When a container is started with Docker Run, behind the scenes Docker creates a separate namespace and collection of control groups for the container.

Namespaces provide the most basic and direct isolation from processes running in a container that are not detected or acted upon by processes and other containers running on the host.

Each container has its own network stack, which means they can’t access sockets or interfaces from other containers. However, containers can interact with other containers just as they would with hosts if the host system is configured accordingly. When you specify a common port or use links to connect two containers, the containers can communicate with each other (the communication policy can be restricted depending on the configuration).

From a network architecture perspective, all containers communicate with each other through the bridge interface of the local host, just as physical machines communicate through physical switches.

Kernel namespaces were introduced after 2.6.15 (released in July 2008), and the reliability of these mechanisms has been proven on many large production systems over the years. In fact, the idea and design of namespaces came much earlier, originally to introduce a mechanism into the kernel to implement the features of OpenVZ. The OpenVZ project was released in 2005, and its design and implementation are already mature.

2. The control group

Control groups are another key component of the Linux container mechanism and are responsible for auditing and limiting resources.

It provides many useful features; And to ensure that each container can share the host memory, CPU, disk I/O resources equally; More importantly, of course, the control group ensures that the host system is not compromised when resource usage within the container causes stress.

Although a control group is not responsible for isolating containers from each other for accessing and processing data and processes, it is essential in preventing denial of service (DDOS) attacks. Especially on multi-user platforms, such as public or private PaaS, control groups are important. For example, consistent health and performance can be guaranteed when certain applications behave abnormally. The control group mechanism began in 2006, with the kernel being introduced with version 2.6.24.

3. Server protection

The core of running a container or application is through the Docker server. The operation of Docker service currently requires root permission, so its security is very critical.

First, make sure that only trusted users can access the Docker service. Docker allows users to share folders between hosts and containers without restricting access to containers, making it easy for containers to break resource limits. For example, if a malicious user started the container and mapped the host root directory/to the container’s /host directory, the container could theoretically make any changes to the host’s file system. Sounds crazy, right? But virtually all virtualization systems allow similar resource sharing without prohibiting users from sharing the host root file system to the virtual machine system.

This could have serious security consequences. Therefore, when providing container creation services (such as through a Web server), greater care should be taken to ensure that parameter security checks are performed to prevent malicious users from creating destructive containers with specific parameters

To protect the server, Docker’s REST API (used by clients to communicate with the server) replaced the TCP socket bound to 127.0.0.1 with a native Unix socket mechanism after 0.5.2, which was vulnerable to cross-site scripting attacks. Users now use Unix permission checking to enforce socket access security.

Users can still provide REST API access using HTTP. It is recommended to use security mechanisms to ensure that only access under trusted networks or VPNS, or certificate protection mechanisms such as protected Stunnel and SSL authentication, can take place. In addition, HTTPS and certificates can be used for additional protection.

Recent improvements to the Linux namespace mechanism will make it possible to run fully functional containers using non-root users. This fundamentally solves the security problem caused by sharing file systems between containers and hosts.

The ultimate goal is to improve two important security features:

  • Map the root user of the container to non-root users on the local host to reduce security problems caused by permission promotion between the container and hosts.
  • Allows Docker servers to run under non-root privileges, using secure and reliable child processes to proxy operations requiring privileged privileges. These child processes will be allowed to operate only to a limited extent, such as only responsible for virtual network Settings or file system management, configuration operations, and so on.

Finally, it is recommended to use a dedicated server to run Docker and related management services (such as management services such as SSH monitoring and process monitoring, management tools NRPE, Collectd, etc.). Other business services are put into the container to run.

4. Kernel capability mechanism

Capability is a powerful feature of the Linux kernel that provides fine-grained permission access control. The Linux kernel has supported capability mechanisms since version 2.2, which divide permissions into more fine-grained operational capabilities that can be applied to processes as well as files.

For example, a Web service process only needs to bind permissions on a port below 1024, not root. Then it just needs to be authorized with the net_bind_service capability. In addition, there are many other similar capabilities to prevent processes from acquiring root privileges.

By default, Docker launches containers that are strictly limited to using only a portion of the kernel’s capabilities.

Using capability mechanisms has many benefits for enhancing Docker container security. Typically, a number of privileged processes run on the server, including SSH, CRon, SYSlogd, hardware management tool modules (such as load modules), network configuration tools, and so on. Containers are different from these processes because almost all privileged processes are managed by supporting systems outside the container.

  • SSH access is managed by the SSH service on the host.
  • Cron should normally be executed as a user process, with permissions delegated to the applications it serves.
  • The logging system can be managed by Docker or third-party services.
  • Hardware management is irrelevant, and there is no need to perform UDEVD and similar services in the container.
  • Network management is also set up on the host, and the container does not need to configure the network unless it is required.

As you can see from the above example, in most cases containers do not require “true” root privileges, only a few capabilities are required. For security purposes, containers can disable unnecessary permissions.

  • Total prohibition of anymountOperation.
  • Disallow direct access to the socket on the localhost.
  • Do not access certain file systems, such as creating new devices or modifying file properties.
  • Disable module loading.

Thus, even if an attacker gains root privileges in the container, he cannot gain higher privileges on the localhost and can do limited damage. By default, Docker uses a whitelist mechanism that disables permissions other than those required. Of course, users can also enable additional permissions for the Docker container based on their own needs.

5. Other security features

In addition to capability mechanisms, several existing security mechanisms can be leveraged to enhance security with Docker, such as TOMOYO, AppArmor, SELinux, GRSEC, etc.

Docker currently only has capabilities enabled by default. Users can adopt a variety of schemes to enhance the security of Docker hosts, such as:

  • Enable GRSEC and PAX in the kernel, which adds a lot of compile and runtime security checks; Address randomization is used to avoid malicious detection. And no configuration is required by Docker to enable this feature.
  • Use container templates with enhanced security features, such as those with AppArmor and Redhat templates with SELinux policies. These templates provide additional security features.
  • Users can customize access control mechanisms to customize security policies.

As with other third-party tools added to Docker containers (such as network topology and file system sharing), there are many similar mechanisms that can harden existing containers without changing the Docker kernel.

6. Summary

In general, Docker containers are quite secure, especially if you don’t use root to run processes inside the container. In addition, users can use existing tools such as Apparmor, SELinux, and GRSEC to enhance security; Even implementing more complex security mechanisms in the kernel itself.