This paper only introduces the concept of Docker in detail, and does not involve the installation of Docker environment and some common operations and commands of Docker.

I. Understanding containers

Docker is the world’s leading software container platform, so to understand the concept of Docker, we must start from the container.

1.1 What is a container?

Let’s start with the more official explanation of the container

Containers in a word: Containers are packages of software into standardized units for development, delivery, and deployment.

  • Container images are lightweight, executable, stand-alone packages that contain everything you need to run your software: code, runtime environment, system tools, system libraries, and Settings.
  • Containerized software is suitable for Linux – and Windows-based applications and runs consistently in any environment.
  • Containers help reduce conflicts between teams running different software on the same infrastructure by giving software independence from external environmental differences (for example, differences in development and rehearsal environments).

Let’s take a look at the more common explanation of containers

If a popular description is needed, I think a container is a place to store things, just like a schoolbag can hold all kinds of stationery, a wardrobe can hold all kinds of clothes, and a shoe rack can hold all kinds of shoes. What we now refer to as containers may be more application-specific such as websites, programs, or even system environments.

1.2 Physical Machines, VMS, and Containers

The comparison between virtual machines and containers will be explained in more detail later, but this is just to deepen your understanding of physical machines, virtual machines and containers through the pictures on the Internet (the following pictures are from the network).

The physical machine

The virtual machine:

Container:

! [container] (my-blog-to-use.oss-cn-beijing.aliyuncs.com/2019-7/ container diagram. PNG)

By analogy, containers virtualize operating systems rather than hardware, and they share the same set of operating system resources. Virtual machine technology is a set of virtual hardware, running a complete operating system on it. Therefore, the isolation level of the container is slightly lower.


I believe that through the above explanation, you will have a preliminary understanding of the concept of container, which is both strange and familiar. Now let’s talk about some concepts of Docker.

Second, let’s talk about some concepts of Docker

2.1 What is Docker?

To be honest, it is not easy to say what a Docker is. I will explain to you what a Docker is through four points below.

  • Docker is the world’s leading software container platform.
  • Docker uses the Go language launched by Google for development and implementation, based on the CGroup function and name Space provided by the Linux kernel, as well as AUFS class UnionFS and other technologies, to encapsulate and isolate the process. Virtualization technology at the operating system level. Since a quarantined process is independent of the host and other quarantined processes, it is also called a container.
  • Docker automates repetitive tasks, such as setting up and configuring development environments, freeing up developers to focus on what really matters: building great software.
  • Users can easily create and use containers and put their own applications into containers. Containers can also be versioned, copied, shared, and modified just like normal code.

2.2 Docker thought

  • container
  • Standardization: ① transportation mode ② storage mode ③ API interface
  • isolation

2.3 Characteristics of Docker containers

  • lightweight

    Multiple Docker containers running on a machine can share the operating system kernel on that machine; They can start quickly and require very few computing and memory resources. Mirrors are constructed through the file system layer and share some common files. This minimizes disk usage and allows images to be downloaded more quickly.

  • standard

    Docker containers are based on open standards and can run on all major Linux versions, Microsoft Windows, and any infrastructure including VMS, bare-metal servers, and the cloud.

  • security

    Docker allows applications to be isolated not only from each other, but also from the underlying infrastructure. Docker provides maximum isolation by default, so when an application has a problem, it is a problem in a single container, not across the entire machine.

2.4 Why Docker is used?

  • Docker image provides a complete runtime environment in addition to the kernel, to ensure the consistency of the application running environment, so that there will no longer be “no problem with this code on my machine ah” such problems; — Consistent operating environment
  • The startup time can be in the order of seconds or even milliseconds. Greatly saving the development, testing, deployment time. — Faster startup time
  • Avoid public servers, where resources are vulnerable to other users. – isolation,
  • Good at dealing with server usage stress of concentrated outbreak; — Elastic expansion and rapid expansion
  • Applications running on one platform can be easily migrated to another platform without worrying about running in a different environment. — Easy to migrate
  • With Docker, continuous integration, continuous delivery and deployment can be achieved by customizing application images. — Continuous delivery and deployment

Three containers VS VMS

When it comes to containers, we have to compare them to virtual machines. As far as I’m concerned, it doesn’t matter which one will replace the other, but the two can coexist harmoniously.

To put it simply: Containers and virtual machines have similar resource isolation and allocation benefits, but their functionality is different because containers virtualize the operating system, not the hardware, and therefore are more portable and efficient.

3.1 Comparison diagram

The traditional virtual machine technology is to create a set of virtual hardware, run a complete operating system on it, and then run required application processes on the system. The application process in the container runs directly on the host kernel, without its own kernel and without hardware virtualization. Therefore, containers are much lighter than traditional virtual machines.

3.2 Containers and VMS

  • A container is an application-layer abstraction for packaging code and dependent resources together. Multiple containers can run on the same machine, sharing the operating system kernel, but each running as a separate process in user space. Compared to virtual machines, containers take up less space (container images are usually only a few tens of megabytes in size) and can be started instantaneously.

  • A virtual machine (VM) is a physical hardware layer abstraction for turning one server into multiple servers. Hypervisors allow multiple VMS to run on one machine. Each VM contains a complete set of operating systems, one or more applications, the necessary binaries, and library resources, and therefore takes up a lot of space. VM startup is also slow.

Through Docker’s official website, we know so many advantages of Docker, but there is no need to completely deny virtual machine technology, because the two have different use scenarios. Virtual machines are better at completely isolating the entire operating environment. For example, cloud service providers often employ virtual machine technology to isolate different users. Docker is usually used to isolate different applications, such as front-end, back-end, and database.

3.3 Containers and VMS can coexist

As far as I’m concerned, it doesn’t matter which one will replace the other, but the two can coexist harmoniously.


Basic concepts of Docker

There are three very important basic concepts in Docker. If you understand these three concepts, you can understand the whole life cycle of Docker.

  • Image
  • Container
  • Repository

By understanding these three concepts, you understand the entire Docker life cycle

4.1 Image: A special file system

Operating systems are divided into kernel and user space. For Linux, the root file system is mounted to provide user-space support after the kernel is started. A Docker Image, on the other hand, is a root file system.

Docker image is a special file system, in addition to providing programs, libraries, resources, configuration files required by the container runtime, but also contains some configuration parameters prepared for the runtime (such as anonymous volumes, environment variables, users, etc.). The image does not contain any dynamic data and its contents are not changed after the build.

When Docker is designed, it makes full use of Union FS technology and designs it as a layered storage architecture. A mirror is actually a combination of multiple file systems.

When a mirror is built, one layer is built on top of the other. After each layer is built, there are no more changes, and any changes on the next layer only happen on your own layer. For example, deleting a file at the previous layer does not actually delete the file at the previous layer, but only marks the file as deleted at the current layer. This file will not be seen when the final container runs, but it will actually follow the image. Therefore, when building the image, extra care should be taken. Each layer should contain only what needs to be added to that layer, and any extras should be cleared away before the layer is built.

The feature of hierarchical storage also makes it easier to reuse and customize images. You can even use the previously built image as the base layer and then add new layers to customize what you need to build new images.

4.2 Container: An image running entity

The relationship between an Image and a Container is similar to that between a class and an instance in object-oriented programming. An Image is a static definition and a Container is an entity of the Image runtime. Containers can be created, started, stopped, deleted, paused, and so on.

The essence of a container is a process, but unlike processes that execute directly on the host, container processes run in their own separate namespace. As mentioned earlier, images use tiered storage, as do containers.

The container storage layer lives the same as the container. When the container dies, the container storage layer dies with it. Therefore, any information stored in the container storage layer is lost when the container is deleted.

As per Docker best practices, containers should not write any data to their storage layer, and the container storage layer should remain stateless. All file writing operations should use data volumes or bind host directories. Read/write operations in these locations skip the container storage layer and directly read/write operations to the host (or network storage), achieving higher performance and stability. The lifetime of a data volume is independent of the container. The container dies and the data volume does not die. Therefore, after using the data volume, the container can be deleted and run again at will without data loss.

4.3 Repository: A centralized Repository for storing image files

After the image is built, it can be easily run on the current host. However, if the image needs to be used on other servers, we need a centralized service to store and distribute the image, and Docker Registry is such a service.

A Docker Registry can contain multiple repositories. Each repository can contain multiple tags; Each label corresponds to a mirror. So mirror repository is a place where Docker centrally stores image files similar to the code repository we used before.

Typically, a repository contains images of different versions of the same software, and labels are often used to match versions of the software. We can specify which version of this software is the mirror by using the format < repository >:< tag >. If no label is given, latest is used as the default label.

Here are some additions to the concept of Docker Registry public service and private Docker Registry:

Docker Registry exposed services are Registry services that are open for user use and allow users to manage images. These public services typically allow users to upload and download public images for free, and may provide a fee service to manage private images.

The most commonly used Registry exposure service is the official Docker Hub, which is also the default Registry and has a large number of high quality official images at hub.docker.com/. Docker Hub is officially introduced like this:

Docker Hub is an official Docker service for finding and sharing container images with your team.

Let’s say we want to search for the image we want:

There are several key pieces of information in the Docker Hub search results that help us select the right image:

  • OFFICIAL Image: Represents the Image provided and maintained by Docker officials, which is relatively stable and secure.
  • Stars: Similar to GitHub’s Star.
  • Dowloads: how many times the mirror is pulled, basically how often the mirror is used.

Of course, in addition to searching images directly through the Docker Hub website, we can also search images in Docker Hub through the command Docker search, and the search results are consistent.

➜ ~ docker search mysql NAME DESCRIPTION STARS OFFICIAL AUTOMATED mysql mysql is a widely used, open-source relation... 8763 [OK] Mariadb Mariadb is a community-developed fork of MyS... 3073 [OK] mysql/mysql-server Optimized mysql Server Docker images.create... 650 [OK]Copy the code

Docker Hub may be slower to access in China. There are also some domestic cloud service providers that provide open services similar to Docker Hub. Such as speed cloud image library, netease cloud image service, DaoCloud image market, Ali Cloud image library and so on.

In addition to using public services, users can also set up the private Docker Registry locally. Docker officially provides the Docker Registry image, which can be directly used as a private Registry service. The open source Docker Registry image only provides the server-side implementation of the Docker Registry API, which is sufficient to support Docker commands without affecting use. Advanced functions such as image maintenance, user management, and access control are not included.


5 Common Commands

5.1 Basic Commands

docker version # Check the docker version
docker images This is equivalent to the docker image ls command
docker container ls # View all containers
docker ps # View the running container
docker image prune Clean up temporary, unused image files. -a, --all: Deletes all useless mirrors, not just temporary files.
Copy the code

5.2 Pulling a Mirror

docker search mysql Mysql > select * from mysqlDocker pull mysql: 5.7# pull mysql image
docker image ls # View all downloaded images
Copy the code

5.3 Deleting a Mirror

Let’s say we want to delete the mysql image we downloaded.

Before deleting an image using docker rmI [image] (equivalent to docker image rm [image]), first make sure that the image is not referenced by the container (can be deleted by tag name or image ID). Through the docker ps command we talked about earlier can be viewed.

➜ ~ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES C4CD691d9f80 mysql:5.7 "docker-entryPoint.s..." 7 weeks ago Up 12 days 0.0.0.0:3306->3306/ TCP, 33060/ TCP mysqlCopy the code

Mysql is being referenced by a container with id c4CD691d9f80. We need to pause the container first by stopping c4cd691d9f80 or docker stop mysql.

Then check the ID of the mysql image

➜ ~ Docker images REPOSITORY TAG IMAGE ID CREATED SIZE mysql 5.7f6509BAC4980 3 months ago 373MBCopy the code

Delete by IMAGE ID or REPOSITORY name

Docker RMI F6509BAC4980 # docker Rmim mysqlCopy the code

Build Ship and Run

Now that we’re done with the concept of Docker and the common commands, let’s talk about Build, Ship, and Run.

If you search Docker’s website, you’ll find the following: “Docker-build, Ship, and Run Any App, Anywhere.” So what does Build, Ship, and Run do?

  • Build: An image is like a container containing resources such as files and runtime environments.
  • Ship: Transport between the mainframe and the warehouse, which is like a Superterminal.
  • Run (Run image) : A running image is a container, and the container is the place to Run the program.

The Docker run process is to go to the repository to pull the image to the local, and then run the image into a container with a command. Therefore, we often refer to Docker as Docker worker or Docker, which is exactly the same as Docker’s Chinese translation of porter.

7. Have a brief understanding of the underlying principle of Docker

7.1 Virtualization Technology

First, Docker container virtualization technology based software, so what is virtualization technology?

In simple terms, virtualization can be defined as follows:

Virtualization technology is a resource management technology, is the computer’s various [physical resources](zh.wikipedia.org/wiki/ resources _(computing… “Physical resources “) (CPU, memory, disk space, network adapters, etc.), which are abstracted, transformed and rendered and can be divided and combined into one or more computer configuration environments. By breaking down the impenetrable barriers between physical structures, users can use these computer hardware resources in better ways than they would otherwise have. New virtual portions of these resources are not limited by how existing resources are erected, geographically, or physically configured. Virtualized resources in general include computing power and data storage.

Docker is based on LXC virtual container technology

Docker technology is based on LXC (Linux Container – Linux Container) virtual container technology.

LXC, whose name comes from the acronym Linux Containers, an Operating system-level virtualization technology, is a user-space interface for Linux kernel container functionality. It packages the application software system into a software Container, which contains the application software code, and the required operating system core and libraries. Linux users can easily create and manage system or application containers by allocating available hardware resources of different software containers through a unified namespace and shared API, and creating an independent sandbox environment for applications.

LXC technology is mainly realized by using the CGroup function and name Space provided by the Linux kernel. LXC can provide an independent operating system running environment for software.

Cgroup and Namespace:

  • Namespace is the Linux kernel’s way of isolating kernel resources. Namespace allows some processes to see only a portion of the resources associated with them, and others to see only the resources associated with them, with the two processes unaware of each other’s presence. This is done by specifying the related resources of one or more processes in the same namespace. Linux namespaces encapsulates and insulates global system resources so that processes in different namespaces have independent global system resources. Changing system resources in a namespace only affects the processes in the namespace. Processes in other namespaces are not affected.

    (more information about the namespace content from www.cnblogs.com/sparkdev/p/… See this article for more information on namespace.

  • CGroup, short for Control Groups, is a mechanism provided by the Linux kernel to limit, record, and isolate the physical resources (such as CPU memory I/O, etc.) used by process Groups.

    (The above introduction to CGroup comes from www.ibm.com/developerwo… See this article for more information about cgroups.

Cgroup and namespace:

Both groups processes together, but their functions are fundamentally different. Namespace is used to isolate resources between process groups, while Cgroup is used to uniformly monitor and restrict resources for a group of processes.

Eight summary

This paper mainly elaborates some common concepts in Docker, but it does not involve Docker installation, image use, container operation and other contents. I hope readers can grasp this part by reading books and official documents. If you find the official documentation difficult to read, here is a book recommended: Introduction and Practice of Docker Technology 2nd Edition.

Nine Recommended Reading

  • 10 minutes to understand Docker and K8S
  • Starting from scratch K8s: Explain the basic concept of K8s containers

Ten reference

  • Linux Namespace and Cgroup
  • LXC vs Docker: Why Docker is Better
  • CGroup introduction, application example, and principle description