A preface,
For computing technology quickly update iteration, we often appear, how to learn so many things out, I was honestly go cut figure ðŸ˜, but as a roll of king, how can I give up, look at this weekend is going to take a lot more Docker (though not a new technology), nonsense not much said, dry.
Second, the history of virtualization technology
1. Physical machine era
How do we deploy our applications in the era of physical machines without virtualization? First of all, we definitely need a physical machine, and then we need to install the operating system on this physical machine, and then we need to install the dependency that our application needs, and finally we can run our application on our operating system.
So what are the drawbacks of this application deployment solution?
- The deployment process is complex and slow. The deployment process is complex and inefficient.
- The cost here is mainly the cost of physical machines. Even some small applications also need a physical machine, and each physical machine can only install one operating system. If we want to change a system for deployment, we have to purchase a new physical machine.
- Idle resources are difficult to reuse
The age of virtualization
In order to solve the problems of physical machines, virtualization technology is derived. When it comes to virtualization technology, virtual machines immediately come to mind. The following shows the working principle of virtual machines.
A VIRTUAL machine (VM) has one more layer than a physical machine. A Hypervisor is a common hardware virtualization software that abstracts the underlying operating system from multiple underlying hardware interfaces. On top of the Hypervisor layer, there are three parallel virtual machines, and each virtual machine has an additional layer of Guest OS compared to the physical machine, that is, each virtual machine has its own operating system. In general, virtual machine technology is to use Hypervisor to simulate the hardware (CPU, memory, I/O, etc.) needed to run an operating system, and then install a new operating system on this basis, so that users can run applications in the virtual machine, so as to achieve the effect of isolation.
Now the virtual machine has changed the physical machine on the basis of a great improvement, that he still have any defects?
- Each VM occupies the resources of the host, and multiple VMS may compete for resources, which seriously affects the system response.
- Each time a new virtual machine is created, the side environment needs to be reconfigured, making the development efficiency of the developer slow
- Each vm occupies 100 to 200MB memory
3. Docker era
What is a Docker? Here is the introduction of Baidu Baike:
To put it simply, Docker is a container. Developers can put applications into this container and then publish them. Containers and containers do not affect each other to achieve the effect of isolation.
In fact, Docker is a package of Linux container, and Linux container is a virtualization technology developed by Linux. Linux container does not simulate a complete operating system like a virtual machine, but isolates the process, which is equivalent to a layer of sandbox outside the process. In the case of a process in a container, what it sees is virtual, as shown below:
Its structure is similar to that of a virtual machine, except that each container does not have a virtual operating system and uses a Docker engine instead of a Hypervisor.
Iii. Implementation principle of Docker
1. Isolation technology
We talked about how a Docker container is a wrapped process, and you might wonder why it’s a process. In fact, the reason is very simple, process is a state of our program running, it contains a program to run all the resources (memory, CPU, I/O equipment, etc.), and the container is the whole process wrapped up, isolated from other applications.
The isolation technology mentioned here is what we will talk about next. The isolation technology of Docker actually makes use of the Namespace mechanism in Linux, and Namespace is an optional parameter of the Linux creation process. For example, parameters such as CLONE_NEWPID, Mount, UTS, and IPC have their own functions. Mount Namespace allows the isolated process to view only the Mount point information in the Namespace. Network Namespace Allows the isolated process to see Network devices and configurations in the Namespace.
int pid = clone(main_function, stack_size, CLONE_NEWPID | SIGCHLD, NULL);
Copy the code
But this isolation approach has its drawbacks. First of all, we know that the container is just a process underneath the host machine, so what’s the problem with the container using the operating system kernel of the host machine? If we wanted to run a Linux container on a Windows host machine, this would never work. Secondly, there are many resources in the Linux kernel that cannot be Namespace, such as time. This means that if I change the time in a container, the time on the host machine will also be changed, which is definitely not allowed. It is these problems that make our applications vulnerable to attack.
2, Cgroups
The Namespace field allows the container to see only the contents of the container, but as a process it is still equal to other processes on the host machine. This means that other processes can compete with the container for resources (CPU, memory, etc.). At either extreme, it is not reasonable to say that other processes can eat up all the resources on the host machine, while the container can eat up all the resources on the host machine.
So the Linux kernel designed Cgroup to set resource limits for the process, it can limit the process can use resources online, including CPU, memory, disk, network bandwidth and so on.
3, mirror
For container, should see him application process is a completely independent of the file system, so that he could in his own directory, this will not affect the host environment is not affected by the host machine and other containers, but by default, the newly created container directly inherit different mount point of the host machine. This means that the contents of the files in the container are the same as those in the host machine, so how can we change them?
There are two main points:
- When creating a process, attach the Mount Namespace parameter. As mentioned above, this parameter modifies the container process to the file system
The mount point
The cognitive. - We need to tell the process which directories need to be remounted when we create the process, because mounting a Namespace does not take effect until the directory is remounted.
mount("none"."/tmp"."tmpfs".0."");
Copy the code
For example, this line tells the container to remount the/TMP directory in TMPFS format.
So what does this have to do with the mirror image we’re talking about? Our container image is a file system that provides an isolated execution environment for container processes. It contains /bin, /etc, /proc and other directories and files.
It is important to note that the image is the file, dependency configuration, and directory that an operating system contains. It does not contain the operating system kernel. The kernel used in our container is the shared host operating system kernel.
With the image, whether locally, in the cloud or on any other machine, users can unpack the image and get the complete environment required for application execution. For example, we can make an image using the ISO of the Ubuntu operating system.
In addition to the mirror and a very important concept is layered, mirror using stratification in order to solve the problem of reuse, normally we mirror layer is placed in/var/lib/docker aufs/diff directory, will look at an example below:
/var/lib/docker/aufs/diff/6e3be5d2ecccae7cc... =rw /var/lib/docker/aufs/diff/6e3be5d2ecccae7cc... -init=ro+wh /var/lib/docker/aufs/diff/32e8e20064858c0f2... =ro+wh /var/lib/docker/aufs/diff/2b8858809bce62e62... =ro+wh /var/lib/docker/aufs/diff/20707dce8efc0d267... =ro+wh /var/lib/docker/aufs/diff/72b0744e06247c7d0... =ro+wh /var/lib/docker/aufs/diff/a524a729adadedb90... =ro+whCopy the code
This is/var/lib/docker/aufs/diff files under the image information of each layer, from the structure can be divided into three parts:
Read/write layer
This layer can also be called A container, it is mainly used to load the user to add, delete, change operations, all user modification operations will only work on this floor, the same file will overwrite A lower level, there is an A file such as read-only layer, and then I need to make any changes to your this file, then it will be from top to down to find the file, This file is then copied to the container layer and the changes are applied to the files below.
Init layer
The init layer is used to mount files that are only valid for the current container. These files are not committed when a Docker commit is performed, only the read-write layer is committed.
Read-only layer
The read-only layer is a common part of the image, and developers only need to make incremental changes to it, and then only need to maintain the incremental content relative to the read-only layer changes.