Kubernetes(K8S for short) and container technology can be said to be one of the hottest technologies in recent years. K8S is known as Google’s open source container orchestration tool. Today I want to start by talking about my understanding of containers, what K8S are, and why they are so popular.
Why Docker
K8S is a container choreography tool, so to understand what K8S is, we first need to understand what containers are, and why container technology is used.
A Linux container, figuratively speaking, is actually a process, and a process that has been tricked by the system. Why do you say that? A process running in a container has the following characteristics:
- It is isolated by the host operating system. It cannot see other processes on the host and thinks it is the process whose PID is 1
- It is limited by the host operating system hardware resources, it can use CPU, memory and other hardware resources only part of the host
- When the host operating system limits storage space, the process thinks that a directory on the host is the system root directory
The techniques for spoofing a process are the three axes of container technology, namespace for resource isolation, Cgroup for resource restriction, and Rootfs for masquerading the root directory of a process. These three technologies, is already present in the Linux, just docker company more innovation to integrate these three technology together, and put forward the concept of container mirror and mirror the warehouse, the dependence on the process and the related environment are packaged together into image file can be distributed and reused, convenient container can transplant between different machines. Thus, anywhere docker can be run, the container image can be run as an instance of the container containing the application process and its dependent environment.
Here is a brief description of the production scenario using the official K8S document to see what container technology can solve.
The left figure is the traditional physical machine deployment application mode, you can see that in addition to the operation of an application, in addition to the program itself, application configuration, libraries, and so on depend on the environment. Typically, an application will go through a development environment, pre environment, beta environment, grayscale testing, production environment, and so on. The students who develop the code run through the development environment, and run in other environments, because of different environment dependence, configuration, security requirements, may appear various problems. Operations students are busy unifying dependencies in different environments. The CI/CD logic is complicated and difficult to be unified due to the special dependency requirements of different application languages.
The image on the right shows the scenario with container technology. The essence of a container image is a collection of application processes plus all runtime environments, configurations, and dependencies. As long as the underlying layer of each environment is Docker-compatible, the deployment of all environments is consistent. Development students do not need to worry about the difference between production and development environments, which may cause application operation problems. Operation students deploy an application, as long as they ensure that the container image works properly. CI/CD automation is also relatively easy to implement a lot. Thus, the development efficiency, application iteration efficiency and operation and maintenance workload are greatly increased.
At this point, a comparison between containers and traditional virtual machines is necessary. In fact, both can be understood as virtualization, the most essential difference is that the container is virtualization at the operating system level, while the virtual machine is virtualization at the hardware level. See the picture below.
Because the VIRTUAL machine has an extra layer of Guest OS, it needs the extra performance overhead of the host machine for hardware virtualization. Meanwhile, because it is a complete operating system, the VIRTUAL machine image is usually several GIGABytes in size, which is difficult to spread quickly among different environments. The hypervisor layer is also heavier than the Docker Engine. The docker container is essentially a process running on the host, but isolated from other processes by docker. To put it simply, agility and efficiency are the biggest advantages of containers over virtual machines, but they also have disadvantages. Containers are isolated only at the operating system level, and the isolation between different containers is not thorough enough.
Why Kubernetes
So what is Kubernetes? What problems does it solve? Why k8S must be mentioned now when it comes to container technology? Let’s talk about Why K8S again.
Back to the real world, if you’ve already started packaging your app with Docker images, you can easily distribute and deploy it, regardless of the differences between different environments. But doesn’t it feel like there’s something missing?
Because the normal actual business system, not the application can be deployed, can run is complete, but also consider the application container access, horizontal scaling, monitoring, security, backup, disaster recovery and other factors. And for a complete business system, not a single application can be done, but also consider the relationship between applications, and the form of application operation. For example, a Web service may require web servers, caching systems, databases, and other applications. After containerized deployment, they may be deployed in different containers of different hosts. How to realize the access between containers involves the processing of the access relationship between containers. Another example is that an application that optimizes the cache also runs in the container. It only needs to run the container instance periodically to perform the optimization task and terminate the container upon completion, which requires dealing with the running form of different container applications. Similar to the above management of container applications and relationships between containers is called container scheduling and scheduling. Kubernetes is now the de facto standard for container choreography platforms.
So, specifically, what can the K8S do?
In the official documentation, the functions of K8S are described as follows:
- Kubernetes addresses many common requirements for running applications in production, such as:
- Pod provides a composite application container model,
- Mount external storage,
- Secret management,
- Apply a health check,
- Copy application instance,
- Horizontal automatic expansion and shrinkage capacity,
- Service discovery,
- Load balancing,
- Rolling updates,
- Resource monitoring,
- Log collection and storage,
- Support for self-testing and debugging,
- Authentication and authentication.
As you can see from these features, K8S is already very much like a cloud platform, fully capable of meeting production-level container application management needs.Here is a diagram of the simplest K8S system:
- K8S cluster is composed of a master node and multiple working nodes. The developer submits the image of the application container, and submits the image running quantity and method to the K8S master node through the description file. The K8S Master node or according to the overall situation of the cluster, Deploy applications on working nodes as required. For developers, K8S makes it easy to deploy applications without caring about infrastructure, while for operations personnel, the focus shifts from maintaining specific applications to maintaining K8S clusters. Moreover, both developers and operations personnel do not need to care about which node the application will be deployed on, K8S will automatically determine everything. Do you think K8S is good compared to traditional application deployment?
Docker’s swarm and Apache Foundation’s Mesos are among the leading tools for container choreography. Why K8S has emerged as the de facto standard for container choreography? I understand that there are two big differences between K8S and swarm :(I won’t go into the details of swarm and mesos, which I haven’t played much)
K8S adds another layer of abstraction to the container, namely POD.
Unlike the other two tools, K8S manages atomic objects that are not containers, but pods. The official documentation defines a POD as a collection of one or more containers that share storage and network, and describes how to run those containers, so POD is really an abstract concept. All k8S operations on containers, such as dynamic scaling and monitoring, are actually management of pods. So what’s the benefit of this layer of abstraction? As mentioned above, containers are essentially special processes. Imagine a Web service where logs generated by web application processes need to be processed by big data Agent processes. If this business wants to be containerized, it usually does two things. One is to separate two containers and mount the same directory on the host to store logs. The other is the operating system-level container or container it is handling as an enterpoint for web services and Agent processes. In the former way, the two containers are framed on the host. To realize the horizontal expansion and scaling of business instances, the operation and storage mount of the two containers should be considered, which is relatively complicated logically. In the latter case, you’re running an additional container process for each container, and more importantly, because entryPoint is the container process, the Web application and big data Agent are invisible to Docker. While the container is still alive, Docker considers the container to be healthy, even if the Nginx is running on error and restarts frequently.
Let’s see what happens when we use the pod concept. There are two container instances of Web service process and big data Agent in a POD. Firstly, the container instance in POD shares storage and network namespace, that is to say, the storage data of these two processes is directly shared without additional mounting action. Second, the POD is managed as a whole by K8S, which monitors the status of each container in the POD and automatically intervenes if there is a problem according to the policy. In this sense, a POD is more like a traditional virtual machine.
- The second and more important aspect of the K8S declarative API is that the K8S declarative API (swarm swarm new version also support, the same did not play, I will not go into details). What is a declarative API? Refer to the description file in the system diagram above. For example, if I need to run 10 Web service containers in a cluster, the traditional imperative API calls commands step by step to build the container. With the declarative API, K8S will automatically keep the number of Web cluster instances at 10 by telling K8S I want 10 Web containers, and when a POD exits with a problem, K8S will automatically restart the new pod so that the cluster always has 10 POD instances running. This makes it easy to manage a cluster by describing the desired cluster state through a configuration file without worrying about the implementation process.
Finally, to sum up:
- Why Dokcer: Using container technology to run applications is more efficient, lightweight and resource-saving than the original physical machine and virtual machine, and greatly facilitates application deployment and distribution in different environments.
- Why Kubernetes: It is not enough for production cluster to only run containers, but also to orchestrate and manage container applications as a cluster of business systems. Some advantages of K8S make it the de facto standard of container cluster orchestration management tools at present. Last but not least, Docker is not alone in container technology, and Kubernetes is not just managing Docker containers. However, they are currently the most mainstream container technology solutions in terms of market share, application, and popularity in the development community, and there is little need to consider other container technologies in terms of production environment.