This article has been authorized by the author Liu Chao netease cloud community.
Welcome to visit netease Cloud Community to learn more about Netease’s technical product operation experience
I’ve been wondering lately why Kubernetes is winning the battle between supporting container platforms and microservices, when in many ways the three container platforms end up being virtually identical in terms of functionality.
reference
Docker, Kubernetes, DCOS don’t talk about faith, talk about technology
Ten models of container platform selection: Docker, DC/OS, K8S who is in the lead?
After some time of reflection and interviews with netease cloud architects who have been practicing Kubernetes since its early days, I’ve summarized my reflections into today’s post.
First, three perspectives on container platform from three architectures of enterprise cloud
It all starts with three major architectures for the cloud on the enterprise
As shown in the figure, the three major architectures of cloud in the enterprise are IT architecture, application architecture and data architecture. Different companies, different people and different roles pay different attention to them.
For most enterprises, the appeal of the upper cloud is initiated from the IT department, usually initiated by the operation and maintenance department. They focus on computing, network, storage, and try to alleviate CAPEX and OPEX through cloud computing services.
Companies with ToC businesses have accumulated large amounts of user data that they use for big data analysis and digital operations, and therefore need to focus on data architecture.
Enterprises engaged in Internet applications often focus on the application architecture first, whether it can meet the needs of end customers and bring them good user experience. Business volume tends to explode in a short period of time, so we pay attention to high-concurrency application architecture, and hope that this architecture can be rapidly iterated, so as to seize the wind.
Before the advent of containers, these three architectures were often addressed by means of virtual machine cloud platforms. After the emergence of container, container’s various good characteristics make people’s eyes shine, its lightweight, packaging, standard, easy migration, easy delivery characteristics, so that the container technology has been widely used quickly.
However, there are a thousand Hamlets in a thousand people’s minds. Due to their original work, the three roles see the convenience brought by the advantages of containers from their own perspectives.
For IT o&M engineers who used to be in charge of computing, network, and storage in the equipment room, containers are more like a lightweight O&M mode. For them, the biggest difference between containers and virtual machines is that they are lightweight and can start quickly. Therefore, they are more likely to introduce virtual machine mode containers.
For data architectures, which perform a variety of data computation tasks on a daily basis, the container is a more isolated and resource-efficient mode of task execution than the original JVM.
From an application architecture perspective, containers are the delivery form of microservices. Containers are not just for deployment, but for delivery, D in CI/CD.
So people with these three perspectives will have different approaches to using containers and choosing container platforms.
Kubernetes is the bridge between microservices and DevOps
Swarm: IT operation and maintenance engineers
From the point of view of IT operation and maintenance engineers, containers are mainly lightweight, fast start-up, automatic restart, automatic correlation, elastic and telescopic technologies, making IT operation and maintenance engineers seem to no longer need to work overtime.
Swarm’s design is clearly more in line with the management model of traditional IT engineers.
They want to be able to clearly see the distribution and state of containers on different machines, so they can easily SSH into a container as needed to see what’s going on.
It is better to restart the container in place rather than randomly scheduling a new container, so that everything installed in the container is still there.
Instead of starting with a Dockerfile, it’s easy to make an image of a running container, so that later startup can reuse the 100 things that were done manually inside the container.
Container platform integration is better, the use of the platform was originally to simplify the operation and maintenance, if the container platform itself is very complex, like Kubernetes itself so many processes, also need to consider its high availability and operation and maintenance costs, it is not cost-effective, no more than the original work, and the cost has increased.
Preferably a thin layer, like a cloud management platform, but more convenient for cross-cloud management, after all, container images are easily migrated across the cloud.
Swarm is used in a way that may sound familiar to IT engineers, but IT does everything OpenStack does at a faster rate.
The problem of Swarm
However, as a lightweight VIRTUAL machine, containers are exposed to customers, whether external customers or development within the company, instead of IT personnel themselves. When they think they are the same as virtual machines, but find different parts, there will be a lot of complaints.
Self-healing function, for example, after the restart, the original SSH in manual installation software is gone, even in the hard disk file could not be found, and the application without your Entrypoint automatically start, since the repair process did not run, still need to manually enter to start the process, the customer will complain you use the self-healing function have?
For example, some users will photoshop and find a process they don’t know, so they kill it directly. As a result, the Entrypoint process will hang directly. The client complains that your container is too unstable and always hangs.
When the container is automatically scheduled, the IP address is not maintained, so the original IP address will not be restored after the restart. Many users will ask whether this IP address can be maintained. The ORIGINAL IP address configured in the configuration file will change after the restart.
The system disk of the container, that is, the disk of the operating system, is usually fixed in size. Although it can be configured in the early stage, it is difficult to change in the later stage, and there is no way for each user to choose the size of the system disk. Some users will complain, we originally put a lot of things directly on the system disk, this can not be adjusted, called what cloud computing flexibility ah.
If the customer said that the container mount data disk, the container is started, some customers want to cloud host, and then mount a disk, the container is more difficult to do, will also be scolded by the customer.
If container users don’t know they’re using the container, they’ll find it hard to use it when virtual machines come to use it, and the platform is not good at all.
Swarm is relatively easy to use, but when problems arise, as the operation and maintenance container platform, people will find it difficult to solve the problems.
Swarm has a lot of built-in functions, which are coupled together. Once an error occurs, it is not easy to debug. If the current functionality does not meet the requirements, it is difficult to customize. Many functions are coupled to the Manager, and the operation and restart of the Manager have too much influence.
Mesos: Data operation and maintenance engineer
From the perspective of big data platform operation and maintenance, how to dispatch big data processing tasks faster and run more tasks faster in limited time and space is a very important element.
So when we evaluate the merits of a big data platform, it is often measured by the number of tasks run per unit of time and the amount of data that can be processed.
Mesos is a great scheduler from a data operation and maintenance perspective. Since you can run tasks, you can also run containers, and Spark’s natural integration with Mesos allows for a more fine-grained way of performing tasks.
In the absence of fine-grained task scheduling, the execution of tasks looked like this. Task execution requires the Master node to manage the entire task execution process, and the Worker node to perform a sub-task. At the beginning of the overall task, the resources occupied by the Master and all the works should be allocated, and the environment should be configured so that sub-tasks can be executed there. When no sub-tasks are executed, the resources of the environment are reserved there. Obviously, not all the works are fully run, and there is a lot of resource waste.
In fine-grained mode, at the beginning of the total task, only resources are allocated to the Master, and no resources are allocated to the Worker. When a sub-task needs to be executed, the Master temporarily applies for resources from Mesos, and the environment is not ready. Fortunately, there is Docker, start a Docker, the environment is all there, running sub-tasks in it. When there are no tasks, resources on all nodes can be used by other tasks, greatly improving resource utilization efficiency.
This is the biggest advantage of Mesos. In Mesos papers, the most important thing is the improvement of resource utilization, and Mesos’s two-layer scheduling algorithm is the core.
You, who claim to know mesOS two-tier scheduling, answer these five questions first!
Former big data operation and maintenance engineers will easily choose Mesos as a container management platform. But it used to be short tasks, and marathon allowed you to run longer tasks. But Spark later deprecated the fine-grained schema because it was still inefficient.
The problem of Mesos
Scheduling is the core of the core in big data, important in container platforms, but not the whole story. So the container also needs to be choreographed, you need various peripheral components that make the container run long tasks and access each other. Marathon is only the first step on a long journey.
Therefore, most of the early manufacturers using Marathon + Mesos used Marathon and Mesos naked. Due to incomplete peripheral, they had to do various packaging. If you are interested, you can go to the community and see the vendors using Marathon and Mesos naked, each with their own load balancing solutions, each with their own service discovery solutions.
Therefore, DCOS came into being, that is, a large number of peripheral components were added in addition to Marathon and Mesos to supplement the functions of a container platform. Unfortunately, many manufacturers have customized it themselves, and most of them still use Marathon and Mesos naked.
Mesos is a great scheduler, but it only deals with part of the scheduler, and you have to write your own framework and schedulers. Sometimes you have to develop executors, which can be very complicated to develop and expensive to learn.
Although later DCOS features are more comprehensive, but it does not feel like Kubernetes like the use of a unified language, but to adopt a hodgepodge way. In the entire DCOS ecosystem, Marathon is written in Scala, Mesos is written in C++, Admin Router is Nginx+lua, mesos-dns is Go, marathon-lb is Python, Minuteman is Erlang, so it’s too complicated to fix bugs.
Kubernetes
And Kubernetes is different, at the beginning of Kubernetes people feel that he is a strange place, the container has not been created, the concept to a lot of first, the document to read a lot of documents, the arrangement of files is also complex, components are also many, so that many people recoiled. I just want to create a container, how so many preconditions. If you put Kubernetes concept on the interface and ask the customer to create the container, you will be scolded by the customer.
From a developer’s point of view, using Kubernetes is definitely not the same as using virtual machines. In addition to writing code, building and testing, you also need to know that your application is running on the container, not just sitting on the shelf. Developers need to know that containers exist differently from the original deployment. You need to distinguish between stateful and stateless containers. Developers need to write Dockerfiles, need to care about environment delivery, and need to know a lot of things they don’t know before. To be honest, it’s not convenient.
In the operation and maintenance personnel point of view, using Kubernetes is absolutely not like the operation and maintenance of virtual machines, I delivered the environment, how to call each other between applications, I don’t care, I tube network connectivity. In the eyes of operations, it does too many things that it should not care about, such as service discovery, configuration center, fuse downgrading, which should be the concern of the code level, should be the concern of SpringCloud and Dubbo, why to the container platform layer to care about this.
Kubernetes + Docker is a bridge between Dev and Ops.
Docker is a delivery tool for microservices. After microservices, there are too many services, which cannot be managed by operation and maintenance alone, and it is easy to make mistakes. Therefore, research and development should start to care about environmental delivery. For example, only development can know what configuration has been changed, which directories have been created and how to configure permissions. It is difficult to synchronize these information to the operation and maintenance department in a timely and accurate way through documents. Even if the synchronization has been done, the maintenance amount of the operation and maintenance department is very large.
So, with containers, the biggest change is the early delivery of the environment, 5% more time per development, 200% more o&M effort, and improved stability.
On the other hand, operations just deliver resources, give you a virtual machine, the applications in the virtual machine access each other, I don’t care what you want, with Kubernetes, operations is about service discovery, configuration center, fuse downgrading.
The two are fused together. From the perspective of microservice research and development, although Kubernetes is complex, its design is reasonable and conforms to the idea of microservice.
Netease Cloud Micro service is a one-stop PaaS platform built around applications and micro services, helping users to quickly realize easy access, easy operation and maintenance of micro service solutions.
Related: Why Kubernetes is a natural fit for Microservices (1)
Why Kubernetes is a natural fit for microservices (2)
Why Kubernetes is a natural fit for microservices (3)
Related articles: [recommendation] Simple Verification code recognition under Windows — perfect verification code recognition system [recommendation] netease Cloud based on Kubernetes in-depth customization practice