Author: Yao Yuan

Profile: 58.com – Infrastructure Department – Senior Development Engineer

58 Private cloud platform is a service instance management platform developed by 58.com based on container technology for internal services.

It supports on-demand expansion of business instances and second-level scaling. The platform provides friendly user interaction process and standardized testing and on-line process, aiming to liberate developers and testers from the configuration and management of the basic environment and make them more focused on their own business.

This article will share relevant container technology practices in the implementation of private cloud platform, mainly from the following three parts:

1. Background: What are the current problems and why container technology is used?

2. Overall architecture: the architecture scheme of the entire container technology.

3. Design scheme of core modules: selection decisions and solutions of some core modules.

Why use container technology

Before containerization, we had these problems:

01 Resource usage problem

Different service scenarios have different resource requirements, including CPU -, memory -, and network – intensive resources. This may cause inappropriate resource utilization. For example, if the services deployed on a machine are network intensive, both CPU and memory resources are wasted.

Some businesses may focus only on the service itself and ignore the issue of machine resource utilization.

02 Mixed deployment cross-impact

For online services, where multiple services are deployed in a mix on a single machine, there may be interactions between services.

For example, if a service has a sudden surge of network traffic for some reason, the bandwidth of the entire machine may be filled, and other services will be affected.

03 Low capacity expansion/reduction efficiency

When a service node needs to be expanded or shrunk, it takes a long time to deploy and test applications from offline. When the service encounters a sudden traffic peak, the traffic peak may have passed after the mobile deployment.

04 Inconsistent Codes in multiple Environments

Due to the non-standard internal development process in the past, there are some problems. The code proposed for business testing may be modified and adjusted in the sandbox after being tested in the test environment, and then packaged and put online.

This will result in inconsistent test code and online running code, increasing the risk of service online, and increase the difficulty of online service troubleshooting.

05 Lack of stable offline testing environment

During testing, there was a problem that none of the other downstream services on which the service depended provided a stable test environment.

This makes it impossible to simulate the entire online process for testing in the test environment, so many test students will use the online service for testing, which is a high potential risk.

In order to solve the above problems, the architecture Line Cloud team conducted technical selection and repeated argumentation, and finally decided to use Docker container technology.

58 Overall Private cloud Architecture

The overall architecture of private cloud is as follows:

01 Infrastructure

The entire private cloud platform takes over all the infrastructure, including servers, storage, and network resources.

02 container layer

The entire container initialization layer is provided on the infrastructure, including Kubernetes, Agent, IPAM; Kubernetes is Docker’s scheduling and management component.

The Agent is deployed on a host to manage system resources and underlying infrastructure, including monitoring collection, log collection, and container rate limiting. IPAM is the network management module of Docker, which is used to manage IP resources of the entire network system.

03 Resource Management

On top of the container layer is the resource management layer, which contains modules such as container management, capacity scaling, rollback and degradation, online publishing, quota management, and resource pool management.

04 application layer

Run user-submitted business instances, which can be in any programming language.

05 Basic Components

Private cloud platforms provide basic components for container operating environments, including service discovery, mirroring center, log center, and monitoring center.

06 Service Discovery

Services connected to the cloud platform provide a unified service discovery mechanism to facilitate service access to the cloud platform.

07 Mirror Center

Storage service mirroring, distributed storage, elastic expansion.

08 Log Center

Centrally collect service instance logs to provide a unified visual entry for analysis and query.

09 Monitoring Center

Collect all host and container monitoring information, monitor view, alarm customization, provide the basis for intelligent scheduling.

10 Unified Portal

The visualized UI portal page standardizes the entire business process and the simple user process, and can dynamically manage all resources of the entire cloud environment.

New architecture brings new ways of business flow:

The platform defines four basic environments:

Test environment: testers perform functional tests to connect to the offline environment.

Sandbox environment: program pre-release environment, to the online environment.

Online environment: The online environment in which services are provided.

Stable environment: Instances running in offline environments that provide stable test environment instances for other upstream services.

The service builds an image based on the code submitted by the SVN. The entire life cycle of the image flows in four environments. Because the instance is created based on the same image, you can guarantee that the application that passes the test is exactly the same as the one running online.

Design scheme of core module

There are many details to consider when developing 58 private cloud platform. Here are five core modules:

Container management.

Network model.

Mirror warehouse.

Log collection.

Monitor alarms.

With these core modules, the platform has a basic framework to work with.

01 Container Management

We investigated three docker-based management platforms: Swarm, Mesos and Kubernetes. Through comparison, we finally chose Kubernetes.

Swarm was too rudimentary to pass early, Mesos + Marathon was a mature solution, but the community wasn’t active enough and you had to be familiar with both frameworks to use.

Kubernetes is a scheduling management platform specifically for container technology. It is more specialized, with a very active community, supporting many components and solutions, and more and more companies are using it. Through communication with some companies, they are gradually migrating Docker applications from Mesos to Kubernetes.

The following table shows some comparisons of our team’s concerns:

02 Network Model

The network model is an issue that any cloud environment must face because of the problems that arise when the network scale up. In terms of network selection, six networking modes are compared according to the characteristics of Docker and Kubernetes, as shown below:

For each network model, the cloud team did performance tests, with the exception of Calico, which was not tested because the company’s computer room did not support BGP.

IPerf test network bandwidth results are as follows:

The result of the Qperf TCP delay test is as follows:

Qperf tests UDP latency as follows:

It can be seen from the test results that Host mode and Bridge mode are the closest in performance to the Host. There are still some gaps in other networking modes, which is related to the principle of Overlay.

The Bridge + VLAN networking mode is adopted for the private cloud platform for the following reasons:

Good performance, simple networking, can be seamlessly connected with the existing network; Container to container, container to host communication can be well realized.

The fault is easy to debug and can be solved by traditional SA. Adapt to any physical device, can be large-scale expansion.

The internal services of the company are based on RPC protocol and have their own service discovery mechanism, which can be very compatible. Existing internal framework changes little.

Since there are a maximum of 4096 vlans, there is a limit to the number of vlans, which is why there are vlans.

In the current network planning of cloud platform, VLAN is sufficient. In the future, with the expansion of the scale of use and the development of technology, we will also conduct in-depth research on a more appropriate networking mode.

There is also feedback from students online that Calico’s IPIP mode network performance is also very high. However, considering that Calico has a lot of pits at present and needs a special network group to support it, which is lacking in the cloud team, we did not conduct in-depth research.

There is also a problem here. The default Bridge mode is to configure different network segments for each host. This ensures that the IP addresses assigned to containers by different hosts do not conflict, but this can lead to a lot of WASTED IP.

The Intranet environment of the equipment room has limited IP resources. Therefore, you cannot configure the network in this way. Therefore, you have to develop the IPAM module for global IP management.

The implementation of IPAM module refers to the implementation of open source project Shrike, which writes assignable network segments into ETCD. When the Docker instance is started, an available IP will be obtained from ETCD through IPAM module, and the IP will be returned when the instance is closed. The overall architecture is shown as follows:

In addition, since Kubernetes does not support CNM, we have modified the source code for Kubernetes. There is another aspect of the network to consider: network speed limitation.

03 Mirror Warehouse

The mirror warehouse of Docker uses the official mirror warehouse, but we select the storage system provided by the back end. The default mode with local disk cannot be applied to the online system. The specific selection is as follows:

By comparison, Ceph is the most suitable, but finally the cloud team chooses HDFS as the back-end storage of the mirror warehouse for the following reasons:

Swift is the officially supported storage type by default. However, to build a set of Swift and ensure its stable operation requires in-depth study by special personnel. Due to limited personnel, IT has not been used yet.

HDFS System company has a specialized data platform department for management and maintenance. They are more professional, and the cloud team can safely host the Docker image on HDFS.

However, HDFS also has some problems. For example, NameNode cannot timely respond to pressure. In the future, we will consider migrating back-end storage to the object storage developed by the architecture line department to provide stable and efficient services.

04 Log System

Logging is a big problem when traditional services migrate to a container environment. Because the container is ready to go, the container’s storage is deleted when the container is closed.

Although you can put in a container log export to the specified location on host, but the container will often drift, considering failure, we need to know, at a certain moment in the history of a container in which the host machine running, and due to the use of no hosting login permissions, so the can’t use good access to log.

In the container environment, a new troubleshooting method is required. A common solution here is to adopt a centralized log solution that collects and stores scattered logs in a unified manner and provides flexible query methods.

The private cloud platform uses the following solutions:

The customer configures the logs to be collected on the MANAGEMENT portal. The private cloud platform maps the logs to the container in the form of environment variables. The Agent deployed on the host obtains the logs to be collected based on the environment variables and starts Flume to collect the logs.

Flume uploads logs to Kafka in a unified manner. Logs uploaded to Kafka are in strict sequence.

Kafka has two subscribers. One uploads logs to a search service for management portal queries. One is to upload logs to the HDFS for querying and downloading historical logs. Users can also write Hadoop programs to analyze specified logs.

05 Monitoring Alarms

Resource monitoring and alarm is also an essential part of a cloud platform. There are a lot of mature open source software to choose from for container monitoring, and there are special monitoring components in 58. The cloud team has also made corresponding selection for better monitoring.

Finally, the cloud team chose to use WMonitor as the monitoring component of the container. Because WMonitor itself integrates physical machine and alarm logic, we do not need to do corresponding development, but only need to develop components of the monitoring part of the container, and we can make good customization according to the internal monitoring requirements.

Heapster + InfluxDB + Grafana is a monitoring component officially provided by Kubernetes. It has no problem when used on a small scale, but may be a problem when used on a large scale, because it polls to obtain monitoring information of all nodes.

Afterword.

The above content is related to the exploration of container technology implementation by 58.com Architecture line. Many technology selection has nothing to do with advantages and disadvantages, and only those suitable for 58 related application scenarios are selected. There are many technical points to be addressed in the cloud platform. Here are a few key points to share with you.

Charging zone

Kubernetes enterprise container cloud platform skills training, 31 days to master Kubernetes, and can independently complete the implementation of Kubernetes.

Course contents include:

1. Docker concept, component, container, image, network, architecture, Jenkins and Docker CI/CD;

2. Kubernetes design concept, architecture design, core components, basic functions, common controller, resource object, cluster network, plug-in, security framework, storage, high availability, resource monitoring, log management, Jenkins and Kubernetes CI/CD and landing experience;

Click “Read the original” below to view the specific course content.

  Xiaobian has something to say

The curtain of the stage of life may open at any time, the key is you are willing to perform, or choose to avoid.