Technical practice of MSE in Alibaba

preface

MSE(official website address >>) is the cloud version of Ali Group configuration center and service registry, which is widely used in distributed consistency coordination, registry, distributed configuration center and other scenarios in the Group. Take tmall Double 11 in 2019 as an example. Due to MSE’s continuous investment in stability and performance, the maximum length of links in a single cluster exceeded 30W. For the business team, the operation and maintenance cost was significantly reduced by double, and the control experience was greatly improved. The excellent stability and performance of MSE products have been highly recognized by benchmark customers, and they have delivered a beautiful report card!

Double 11 record

1. MSE1.0 hosts most ZooKeeper online clusters of Ali Group, covering products including a variety of core basic components of the Group; 2. On The Day of Double 11, all clusters operated stably, with normal capacity water level and various indicators, 0 failure and 0 problem feedback; 3. During the peak period, Blink cluster, the largest capacity cluster hosted by MSE, met expectations and passed smoothly. The detailed trend of each indicator is as follows:

Number of links

Average request delay

2.0 architecture upgrade, integrating the competitiveness of cloud native enabling products

MSE 2.0 architecture upgrade is a seamless integration with cloud native technology system; From another perspective, it is a kind of cloud native ability to bring technological innovation, efficiency improvement and performance improvement to one’s own products by relying on the technical dividend brought by cloud native.

1. MSE 1.0 adopts group deployment mode, Sigma + group internal network + group components:

Under the 1.0 architecture, the product platform capability and operation and maintenance efficiency are inefficient and limited by the platform, so the product’s own capability cannot be improved qualitatively:

MSE1.0 cluster delivery model, is applied to machine resources in users need to deploy machine room, and the deployment of the cluster, artificial in machine deployment is completed, need to maintain a set of nodes above IP configuration center mapping relationship, the entire process, at least 3 hours, often because of resource delivery problems, coordinate various.
In many scenarios, users within the group apply for ZooKeeper resources independently, that is, an independent cluster is required. In this case, if each user is given one, the maintenance cost is very high, so users need to explain the scenario by email separately to determine whether the public cluster can be delivered. This process makes it difficult for users to achieve fast and secure access.
Many independent clusters are supported in the group. The clusters break down every month. After the migration of physical servers, nodes need to be manually restored.
The capacity expansion or reduction of a cluster requires manual modification of the configurations of each node, and each node must be restarted for the expansion or reduction to take effect. Because the ZAB protocol rules are involved, multiple hosts cannot occur; otherwise, serious consequences such as data loss may occur. Strictly control the restart sequence and configuration content, resulting in high O&M complexity.
To reduce the dependency, users directly connect to the IP addresses of cluster nodes. When a node is replaced or expanded, users need to maintain the mapping relationship. If errors occur, services may become unavailable.
The server configuration is manually synchronized. There is no mechanism to ensure the correctness of each manual operation. As a result, inconsistent configurations are loaded during the ZooKeeper cluster restart, causing data loss and primary selection failures.

In May 2019, we upgraded the architecture of MSE 1.0 at the platform level, hoping to solve the current problems. At the same time, we can support the internal group and export the product capabilities to the outside world (based on the inventory survey data of the engine on the public cloud, the market is very large). When selecting the technology, The cloud native architecture based on K8S is fully considered. In the future cloud era, all kinds of cloud capabilities will take the initiative to connect to the standard cloud native architecture in the future. MSE needs to make use of these cloud native standard capabilities to improve the overall platform capabilities of MSE products and lay a technical foundation for the subsequent cloud native battle.

2, MSE 2.0, based on ACK+ cloud capability multi-in-one combination (VPC, DNS, SLB, Efficient Cloud disk, ARMS)

MSE 2.0 architecture, the underlying container resources through ACK unified management, fully compatible with the open source K8s standard, therefore also obtained a variety of ADVANCED FEATURES of K8s support:

Cluster delivery capacity efficiency increased 100 times K8s POD pull, coupled with cloud disk, SLB and other resource allocation, 3 minutes can deliver a set of clusters, compared to 1.0 process, resource application, configuration synchronization, etc., at least half a day, efficiency improved a hundred times.
Cluster node deployment with multiple availability zones (AZS), enabling the disaster recovery (Dr) capability of multiple equipment rooms MSE Each Region supports at least two availability zones. When cluster nodes are allocated, affinity scheduling based on K8s has been configured so that nodes are deployed in multiple availability zones.
SLB is a mature product and has been tested by the market. Relying on its 4-layer load balancing capability, MSE can evenly distribute clients on each node to avoid load imbalance when users are directly connected by IP.
Machine downtime and automatic migration and reconstruction reduce o&M costs and improve availability. In the past, you had to apply for machines by yourself, but due to state synchronization, you had to manually modify configurations and restart each node, which was very complicated. In addition, frequent manual operations also brought risks to online stability
In terms of monitoring scheme selection, Prometheus, a cloud native standard monitoring system scheme, is adopted. Because it is compatible and consistent with open source, the reporting ability of ZooKeepper open source component is fully reused in the collection component of MSE business monitoring indicator data, and the r&d efficiency has been doubled.

Prometheus monitoring, provides a powerful background interactive market, while the MSE front end through the data query interface, as required to redraw the trend chart of monitoring indicators.

Productization of technical capabilities to meet the multi-level needs of customers

Compared with MSE 1.0, MSE 2.0 transforms many technical capabilities into products to improve product competitiveness, and at the same time endods users with technical capabilities to meet the multi-level needs of customers.

1. Cluster delivery Mode Domain name Delivery When a client connects to the MSE cluster, the client does not need to change the address after the cluster instance changes, and the client automatically resolves to the new address, reducing the user switchover cost.

2. Visualization of cluster node health Status On the status page, you can view the health status and role of each node.

3. Optimized MSE runtime parameters to support higher link performance and lower operation and maintenance costs. After migrating to MSE, the capacity and performance requirements of the self-maintained 64-core physical machine were evaluated, and the CPU utilization was stable at about 15% and GC frequency was reduced by 80%. In the same business scale, only 1/5 machine resources are needed to meet business requirements, saving machine resources for customers.

4. Data node editing function

The MSE provides a cluster data management view, which is suitable for service scenarios and facilitates data white-screen operations.

5, version upgrade, MSE one-click rolling upgrade to upgrade THE MSE version, just need to update the image, and then click on the console to upgrade, all nodes in the MSE cluster rolling upgrade

6. Cluster monitoring indicator trend View You can view the real-time value of the current cluster monitoring indicator or the historical trend, for example, the link number change of the cluster and the 7-day historical trend

7. User-defined alarm on the user side can set corresponding alarm thresholds for different monitoring indicators, supporting SMS, nail group and email.

New journey and new challenges

Product technology is highly integrated, based on the Nacos kernel, unified internal multiple configuration centers, service registration center, Nacos products based on the MSE platform commercialization, all internal clusters of the group on the cloud, the overall saving of machine cost 50%, human efficiency increased by 3 times.
Commercialization means stricter requirements on product availability and performance, higher SLA, 99.99% read service available, 99.9% write service available, and average request latency less than 50ms 99.99%.
Deep integration of cloud native technology system, support elastic scaling, ServerLess service, resource cost reduced to 50% at least, with the ability of product configuration modification without restart, so that MSE service is always online

Read more: https://yqh.aliyun.com/detail/6482?utm_content=g_1000105580

On the cloud to see yunqi: more cloud information, on the cloud case, best practices, product introduction, visit: https://yqh.aliyun.com/

Technical practice of MSE in Alibaba

preface

Double 11 record

2.0 architecture upgrade, integrating the competitiveness of cloud native enabling products

Productization of technical capabilities to meet the multi-level needs of customers

New journey and new challenges

Related Posts

Take-down of exclusive Delivery Secrets (Session 1)

Dull would like to talk with you about the feeling of entering CVTE for one month

Possible problems with password recovery