Author: Development team of Vivo official website mall

Using high availability system architecture to support important systems and provide 7×24 uninterrupted service for key businesses has become the main choice for many enterprises to ensure stable and continuous operation of business. Service multi-activity is an important implementation method of high availability architecture. This paper introduces some common multi-activity methods in the industry, such as same-city, two-place, three-center, remote multi-activity architecture design schemes and details the advantages and disadvantages of each scheme.

First, why to do more work

With the in-depth development of mobile Internet, after the user growth reaches a certain scale, many enterprises meet the challenge of high concurrent business and massive data, and there is a bottleneck in the capacity of the traditional single machine room. In some extreme situations, may all the server fails, such as power room, computer room fire, earthquake, etc. These BuKa resistance factors all server will result in system failure resulting in business as a whole paralysis, and even with other parts of the backup, the backup system all returned to normal business, the amount of time is longer. As a reliable and highly available deployment architecture, multi-activity has become the primary choice of Internet companies in order to meet business continuity and enhance risk resistance.

1. Live multiple scenarios

The key point of a live architecture is that systems in different geographic locations can provide business services, and by “live” I mean live. Standby is a backup. Normally, services are not provided externally. If services need to be provided, manual intervention and operation are required, and a large amount of time is required to make standby become active. The mere fact that multiple lives are powerful enough to keep the business intact in the event of a disaster does that mean we should implement a multiple life architecture regardless of the business? In fact, it is not. There is a price to pay for implementing a multi-living architecture, as follows:

  • The complexity varies according to the multi-live scheme. As the service scale and Dr Level increase, the multi-live scheme brings more complexity to the design of the service system.

  • No matter which multi-live solution is adopted, it is difficult to completely avoid the time increase caused by cross-room or even cross-region service calls.

  • More work will bring higher costs, after all, we need to build an independent set of service systems in one or more computer rooms.

So, while live is powerful, not every business needs live. For example, if the IT system, management system, and blog site of an enterprise cannot bear the complexity and cost of remote multi-live, IT is not necessary to perform multi-live for important services such as core finance, payment, and transaction.

2. Multi-live scheme

Common multi-live solutions include same-city dual-live, two-place three-center, three-place five-center, remote multi-live and other technical solutions. Different multi-live solutions have different technical requirements, construction costs, and operation and maintenance costs. We will gradually introduce these multi-live solutions and give the advantages and disadvantages of each solution. Which scheme to choose should be decided based on the specific business scale, current infrastructure capacity, input-output ratio and other factors.

Two, the city active-active

Hypermetro in the same city is to establish two equipment rooms in the same city or nearby areas. The distance between two equipment rooms in the same city is close, and the communication line quality is good. Therefore, it is easy to realize synchronous data replication, ensuring high data integrity and zero data loss. Two rooms in the same city each bear part of the traffic. Generally, the inlet traffic is completely random. The internal RPC calls are closed loop in the same room through the nearest routing as far as possible, which is equivalent to deploying two independent clusters in the mirror of two rooms. The following figure shows the simple deployment architecture of same-city hypermetro. Of course, the actual deployment and consideration are far more complex than the following figure.

The service invocation is basically completed in a closed loop within the same machine room, and the data is still single point written to the data store in the main machine room, and then synchronously copied to the backup machine room in the same city in real time. When equipment room A is faulty, o&M personnel only need to manually change the routing mode to route traffic to equipment room B through GSLB or other solutions. Same-city Active-active can be used to prevent equipment room disasters caused by fire, building damage, power supply failure, computer system failure, and human damage.

1. Service routing

  • ** ZK cluster: ** Each machine room is deployed with a ZK cluster, real-time bidirectional synchronization of ZK data between machine rooms, each machine room has all machine room ZK registration data.

  • ** Routing scheme: ** Conditional routing > nearby routing > cross-room routing, try to avoid cross-room call.

  • ** Subscription scheme: ** Consumer subscribes to all equipment room services, and the Provider registers only with the ZK cluster of the equipment room.

2. Data hypermetro

  • **MySQL: ** MHA is used to ensure data consistency. Read/write separation, read routes to the nearest data node in the equipment room, and write routes to the equipment room where the master node resides.

  • Redis: In Redis cluster mode, the primary node synchronously reads and writes routes to the nearest primary node. The original master-slave synchronization across the machine room has low write performance, and multi-node bidirectional synchronization can also be constructed based on CRDT theory to achieve the nearest read and write in the machine room, but the overall implementation is complicated.

3. Evaluation of intra-city hypermetro schemes

advantage

  • Same-city active-active service, same-city DATA Dr, cross-room level Dr Without data loss in the same city.

  • The architecture solution is simple, and the core is to solve the underlying data hypermetro. Because the distance between the two equipment rooms is short and the communication quality is good, the underlying storage, such as mysql, can adopt synchronous replication to effectively ensure the data consistency of the two equipment rooms.

disadvantage

  • Database write data has cross-room invocation. Frequent cross-room invocation increases response time in complex services and links, affecting system performance and user experience.

  • Ensure disaster recovery (Dr) in the same city or region. If the network of the city or region where the service resides fails or an irresistible natural disaster occurs, the service may fail or data may be lost. Core financial services must have at least cross-regional disaster preparedness capabilities.

  • If the service size is large enough (for example, a single application of more than 10,000 machines), all machines linking to a single master database instance can cause underconnection problems.

Three, two, three center structure

Geo-redundant centers refer to the same-city, two-center and remote Dr Center. A remote Dr Center is set up in a remote city to back up data of the two centers. Data and services are normally cold. When the city or region where the two centers reside fails to provide services due to an exception, the remote Dr Center can use the backup data to restore services.

Evaluation of the two-place three-center scheme

advantage

  • Same-city active-active service, same-city DATA Dr, cross-room level Dr Without data loss in the same city.
  • The architecture solution is simple, and the core is to solve the underlying data hypermetro. Because the distance between the two equipment rooms is short and the communication quality is good, the underlying storage, such as mysql, can adopt synchronous replication to effectively ensure the data consistency of the two equipment rooms.
  • The DISASTER recovery center can use backup data to recover services when faults occur in the same city and the disaster recovery center.

disadvantage

  • Database write data has cross-room invocation. Frequent cross-room invocation increases response time in complex services and links, affecting system performance and user experience.

  • If the service size is large enough (for example, a single application of more than 10,000 machines), all machines linking to a single master database instance can cause underconnection problems.

  • If a fault occurs, do not switch traffic to the remote data backup center. The remote data backup center is cold and does not receive traffic. Therefore, it takes a long time to verify the remote DISASTER recovery equipment room.

City double live and dual three center construction scheme complexity is not high, and compared three center city live effectively resolves the problem of long distance data disaster preparedness, but still can’t solve several shortcomings of city double live there, and want to solve the insufficiency of these two kinds of architecture will introduce more complex solutions to solve these problems.

Four, live more in different places

Remote multicast refers to a service scenario in which multiple sites in remote locations provide services concurrently. Remote live is a high availability architecture design. The main difference from traditional Dr Design is live: All sites provide services externally at the same time.

1. Live more in different places

(1) To move to different places, the first thing to face is the delay caused by physical distance. If an application request needs to modify the same row of records in multiple remote cells, it will cost a lot of time to ensure the consistency and integrity of database data between remote cells.

(2) To solve the long-distance high delay, data read and write in the unit should be closed. Different units cannot modify the same row of data, so we need to find a dimension to divide the unit.

(3) Access to data from other units within A unit must be correctly routed to the corresponding unit. For example, when user A transfers money to user B, the data of user A and user B are not in the same unit, and the operation of user B can be routed to the corresponding unit.

(4) Data synchronization challenges: For closed data in a unit, all data needs to be synchronized to the corresponding unit; for read/write separation, data in the center needs to be synchronized to the unit.

2. Unitization

A unit (replaced by RZone) is a self-contained collection of all business operations, all services required by all businesses, and the data assigned to the unit.

The unitary architecture takes the unit as the basic unit of system deployment and deploys several units in all machine rooms of the station. The number of units in each machine room is variable, and all applications required by the system are deployed in any unit. In the unitary architecture, services are still layered. The difference is that any node in each layer belongs to and only belongs to a cell. When the upper layer calls the lower layer, only nodes in this cell are selected.

To choose the dimensions for traffic segmentation, you need to analyze the business itself. For example, in e-commerce business and financial business, the most important process is the order, payment and transaction process. It is the best choice to divide and split the user ID data. All relevant operations of the buyer will be completed in the unit where the buyer is located. Merchant-related operations cannot be unitary and must be deployed in the following non-unitary mode. Of course, cross-cell invocation or even cross-machine room invocation can not be completely avoided by user operation services. For example, when two buyers A and B transfer services, when the data units of A and B are inconsistent, the operation of B needs to be completed by cross-cell. Later, we will introduce the problem of cross-cell service routing invocation.

3. Non-unitized applications and data

For services and applications that cannot be unitary, there are two possibilities:

(1) Delay is not sensitive but data consistency is sensitive. Such applications can only be deployed in the same-city active-active mode. When other applications call this type of application, there is the possibility of cross-region invocation, which is called MZone application if the delay can be tolerated.

(2) It is sensitive to data call delay but can tolerate data inconsistency in a short time. Such applications and data can maintain a full set of data in one machine room, and real-time synchronization between machine rooms in an incremental way. This kind of application is temporarily called QZone.

In addition to more than two non-unitary applications, our machine room deployment may be as follows: each machine room has two RZones, and MZone remains similar to the two-place and three-center deployment mode. The invocation of MZone service in remote machine rooms requires cross-region and cross-machine room invocation. In QZone, each room keeps a complete data, and the rooms synchronize with each other in real time through data links.

4. Request routing

(1) Api entry gateway

To ensure that users can access their own units correctly, a traffic gateway cluster is deployed in each equipment room. When a user request arrives in the equipment room, it first enters the traffic gateway. The traffic gateway senses the global traffic fragmentation, calculates the traffic unit where the user resides, and forwards the traffic to the corresponding unit. In this way, the user request can be routed to the corresponding unit.

GateWayr forwarding can determine the user unit and route the user traffic to the correct location, but HTTP forwarding will also cause certain performance loss. To reduce the amount of HTTP traffic forwarded, the user’s route id can be added to the cookie when the user’s request is returned. When the user makes a request next time, he can obtain the route id in advance and directly request to the corresponding unit, which can greatly reduce HTTP traffic forwarding.

(2) Service routing

Although the application has been unitary, it still cannot avoid cross-cell invocation. For example, if user A transfers money to user B, the operation of user B needs to be cross-cell invocation. In this case, the request needs to be routed to the unit where user B’s data resides. In remote multi-live scenarios, middleware such as RPC, MQ, and DB must provide routing capabilities to route requests to corresponding units correctly. The following uses RPC routing as an example to illustrate how middleware performs routing in remote multicast, as well as other middleware (database middleware, cache middleware, message middleware, etc.).

public interface ManualInterventionFacade {
    @ZoneRoute(zoneType= ZoneType.RZone,uidClass = UidParseClass.class)
    ManualRecommendResponse getManualRecommendCommodity(ManualRecommendRequest request);
}
Copy the code

The above shows the RPC interface definition method under live. The RPC type needs to be specified. If the RZone service is used, the resolution UID method must be provided. The following figure shows the routing addressing process of RPC note center, which is different from that of same-city hypermetro.

5. Data synchronization

** (1) QZone data: ** This data only needs to ensure final consistency. It has no impact on transient inconsistency, but is very sensitive to delay, such as some algorithms, risk control, configuration and other data. This kind of data is basically a set of QZone deployed in each machine room, and then synchronized with each other.

** (2) MZone data: ** This kind of data is very sensitive to consistency and cannot be inconsistent. It can only adopt same-city active-active deployment mode, and the business needs to be able to tolerate remote call delay.

** (3) RZone data: ** This kind of data each Zone has its own primary node, if the data is not in the cell needs to be routed to the corresponding node to write. Such data deployment looks like this

6. Program evaluation

advantage

  • The DISASTER recovery capability is greatly improved, enabling services and data to live in remote locations.

  • Theoretically, the system service can be expanded horizontally, and the overall capacity of the remote multi-machine room can be greatly improved. Theoretically, there is no performance concern.

  • The user traffic is divided into multiple equipment rooms and regions to reduce the impact range of equipment room – and region-level faults.

disadvantage

  • The architecture is very complex, and the deployment, operation and maintenance costs are very high. Therefore, the middleware and storage that the company relies on need to be improved in many aspects.

  • It is intrusive to the business system. The business system needs to set the routing identifier (such as UID) because the unitary affects the service invocation or writes data to the corresponding unit.

  • Invoking services across cells and regions, such as the money transfer service above, cannot be completely avoided. What we do is try to avoid cross-locale service invocations.

Five, the summary

This paper discusses the general idea of some live construction and some key technical points of the solution, a variety of different scheme comparison. To establish a complete remote live capacity is far more complex than the above discussion, it is necessary to rely on a variety of middleware, storage and other corresponding unit transformation and supporting complete traffic scheduling and operation and maintenance management and control capabilities.

Due to space limitations, this paper does not introduce in detail various storage (such as Redis, MySQL) data synchronous replication and high availability schemes in multi-live, so interested students can go to in-depth understanding of this knowledge.