preface

How often do you hear about multi-room deployment and remote DISASTER recovery? What two places, three centers, three places, five centers? Curious about how data is shared in a multi-room deployment?

Today, I will try to clarify the confusion for you and introduce the data synchronization problem in detail.

A single IDC

The architecture in the figure above is an IDC room, deploying a master and two slave mysql database cluster, which is mostly used by small and medium-sized Internet companies.

There are some problems with the above scheme:

1) The speed of user experience varies in different regions. An IDC can only be deployed in one region, for example, Beijing. In this case, users in Beijing can access the IDC quickly. However, for users in Shanghai, the access latency is generally higher.

An RTT from Shanghai to Beijing may be about 20ms.

2) Disaster recovery. Disaster recovery refers to a power failure in the equipment room, a natural disaster, or an optical fiber cut out, not a single machine fault. Once this problem occurs, users cannot be properly accessed, and even data loss may occur.

One year, alipay hangzhou data center of a cable was cut

More IDC

To solve these problems, we can deploy the service to multiple DIFFERENT IDCs and synchronize data between different IDCs ****. The following figure

In this way, we can solve the problems encountered by single room:

1) User experience. Users can access the equipment room nearest to them.

2) Disaster recovery. When an equipment room is down, the traffic of users in this equipment room can be scheduled to another normal equipment room. Because the data between different equipment rooms is synchronized in real time, users can access the data after the traffic scheduling is over.

A small amount of data may be lost at the moment of failure

On the flow scheduling problem, this paper will not introduce, after the old customer will introduce the flow, gray scale issue separately. This article focuses on data synchronization.

Disaster added

  • Equipment room Dr: In the preceding example, two IDCs are used, but the two IDCs cannot provide the equipment room Dr Capability. At least three IDCs are required. For example, some consistency components based on the majority protocol, such as ZooKeeper, Redis, ETCD, consul, require the consent of most nodes. For example, three nodes are deployed. If there are only two equipment rooms, two nodes must be deployed in one equipment room and one node must be deployed in one equipment room. When the machine room with two nodes is down, only one node is left and the majority cannot be formed. In the case of three machine rooms, one node is deployed in each machine room. If any machine room fails, two nodes remain, the majority can still be formed. This is what we often call “two places and three centers”.
  • City-level DISASTER recovery (Dr) : In the case of a serious natural disaster, the equipment room in the entire city may be inaccessible. In order to achieve city-level disaster recovery capability, the “three places and five centers” scheme is used. In this case, the three cities have two, two, and one computer rooms respectively. In the event of a disaster in the whole city, the other two cities can still ensure that at least 3 computer rooms are still alive, and can also form a majority.

Mysql primary/secondary synchronization

You should know the basics of data replication in mysql’s master-slave architecture.

Typically, a mysql cluster has one master and many slaves. All user data is written to the Master library. The Master writes the data to the binary log. Slave library Starts an I/O Thread (I/O Thread) to synchronize the binlog from the master to the Slave and write it to the local relay log. At the same time, the Slave starts an SQL Thread to read the local relay log and write it to the local for data synchronization.

Data synchronization scheme

Mysql > create binlog (); mysql > create binlog (); And then sync to another mysql cluster?

In this way, data from one cluster can be synchronized to another cluster.

Do we need to write this component ourselves? This component can refer to the binlog protocol and can be implemented with a knowledge of network programming.

Of course, we don’t need to write it ourselves, there are mature open source ones on the market:

• Alibaba’s open source Canal

• Meituan open source Puma

•linkedin’s open source Databus

We can use these open source components to subscribe to binlog logs that are parsed to change data and synchronized to the target library. The whole process can be divided into two steps. The first step is to subscribe to obtain the changed data, and the second step is to update the changed data to other target libraries.

The target libraries mentioned here are not only for mysql cluster, but also for Redis, ES, etc

In the figure above, we perform a typical data synchronization by subscribing to binlog.

Mysql synchronization in multiple computer rooms

According to the above knowledge, mysql data synchronization in multiple computer rooms can also use binlog scheme

The data of the Beijing user is continuously written to the DB of the machine room nearest to him. The binlog library is subscribed by the binlog subscription component, and then the downstream update component converts the binlog into SQL and inserts it into the target library. Shanghai user is similar, but the direction is opposite, no longer repeat. In this way, we can synchronize data from the two libraries to the peer end in real time.

The above solution should not be a problem in the case of infrequent binlog updates; However, if the update is frequent, the binlog volume is very large, and the component that processes the update data is likely to fail.

Optimizing synchronization scheme

In order to solve the problem that the binlog volume is too large for the update data component to handle, MQ can be added to this solution for peak clipping, as shown in the following figure:

Problems with synchronization schemes

The architecture we see above is primarily for incremental data synchronization; But what about the full data when the project first came online? This general processing strategy is for the DBA to dump a complete snapshot of the source repository; The target library can import the snapshot.

Let’s look at the incremental data synchronization, careful partners should see the data between Beijing IDC and Shanghai IDC is two-way, because the data of Beijing users is updated to Beijing DB, the data of Shanghai users is updated to Shanghai DB, so the business must be two-way.

There are several problems in the data synchronization process:

How to solve the problem of repeated insertion?

Consider the case where a record in the source repository does not have a unique index. For the binlog of this record, an exception was thrown when the update component converted the binlog into SQL and inserted into the target library. At this point, we do not know whether the insert was successful or not, so we need to retry. If the target library has been successfully inserted before, but the network socket timeout occurred when the target library responded, the exception is caused. If the socket timeout is retry at this time, multiple records will exist, causing data inconsistency.

Therefore, in general, when synchronizing data, there is a restriction that records must have a primary key or unique index.

What about DDL statements?

If there is already a large amount of data in a database table, such as tens of millions or hundreds of millions, DDL changes to the table will be very slow, possibly taking several minutes or more, and DDL operations will lock the table, which will have a significant impact on the business.

As a result, the synchronous component typically filters DDL statements and does not synchronize them. Dbas make table structure changes on different database clusters using some online DDL tools.

How to resolve unique index conflicts?

Since data inserts exist in both libraries, if the same unique index is used in both libraries, unique index conflicts will occur when synchronization to the peer end. In this case, it is often recommended to use a globally unique distributed ID generator to generate unique indexes, ensuring no conflicts. See Gu’s previous article on how to generate globally unique distributed ids.

In addition, if a conflict does occur, the synchronization component should save a record of the conflict for later troubleshooting.

How to solve the data loopback problem?

This problem is common in data synchronization and must be resolved. The most important question. INSERT, UPDATE, DELETE

INSERT operations

If you insert data into library A, library A generates A binlog, which is then synchronized to library B, which also generates A binlog. Because it’s bidirectional synchronization, this record is going to be resynchronized back to library A. Because library A should have this record, there is A conflict.

The UPDATE operation

Consider updating R to R1 only once for A particular record in library A, after which the binlog R1 will be synchronized to library B, which in turn synchronizes R1 back to library A. In this case, library A will not generate A binlog. Since the A library record is currently R1, the B library synchronizes back to R1, meaning the value has not changed.

Mysql does not generate a binlog if an update operation does not change the value of a record, which is equivalent to termination of synchronization. Mysql doesn’t actually do anything when the updated value doesn’t change:

The figure above illustrates that there is a single record (1,”tianshouzhi “) in the data, and then an update statement is executed to update the name value of the record with id=1 again to “Tianshouzhi”, meaning that the value has not changed. At this point, we see that mysql is returning a zero impact log function, which means that no update operation will actually occur.

Update: update: no loopback It’s not. Let’s look at some scenarios:

Considering that the record R of library A is continuously updated twice, the first time is updated as R1, and the second time is updated as R2. Both recorded changes are synchronized to B, which generates R1 and R2. Since B’s data is also being synchronized to A, B’s R1 will be synchronized to A first, while A’s current value is R2. Since the value is different, it will be updated to R1 and A new binlog will be generated. At this point, R2 of B is synchronized back to A, and the value of A is found to be R1, and then updated to R2, which also generates A binlog. Because B synchronizes back to A, A generates A new binlog, and A synchronizes back to B, and so on, in an infinite loop.

The consequences will continue in an endless loop.

The DELETE operation

There is also a sequencing problem. For example, insert a record and then delete it. After A is deleted, B synchronizes the inserted data back to A, and then synchronizes the deletion operation of A back to A. Each time, A binlog is generated and an infinite loop is created.

conclusion

Today introduces the basic multi-room synchronization of mysql scheme, and some problems encountered in the synchronization scheme, and some solutions; However, there is still a data loopback problem, which Gu will address in his next article. Thanks!!