Author’s brief introduction

LeiXiaoLong

Qunar DBA

I joined Qunar in August 2019. I have rich experience in database operation and optimization. Now I am responsible for the operation and maintenance of MySQL/Redis and the implementation and landing of automation solutions.

1. An overview of the

Database migration split is an inevitable part of our daily operation and maintenance work, this paper mainly analyzes the common database migration scheme corresponding to the use of the scenario and advantages and disadvantages, and share the Quanr database using architecture, and the practical experience in the evolution process of architecture.

2. Migration solution

Often, the design of a data migration solution depends on the original data architecture and business scenario. The ultimate goal is to be as business-friendly as possible, with as little downtime as possible, or even nonstop, while ensuring data consistency.

Scheme 2.1 a

In common online data migration modes, the primary and secondary servers are established to synchronize data. Waiting for the cutover during the low business peak, the main process is as follows:

  • Data is synchronized from primary to secondary in the new and old clusters
  • Stop writing services during peak periods and check data consistency between primary and secondary data
  • Business modification connects to the new cluster and publishes the program
  • Verifying migration results

The advantages and disadvantages of this approach are obvious.

** Advantages: ** Simple architecture, dbA-friendly, you just need to create a new cluster to ensure primary/secondary synchronization and data consistency.

** Disadvantages: ** is unfriendly to the business and needs to stop writing during the peak period; If the cluster involves many services, the server must be stopped at the same time and the connection mode must be changed. In addition, traffic switching may not be complete, resulting in double write.

2.2 2

Data migration is carried out by proxy. The specific process is as follows:

Divided into three stages:

  1. Create a primary-secondary replication to replicate data in the original cluster to the new cluster.
  2. A proxy is added to the original cluster for traffic forwarding. The proxy is used to control which cluster the traffic is distributed to. In this case, the service developer is notified to change the connection to the proxy (the connection mode can be the virtual IP address, domain name, or service name).
  3. After observing the traffic and switching the service traffic to the proxy, the DBA cuts off the primary and secondary replication from the Source to the Target. After ensuring the data consistency, the DBA uses the proxy to send the traffic to the destination database. Note that when cutting off traffic and replication, ensure that the time is short enough and replication is not delayed. Finally, the proxy is dropped by switching virtual IP address or modifying DNS resolution, and the final program directly accesses the database at the destination end. Alternatively, notify the service to change the connection mode from proxy to the destination database.

Summarize the advantages and disadvantages of the migration solution.

** Advantage: ** The advantage of this solution is that after the proxy is added, the DBA can control the distribution of entry traffic. Development in the release program to modify the connection mode, support rolling release, two access methods coexist, do not need to stop service; On the other hand, the DBA and the development work are uncoupled, so there is no need to release the program at a fixed time and perform cutovub. The DBA can determine which way the service accesses the database through the traffic source to ensure that the traffic is cut clean.

** Disadvantages: ** The architecture is relatively complex, which increases the cost and difficulty of operation and maintenance for DBAs. The service interruption time depends on the restart time of the proxy, which is the interruption time for traffic to be switched from the Source cluster to the Target cluster. Generally, however, it is very fast and can be controlled within seconds. The database connection may be interrupted intermittently. Ensure that services can be reconnected and that the primary and secondary data are consistent.

2.3 summary

Compared with the above two migration schemes, it can be said that each has its own advantages and has its own applicable scenarios. The first scheme is more suitable for the business to accept interruption, the database cluster user is more clear, there is no multiple services mixed, otherwise downtime maintenance will be a huge project. On the other hand, the business side needs to be trustworthy.

Why is trust in bold here? There must be a story to it. As mentioned above, the service needs to complete the modification of the connection mode within a specific period of time, from the address of the old cluster to the new cluster. Whether the traffic is switched completely depends on the service side. During the downtime, the DBA has no way to determine whether the service traffic is switched completely or not. Whether the residual traffic is written into the old cluster can only be determined after the service is started. If the traffic is not cut clean, it is too late, the business begins to double write, the data is inconsistent, think about how to roll back or fill the data to go. However, for the “scheming” DBA, they may not trust the business so much, and will secretly save something by setting the old cluster read-only, reclaim user rights, and so on, so that the business can no longer write to the old cluster. The advantage of the second option is to ensure data consistency at the expense of business.

The core idea of the second scheme is to make the database can provide two access modes at the same time, and the two access modes point to a database instance at the same time, so that business can smoothly transition between the two modes. Business development and DBA efforts are uncoupled, so there is plenty of time to release the application, and the DBA has the means to determine whether the service traffic has been cut thoroughly to ensure that the data is consistent.

3. Qunar cluster architecture

Based on the idea of the second scheme above, the upgrade and migration scheme of Qunar internal architecture is introduced below. Let’s first understand qunar.com’s database architecture.

3.1 MMM cluster

The cluster architecture diagram is as follows:

MMM (Master-Master Replication Manager for MySQL) is an old high availability database architecture, which was widely used in Qunar in the early days. Monitor and Agent are used to Monitor and manage MySQL bidirectional replication. Services can be written to only one master at a time, and the other Slave master can only provide read services. You can also add Slave nodes for load balancing and provide external services using virtual IP addresses without client restrictions.

Because of the asynchronous replication, data consistency cannot be guaranteed completely. Therefore, IT is suitable for scenarios where data consistency is not required but service availability is guaranteed to the greatest extent. The MMM architecture is not recommended for services that require high data consistency. Besides, MMM does not support network partitions and cannot be deployed across equipment rooms.

For details, go to mysql-mmm.org

3.2 PXC cluster

Percona XtraDB Cluster (PXC) is Percona’s open source solution for high availability of MySQL. It integrates Percona Server and Percona XtraBackup with the Galera library for synchronous multi-master replication. And MySQL, compared to the traditional asynchronous replication can ensure the strong consistency of the data, at any time on any node in the cluster data state is completely consistent, realized the decentralization and the whole architecture, all nodes are equal, allowing to write and read on any node, a cluster to synchronize data state to all other nodes. However, currently, PXC cluster has many restrictions on its use, such as: InnoDB storage engine only supports, transaction size limits and so on, so you need to select the appropriate application scenario.

Details please click into the website: www.percona.com/doc/percona…

The internal use architecture of Qunar is as follows:

The PXC cluster used by Qunar generally has three nodes. Although PXC supports multi-point write, multi-point write is easy to lead to data conflict, which will produce large extra overhead and affect performance. Therefore, in the production environment, we still only write on one node and provide read service on the other two nodes. Temporary multipoint write is allowed only when the write node is switched to ensure smooth service transition to the new write node. Different from traditional primary/secondary switchover, PXC HIGH availability switchover does not require transient database read-only and intermittent disconnection. The namespace service name is used to provide external services. The real IP address and port number of the database cluster node masked by the service layer are displayed. The client obtains the cluster topology information from the configuration center.

  • Sentinels: A cluster of Sentinels that Monitor the health and topology of PXC clusters. Similar to Redis’ Sentinel cluster, the Sentinels solve the problem of MMM Monitor single point. When the cluster topology changes, sentry modifies the configuration center to take the faulty node offline and promote the primary node. At the same time, the ZooKeeper version information is changed, triggering the client to reconnect and obtain the cluster configuration information again.
  • Config Server: the configuration center of a cluster. It is also a PXC cluster. It is used to store namespace and node information (including IP addresses, ports, read/write roles, online status, and change information) of the cluster.

4.1 MMM Migration PXC

Due to historical reasons, Qunar database still has many MMM architecture clusters. Some important “ancestral” services are still running on MMM, bearing large risks. It can be said that it is “lucky” that there has been no problem. At first, IT was difficult for DBAs to promote architecture migration. First, PXC architecture was less used in the industry, and there was not enough understanding of it, so there was no confidence. Second, the service is older and more important, dare not change easily.

Through Qunar’s years of practice, the PXC architecture has proved to be an excellent database high availability solution. The business side began to have some trust, and with the impact of the epidemic this year, we began to “cultivate internal work”. With the help of this opportunity, the DBA department promoted the MIGRATION plan of MMM architecture to PXC cluster.

Such a huge project, need to have a complete solution, the first to have the following points:

  • Business friendly, non-stop, support rolling release
  • Support for rollback. Business publications can be rolled back at any time if they find problems running on the PXC architecture
  • The migration cycle is long, and the entire process must be highly available
  • Business publishing is not time-limited and has no fixed window period

Based on the above points, the first solution cannot be satisfied, because the old service has serious database reuse, it is impossible to change the address completely at one time, and it is impossible to accept a long time of outage. In the second solution, a proxy is added to register the IP address and port of the proxy to the configuration center of PXC, so that services can be published to namespace. In fact, the old cluster is accessed through the proxy, and traffic direction is controlled by the proxy. In addition, the configuration center supports manual modification.

In this way, the above appeal seems to be satisfied. However, the problem is also obvious, the DBA needs to maintain an additional set of proxy, and also to ensure that the proxy is highly available. In addition, there will be a lot of more resources. We will migrate more than 120 MMM clusters, and you can imagine the maintenance cost and resources.

Based on the above migration idea and the particularity of namespace, we finally choose the plan of in-situ upgrade and discard the proxy. The general process is as follows:

1. First, take THE VIP of MMM as the access address of PXC cluster, register it with the configuration center and ZooKeeper, and provide namespace that can be used for external services.

2. Notify services to be published to the NAMESPACE of the PXC. When services access the database through namespace, they actually access MMM cluster nodes through VIP. This process can be rolled back at any time, and VIP and namespace access are supported. With high availability, for namespace, the lower layer is VIP, which will switch with the high availability of MMM, so there is no single point of failure.

3. Observe the switchover of service traffic and determine the switchover based on the ZK connection.

4. After the service is released, node by node is upgraded to THE PXC version, and sentinel is deployed to complete the construction of the PXC cluster. The upgrade process is as follows:

  • Take offline the alternative master node MASTER2 of MMM and upgrade it to the first node of PXC.

  • MASTER2 rejoins THE MMM cluster and restores the secondary master read role, performs the read/write role switchover, and the MASTER2 node is upgraded to the master node to provide services externally (the node providing services is PXC version, MMM high availability is still valid, you can observe for a period of time. If the service has compatibility problems due to PXC restrictions, you can switch the read/write role here and switch back to the original normal version).

  • MMM offline the old master node MASTER1 (now an alternative master node) and upgraded to a PXC node. MASTER2 as donor to form a 2-node PXC cluster (here the high availability of MMM cluster has been broken, the next steps need to be continuous).

  • In MMM, the remaining SLAVE nodes are offline, upgraded to PXC nodes, and added to the PXC cluster composed of MASTER1 and MASTER2. Deploy the sentinel cluster, change the IP address of the VIP in the configuration center to the real IP address, ensure that the VIP has no traffic, and go offline.

5. Change the ZK version and notify the client to reconnect. At this point, the business has been fully migrated to the PXC architecture.

6. Finishing work, checking monitoring and backup tasks.

It is important to note during the migration process that in the process of node by node upgrading to PXC, the high availability of MMM was broken after the second node upgrade, while the high availability of PXC was not formed before sentinel deployment. Therefore, when it comes to this link, it needs to be carried out quickly and coherently. We use the automation program upgrade to ensure the specification and efficiency of the operation, and minimize the time of single point operation and avoid misoperation. In extreme cases, you can manually bind viPs to ensure service availability.

The above migration scheme has passed the demonstration of practice, and the development of students strongly recognized. At present, qunar has completed more than 70% of the MMM cluster upgrade. The whole process has been automated. Although the migration cycle is long, it does not actually require much human intervention.

4.2 Migration and Split Cases

Based on the idea of MMM upgrading PXC in place, we can also deal with many migration scenarios. Now the service has A demand for database splitting: Cluster A is MMM cluster. Due to the large number of DB, mixed services, and heavy traffic, the database has been overwhelmed. Now we are going to split a more important service out, corresponding to a DB in the instance; It must run independently on the new cluster and support cross-machine room deployment. In addition, services cannot be interrupted due to the special nature of services.

With the above migration ideas in mind, let’s look at the migration architecture diagram:

The original cluster is MMM cluster, which consists of MASTER1 and MASTER2 nodes. Services are accessed through WVip and RVip respectively. Now business A needs to be migrated. The general operation process is as follows:

  • The write node of MMM cluster, MASTER1 as donor node of PXC cluster, build PXC cluster (if MASTER1 has been upgraded from normal MySQL version to PXC version).
  • Register MASTER1’s real IP address to the configuration center; At this point, MASTER1 supports namespace access using MMM-wVIP and PXC.
  • Services to be migrated can be smoothly transferred from WVip to Namespace.
  • The MMM cluster takes offline MASTER1 and detach MASTER1 from the MMM cluster (at this time, only the single point of SERVICE is provided by MMM. In order to ensure the availability, the slave library can be added in advance to avoid extreme situations).

The migrated services have already accessed PXC cluster, while the remaining services are still accessing MMM. The final act is to take MASTER1 offline from PXC cluster and restore MMM’s alternate master node.

5. To summarize

For different database architectures and business scenarios, the migration method is different. In addition to homogeneous database migration, there are often heterogeneous database migration, so you may need to use third-party tools to achieve data migration. To sum up, there is no best plan, only the most suitable plan.

This paper mainly explains that through the proxy layer, the same database node can support two ways or two addresses to access the database, so that the business traffic can smoothly transition from the old connection mode to the new connection mode, and achieve the most business-friendly migration scheme. Based on this idea, we can deal with a variety of migration scenarios, no need to stop cutover, and the development and DBA work is uncoupled, greatly improving the work efficiency.