Type of data migration

As the business evolves, storage will often need to be migrated. The following scenario is one we often encounter during development

  1. The business and the team are expanding rapidly, and it is necessary to split microservices when appropriate. Separate databases are needed to migrate data from the source database to the new database
  2. The number of records in a single table is large, so it is necessary to divide the database into different tables. Data from the old table needs to be migrated to the new table.
  3. Incorrect storage selection, such as relational database mutual migration, PG, MySQL,Oracle mutual migration. NoSQL Mongo,Cassandra,Hbase mutual migration.
  4. Machine room migration, self-built machine room to cloud mutual migration

Each of these scenarios requires data migration, and while the detailed solutions differ, they also have some similarities.

Data migration scheme

Data migration simply means moving data from one location to another.

standard

Data consistency After data migration, records cannot be lost, and data in a single record cannot miss fields.

Data is being written continuously. To prevent data writing, ensure the availability of service writing.

The fact that the migration process can be interrupted and rolled back is a high requirement is a foolproof data strategy. If any problem is found in each phase of data migration, it can be rolled back to the original library to ensure the normal operation of services.

The migration plan

In order to meet the above requirements, the dual-write strategy is generally adopted. That is, write two copies, the old one and the new one.

  1. Convergence The more access points for reading and writing, the more places to switch, the more prone to errors. Therefore, try to converge all access points to one place
  2. Double write Writes incremental data to two storage systems simultaneously. Make sure the new write code is ok. If the operation succeeds, the operation succeeds. If the operation fails, record the failure log, analyze the failure, and rectify and compensate for the failure
  3. Migrate the old stock data to the old stock data migration by iterating through the ID, writing to the new storage. There are many specific plans. You can use synchronization tools such as binlog +flink to handle this. If you have less data, just go through it.
  4. Data verification Data consistency verification is critical to ensure the number of records on both sides and the integrity of a single record. If the amount of data is small, full check is generally performed. There is a large amount of data, which can be sampled and verified.
  5. Switching to the new read After the data is verified, you can switch to the new read. In case of any problem, you can switch to the old read. Troubleshoot the problem and start over.
  6. After N days of running safely and smoothly on the new storage, you can stop the old read, and the migration process is complete.

Matters needing attention

  1. For back-end services, storage is the cornerstone and a top priority. The stability requirement is the highest. Ensure that data is migrated smoothly and not aware of services.
  2. At the same time, storage is stateful and migration is difficult, so developers need to be forward-looking and be careful when choosing the right database to avoid database migration. When potential problems are found in database selection, it is necessary to make a prompt decision and migrate as soon as possible. Don’t procrastinate on the assumption that problems are unlikely. Otherwise, once there is a problem, it is a major failure, resulting in inestimable losses.