Wechat public account [Huanshao’s Growth Path]

Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

Hello, I’m Leo. Currently, I am working on Java backend in Changzhou. In the last article, we explained how to troubleshoot node outages when a node is down in an online database. From SELECT 1, external statistics, internal statistics and a series of process scheme introduction. In this paper, we introduce the online database mistakenly deleted data, in the end is to run away or how to solve!

Train of thought

The introduction of this article is outlined in the following mind map. It also helps readers to distinguish readability better!

Mistaken delete rows

It’s quite common to delete a line by mistake. Sometimes we delete a line to solve a data problem. Then I realized I had made a mistake. Now let’s introduce how we should deal with it!

Binlog_format =row Binlog_row_image =FULL

binlog_format=row

This parameter was introduced earlier when we introduced binlog. Row, statement, mixed

The reason why row must be set here is because it is only useful to record detailed log information for data recovery. Statement is certainly not enough. Mixed is also not suitable because it does not need to be judged at all!

binlog_row_image=FULL

This is a new parameter introduced simultaneously by the arguments listed above. There are currently two options, FULL records every row change, minimal records only affected rows. The default is FULL.

Getting down to business…

You can use the Flashback tool to recover data by Flashback. The principle of data recovery is to modify the contents of the binlog, take it back to the main library and reload it. To use the current method and also modify things as follows.

  • For insert statements, the binlog event type is Write_rows event. Change it to Delete_rows event.
  • Similarly, for delete statements, change Delete_rows event to Write_rows event;
  • In the case of Update_rows, the binlog contains the values of the rows before and after the changes.

If multiple transactions are executed, such as originally A,B,C. If you want to recover the data, you just reverse the order, which is C,B,A

Suggestion: However, it is not recommended that the primary database perform these operations directly. It is safer to restore a backup or use a secondary database as a temporary database to perform these operations. Then the confirmed temporary database data is restored back to the master database.

The prevention of

  • Set the SQL_safe_updates parameter to ON. As a result, if we forget to write a WHERE condition in a DELETE or UPDATE statement, or if the WHERE condition does not contain an index field, the statement execution will fail.
  • Before the code goes live, it must be audited by SQL.

If a large amount of data needs to be deleted and the data is useless, do not delete the table. This generates and writes redo logs, binlogs, rollback logs, and so on. Using the TRUNCate table or drop table command reduces performance

Why can TRUNCate table or DROP Table save performance?

We said that binlog_format=row must be set. Here we want to explain that although our configuration is ok, but the internal mechanism of the problem. The two commands are automatically set as statement, so the logs saved by the two commands are relatively simple. Data could not be recovered. The performance is good.

What if they do?




Delete tables/libraries by mistake

There’s a way to do it if you do. But a little bit more work. This is also the lowest hand. Full backup + Incremental backup. This solution requires regular full backups online, in real time.

This scheme is similar to Redis’ AOF and RDB. So how do they operate?

Let’s say someone deletes a library at 12 o ‘clock by mistake

  1. Obtain the latest full backup, if the backup time is 3 am, one backup a day.
  2. Restore a temporary repository from a backup;
  3. From the log backup, take out the logs after 3am;
  4. Apply these logs, except for statements that mistakenly delete data, to temporary repositories.

extension

  • If the temporary repository has more than one database while doing data recovery above. Add one when using the mysqlbinlog commandThe database parameter. Specifying the library in which the table resides avoids having to look up logs from other libraries when data is recovered.
  • If the GTID mode is used, it is much easier to add the unexecuted GTID1 to the GTID collection of the temporary instance, and then execute the binlog sequentially.
  • If the GTID mode is not used, it is still more troublesome. Only when applied to a 12-point binlog file, the -stop-position parameter is used to execute the log before the misoperation, and the -start-position parameter is used to continue the execution from the log after the misoperation.

Performance optimization

Such a process is relatively slow in terms of performance, because the operation tends to be one library, one instance. It is unnecessary to restore a table. This is not superfluous, but mysql cannot specify that only one table’s logs should be parsed.

Accelerated methods

After restoring the temporary instance with the backup, set the temporary instance as a slave to the online standby repository. Before saving the primary/secondary configuration, run change replication filter replicate_do_table = (tbl_name).

Command to have the temporary library synchronize only misoperated tables. Doing so can also speed up the entire data recovery process using the parallel replication technique described earlier.

The log is lost

If the binlog required by the temporary instance has been deleted from the standby database during the search for the log recovery instance, we can find the required binlog from the binlog backup system and put it back to the standby database. Specific operations are as follows

  1. Download the two missing logs and place them in the log directory of the standby database
  2. Open the master.index file in the log directory and add two lines at the beginning of the file. / master. Lost 001and. / master. Lost 002
  3. Restart the standby database and reload the two logs. At this point, the master-slave relationship can be established and the synchronization can be normal.

The backup system must periodically back up full logs, considering disk hardware requirements. A fixed number of days can be saved appropriately

Lazy replication standby database

This scheme belongs to a log delay scheme. For example, when writing data from a database, the data is not immediately synchronized to the standby database. And then synchronizes to the standby database by means of delay.

Let’s say we have an hour delay. After the primary database writes data, it will be synchronized to the secondary one hour later. If errors are found within an hour, the stop slave command can be used to stop the data being written.

You can run the CHANGE MASTER TO MASTER_DELAY = N command TO specify that the standby database is kept N seconds longer than the MASTER database.

Prevention table/library approach

  1. Accounts are separated, and different service personnel have different operation rights. Avoid writing wrong commands.
  2. Develop operational specifications. The purpose of doing this is to avoid miswriting the table name to be deleted

Rm Deleting data

This risk is still relatively high, generally such a situation, can only be restored by the way of the cluster, if there is no cluster then can only burp fart.

If only one node is deleted, the HA system will start to work, selecting a new master library, and then restoring data on this node and connecting to the whole cluster. That should solve the problem.

In order to be safe, rm commands are generally harmful, so it is recommended to keep data in different rooms and across cities

conclusion

Today, we introduced how to deal with the deleted data in addition to running away, and how to deal with and deal with the deleted data.