As any DBA knows, database backup is critical and is the ultimate panacea for saving a database. Therefore, data in the production system must be backed up. Of course, the selection of backup tools and policies is also very important, which directly affects the recovery efficiency.
Here’s a look at the common backup tools and the most popular Percona XtraBackup backup process.
1. Introduction to backup tools
1.1 Introduction to common Backup tools
There are three common backup tools for MySQL:
- Logical backup, including mysqldump and Mydumper;
- Physical backup, including Mysqlbackup and Percona XtraBackup;
- Binlog backup, including mysqlbinlog.
A binlog backup is a copy of a logical or physical backup and can only be used to restore data.
(1) Logical backup: mysqldump
Mysqldump backs up SQL files. The core of mysqldump is to perform a select on each table and convert it into a corresponding INSERT statement. The backup process for mysqldump is as follows:
- Add read lock to all tables in a library;
- Cyclic backup table data;
- Release read lock;
- Repeat the above three steps;
- Backup completed.
Mysqldump is a backup to a database, and all tables in the database are read-only. No changes can be made to the table until all tables in the database are backed up. This is unacceptable for online environments. If –master-data or –dump-slave is specified, FLUSH TABLES WITH READ LOCK is added at the start of the backup until the end of the backup. Of course, we can choose a secondary library for backup, so that online business will not be affected. One of the biggest benefits of using mysqldump backup is that it supports cross-platform and cross-version data migration and recovery, which physical backup cannot do.
However, because mysqldump is a backup of SQL statements, you need to be careful when using mysqldump. For example, common problems with mysqldump are:
- Migrating some data to the new instance results in deleting the original data;
- Due to the time zone problem, the data related to the restored table and time is incorrect.
- After the backup data is imported into the primary database, the secondary database is not synchronized.
- Because of the character set problem, garbled characters appear in the data after recovery.
- .
So when using mysqldump, it’s important to know what the options are, and what happens to the existing data in the backup SQL file.
The principle of Mydumper is similar to the principle of Mysqldump, but the biggest difference is the introduction of multi-threaded backup, each backup thread backup part of the table, of course, the concurrency granularity can be down to the row level, to achieve the purpose of multi-threaded backup. I’m not going to do that here.
(2) Physical backup: Percona XtraBackup
Percona XtraBackup is a backup tool developed by Percona for physical hot backup of MySQL database. It is based on InnoDB crash recovery function. Its basic working principle is as follows:
- Create a redo log copy process at startup, obtain and record the current log sequence number (LSN), and continuously copy the redo log from this site.
- Start the IDB file copy thread to copy ibDATA1, undo tablespaces and all IBD files.
- FTWRL (or add backup lock) is notified when ibD file copy is finished.
- Back up non-InnoDB data (.frm,.mrg,.myd,.myi…… Etc.);
- Back up slave and binlog information.
- Refresh logs and exit the log copying thread after the redo log is copied.
- Release global locks and record backup metadata. The backup is complete.
Percona XtraBackup applies a copy of the redo log during recovery, applies committed transactions, rolls back uncommitted transactions, and restores the database to a consistent state. Because Percona XtraBackup backs up physical files, there aren’t as many problems with using backup files for recovery or migration as there are with mysqldump.
When XtraBackup is used for backup, the database changes are affected to different degrees according to the backup parameter Settings. The impacts are analyzed in the following sections.
(3) Comparison of backup tools
By comparison, it is found that XtraBackup has the advantages of low impact on database and fast recovery, and is the first choice in daily backup. Mysqldump is more flexible to use, but you need to pay attention to the impact of existing database data.
1.2 Backup Policy
Backup policies include full backup, incremental backup, and binlog backup.
Currently, Qunar database backup mainly adopts XtraBackup full backup and binlog backup. The frequency of full backup varies with the importance level of a database. The backup program architecture is as follows:
Description:
- The backup scheduler schedules backup jobs based on the specified backup policy.
- Database backup adopts XtraBackup streaming backup, which is directly transferred to the backup machine.
- Apply-log after the backup is complete, which not only verifies the availability of the backup, but also reduces apply-log time when using the backup.
- Compressed backup;
- Binlog backup is performed periodically and stored to MFS storage using rsync.
XtraBackup backup process analysis
Percona XtraBackup is currently the most widely used tool for backing up MySQL. During the backup process, the database can perform normal read/write operations or other changes, but occasionally encountered metadata locks caused by backup, or found that the transaction is blocked by a binlog lock. Let’s take a look at the Percona XtraBackup backup process and lock timing.
Note: The following analysis of Percona XtraBackup is based on version 2.4.23. Other versions vary slightly, but the key steps are basically the same.
2.1 Copying redo Logs
XtraBackup creates a background thread at the start of a backup that copies the redo log of the database. XtraBackup scans the redo log header to find the current checkpoint LSN. Then, XtraBackup copies all redo logs, including new redo logs, from the current CHECKPOINT LSN. The thread will continue until the non-transactional table is fully copied before safely exiting. The backup log output records the checkpoint LSN when the copy starts. The following output is displayed:
2.2 Copying IBD Files
Before copying ibD files, the database will scan the ibDATA1, undo tablespaces and all IBD files. The corresponding space ids will be recorded. These space ids are needed to find the contents of the doublewrite buffer and the redo log entries during recovery. Then copy ibDATA1, undo tablespaces and all IBD files in a loop. Parallel can be set to multithreaded backup to improve the copy efficiency of physical files. If this parameter is not specified, the default value is 1.
2.3 Copying Non-IBD Files
After all ibD files are copied, XtraBackup starts backing up non-IBD files. The logic of this part is complicated because the lock is required before backing up non-IBD files. Whether the lock is required depends on the –no-lock parameter.
2.3.1 No-lock Option Parameter Description
If –no-lock is set to TRUE, “FLUSH TABLES WITH READ Lock “is not used for global READ locks, but if DDL or DML operations are performed on non-InnoDB TABLES during backup, the backups will be inconsistent and the restored data will be problematic. Therefore, it is not recommended to set –no-lock to TRUE. The default value is FALSE, which means that global read locks are applied before backing up non-IBD files if this option is not specified.
Let’s look at the process logic to determine whether to add a global lock:
The flow chart is as follows:
To sum up:
1) If –no-lock is FALSE (the default value), a global read lock is applied before the file is copied. If –safe-slave-backup is set to TRUE, the SQL_THREAD is closed before the global lock is applied.
2) If –no-lock is TRUE, the lock will not be applied and the file will be copied directly.
2.3.2 Lock processing logic
The lock logic is mainly implemented by lock_tables_maybe. Take a look at the lock_tables_maybe source code as follows:
The lock_tables_maybe function simplifies the process as follows:
1) If the BACKUP instance has been locked (LOCK TABLES FOR BACKUP/FLUSH TABLES WITH READ LOCK) or lock-dcl-per-table is set, the value is returned.
2) LOCK TABLES FOR BACKUP
3) If backup locks are not supported, run FLUSH TABLES WITH READ LOCK. Before performing this operation, the system checks whether DDL/DML is in progress, the waiting timeout period, and whether to kill pending transactions.
Safe-slave-backup = safe-slave-backup = safe-slave-backup = safe-slave-backup
If this parameter is set during the backup operation performed by the secondary database, it prevents the backup failure caused by the synchronization of the primary database operation by XtraBackup.
If –safe-slave-backup is set to TRUE, “STOP slave SQL_THREAD” will be executed and Slave_open_temp_tables will be zero before copying non-IBD files. A zero Slave_open_temp_tables value indicates that all transactions executed by the SQL thread have completed, ensuring consistent backup. There is also no global lock imposed on XtraBackup by executing transaction blocking.
2.4 Backing up Slave and Binlog Information
After non-IBD files are backed up, slave and binlog information is backed up.
- If –slave-info is specified in the command line, the SHOW slave STATUS command is executed to obtain information about the replication and record it in the xtrabackup_slave_info file. Contains the binlog point (Relay_Master_Log_File, Exec_Master_Log_Pos) or GTID value (Executed_Gtid_Set) of the master library synchronized from the library. Xtrabackup_slave_info: Xtrabackup_slave_info:
SET GLOBAL gtid_purged='6b7bda9f-15f0-11ec-ba14-fa163ea367a4:1-83, 9841546e-15f0-11ec-9557-fa163e736db4:1'; CHANGE MASTER TO MASTER_AUTO_POSITION=1Copy the code
- If –binlog-info is specified in the command line, the SHOW MASTER STATUS command is used to obtain the binlog location information and record it to the xtrabackup_binlog_info file. The main information includes the current binlog file name, binlog locus, and the current GTID. Binlog-info does not need to be specified explicitly because its default value is AUTO, or ON if binlog is enabled. The xtrabackup_binlog_info file contains the following contents:
Mysql - bin. 15 f0-11 000004 2004 6 b7bda9f - ec - ba14 - fa163ea367a4:1-83984546 - e - 15 f0 - ec - 9557-11 fa163e736db4:1
Mysql - bin. 15 f0-11 000004 2004 6 b7bda9f - ec - ba14 - fa163ea367a4:1-83984546 - e - 15 f0 - ec - 9557-11 fa163e736db4:1Copy the code
Note that –slave-info or –binlog-info will first apply a LOCK binlog FOR BACKUP on an instance that supports a BACKUP LOCK. This will block any operation that changes the binlog location.
2.5 Backup End
After backing up all the files and binlog information of the database, the backup work is basically completed. After that, the operations are as follows:
1) Run FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS to FLUSH all redo LOGS.
Stop the redo log replication thread.
3) Release global read lock (backup lock), binlog lock;
4) Enable SQL_THREAD.
5) Copy ib_buffer_pool and ib_lru_dump files.
6) Generate configuration file backup-my.cnf;
7) Print the backup information to xtrabackup_info file, which mainly contains the parameters used in the backup, the start and end time of the backup, the binlog point information, and the LSN point to be returned.
Here is part of the xtrabackup_info record:
Percona XtraBackup and lock related parameters and operations
3.1 Related Parameters
- — lock-DDl: If this option is specified on instances that support BACKUP locks, “Lock TABLES FOR BACKUP” will be executed at the start of the BACKUP to block all DDL operations. The lock is released before the redo log thread is created and continues to be locked until the dump buffer pool. Take a look at the relevant log:
- — lock-Dcl-per-table: Apply metadata locks on each InnoDB table at the start of backup to prevent DDL operations on it. The lock is not released until the redo log thread is created and all backups are completed. Take a look at the relevant backup log:
The locking function is mdl_lock_tables, and the locking release function is MDl_unlock_all. The lock release function is mdL_unlock_all, and the lock release function is MDL_unlock_all. The mdl_lock_tables process is as follows:
- –no-lock See section 2.3.1
The above parameters –lock-ddl and –lock-ddl-per-table were added after Percona XtraBackup 2.4.8 because MySQL 5.7 added a feature called Sorted Index Builds. This can cause some DDL operations to fail to log redo, causing backups to fail. Using — lock-DDl or –lock-ddl-per-table will apply a lock at the start of the backup, preventing DDL operations.
In addition, if –lock-ddl or –lock-ddl-per-table is specified when backing up non-IBD files, locking is not required.
3.2 Locking operations performed on the Database
- LOCK TABLES FOR BACKUP: New MDL type LOCK is used to block ALL DMS of non-transactional TABLES and DDL operations of all types of TABLES, but does not affect LOCK free SELECT operations and DML operations on transactional TABLES. LOCK TABLES FOR BACKUP (DDL) or non-transactional TABLES (DML) will block. Show processList (Waiting FOR BACKUP LOCK)
- LOCK BINLOG FOR BACKUP: Another new TYPE of MDL LOCK is used to block all possible changes to the binary log location or Exec_Master_Log_Pos or Exec_Gtid_Set. Any operation under LOCK BINLOG FOR BACKUP that changes the BINLOG point will be blocked. Show processList shows that the blocked thread is Waiting FOR BINLOG LOCK.
- FLUSH TABLES WITH READ LOCK (FTWRL) : Close all open TABLES and LOCK all TABLES on all databases WITH a global READ LOCK. The database is read-only and any DDL/DML operations are blocked. Running show processList shows that the blocked thread is Waiting for global read lock. In addition, because FTWRL needs to close the table, if there is a large query, FTWRL will wait, resulting in a longer DML/DDL jam. Even with the standby database, there are SQL threads copying updates from the primary database and locking globally, causing delays in the primary and secondary databases.
Note: The LOCK TABLES FOR BACKUP and LOCK BINLOG FOR BACKUP statements are executed only on instances that support BACKUP locks, Percona Server for MySQL has started supporting this more lightweight backup lock in 5.6.16-64.0.
Four, thinking
Q1: To what point in time do you restore files backed up by XtraBackup? A1: Restores to the point in time when LOCK BINLOG FOR BACKUP or FLUSH TABLES WITH READ LOCK was executed, because any change to the BINLOG site is blocked. Redo log and BINLOG are identical.
Q2: With binlog enabled, MySQL crash recovery relies on both binlog and redo log. Why does XtraBackup not back up binlog?
A2: Log FOR BACKUP/FLUSH TABLES WITH READ LOCK. This prevents any change to the BINLOG location. This only requires a commit log based on the redo log. Roll back transactions that do not have commit logs.
Q3: After Percona XtraBackup has been backed up, are redo loci the same as or more than binlog?
A3: By analyzing the backup process, it can be found that the binlog loci are backed up before the redo log thread is stopped, and the redo log loci are released after the redo log thread is stopped, so there are more redo logs. Locking a binlog ensures that all committed redo logs have a commit log. Uncommitted transactions do not have a commit log. If Innodb does not commit a new DML redo log after binlog is locked, there is no commit log. If Innodb does not commit a redo log after binlog is locked, there is no commit log. \
Q4: When will Percona XtraBackup be locked, and what are the factors that affect the length of the lock?
A4: According to the above analysis, the locking operation is only performed when non-IBD files are backed up. The locking duration is mainly related to the number and size of non-transaction tables. The larger the number and size of non-transaction tables are, the longer it takes to copy files, the longer the locking time will be. The rate at which the redo log is generated may vary, but the number of redo logs that are not flushed in a timely manner is usually very small.
Q5: Percona XtraBackup or mysqldump
A5: Select Percona XtraBackup first if the whole instance is backed up, because it has the least impact on the database. If you are backing up a database table, this depends on the amount of data. If the amount of data is not large, you can use mysqldump. Note When backing up a database, you are advised to select the secondary database with the least service connections because backup consumes resources to avoid service interruption.
Five, the summary
Based on the above analysis, the XtraBackup backup process is shown below.