Problems solved by master-slave replication
- Data distribution: The distribution of data to different geographical locations through replication
- Load balancing: Read/write separation and load reading to multiple slave libraries
- Backup: Can be used as real-time backup
- High availability: Use master master replication to achieve high availability
Replication Principle The principle of replication is actually very simple, only divided into the following three steps:
- The binary log is recorded by the master database in the order in which the transaction is committed. The binary log is recorded by the master database in the order in which the transaction is committed. After logging, the master library notifies the storage engine to commit the transaction.
- The slave library starts an IO thread that connects to the master library. The binlog dump thread on the primary library reads the update events in the local binlog file. After receiving logs from the slave library, the logs are recorded in the local relay log relay-log.
- The events in the relay log relay-log are read from the SQL thread in the library and replaced into the slave library. (Prior to version 5.6, SQL threads were single-threaded, resulting in greater latency between master and slave.)
What exactly is recorded in the log file for both replication modes? Mysql supports two log formats, and both log formats reflect their own replication modes
Statement-based replication Statement-based replication is equivalent to logical replication, that is, statements that record operations in binary logs are replicated in the slave library.
In this mode, binary logs occupy less space and have a low bandwidth and high transmission efficiency. But statement-based updates that depend on other factors, such as inserting data using the current time of the timestamp function call as the time value, can also be problematic because of inconsistent time values due to delays between master and slave. Stored procedures and triggers can also have problems.
So in development we should try to put the logic in the code layer, not in mysql, not easy to expand.
Row-based replication Row-based replication is equivalent to physical replication, that is, binary logs record every row of the actual updated data. As a result, the pressure of row replication is high because the log occupies a large space and the transmission occupies a high bandwidth. But it is more accurate than statement based replication and can mask some inconsistencies caused by differences between master and slave libraries. Such as the timestamp function mentioned earlier.
Comparison: statement replication
High transmission efficiency, reduce delay. Statement assignment does not fail when updating non-existent records from the library. Row replication, on the other hand, leads to failure, which leads to earlier discovery of master/slave inconsistencies. Let's say there are a million data pieces in the table, and a SINGLE SQL update for all the tables. A statement based replication would send only one SQL, whereas a row based replication would send a million update recordsCopy the code
Line copy
No query plan needs to be executed. I don't know exactly what statement is being executed.Copy the code
For example, a statement that updates a user’s total score needs to count all the user’s points before writing them to the user table. In the case of statement-based replication, the user’s credits need to be counted again from the slave library, while row-based replication updates the record directly without counting the user’s credits.
Mysql dynamically switches between the two replication modes because both modes have their advantages and disadvantages. The default is statement.
Configuration points
Server_id =1 binlog_format=mixed # After n transaction commits, Mysql will perform a disk synchronization command of fsync. Flushes buffer data to disk. If # is 0, Mysql controls the frequency itself. If sync_binlog=n # is 0, the log buffer will be written to the log file and flushed to disk once per second. The # mysqld process crashes and loses all transactions for a second. If # is 1, the log buffer will write the log file and flush to disk each time. Only one transaction is lost in a crash. Innodb_flush_logs_at_trx_commit =0 innodb_flush_logs_at_trx_commit=0 Automatic replication after a crash may cause more problems. Skip_slave_start =1; skip_slave_start=1; skip_slave_start=1 Log_slave_update # Log deletion expiration time. If the delay is severe, log files will occupy disks expire_logs_days=7Copy the code
The three parameters of Innodb_flush_logs_at_trx_COMMIT are easily confused. The following is a detailed analysis:
Mysql first writes logs to the log buffer, and then writes data from the log buffer to the log file. In this case, the log file is written in memory. Finally, the operating system needs to write data in memory to disk. Parameter 0: mysql writes data from the log buffer to the log file and flusher it to disk every second. This means that when mysql crashes, it will lose all transactions for a second. Parameter 1: The log buffer is written to the log file and flushed to disk for each transaction commit. This means that only one transaction is lost when mysql crashes. Argument 2: Each transaction will write the log buffer to the log file but not to disk at the same time. Mysql controls the writing of the log file to disk every second. When mysql crashes and the operating system does not crash, log_file will only lose one transaction. The operating system still brushes log files to disk, and if the operating system crashes or goes out of power, it loses transactions for a second.Copy the code
Recommended use:
innodb_flush_logs_at_trx_commit=2
sync_binlog=500
Copy the code
Performance will be faster
innodb_flush_logs_at_trx_commit=1
sync_binlog=1
Copy the code
safer
Latency issue
When the TPS concurrency of the master library is high, the slave database SQL may not be able to keep up with the master database because the master database is written by multiple threads while the slave DATABASE SQL thread is single-threaded (producers are faster than consumers, resulting in goods piling up).
Delayed resolution
- Network: Slave libraries are distributed on the same LAN or in an environment with low network latency.
- Hardware: Configure better hardware from the library to improve random write performance.
- Configuration:
From library configuration
Sync_binlog =0 innodb_flush_log_at_trx_commit=2 logs-slave-updates=0 Increment innodb_buffer_pool_sizeCopy the code
Let more operations be done in Mysql memory and less disk operations. Or upgrade Mysql5.7 to use parallel replication.
Architecture: for example, in the transaction, try to read and write to the master library, and in other non-transactions, read to the slave library. Eliminate database inconsistencies caused by partial latency. Increasing the cache reduces some of the load on the slave library.