Why care about the Mysql architecture?
benefits
Improve usability
If a database instance is faulty, services can be provided by other database instances or quickly switched to other database instances without being aware of services and causing service interruption. It also improves data security and availability by replicating data across multiple instances.
To improve performance
Service data access can be distributed to different database instances. Different access operations can be separated based on different data access types to reduce the access pressure of a single database instance.
Possible solution for Mysql cluster
MySQL Cluster
Provided by Mysql itself, advantages: Very high availability, very good performance. At least one copy of each data can be stored on different hosts, and redundant data copies can be synchronized in real time. However, its maintenance is very complicated and there are some bugs, so it is not recommended for core online system at present.
DRBD Disk network mirroring solution
The Distributed Replicated Block Device, which is implemented to mirror the entire Device (disk) over the network. It allows the user to create a real-time image of the local block device on a remote machine, which can be used in conjunction with heartbeat links, or as a kind of network RAID.
Advantages: With powerful software functions, data can be mirrored across physical hosts at the underlying fast device level, and synchronization at different levels can be configured based on performance and reliability requirements. I/O operations are performed sequentially, which meets the strict requirements of databases for data consistency. However, in a non-distributed file system environment, image data cannot be visible at the same time. As performance and reliability conflict, it cannot be used in an environment with stringent performance and reliability requirements, and the maintenance cost is higher than MySQL Replication. In addition, DRBD is also one of the officially recommended high availability solutions for MySQL, so you can consider whether to deploy it according to your actual environment.
MySQL Replication
In practical application scenarios, MySQL Replication is the most widely used design method to improve system scalability. Many MySQL users have multiplied or even multiplied the performance of existing systems by simply adding inexpensive hardware after improving the scalability of their systems with Replication.
Mysql replication
What is Mysql replication?
Replication refers to transferring the DDL and DML operations of the master database to the replication server (also called slave) via binary logs, and then re-executing (also called redo) those logs on the slave to keep the data in sync with the master. MysQL supports simultaneous replication from one primary database to multiple secondary databases. Secondary databases can also function as the primary databases of other servers to implement chained replication.
Note:
Because MySQL implementation is not fully synchronous replication, so there is difference between master-slave library, from the library on the query operation need to consider the differences in these data, generally only update not frequent data or data of real-time demand is not high can pass from library query, high real-time demand of data from the primary database is still needed.
Name Description:
Data Manipulation Language (DML) :
SELECT, UPDATE, INSERT, DELETE. Mainly used to perform some operations on the data of the database.
Data Definition Language (DDL) Database definition language:
This is the SQL we use to CREATE tables, such as CREATE, ALTER, DROP, etc. DDL is used to define or change the structure of a table, data types, links between tables, and constraints for initialization
benefits
1. If the primary library fails, you can quickly switch to the secondary library to provide services.
2. You can perform query operations on the secondary database to reduce the access pressure on the primary database.
3. Some database maintenance work, such as backups, can be performed on the slave database to avoid impacting the services of the master database during backups.
The principle of overview
(1) First, the main MySQL library will record data changes as Events in the binary log file Binlog when the transaction is committed; The sync_binlog parameter on the MySQL primary library controls the flushing of Binlog logs to disk.
(2) The master library pushes the events in the binary Log file Binlog to the Relay Log of the slave library. Then the slave library reperforms data change operations according to the Relay Log and achieves data consistency between the master and slave libraries through logical replication.
MySQL uses three threads to replicate data between the master and slave libraries: Binlog Dump threads running on the master library, I/0 threads and SQL threads running on the slave library. When replication is started on a slave library, I/0 programs are created to connect to the master library. The master library then creates a Binlog Dump thread to read database events and send them to the I/0 thread. After the I0 thread obtains the event data, it updates it to the RelayLog of the slave library, and then reads the updated database event from the RelayLog RelayLog from the SQL thread of the library and applies it.
You can run the SHOW PROCESSLIST command to check the status of the BinlogDump thread on the primary database. From the status of the BinlogDump thread, you can see that the primary database actively pushes logs to the secondary database. Similarly, l/O threads and SQL threads can be seen from the slave library with SHOW PROCESSLIST. The L /O thread waits for the Binlog Dump thread on the main library. Events are sent and updated to the RelayLog RelayLog, and the SQL thread reads the RelayLog and applies the changes to the database.
Various file parsing in replication
The log file
There are two very important types of Log files involved in replication: binary Log files and Relay logs. The binary log file records all data modification operations in MysQL in binary format, including Create, Drop, Insert, Update, and Delete operations, but the binary log file does not record data Select operation, because the Select operation does not modify the data.
You can view the format of Binlog by running show variables. Binlog supports Statement, Row, and Mixed formats, which also correspond to the three replication technologies of MysQL.
Relay Log the Relay Log file has the same format and content as the binary Log file. The only difference is that the SQL thread on the slave database automatically deletes the Relay Log file after executing the events in the Relay Log file. Prevent the Relay Log file on the slave library from occupying too much disk space. In order to ensure that the I/0 thread and SQL thread of the slave library can still know where to start replication after the slave Crash restarts, two log files master.info and relay_log.info are created on the slave library by default to save the replication progress. These two files record the progress of the l/0 thread from the library reading the Binlog of the master library and the progress of the SQL thread applying RelayLog in the form of files on disk respectively.
You can run the show slave status command to view the status of the slave database.
Main parameters:
Master Host: indicates the IP address of the primary library.
Master User User account used by the Master and slave replication on the Master database.
Master Port: indicates the Port number of the primary MySQL database.
Master_Log_File: the file of the master library Binlog that the I/0 thread from the library is currently reading.
Read_Master Log_Pos: the location currently read from the library I/0 thread.
Relay_Log_File: file name of the Relay Log being read and applied from the library SQL thread.
Relay_Log_Pos: Location of the Relay Log currently read and applied from the library SQL thread.
Relay_Master_Log_File: the file name of the Relay Log being read and applied by the slave SQL thread corresponds to the master Binlog.
Exec_Master_Log_Pos: Relay_Log_Pos in RelayLog corresponds to the position of the primary Binlog.
Three replication techniques
The binary log file Binlog has three formats:
Statement: A Binlog is created based on the SQL Statement level. Each SQL Statement that modifies data is saved in the Binlog.
Row: Records changes to each Row on a row-level basis. That is, the changes to each Row are recorded in a Binlog, in great detail, but not in raw SQL. During replication, there is no inconsistency between the master and slave libraries due to stored procedures or triggers, but the amount of logs recorded is much larger than in Statement format.
Mixed: Mixed Statement mode and Row mode. By default, Statement mode is used for recording, but in some cases, Row mode is switched
At the same time, it also corresponds to the three technologies of MysQL replication.
When binlog_format is set to Row, MySQL actually records data changes in Binlog Row by Row. Row is more consistent than Statement (copying records rather than simply manipulating SQL). Of course, the size of binlogs in Row format can grow considerably, and disk space needs to be taken into account.
The binlog_format parameter can be set globally or dynamically in the current session: setting the format globally affects all sessions, while setting the format in the current session only affects the current session. The format of a binary log file (Binlog) can be changed in real time with the SET command.
Relevant command
View the current replication mode
show variables like ‘%binlog%format%’;
Changing the Replication Mode
set global binlog_format = ‘ROW’;
set global binlog_format = ‘STATEMENT’;
Common replication architectures
The three common architectures for replication are one-master, multi-slave, and dual-master /DrualMaster
From more than a master
In the main library read request under pressure is very big scene, more than a master can be configured from the replication separation architecture implementation, speaking, reading and writing, and put a lot of read request is not particularly high real-time demand by load balance distribution to more than one from the library, read the pressure to reduce the main library, in the main library under the condition of the abnormal downtime, can put a switch from library is given priority to continue to provide services.
Multistage copy
The one master, many slave architecture can solve the requirements of most scenarios with particularly high read request pressure. Given that MysQL replication is done by “pushing” Binlog logs from the master to the slave, I/0 stress and network stress on the master increases as the slave grows (each slave has a separate Binlog Dump thread on the master to send events), The multilevel replication architecture solves the extra I/0 and network pressure of the master library in the scenario of one master and multiple slaves.
Dual Master replication /Dual Master
In fact, Master library and Master2 are Master and slave to each other. Client clients can access Master library for write requests, and Master library or Master2 for read requests.
Dual-master multi-level replication architecture
Of course, dual-master replication can also be used in combination with master-slave replication: Configure Slave libraries Slave and Slave2 under Master2 library, so that the Slave library Slave can share the load of reading. The dual-master multi-level replication architecture of MyQL is shown in the figure