Today we are going to take a closer look at some of mysql’s replication mechanisms and what the binlog and relay log structures look like. \

Binlog role

The main purpose of binlog is to record table changes in the database. It only records SQL that have changed the data. SQL that have not changed the data will not be written.

To see the binlog generation process in an actual scenario, prepare SQL:

create table test(text varchar(20));
insert into test values ('test_text');
select * from test;
flush logs;
Copy the code

Check the binlog

show binlog events in 'binlog.000029';
Copy the code

The following information is displayed:

binlog

Alternatively, you can use the mysqlBinlog tool to view the contents of the binlog:

show variables like 'log_%'; Mysqlbinlog --short-form --force-if-open --base64-output=never /usr/local/var/mysql/binlog000029.
Copy the code

From the log we can see that the table creation statement was executed along with a Format_desc header and a Ratate rotation event, which we’ll cover later, looking at what a few fields mean.

Log_name indicates the name of the log file. For example, if you want to query binlog.000029, the default is show binlog events, but this will only query the first binlog, not the current active binlog. If you don’t know what binlogs are, you can use the following command:

show binary logs; #查看binlog列表
show master status; #查看最新的binlog
Copy the code

Pos represents where the file starts.

Event_type Indicates the event type.

Server_id is the ID of the server on which the event was created.

End_log_pos represents the end position of the event in the file. For example, the end position of the first query is 723, and the start position of the file after the second insert is 723.

Info stands for event information and is a readable text content.

Binlog Log structure

The structure of a binlog log is like this. It consists of an index file and a binlog file. The binlog event contains a common header, a commit header, and an event body.

Starting with index files, each line of an index file contains the full name of a binlog file (like host-bin.001), which is affected by commands such as Flush logs that write all logs to disk.

Each binlog file consists of several binlog events. The Format_description event is used as the header and the rotate event is used as the end.

Format_description contains the server information of the binlog file and key information about the file status. If the server is shut down or restarted, a new binlog file is created and a new format_description is written. His format is roughly as follows.

2                binlog-version
string[50]       mysql-server version
4                create timestamp
1                event header length
string[p]        event type header lengths
Copy the code

The log rotation event contains the file name of the next binlog and the location from which it was read. It is added to the end of the file after the server writes the binlog. The rotation event does not exist every time.

if binlog-version > 1 {
8              position
}
string[p]      name of the next binlog
Copy the code

Binlog events contain groups of several transactions, one for each transaction. If create ALTER statements are not part of a transaction statement, they are themselves a group, and each group is either executed at all or not at all.

Binlog event structure

Each binlog event consists of three parts:

  1. Generic header that contains basic information for all events in binlog.
  2. The content of the submission header varies for different types of events
  3. The event body, which stores the main data of the event, also varies for different types of events.

Binlog rotation and cleanup

From the above example, we can see that there is not only one binlog, and based on real scenarios, it is certainly not advisable to write a binlog file all the time, and there are three binlog rotation scenarios:

  1. A new binlog file is generated each time the server is started.
  2. If it reaches the maximum size, you can run the binlog-cache-size command to control the size. If it reaches the maximum size, the device will be replaced.
  3. Display flush, flush logs writes all logs to disk, and a new file is created to write to. As can be seen from the first example, a new log file binlog.000030 is generated after execution and starts at position 4.

As our binlog files grow over time, there are two ways to clear binlogs:

  1. You can set expire-logs-days to control the number of days you want to keep binlog files. The system will automatically clear them.
  2. PURGE BINARY LOGS manually

Relay – log structure

Relay-log is the core of the connection between master and slave. Let’s take a closer look at its structure and use.

image-20200909161115718

Relay-log has a very similar structure to binlog, except that it has a master.info and relay-log.info file.

Master.info records the location of the last binlog read from the master synchronization and all the information necessary to connect to the master and start the replication.

Relay -log.info records the progress of file replication, where the next event starts, and the SQL thread is responsible for updating.

In the last article we mentioned that the entire replication process looks something like this:

Now that we know the structure of binlog and relay-log, let’s rearrange the flow of the whole link. Here we assume that master.info and relay-log.info exist:

  1. The Master receives the client request statement and writes a record to the binary log before the statement ends, which may contain multiple events.
  2. At this point, a Slave connects to the Master, and the Master dump thread reads logs from the binlog and sends them to the Slave I/O thread.
  3. The IO thread reads from master.info to the last position of the last write.
  4. The IO thread writes the log to the relay log, and if it exceeds the specified relay log size, writes the rotation event and creates a new relay log.
  5. Update the last location of master.info
  6. The SQL thread reads from relay-log.info into the last read location
  7. The SQL thread reads the log event
  8. Execute SQL in the database
  9. Update the last location of relay-log.info
  10. Slave Records its own binlog

In this case, IO and SQL threads have the problem of generating duplicate events.

  1. The relay log is first logged, then the master.info location is updated
  2. At this point the server crashes and writing to master.info fails
  3. The server recovers and synchronizes again to get the last location from master.info, causing the event to repeat

Why do it if it’s a problem? If the reverse is done, master. Info is updated before the relay log is logged, and the problem is lost data. Mysql considers loss to be more serious than repetition, so refresh the log first. Mysql decides whether to keep the log large or small.

Long press scan code to add "Python little assistant" ▼ Click to become a community member click on itCopy the code