The interview questions


Have you done MySQL read-write separation? How to implement MySQL read-write separation? How does MySQL master slave replication work? How to solve the delay of MySQL master/slave synchronization?

Interviewer psychoanalysis


In this phase of high concurrency, it is necessary to do read/write separation. Because in fact, most Internet companies, some websites, or apps, actually read more than they write. Therefore, in this case, it is to write a master library, but the master hangs multiple slave libraries, and then reads from multiple slave libraries, which can support higher read concurrency pressure?

Analysis of interview questions


How to implement MySQL read-write separation?

In fact, it is very simple, based on the master-slave replication architecture, in simple terms, we have a master library, multiple slave libraries, and then we just write to the master library, and then the master library will automatically synchronize data to the slave library.

How does MySQL master slave replication work?

The master library writes the changes to the binlog log. After the slave library connects to the master library, the slave library has an IO thread that copies the binlog of the master library to its own local and writes the changes to a relay log. Then an SQL thread from the slave library reads the binlog from the slave log and executes the contents of the binlog, i.e. executes the SQL locally to ensure that it is the same as the data from the master library.

One important point here is that synchronizing data from the master library is serialized, meaning that operations that are parallel on the master library are executed serially on the slave library. Therefore, this is a very important point. Due to the characteristics of copying logs from the master library and serial execution of SQL, in high concurrency scenarios, data from the slave library will be slower than that from the master library, there is a delay. So often, data that was just written to the main library may not be read, and it may take tens or even hundreds of milliseconds to read.

Another problem is that if the master database suddenly goes down and data is not synchronized to the slave database, some data may not be available on the slave database, and some data may be lost.

So MySQL actually has two mechanisms in this piece, one is semi-synchronous replication, which is used to solve the problem of primary database data loss; Parallel replication is used to solve the delay of master/slave synchronization.

Semi-synchronous replication, also known as semi-sync replication, means that after the master database writes to the binlog, it forces data to be synchronized to the slave database immediately. After the slave database writes the log to its own local relay log, it returns an ACK to the master database. The master does not consider the write operation complete until it receives at least one ACK from the slave.

Parallel replication means that multiple threads are started from a library, logs of different libraries are read in parallel in relay logs, and logs of different libraries are replayed in parallel. This is library-level parallelism.

MySQL master/slave synchronization delay

We did deal with the online bug caused by the delay of master/slave synchronization in the past, which was a small production accident.

This is the scene. One student wrote code logic like this. Insert a piece of data, look it up, and update it. At the peak of the production environment, when the write is up to 2000/s, the master-slave replication latency is about tens of milliseconds smaller. Online, there’s always some data every day, and we expect to update some important data status, but we don’t update it at peak times. Users give feedback to customer service, and customer service gives feedback to us.

We use MySQL:

show slave status

Looking at Seconds_Behind_Master, you can see that the secondary database is several ms behind the primary database.

Generally speaking, if the master-slave delay is severe, the following solutions are available:

  • Split a master library into multiple master libraries, the write concurrency of each master library is reduced several times, at this time the master/slave delay can be ignored.
  • Enable parallel replication for multiple libraries supported by MySQL. If the write concurrency of a library is extremely high, and the single-library write reaches 2000/s, parallel replication still doesn’t make sense.
  • Rewrite code, code writing students, be careful, insert data immediately query may not be found.
  • If it does exist and must be inserted first, ask for the query immediately, and then immediately do something in reverse, setting the query directly to the main library. This is not recommended, because if you do it this way, the point of reading and writing separation is lost.