Welcome to this article on MySQL technology shared by the GreatSQL community. If you have any questions or questions you would like to learn, please leave them in the comments section below

A list,

Before we start troubleshooting we need to know what the Update lifecycle is in MySQL and how MySQL performs a transaction.

By understanding how to execute, we know how to troubleshoot.

Update life cycle

Server layer phase

2.1 the connector

After the client initiates a TCP request, the MySQL Server handles communication protocol processing, thread processing, account authentication, and security check.

2.2 analyzer

MySQL Server performs lexical analysis (select, FROM) on a SQL request, and then analyzes the syntax to determine whether the syntax is correct.

2.3 the optimizer

The optimizer analyzes the SQL statement, selects the appropriate index, and determines whether to use a full table scan based on the pre-result set.

2.4 actuators

InnoDB engine layer stage

2.4.1 Transaction execution phase

1) After the request enters the InnoDB engine, it first judges whether the data page involved in the transaction is in BP. If not, the data page involved in the transaction will be loaded from disk to BP buffer pool, and the corresponding index data page will be locked

Thinking about?

How is data loaded from disk into BP? How are new and old generations exchanged and recycled in BP? How to add corresponding data?

Answer:

Index pages read to disk by B+Tree are loaded into the BP buffer pool.

1. Load the index page into the specified buffer pool instance after calculating the space ID and page no hash. Innodb_buffer_pool_pages_free, innodb_buffer_pool_WAIT_free, if no, discard the dirty pages or old LRU List. Load the data page into the Free List and then load the MIDpoint (header) in the old section of the LRU List. 4. Through binary search method, find the corresponding record of the page and try to add exclusive lock to the row record involved in the thing. (1) If the row lock of the current record of the thing is occupied by other things, it needs to enter the lock wait. (2) After entering the lock and waiting, judge whether the deadlock will be caused by their joining. (3) When no lock wait is detected and no deadlock is caused, exclusive locks are added to the row record.Copy the code

2) Write the data before modification to Undo. After modification, the row before modification in Undo log will be rolled back

Thinking about?

  • Why write Undo logs?

  • What is the storage mode of Undo? \

Answer:

The Undo log is a logical log that records each row. It has two functions: providing rollback and MVCC.

When a transaction needs to be rolled back due to some reasons, you can use Undo log to roll back the transaction to ensure transaction consistency at different transaction isolation levels. When a row of locked data is read, the Undo log can be used (for example, RR level). The Undo log will not be deleted until the transaction is completedCopy the code

The Undo log is stored using segments in the tablespace.

InnoDB storage engine uses segments to manage Undo. Rollack segments are called rollback segments, and each rollback segment has 1024. After 5.6, innodb_undo_logs can be used to define the number of rollback segments. The default number is 128. By default, the Undo log is stored in the shared tablespace. If innodb_file_per_TABLE is enabled, the Undo log is stored in the independent tablespace.Copy the code

3) Write redo log buffer Modify data in BP and write the modified values to redo log buffer for asynchronous sync to disk

Thinking about?

  • When do I write redo log buffer?

  • How do LOG buffers fall to redo log after commit?

  • What are the log flush rules?

Answer:

When to write redo log buffer.

Innodb_waits determines whether the redo log buffer is sufficient. Innodb_waits determines whether the redo log buffer is sufficient. Update the field value of the row record of the data page in MIDPont of the old section of the LRU List of the BP buffer pool. 3. Write the modified fields to the redo log buffer and add the length of the current redo log to the LSN. 4. Because of the redo group commit, redo log buffers generated by transactions may be flushed and synchronized to disk with other transactions. 5. After the field value is updated successfully in the BP buffer pool, the corresponding data page is a dirty page.Copy the code

When to log to redo log.

1. Each time, the log in the log buffer will be written to the log file (OS buffer here), and then the fsync operation of the system will be called to unload the disk. After the commit, determine when to flush the log buffer with innodb_flush_log_at_trx_COMMIT. 2, the value is 1 (default is 1) : each transaction is written to the log buffer--> then written to the OS buffer--> call system fsync to flush the log file on disk. If the value is 0, the transaction commit is written to the log buffer first --> write to the OS buffer every second and call fsync to drop disk (lose 1s data at most). 4. Value 2: Only OS buffer is written to each commit, then fsync() is called every second to write the OS buffer logs to the log file on disk (at most 1s of data is lost).Copy the code

Rules for flushing logs.

By default, the transaction flushes every lift because innodb_flush_LOG_at_trx_COMMIT has a value of 1. This is only for InnoDB to flush logs after a commit action, which is one of the rules of InnoDB storage engine flush. The rules for flushing logs are as follows: 1. After the COMMIT action is issued. It is controlled by innodb_flush_LOG_at_trx_COMMIT. The flush frequency is determined by the innodb_flush_log_at_timeout value. The default value is 1. The flush frequency has nothing to do with the commit action. When more than half of the memory is used in the log buffer. 4. When there is a checkpoint, the checkpoint indicates to some extent the LSN position of the log when the data page is flushed.Copy the code

Checkpoint flush rule.

In InnoDB, there is only one rule for flushing data: checkpoint. When checkpoint is triggered, dirty data and dirty log pages in BP are flushed to disk. Sharp Checkpoint: When the redo log file is reused (cutting the redo log), all dirty data recorded in the redo log is flushed to disks. Fuzzy checkpoint: Flush a small number of logs to disk at a time, not all dirty logs. 1. Master thread checkpoint: The master thread brushes a certain percentage of dirty pages to disks every second or every 10 seconds. Flush_lru_list checkpoint: innodb_page_CLEANERS specifies the number of Page Cleaner threads in the Flush_lru_list checkpoint. 3. Async/sync/ flush checkpoint, synchronous or asynchronous. Innodb_max_dirty_pages_pct is controlled by the variable innodb_max_dirty_pages_pct.Copy the code

4) Write to binlog cache

At the same time, the modified information is recorded in the binlog_cache format to wait for the disk to drop.

If the binlog cache is insufficient, it is written to a binlog temporary file.

Thinking about?

What is the writing flow of a transaction binlog Event?

Answer:

Once a transaction commits, both the binlog cache and the binlog temporary file are released (already written to the binlog file). Multiple DMLS in the same transaction share the same binlog cache and the binlog temporary file.

1. Transaction start. 2. Execute the DML statement. The first execution of the DML statement will allocate the binlog cache. 3. Events generated during the execution of DML statements are continuously written to the binlog cache. When the binlog cache is full, data in the binlog cache will be written to the binlog temporary file and the binlog cache will be emptied. Error 1197 is reported if the temporary file size is greater than max_binlog_cache_size. The size of the IO_CACHE and the temporary file is 0, and the file descriptor is reserved.  6. Disconnect and release the IO_CACHE.Copy the code

5) Write change buffer

If the transaction needs to modify the secondary index, write it to the change Buffer page and wait until the transaction needs to merge the secondary index.

Thinking about?

  • When will change buffer be used?

  • Why only normal index pages?

  • What scenarios trigger refreshing the change buffer?

  • What business is not suitable/suitable to enable change buffer?

  • What are the parameters of change buffer?

Answer:

When will change buffer be used?

MySQL 5.5 previously called the Insert Buffer, which was only for INSERTS, and later called the change buffer, which also works for delete and Update.

In the normal index data page is not in THE BP, the page is written, the disk data is not loaded into the buffer pool, and only the buffer changes are recorded. When data is read, merge data into BP to reduce DISK I/OS for write operations and improve disk performance.Copy the code

Why only normal index pages.

InnoDB must perform a uniqueness check every time a unique index or primary key index is modified. The disk page is read even if the index page is not in the buffer pool. A random I/O and a sequential I/O are unavoidable.Copy the code

Those scenarios trigger a refresh of the change buffer.

1. The data page is accessed. 2. The master thread performs operations every 10 seconds. 3. Database BP is not enough. 4. When the database is shut down normally. 5. When the redo log is full (rarely, the redo log is full and the database cannot write to it).Copy the code

What services are not suitable for/Suitable for enabling change Buffer?

Not suitable for: 1. Databases are unique indexes. 2. After a data is written, the above scenario is read immediately. After the write operation, the change buffer should be entered into BP. Suitable for: 1. Most databases are non-unique indexes. Business is to write more and read less, or not immediately read after writing. 3, read and write separation under the master library can be used. You can use change Buffer to reduce one random I/O and optimize periodic batch disk writes.Copy the code

Change buffer parameters.

show global variables like '%innodb_change_buffer%'; Innodb_change_buffer_max_size; Set the size of the write buffer. The default value is 25%, and the maximum value is 50% (for more write and less read, the maximum value is 25%). innodb_change_buffering; Configure those write operations to enable write buffers, which can be set to all/ None /inserts/ Deletes, etc.Copy the code

2.4.2 Transaction commit phase

With the Binlog option enabled, the transaction lift is performed to enter the two-phase commit mode (prepare and COMMIT).

The two phases involve two parameters (sync_binlog and innodb_flush_LOG_at_trx_COMMIT).

1) Transaction commit is divided into prepare stage and COMMIT stage (two-stage commit)

The commit operation of a transaction uses internal XA at the storage engine and server layers. The two-phase commit protocol guarantees transaction consistency, mainly atomicity of redo log and binlog.Copy the code

2) Redo log prepare

The redo log is prepared and the transaction xID is written. Flush redo log buffer to redo log disk file for crash recovery. The flush mode is determined by innodb_flush_log_at_trx_COMMIT.Copy the code

3) Binlog write&fync: The executor writes the complete transaction in the Binlog cache and the XID in the redo log prepare file to the Binlog

The Dump thread will actively send events from the binlog cache to the SLAVE I/O thread, perform fsync flush (large transactions take longer) and empty the binlog cache. The Binlog flush mode is determined by sync_binlog. The transaction succeeds when the binlog is written. Summary: During transaction execution, the log is written to the binlog cache first. When the transaction is committed, the binlog cache is written to the binlog file. When sync_binlog is 1, the dump thread is notified to perform primary/secondary replication only after the binlog falls from the disk.Copy the code

4) Redo log commit commit

During the commit phase, the redo log generated by the transaction is synchronized to the disk. If the redo log is marked commit, the transaction is successfully committed.Copy the code

5) Transaction commits to release the exclusive lock held by row records

6) Dirty page refresh is triggered when Binlog and redo log disks fall

The dirty page is copied to the DoubleWrite buffer, the data in the DobuleWrite buffer is flushed to the shared table space (IBData), and the dirty page is flushed to disk. The memory page is the same as the data page.Copy the code

Thinking about?

  • What is the dirty page washing mechanism in BP?

Answer:

When the percentage of dirty pages in InnoDB exceeds the value of Innodh_max_dirty_pages_pcT_LWM, the disk is flushed.

2.4.3 Assume transaction ROLLBACK

Because of a system exception or display rollback, all data changes are restored by rolling back the data in the log.

For in-place updates, data is rolled back to the oldest version; For delete+ INSERT, the mark deleted record clears the delete mark, and the clustered index and secondary index records inserted are also deleted.

3. Several cases affecting transaction submission delay

During transaction execution:

1. Lock wait

1. Insert lock waiting gap lock waiting in RR mode 2, Insert wait MDL lock cause. Table lock.Copy the code

2. IO

1, slow SQL leads to high I/O. 2. Other programs have a high occupancy ratio. 3. BP hit rate is relatively low. 4. Concurrency. 5. Innodb buffer pool is insufficient. Update and delete large number of rows (>W).Copy the code

3. Buffer

1. Innodb_log_waits determines whether the redo log buffer is sufficient. 2. Flush the Redo log buffer using innodb_flush_log_at_trx_commit. 3. Check whether the Binlog cache is sufficient to create temporary files and consume I/OS. 4. Whether the Change buffer is sufficient.Copy the code

4, falling delay

1. Sync_binlog. 2. Binlog_group_commit_sync_delay. 3. Innodb_flush_commit parameter 4. Check innodb_buffer_pool hit ratio to see the effect of dirty page refresh frequency.Copy the code

Update slow troubleshooting ideas

Investigation ideas:

1. Check the system performance (IO, CPU, memory) of the instance at that time to eliminate the interference of system performance

If the CPU is high, I/O is high, or WA is large, check the number of concurrent SQL queries first and then the current number of concurrent SQL queries. If the CPU is high, I/O is medium, or WA is small, check the number of concurrent SQL queries that are slow. If the CPU is low, I/O is high, or WA is low, check the threads that occupy high I/O usage. The possible cause is Page clean or log refresh frequently.Copy the code

2. Check the MySQL status

Mysql porcessList to check whether there is a wait lock (table lock, row lock, meata lock, etc.). Check whether the mysql processList has a large number of send Data, init, commit, and Clean up states. Check the mysql processList to see if there is concurrency pressure. Check whether innoDB buffer pool hit ratio is sufficient. Check whether mysql TMP is sufficient and open tables equals table_open_cache.Copy the code

3. Analyze SQL statements

Analyze the execution of SQL through Explain, whether to go index, whether there is union. Analyze SQL execution through explain, whether there is a large table drive small table, multiple table join. Check if SQL exists to generate additional temporary tables. Analyze a single SQL statement using profiles.Copy the code

Analyze how long the application takes to execute SQL slowly

Observe whether a single SQL statement is slow or all statements are slow. Slow SQL time is regular, help check MysSQL related parameters.Copy the code

5. Packet capture and STRACE analysis

Tcpdump is used to capture packets and analyze whether MySQL returns slowly or the network is slow. Use Strace to analyze what is slow inside MySQL and which function is causing it.Copy the code

5. Common problems

Select * from table_name where table_name = ‘1’ and table_name = ‘1’;

Update causes deadlock issues.

3. Update hundreds of data takes 10s.

Select * from the same table where some updates are fast and some are slow.

Update problems don’t stop there. Reading this article has given you a better understanding of how to find, troubleshoot, and resolve problems that may arise from updates.

In this section, some common cases are listed to help you learn. It is also recommended that you build your own test environment and discover new problems.

Enjoy GreatSQL 🙂

Article recommendation:

Technology sharing | MGR Best practices (MGR Best Practice) mp.weixin.qq.com/s/66u5K7a9u…

Share | wanli database MGR Bug fixes mp.weixin.qq.com/s/IavpeP93h road…

Macos system compile percona and some functions on Macos systems operation difference mp.weixin.qq.com/s/jAbwicbRc…

Technology sharing | use systemd manage MySQL standalone multi-instance mp.weixin.qq.com/s/iJjXwd0z1…

Products | GreatSQL, create a better ecological mp.weixin.qq.com/s/ByAjPOwHI MGR…

Products | GreatSQL MGR optimization reference mp.weixin.qq.com/s/5mL_ERRIj…

About GreatSQL

GreatSQL is a MySQL branch maintained by Wanli Database, which focuses on improving the reliability and performance of MGR. It supports InnoDB parallel query feature and is a MySQL branch version suitable for financial applications.

Gitee: gitee.com/GreatSQL/Gr…

Making: github.com/GreatSQL/Gr…

Wechat &QQ Group:

You can search to add GreatSQL Community Assistant wechat friends, and send the verification message “Add group” to join the GreatSQL/MGR communication wechat group

QQ group: 533341697 wechat assistant: WanliDBC

This article is published by OpenWrite!