Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

Hello, I’m Leo. Currently, I am engaged in Java backend development in Changzhou. This is the second article in the MySQL Learning collation series. This series will be output together with byte, netease, Ali, Tencent, Meituan and Kuaishou. Hope to help more friends as soon as possible into the factory!

After reading this article, don’t say you have learned MySQL.

4. Log

4.1 Error Logs

The MySQL error log records serious warnings and errors during the MySQL running and details about each startup and shutdown of MySQL. Error logs are usually named hostname.err

You can find the location of the error log using the following SQL.

show variables like '%log_error%';
Copy the code

The error log keeps growing until it is cleaned up or deleted. Prior to MySQL 5.5.7, error logs could be removed with the mysqladmin -uroot -p flush-logs command. MySQL 5.5.7 and later, you can only archive and back up error logs in the following ways

shell> mv host_name.err host_name.err-old
 
shell> mysqladmin -u root -p flush-logs
 
shell> mv host_name.err-old backup-directory
Copy the code

Error logs can be named arbitrarily. Add log_error=/u02/mysql/mysql.err to the /etc/my. CNF configuration file and restart mysql.

4.2 Querying Logs

The MySQL query log records information about all MySQL database requests. Whether or not these requests are properly executed. The default file name is hostname.log. MySQL query logging is turned off by default. In the production environment, if MySQL is enabled to query logs, performance is greatly affected

I won’t talk too much about it if it’s not very common

4.3 log slow

The slow query log of MySQL is used to record statements whose response time exceeds the threshold. In particular, SQL statements whose response time exceeds the value of long_query_time are recorded in the slow query log.

The default value of long_query_time is 10, which means that the statement is run for more than 10 seconds. By default, the Mysql database does not enable slow query logging. You need to manually set this parameter. However, it is not recommended to enable this parameter unless it is necessary for tuning, because slow query logging may affect performance. Slow query logging supports writing log records to files as well as to database tables.

Important parameters involved in slow log query:

  • Slow_query_log: indicates whether slow query logs are enabled. 1 indicates that slow query logs are enabled, and 0 indicates that slow query logs are disabled.
  • Long_query_time: slow query threshold. When the query time exceeds the threshold, a log is generated.

Important tools involved in slow log query:

  • mysqldumpslow

Commonly used instructions

mysqldumpslow -s r -t 20 /mysqldata/mysql/mysql06-slow.log | more
Copy the code

-s is how to sort,

C: Access count

L: Lock time

R: Returns a record

T: Query time

Al: Average lock time

Ar: average number of returned records

At: indicates the average query time

-t stands for top N, that is, how many previous columns of data are returned

And then the directory

More: another suggested when using these commands | with more use, otherwise likely refresh.

4.4 redolog Redo logs

Redolog is about redolog buffers and redolog files. The former is a cache of logs and is volatile. The latter are log files and are persistent.

Write mechanism

A redo log buffer is a transaction that writes data to the log before inserting a log. You cannot write to a redo log file when a transaction has not yet been committed. The temporary repository for this log is the redo log buffer. The actual writing of the redo log file is done at the commit step. (Executing an SQL statement is also a transaction.)

If you have not yet waited for the COMMIT step, there are two main possibilities

  1. If MySQL is down, the buffer log is lost and nothing is lost.
  2. Persisting to disk!

Then persist the disk

  • Redo log buffer: This is physically the MySQL process memory
  • FS Page cache: written to disk but not persisted. Physically the Page cache file system.
  • Persist to hard disk

  • The red areas in the figure are memory operations that do not involve disk IO. So the performance is very fast. Write is also very fast
  • The yellow part of the picture. Fsync is much slower. Because persistence to disk

Write policy

innodb_flush_log_at_trx_commit

  • If the value is set to 0, only the redo log is left in the redo log buffer for each transaction committed.
  • If the value is set to 1, the redo log is persisted to disk during each transaction.
  • If set to 2, only the redo log is written to the page cache each time a transaction commits.

InnoDB has a background thread that writes logs from the redo log buffer to the page cache of the file system every second and then calls fsync to persist them to disk.

The refresh strategy

The redo log buffer refresh policy is controlled by Innodb_log_buffer_size.

  • When the size of the redo log buffer reaches half of innodb_log_buffer_size, the background thread writes disks.

(Note that since the transaction is not committed, the write action only writes and does not call fsync, leaving only the page cache in the file system.)

  • When a parallel transaction commits, persist the redo log buffer of that transaction to disk

Innodb_flush_log_at_trx_commit: innodb_flush_log_at_trx_commit: innodb_flush_log_at_trx_commit: Transaction B persists the redo log buffer to disk. Transaction A’s log in the redo log buffer is persisted to disk.

Group submission mechanism

Log logical serial number (LSN for short). LSN is monotonically increasing. Write points that correspond to the redo log. Each time a length redo log is written, length is added to the value of the LSN. The LSN is also written to InnoDB data pages to ensure that the data pages are not repeated by redo logs.

As the picture above shows,

  • Trx1 is the first to arrive and will be chosen as the leader of the group.
  • By the time TRx1 starts to write disk, there are already three transactions in the group, and the LSN has become 160.
  • So when TRx1 returns, all redo logs whose LSN is less than or equal to 160 are persisted to disk.
  • And then trx2 and trx3 can go straight back.

Therefore, the more members in a group submission, the better the disk IOPS saving effect. But if you only have single-threaded pressure, you can honestly only persist once per transaction.

In concurrent update scenarios, the later the fsync is invoked after the first redo log buffer is written, the more team members there are and the better the IOPS savings.

4.5 Binlog Archive logs

Write mechanism

Binlog writes to the log this is the easy one. If you mention binlog, you must mention binlog cache. So what is a binlog cache?

The binlog cache is a buffer for binary log files that is controlled by the parameter binlog_cache_size.

A transaction is not allowed to be broken apart during execution, so no matter how big the transaction is, it should be saved and executed at once. So this is a binary log cache. If the amount of memory used exceeds the binlog_cache_size parameter. The temporary save to disk is used. When a transaction is committed, the data in the binlog cache is first written to the binlog and the binlog cache is emptied.

We can see from the above figure that each binlog cache is owned by a separate thread. This means that multiple threads with multiple binlog caches writing to binlog files is very fast because there is no disk IO overhead involved.

Data is persisted to disk only when fsync is performed. This is when the disk IO, or IOPS, is consumed.

When the write? When fsync?

Mainly controlled by sync_binlog.

  • When it equals 0, each committed transaction is only written, not fsync
  • When it is equal to 1, fsync is performed for each committed transaction
  • When it is greater than 1, each committed transaction is written, but N transactions are accumulated before fsync

Therefore, in IO bottleneck scenarios, setting sync_binlog to a large value can improve performance. In actual service scenarios, it is not recommended to set this parameter to 0 because of the controllability of lost logs. A value ranging from 100 to 1000 is commonly used.

However, if sync_binlog is set to N, the binlog of the last N transactions will be lost if the host restarts abnormally.

Group to submit

Binlogs can also be submitted in groups. It’s divided into two parts

  • Write the binlog file from the binlog cache to the binlog file on disk.
  • Call fsync persistence.

As shown in the figure above, you can look at step 2.

If multiple transactions have been written (that is, written to the redo log buffer), they can be persisted to disk together at step 4. Isn’t this optimization process to improve IOPS?

However, step 3 is usually performed quickly, so the time between binlog write and fsync is short, resulting in fewer binlogs that can be aggregated for persistence, so binlog group commits are usually not as good as redo logs.

If you want to improve the performance of binlog group commits, set binlog_group_COMMIT_SYNc_delay and binlog_group_COMMIT_SYNc_NO_delay_count. Fsync is called as soon as either of these two conditions are met.

  • The binlog_group_COMMIT_SYNc_delay parameter, indicating the number of microseconds before fsync is called;
  • The binlog_group_COMMIT_SYNC_NO_delay_count parameter indicates how many times fsync should be called.

WAL mechanisms mainly benefit from

  • Both redo logs and binlogs are written sequentially. Sequential disk write is faster than random write.
  • The group submission mechanism greatly reduces disk IOPS consumption.

4.6 Undolog Rolls back logs

Undo log has two main functions: rollback and multi-version control (MVCC).

During data modification, not only the redo log is recorded, but also the Undo log is recorded. If a transaction fails or is rolled back for some reason, you can use the Undo log to roll back

Undo log mainly stores logical logs. For example, if we want to insert data, undo log will record a corresponding DELETE log. When we update a record, it records a corresponding reverse update record.

This should also be easy to understand, after all, rollback, as opposed to the need to modify the operation, so that the purpose of the rollback can be achieved. Because rollback operations are supported, we can guarantee that “a transaction contains multiple operations, all of which are executed or none of which are executed”. [Atomicity]

Since undo log stores the data before modification, it is equivalent to a previous version. MVCC implements read and write without blocking, and only needs to return the data of the previous version.

5. Skip table, back table

5.1 why

Jump table

A skip table is also a data structure proposed for fast lookup

When we query data in a linked list, the time complexity is O(n), and to solve the efficiency problem, skip tables are created. In essence, it is a multi-level linked list, which increases the redundancy of data in exchange for the time complexity of search, belonging to the idea of space for time. However, it doesn’t actually consume too much space, because only Pointers to nodes are redundant.

Advantages of analysis

  • Compared to red-black trees, skip lists are easy to implement, you can write them out by hand during an interview, and insertion and deletion are not difficult. The sheer amount of spin in red-black trees can be confusing.
  • The data is self-sorted, much like MYSQL’s B+ tree, which is sorted from smallest to largest by default. To take advantage of this is to do a quick range lookup without actually sorting.

Linked list, hop list comparison


The query process

If I were to look for 21 in here

  • Linked list: the process is 3→ 6 → 7 → 9 → 12 → 17 → 19 → 21.
  • The jump table: 6-9-17-21

The main idea of a hop table is to gradually build indexes in this way to speed up searches and inserts. Starting from the top layer, if the key is less than or equal to the key of the next node in the layer, the shift is one bit; If the key is larger, the number of levels is reduced by 1 to continue the comparison. It’s bound to end up on the first floor

Insert the process


Determine the number of layers, K, that the element occupies (by flipping a coin, which is completely random).

And then at Level 1… Level K Inserts elements into the linked list at each Level.

Use the Update array to record the insertion position, again starting from the top layer, find the position of each layer to be inserted, regenerate the number of layers and insert.

Example: insert 119, K = 2

Delete process

Similar to insertion

Back to the table

Let’s take a common example here. We’ve been writing logins since we first touched the code. So how do we set the account and password when we log in?

Small amount of data is good, once the amount of data must be added to the index. The problem is how to build an index!

If you only index an account, you get a back table operation.

MySQL is a B+ tree. If the account is indexed, the account field becomes a node tree. When we query, we will query the account + password. The password is not on this tree, so you need to go back to the table and look up the password field and piece it together.

5.2 How To Avoid it

Back to the table means adding disk IO overhead, so avoiding back to the table is also a way to optimize MySQL. Or take the example of login, account password belongs to high-frequency query. Create a joint index for the account + password to avoid the table back.

The rest is up to the business scenario. For example, software designer’s official website login. Use is id card + password. Each one is different.