The storage engine

Many articles begin directly with the storage engines, not the storage engines themselves. So what exactly is a storage engine? Have you ever wondered how MySQL stores the data we throw in?

In fact, the storage engine is also very simple, I think it is a storage solution, to achieve new data, update data and index and so on.

What existing storage engines do we have to choose from?

InnoDB, MyISAM, Memory, CSV, Archive, Blackhole, Merge, Federated, Example

There are many types, but the only common storage engines are InnoDB and MyISAM, and I’m going to focus on those two.

InnoDB is the most widely used storage engine for MySQL. InnoDB has been the default storage engine for MySQL since version 5.5. Do you know why InnoDB is widely used? Putting that aside, let’s take a look at the underlying principles of InnoDB’s storage engine.

InnoDB’s memory architecture is divided into three main chunks: Buffer Pool, Redo Log Buffer, and extra memory Pool

The buffer pool

InnoDB stores data to disk for persistence. However, when faced with a large number of requests, the gap between CPU processing speed and disk I/O speed was too large. To improve overall efficiency, InnoDB introduced buffer pools.

When there is a request to query data, if there is no data in the cache pool, it will search the disk and put the matched data into the cache pool. Similarly, if there is a request to modify data, MySQL does not modify the disk directly, but will modify the data already in the buffer pool page, and then flush the data back to disk. This is the buffer pool, speed up the read, speed up the write, reduce the I/O interaction with disk.

The buffer pool is basically throwing data from disk into memory, and since it is memory, there will be no memory space to allocate. So buffer pool adopts LRU algorithm, when there are no idle pages in the buffer pool, to carry out the page elimination. But there is a problem with this algorithm called buffer pool contamination.

When you do a batch scan or even a full table scan, you might replace all the hot pages in the buffer pool. This can lead to a precipitous decline in MySQL performance. So InnoDB makes some optimizations to LRU to avoid this problem.

MySQL logs data first. Before writing data, MySQL logs a Redo Log called a Redo Log. It periodically flusits a new Redo Log to disk using the CheckPoint technology.

In addition to data, it stores index pages, Undo pages, insert buffers, adaptive hash indexes, InnoDB lock information, and data dictionaries. Here are a few more important ones to talk about briefly.

Insert the buffer

The operation for the insert buffer is update or insert, and we consider the worst-case scenario where the data that needs to be updated is not in the buffer pool. There are two options.

  1. A piece of data is written directly to disk
  2. Data is written to disks in batches when it reaches a certain threshold (for example, 50)

Obviously, the second option is better and reduces the interaction with disk IO.

Write two

Since we both talked about insertion buffering, I have to mention it twice, because I think the two InnoDB features complement each other.

Insert buffering improves MySQL performance, while double writes improve data reliability on this basis. We know that when the data is still in the buffer pool, when the machine is down, there is a write failure, there is a Redo Log to recover. But what if it goes down while flushing data back to disk from the buffer pool?

This condition is called partial write failure, and the redo log cannot solve the problem.

Pictures from the network, deleted

When dirty pages are brushed, they are not brushed directly to disk. Instead, they are copied to the Doublewrite Buffer in memory, and then copied to the disk shared table space (you can think of it as disk), and then written to the disk file.

With the two-write mechanism, even if the page goes down while the dirty page is being brushed, the page copy of Doublewrite Buffer can be retrieved from the shared table space and overwritten with the original data page when the instance is recovered.

Adaptive hash index

Adaptive indexing Just as the JVM dynamically compiles hotspot Code into Machine Code during execution, InnoDB monitors queries to all indexes and hashes the pages accessed by hotspots to speed up access.

You’ve probably seen a keyword page many times, so let’s talk about what that page is.

page

Page, the smallest unit of data management in InnoDB. When we query data, we load the data from disk into the buffer pool on a page basis. Similarly, updating data is also done on a page basis, flushing our changes back to disk. The default size of each page is 16K, and each page contains several rows of data. The structure of the page is shown below.

Pictures from the network, deleted

Don’t worry too much about what each zone is, just know what the benefits are. The data of each page can be passed through the data of the upper and lower and the next page in FileHeader, and the bidirectional linked list can be formed between pages. Because in actual physical storage, data is not stored continuously. You can think of it as the distribution of G1 regions in memory.

The row data contained in a page forms a one-way linked list between rows. The row data we store eventually ends up in User Records, which of course doesn’t take up any storage space initially. As we store more and more data, the User Records will be bigger and bigger, and the Space of Free Space will be smaller and smaller, until it is used up, we will apply for a new data page.

The data in User Records is sorted by the primary key ID, and when we search by the primary key, we will go down the one-way linked list,

Redo log buffers

InnoDB also uses a Write Ahead Log policy to refresh data in the buffer pool. When a transaction is started, Redo logs are logged to the Redo Log Buffer before Buffer pool page data is updated.

Data in the Redo Log Buffer is written to the Redo Log at a regular rate. The changed pages are marked as dirty pages and InnoDB uses CheckPoint to flush dirty pages to disk.

The log

I mentioned Redo logs above. This section is devoted to logging, which is divided into two dimensions.

MySQL level

InnoDB level

MySQL logs

MySQL logs can be divided into error logs, binary files, query logs, and full query logs.

  • Error logs are well understood as logs of serious errors that occur during service execution. When our database fails to start, we can come here to see what is the cause of the failure
  • The binary file has another name that you should be familiar with, Binlog, and it records all changes to the database.
  • The query log records all statements from the client
  • Slow Query logsHere we record all SQL statements whose response time exceeds the threshold, which we can set ourselves, aslong_query_time, the default value is 10s, and the default value isShut down“, which needs to be opened manually.

InnoDB log

There are only two types of InnoDB Log: Redo Log and Undo Log.

  • Redo Log A Log that records changes in transaction values. Whether the transaction is committed or not is logged. For example, when data is updated, the updated record is written to the Redo Log before the data in the cached page is updated. Then the data in the memory is flushed back to disk according to the preset update policy.

  • Undo Log records the version before the transaction was recorded and can be used for rollback after a transaction failed.

Redo logs Log changes made to a specific data page and can only be used on the current Server, while binlogs can be interpreted as being used by other storage engines. This is also one of the important functions of Binlog, which is master slave replication, and another function is data recovery.

As mentioned above, the Binlog, which records all changes to the database, has three formats for logging. These are Statement, Row, and MixedLevel.

  • Statement records all SQL statements that modify data. It only records THE SQL and does not need to record all rows affected by the SQL, reducing the amount of logs and improving performance. However, since only the execution statement is recorded, it cannot be guaranteed to be executed correctly on the Slave node, so some additional context information needs to be recorded
  • A Row stores only modified records. A Statement only records the execution of SQL for comparison. A Row generates a large number of logs. But the Row does not record the context information, just what is changed.
  • MixedLevel is a mixture of Statement and Row.

The specific log type depends on the actual situation. For example, if an UPDATE Statement updates a lot of data, using a Statement saves more space, but rows are more reliable.

Difference between InnoDB and MyISAM

Since MyISAM is not commonly used, I’m not going to delve into the underlying principles and implementations. Let’s briefly compare the two storage engines here. Let’s break it down a little bit.

  • Transactions InnoDB supports transactions, rollback, transaction security, and crash recovery. MyISAM does not support this, but queries are faster than InnoDB
  • Primary key InnoDB specifies that if no primary key is set, it automatically generates a 6-byte primary key, while MyISAM allows no index or primary key. The index is the address of the row
  • Foreign keys InnoDB supports foreign keys, while MyISAM does not
  • Table locks InnoDB supports row and table locks, while MyISAM only supports table locks
  • InnoDB does not support full-text indexing, but you can use plug-ins to do so, while MyISAM supports full-text indexing itself
  • Number of rows When InnoDB gets the number of rows, it needs to sweep the table. MyISAM stores the total number of rows in the current table, which can be read directly.

So, to summarize, MyISAM is only suitable for scenarios where there are more queries than updates. If your system has a large number of queries (such as reporting systems), you can use MyISAM for storage. Otherwise, InnoDB is recommended.

End

Due to time constraints, this article only briefly talked about the overall architecture of InnoDB, and did not talk about some points in depth. For example, how InnoDB improves to solve buffer pool contamination, what its algorithm is, how checkpoint works, etc., just a brief overview, and more details later if we have time.

If you find this article helpful, please give it a thumbs up, a comment, a share and a comment

You can also search the official account [SH full stack Notes] on wechat, and of course you can also scan the TWO-DIMENSIONAL code to follow it


Thanks to a worship

This article is formatted using MDNICE