preface

InnoDB storage engine as one of our most commonly used storage engines, fully familiar with its implementation and operation principle, help us to create and maintain database tables better, sort out a MySQL learning notes hope to help you.

InnoDB architecture

  • **InnoDB includes: ** memory pool, background threads and storage files.
  • The memory pool is composed of multiple memory blocks, including cache disk data, redo log buffer, etc.
  • ** Background threads include **Master threads, IO threads, and Purge threads.
  • Table storage structure files implemented by InnoDB storage engine generally include table structure files (.frm), shared table space files (IBDATA1), exclusive table space files (IBD) and log files (redo files, etc.).

1. The memory pool

As we know, if the client reads data from the database directly from disk, there will undoubtedly be some performance bottlenecks, the buffer pool is to improve the overall database read and write performance.

When a client reads data, if the data exists in the buffer pool, the client will directly read the data in the buffer pool. Otherwise, the client will read the data from the disk. For changes in the database, the data is first modified in the buffer pool and then flushed to disk through the Master Thread Thread.

** Theoretically, the larger the buffer pool, the better. ** The buffer pool caches not only index pages and data pages, but also undo pages, insert cache, adaptive hash indexes, InnoDB lock-information and more.

InnoDB allows multiple buffer pool instances to reduce competition for resources within the database and increase the concurrency of the database. Lecture 38 also covers configuration and tuning of buffer pool instances.

InnoDB storage engine puts redo log information into buffer before flushing it to redo log file.

Background threads

**Master threads are responsible for asynchronously refreshing data from the buffer pool to disk. ** In addition, they also include inserting cache and retrieving undo pages. IO threads are responsible for reading and writing I/OS. Purge threads are used to Purge undo logs that have been committed by transactions, and the Pager Cleaner Thread is a new Thread that helps Master threads flush dirty pages to disk. It takes the load off the Master Thread and reduces congestion.

3. Store files

When creating a table in MySQL, a. FRM file is generated. This file is used to store the metadata information of each table, mainly including the table structure definition.

In InnoDB, data is stored by table space. ** is a shared table space by default, and files stored by ** are shared table space files (IBDATA1). Innodb_file_per_table if innodb_file_per_table is set to 1, data, indexes, and other information are stored in an exclusive tablespace. Therefore, an exclusive tablespace file (IBD) is generated. If you don’t have a clear understanding of shared and exclusive tablespaces, I’ll explain.

Log files are mainly redo log files, which mainly record redo logs generated by transactions to ensure the consistency of transactions.

InnoDB logical storage structure

InnoDB logical storage structures are divided into Tablespace, Segment, Extent, Page and row.

1. Alter Tablespace Tablespace

InnoDB provides two ways of storing data in table Spaces, one is shared table space and the other is exclusive table space. By default, InnoDB stores all of its table data in a shared tablespace, ibDatA1.

We can enable exclusive tablespace mode by setting innodb_file_per_table to 1 (1 represents exclusive mode). ** When enabled, each table has its own physical file in the tablespace. ** All data and indexes are stored in this file for easy backup and recovery.

2. Paragraph (Segment)

A table space consists of segments, including data segments, index segments, and rollback segments. As we know, InnoDB’s default data store is based on B + tree implementation.

Here, the index segment refers to the non-leaf node of the B + tree, while the data segment refers to the leaf node of the B + tree. Rollback segments refer to rolling back data.

3. Extent/Page

Extents are the unit structure of a table space, each 1MB in size. ** Pages are the smallest unit of area. ** pages are also the smallest unit of disk management for InnoDB storage engine. By default, each page size is 16KB. To ensure page continuity, InnoDB storage engine requests 4-5 extents from disk at a time.

4. Row

InnoDB storage engine is row-oriented, that is, data is stored in rows. The number of rows per page is strictly defined, allowing up to 16KB/2-200 rows (7992 rows).

InnoDB Transaction Redo log

**InnoDB is a transactional storage engine. **InnoDB transactions are based on redo log and undo log. Redo log is a redo log that provides redo writes and implements transaction persistence. Undo log is a rollback log and provides rollback operations to ensure transaction consistency.

** Redo logs include in-memory redo log buffers ** and disk-stored redo log files that are persistent and never lost.

**InnoDB updates using the Write Ahead Log policy, that is, Write logs first, then Write to disk. ** When a record is updated, InnoDB writes the record to the redo log buffer and updates the memory data. You can customize innodb_flush_log_at_trx_commit to flush logs from the redo log buffer to the redo log file at commit time.

** InnoDB’s redo log is of a fixed size. ** Multiple log files form a loop. When writing to the end, log files return to the beginning. We can configure the number of log files and the size of each log file with the parameters innodb_log_files_in_group and innodb_log_file_size.

** The updated data in the Buffer Pool is not flushed to disk. This memory page is called a dirty page. ** Eventually the dirty pages are flushed to disk, overwriting the data on disk. This process is not necessarily related to the redo log.

** Dirty pages are flushed to disk only when the redo log is full. ** Dirty pages are flushed only when the redo log is full. Dirty pages are also flushed when the redo log is full:

  • If the system memory is insufficient, you need to delete some data pages. If dirty pages are to be deleted, synchronize the dirty pages to disks first.
  • MySQL considers idle time, this case has no performance issues;
  • MySQL will flush all dirty pages to disk before properly shutting down, and there is no performance problem in this case.

** In a production environment, if we turn on slow SQL monitoring, you will occasionally see some SQL that takes a little longer. ** This is because dirty pages may impose a performance overhead on the database when flushed to disk, causing the database operation to jitter.

LRU elimination strategy

Now that we’ve seen how InnoDB updates and inserts work, let’s look at how they work and how they’re optimized.

InnoDB storage engine is a data store based on the collection index implementation, ** that is, in addition to index columns and primary keys are stored in the B + tree, ** other column data is stored in the B + tree leaf nodes. Index pages and data pages are cached in the buffer pool. InnoDB does not need to read pages every time as long as the data exists in the buffer pool, thus improving the query performance of the database.

** Although a buffer pool is a large memory area, it cannot store all data in it due to the various types of data and the large amount of data to be stored. Therefore, the buffer pool needs to cache recent and frequently queried data through the LRU algorithm, while infrequently queried data is filtered out.

**InnoDB makes some optimizations to LRU, ** The familiar LRU algorithm usually puts the most recently queried data at the head of the LRU list, while InnoDB puts the data at a MIDpoint, usually 5/8 of the list length.

This policy prevents infrequently queried operations from flushing out hotspot data. When hotspot data is queried again, it needs to be obtained from disks, which affects the query performance of the database.

If we have a lot of hot data, we can adjust the MIDpoint value to increase the storage of hot data, thus reducing the elimination rate of hot data.

conclusion

** The above InnoDB implementation and operation principle is introduced here. * * in general, as a development engineer, we should master the knowledge of database several big, internal implementation details, and then to go deep into the database to avoid often write some SQL performance problems, train the ability of tuning the database performance, finishing a MySQL learning notes hope I can help to you.