The fastest shortcut in the world is to keep your feet on the ground. Focus on the sharing place.

The InnoDB engine has several key features that give it better performance and reliability:

  • Insert Buffer
  • Double Write
  • Adaptive Hash Index
  • Async IO
  • Flush Neighbor Page

Today our topic is Insert Buffer, because InnoDB engine’s underlying data storage structure is B+ tree, and for indexes we have clustered indexes and non-clustered indexes.

When data is inserted, index changes are inevitable. Clustered indexes, needless to say, are generally incrementally ordered. Non-clustered indexes, on the other hand, are not necessarily what data is, and their discreteness leads to constant changes in the structure at insert time, resulting in reduced insert performance.

So to solve the problem of non-clustered index Insert performance, the InnoDB engine created the Insert Buffer.

Insert Buffer storage

If you look at the figure above, you might think that the Insert Buffer is part of the InnoDB Buffer pool.

** Important: ** True and false, InnoDB Buffer pools do contain Insert buffers, but Insert buffers, like data pages, are physically present (as B+ trees in a shared table space).

Insert Buffer

A few points:

  • A table can have only one primary key index because its physical storage is a B+ tree. (Don’t forget to aggregate the data stored by the index leaf node, which has only one copy)

  • The non-clustered index leaf node stores the primary key of the clustered index

Clustered index inserts

First we know that in the InnoDB storage engine, the primary key is the unique identifier of the row (i.e., the clustered index we’re always talking about). We usually insert data according to the primary key in ascending order, so clustered indexes are sequential, do not need to disk random read.

Such as table:

CREATE TABLE test(
	id INT AUTO_INCREMENT,
	name VARCHAR(30),
	PRIMARY KEY(id)
);
Copy the code

Above I created a primary key ID that has the following properties:

  • The Id column is self-growing
  • When a NULL value is inserted into the Id column, the value is incremented due to AUTO_INCREMENT
  • At the same time, row records in the data page are stored in order according to the value of ID

In general, because of the orderliness of clustered indexes, there is no need to randomly read the data in the page because such sequential inserts are very fast.

But if you insert a column Id into something like a UUID, then your insert is just as random as a non-clustered index. This will cause your B+ tree structure to constantly change, and performance will inevitably suffer.

Inserts of non-clustered indexes

Many times our table will also have many non-clustered indexes, such as if I am querying by b field, and B field is not unique. The following table:

CREATE TABLE test(
	id INT AUTO_INCREMENT,
	name VARCHAR(30),
	PRIMARY KEY(id),
	KEY(name)
);
Copy the code

Here I create an X table that has the following characteristics:

  • There is a clustered index ID
  • There is a non-unique non-clustered index name
  • Data pages are stored in order of primary key IDS when data is inserted
  • Data inserts for secondary index name are not sequential

A non-clustered index is also a B+ tree, except that the leaf node stores the primary key and name of the clustered index.

Since the data in the name column is not guaranteed to be sequential, the inserts in the tree of the non-clustered index must also be sequential.

Of course, if the name column inserts time type data, its non-clustered index inserts are also sequential.

The arrival of Insert Buffer

It can be seen that the discreteness of non-clustered index inserts leads to poor Insert performance, so InnoDB engine designed the Insert Buffer to improve Insert performance.

Let me take a look at how to Insert using Insert Buffer:

First of all, the insert or update operation of the non-clustered index is not directly inserted into the index page each time, but determines whether the inserted non-clustered index page is in the buffer pool.

If yes, insert directly; If not, it is first put into an Insert Buffer object.

It looks as if the tree has been inserted into a leaf node of a non-clustered index, when it is stored elsewhere

Merge Insert buffers and sub-nodes of secondary index pages on a regular basis, often merging multiple inserts together, greatly improves Insert performance for non-clustered indexes.

Insert Buffer usage requirements:

  • Indexes are non-clustered indexes
  • The index is not unique

The InnoDB storage engine uses Insert Buffer to improve Insert performance only if the above two requirements are met.

So why do these two conditions have to be met?

The first point is that the index is not a clustered index, which is sequential and does not need you

The second point must not be unique, because the database does not determine the uniqueness of the inserted record when writing to the Insert Buffer. If you look it up again, it’s going to be discrete reading, so InsertBuffer is meaningless.

Query the Insert Buffer information

We can use the SHOW ENGINE INNODB STATUS command to view the Insert Buffer information:

-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 7545, free list len 3790, seg size 11336, 
8075308 inserts,7540969 merged sec, 2246304 merges
...

Copy the code

After using the command, we can see a lot of information, but here we just look at the INSERT BUFFER:

  • Seg size indicates the current Insert Buffer size 11336 x 16KB

  • Free Listlen represents the length of the free list

  • Size represents the number of record pages that have been merged

  • Inserts represent the number of records inserted

  • Merged Recs represents the number of merged insert records

  • Merges represents the number of merges, which is the number of pages actually read

Merges: Merged Recs is approximately 1:3, which means that the Insert Buffer reduces discrete I/O logic requests for non-clustered index pages by approximately 2/3

Insert Buffer problems

Having said that, there is a problem with Insert buffers:

That is, in write-intensive cases, insert buffers take up too much buffer pool memory (innodb_buffer_pool), up to 1/2 buffer pool memory by default.

Occupying too large a buffer pool will inevitably affect other buffer pool operations

Optimization of Insert Buffer

MySQL5.5 is an update to the Insert Buffer that was previously called the Change Buffer.

The Change Buffer buffers Insert, DELETE, and UPDATE operations. The Change Buffer buffers Insert, DELETE, and UPDATE operations.

  • Insert Buffer

  • Delete Buffer

  • Purgebuffer

Conclusion:

What exactly is an Insert Buffer?

  • The data structure of the Insert Buffer is a B+ tree.

  • Prior to MySQL 4.1 there was an Insert Buffer B+ tree for each table

  • Currently, there is only one Insert Buffer B+ tree globally, which is responsible for all the secondary indexes of the Insert Buffer

  • The B+ tree is stored in the shared table space IBDatA1

Merge Insert Buffers can write to truly non-clustered indexes in the following cases: Merge Insert Buffers

  • When the secondary index page is read into the buffer pool
  • The Insert Buffer Bitmap page traces when the secondary index page is out of space
  • Merge Insert Buffer operations are performed on the Master Thread every second or every 10 seconds

In a word:

The Insert Buffer is used to improve the Insert performance of non-clustered index pages. Its data structure is similar to a B+ tree of data pages and is physically stored in the shared table space IBDatA1.