1. Introduction to InnoDB page

A Page is the smallest unit of disk used by Innodb storage engine to manage data. Common page types include data page, Undo page, system page, transaction data page, etc

2.InnoDB row storage format

The records we insert into MySQL can be stored in four rows in InnoDB, namely Compact, Redundant, Dynamic, and Compressed. We can specify the desired row format in the statement creating or modifying the table

ALTER TABLE TABLE name ROW_FORMAT= row format name ALTER TABLE name ROW_FORMAT= row format nameCopy the code

2.1 COMPACT row format

Compact row recording was introduced in MySQL5.0 to store data efficiently. In simple terms, the more rows a Page can hold, the higher the performance

A complete record is divided into recorded additional information and recorded real data

2.1.1 Additional information recorded

This additional information falls into three categories, variable-length field length lists, NULL value lists, and record header information.

2.1.1.1 Variable-length field length list

Record the length of each column in reverse order, using one byte if the column length is less than 255 bytes, and two bytes otherwise. The actual length of the field depends on the number of columns and the length of each column, and is therefore variable

2.1.1.2 NULL Value list

A byte indicating whether the line has a NULL value. Note: It is important to note here that the fixed-length CHAR data types and the variable-length VCHAR data types do not occupy any storage space when NULL is recorded in Compact.

2.1.1.3 Recording Header Information

The record header information used to describe the record, which is a fixed 5 bytes. Five bytes is 40 binary bits, different bits represent different meanings, where next_record records the relative position of the next record, all records in a page using this field to form a single linked list.

2.1.2 Recorded real data

Transaction ID, Roll Pointer, row_ID (when no primary key is specified);

2.1.2.1 row_id

InnoDB implicitly generates a row_ID as the primary key if the user has not set a primary key and does not have a unique attribute

2.1.2.2 transaction_id and roll_pointer

The transaction number used to record the current record for optimistic lock implementation with repeatable read isolation level

2.2 Redundant line format

MySQL5.0 line record format:

2.2.1 Field Length Offset List

As in Compact, varied-length field length lists are set in reverse order, but the offset list records the offset. Each offset is added to the previous offset. For CHAR NULL values, the maximum space is directly recorded. NULL values for VCHAR do not occupy any storage space.

The difference from the Compact row format is that:

  • It stores the length of all attributes
  • The length is calculated by the difference between two adjacent offsets
  • The first bit of each column’s offset is used as a NULL bit to mark NULL values, and fields of fixed length that are actually NULL are filled with 0 bits, not variable lengths
  • For char(M), even if a variable-length character set is used, space is allocated according to the maximum character length of the character set

2.2.2 Recording header Information

Approximately the same as compact

Row overflow data

We know that the data page size is 16KB, Innodb storage engine ensures that each page has at least two records, when the row record length does not exceed the maximum row record length, all data is stored on the current page. If the records on a Page are too large, the first 768 bytes are captured and stored in the Page and the rest in the BLOB Page. Then the actual recorded data store 20 bytes pointing to the address of these pages (of course, the 20 bytes also include the number of bytes occupied by the data scattered in other pages), so as to find the page where the rest of the data resides

2.3 Dynamic and Compressed format

Innodb 1.0x started to introduce a new file format, Barracuda, with two new row formats: Compressed and Dynamic. The two new record formats use full row overflow for storing data in bloBs.

  • Dynamic row format, columns stored on the off-page page, depending on the size of the row, will be the longest row in the off-page until the data page can hold two rows. TEXT or BLOB columns <=40bytes always exist on the data page. This prevents the large column of values from being placed in b-tree nodes as in Compact (the data Page only holds 20-byte Pointers and the actual data is stored in the Off Page instead of 768 prefix bytes in the previous Compact and Redundant formats).

  • The physical structure of Compressed data is similar to Dynamic. Another function of Compressed data format is that the data stored in the Compressed row can be Compressed by zlib algorithm, so it can effectively store BLOB, TEXT, VARCHAR and other large length data (reduced by 40%, But the CPU is more demanding).