A quick look at the data page structure
- The data page represents this
16KB
The size of the storage space can be divided into multiple parts, different parts have different functions, each part is shown in the figure:
- a
InnoDB
The data page storage space is roughly divided into7
A part of
One, records in the page storage
- The records will be stored as we specify
Row format
Stored in theUser Records
Part of the - When the page was first generated, it wasn’t
User Records
This part, every time we insert a record, it’s going to go fromFree Space
Part, that is, unused storage space to apply for a record size space partition toUser Records
Part, - when
Free Space
Part of the space is completelyUser Records
After partial replacement, it means that the page is used up, and if new records are inserted, you need to apply for a new page
1. Record the secret of header information
Let’s create a table:
mysql> CREATE TABLE page_demo(
-> c1 INT.-> c2 INT.-> c3 VARCHAR(10000),
-> PRIMARY KEY (c1)
-> ) CHARSET=ascii ROW_FORMAT=Compact;
Query OK, 0 rows affected (0.03 sec)
Copy the code
c1
andc2
Columns are used to store integers,c3
Columns are used to store strings
- in
page_demo
The row format diagram for the table draws the related header attributes as wellc1
,c2
,c3
Column information (other information not drawn does not mean that they do not exist, but for the sake of easy understanding, the ~ is omitted from the diagram), the simplified row format diagram is like this:
- Continue inserting data
mysql> INSERT INTO page_demo VALUES(1.100.'aaaa'), (2.200.'bbbb'), (3.300.'cccc'), (4.400.'dddd');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
Copy the code
- As is shown in
-
delete_mask
-
This property indicates whether the current record has been deleted. It occupies one binary bit. A value of 0 indicates that the record has not been deleted, and a value of 1 indicates that the record has been deleted.
-
Instead of being deleted immediately, all the deleted records will form a so-called junk list. The space occupied by the records in the list is called reusable space. If new records are inserted into the table later, the storage space occupied by the deleted records may be overwritten.
-
-
min_rec_mask
- The mark is added to the smallest record in the non-leaf node at each level of the B+ tree
-
n_owned
This is under wraps for now, but later it will be the main character
-
heap_no
-
This property represents the current record’s position on the page. As you can see from the figure, the four records we inserted are 2, 3, 4, and 5 on the page. We’re missing 0’s and 1’s
-
Mysql adds two records to the edges of each page, which are sometimes called pseudo records or virtual records because they are not inserted by us. One of these two pseudo records represents the minimum record and the other represents the maximum record
-
Records are available in ratio sizes and are compared to primary keys
-
No matter how many of our own records we insert into the page, Mysql automatically generates two pseudo records, a minimum record and a maximum record. The construction of these two records is quite simple, consisting of a 5-byte record header information and a fixed 8-byte section, as shown in the figure
-
- Since these two records are not our own defined records, they are not stored
page
theUser Records
For parts, they are placed separately in a section calledInfimum + Supremum
, as shown in the figure:
-
The values heap_no are 0 and 1, respectively
-
record_type
- This property represents the type of the current record. There are four types of records.
0
Represents a normal record,1
Represents B+ tree non-leaf node record,2
Represents the minimum record,3
Represents the maximum record. We can also see from the figure that the records we insert ourselves are ordinary records, theirrecord_type
Values are0
, while minimum record and maximum recordrecord_type
Values, respectively2
and3
- This property represents the type of the current record. There are four types of records.
-
next_record
- It represents the address offset from the real data in the current record to the real data in the next record
The biggest record
thenext_record
The value of0
- If we delete one record from the list, the list will change. For example, if we delete the second record:
-
As can be seen from the figure, the following changes occurred before and after the deletion of the second record:
- The second record is not removed from the storage space, but is recorded
delete_mask
Value is set to1
. - Recorded in article 2
next_record
The value becomes 0, which means there is no next record for the record. - Recorded in article 1
next_record
It points to the third record. - There’s something else you might have missed, which is
The biggest record
then_owned
Values from5
Turned out to be4
“We’ll talk more about this later.
- The second record is not removed from the storage space, but is recorded
-
If we insert this record into the table again
- As you can see from the picture,
InnoDB
Instead of applying for new storage space for the new record because of its insertion, the storage space of the original deleted record is directly reused.
2, Page Directory
- Now that we know that records are concatenated in a single linked list by primary key on a page, what if we want to find a record on a page by primary key
SELECT * FROM page_demo WHERE c1 = 3;
Copy the code
-
This section describes the MySQL query process
- Divide all normal records (including maximum and minimum records, but not those marked as deleted) into groups.
- The header for the last record of each group (that is, the largest record in the group)
n_owned
Property to indicate how many records the record has,That is, there are several records in the group. - Will each group ofLast recordThe address offsets are extracted separately and stored in sequence close to each other
page
The tail of PI, this place is called PIPage Directory
, that is,Page directory
You should go back to the head to see the diagrams for each section of the page. These address offsets in the page catalog are calledslot
English name:Slot
), so this page directory is made up ofslot
Composed of.
-
For example, the page_demo table now has 6 normal records, InnoDB will divide them into two groups, the first group has only one minimum record, the second group is the remaining 5 records, see the diagram below:
-
Note the N_OWNED attribute in the header information for the minimum and maximum records
- minimum-recorded
n_owned
A value of1
This means that the group ending in the minimum record has only1
One record, the minimum record itself. - maximum-recorded
n_owned
A value of5
This means that the group that ends with the largest record has only5
A record, including the maximum record itself and our own insert4
Records.
- minimum-recorded
-
The group where the minimum record resides can have only one record, the group where the maximum record resides can have only 1 to 8 records, and the remaining groups can have only 4 to 8 records.
-
So the n_owned of the smallest record is 1, and the n_owned of the current largest record is 5
-
Each time a record is inserted, it finds a slot in the page directory where the primary key is larger than the primary key of the current record and the difference between the primary key and the primary key of the current record is the smallest. Then, the n_OWNED value of the record corresponding to this slot is increased by 1 to indicate that another record is added to the group until the number of records in the group is equal to 8.
-
If the number of records in a group is equal to eight and a record is inserted, the group is divided into two groups. One group has four records and the other has five records. This process creates a new slot in the page directory to record the offset of the largest record in the new group.
-
Add another 12 pieces of data
- So let’s say we want to find the primary key of PI
6
The record of
-
Calculate the position of the middle slot :(0+4)/2=2, so check the primary key of the record corresponding to slot 2 is 8, and since 8 > 6, set high=2, and keep low unchanged.
-
Recalculate the position of the middle slot :(0+2)/2=1, so check that the primary key for slot 1 is 4, and since 4 < 6, set low=1, and leave high unchanged.
-
Because the value of high-low is 1, make sure that the record with a primary key of 6 is in the group corresponding to slot 2. At this point we need to find the record in slot 2 with the smallest primary key, and then traverse the records in slot 2 along the unidirectional list.
-
We can take the record for slot 1 (primary key 4), and the next record for that record is the record with the smallest primary key in slot 2, which has the primary key 5. So we can start with the record with the primary key of 5 and go through the records in slot 2 until we find the record with the primary key of 6. Since a group can contain only 1 to 8 records, the cost of traversing the records in a group is small.
So finding a record in a data page with a specified primary key is a two-step process:
- The slot in which the record is located is determined by dichotomy and the record with the smallest primary key in the group in which the slot is located is found.
- By record
next_record
Property traverses the records in the group to which the slot belongs.
Page Header
- In order to get oneData pageFor example, how many records have been stored in the page, what is the address of the first record, how many slots are stored in the page directory, etc
Page Header
PI, it’s PIpage
The second part of the structure, which occupies the fixed56
Each byte is used to store various state information. The following table describes what each byte does:
The name of the | Occupied space | describe |
---|---|---|
PAGE_N_DIR_SLOTS |
2 byte |
Number of slots in the page catalog |
PAGE_HEAP_TOP |
2 byte |
The minimum address of the space that is not yet in use, i.eFree Space |
PAGE_N_HEAP |
2 byte |
The number of records in this page (including minimum and maximum records and records marked for deletion) |
PAGE_FREE |
2 byte |
The address of the first record that has been marked for deletion (each deleted record passesnext_record It also forms a single linked list, which can be reused.) |
PAGE_GARBAGE |
2 byte |
Number of bytes occupied by deleted records |
PAGE_LAST_INSERT |
2 byte |
The last place to insert a record |
PAGE_DIRECTION |
2 byte |
Record the direction of insertion |
PAGE_N_DIRECTION |
2 byte |
The number of consecutive records inserted in one direction |
PAGE_N_RECS |
2 byte |
The number of records on the page (excluding minimum and maximum records and those marked for deletion) |
PAGE_MAX_TRX_ID |
8 byte |
Changes the maximum transaction ID for the current page, which is defined only in the secondary index |
PAGE_LEVEL |
2 byte |
The level of the current page in the B+ tree |
PAGE_INDEX_ID |
8 byte |
Index ID, which indicates the index to which the current page belongs |
PAGE_BTR_SEG_LEAF |
10 byte |
The header information for the B+ tree leaf segment is defined only on the Root page of the B+ tree |
PAGE_BTR_SEG_TOP |
10 byte |
The header information for the non-leaf segments of the B+ tree is defined only on the Root page of the B+ tree |
4. File Header
- Store general information about the page, such as the number of the page, the page before it, the page after it, and the page after it
38
Byte, which is made up of the following:
:
The name of the | Occupied space | describe |
---|---|---|
FIL_PAGE_SPACE_OR_CHKSUM |
4 byte |
Page checksum |
FIL_PAGE_OFFSET |
4 byte |
Page number |
FIL_PAGE_PREV |
4 byte |
The number of the previous page |
FIL_PAGE_NEXT |
4 byte |
The page number of the next page |
FIL_PAGE_LSN |
8 byte |
Log Sequence Number where the page was last modified |
FIL_PAGE_TYPE |
2 byte |
The type of the page |
FIL_PAGE_FILE_FLUSH_LSN |
8 byte |
Defined in only one page of the system tablespace, which means that the file has been refreshed to at least the corresponding LSN value |
FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID |
4 byte |
Which table space the page belongs to |
Against this chart, let’s take a look at some of the most important parts:
-
FIL_PAGE_SPACE_OR_CHKSUM
This represents the checksum of the current page. What is a checksum? That is, for a very, very long byte string, we’re going to use some algorithm to compute a short value to represent that long byte string, and that short value is called the checksum. This compares the checksum of two long bytes before comparing them. If the checksum is different, the two long bytes must be different, so the time loss of directly comparing two long bytes is eliminated.
-
FIL_PAGE_OFFSET
Each page has a separate page number, just like your ID number. InnoDB uses the page number to uniquely locate a page.
-
FIL_PAGE_TYPE
This represents the type of the current page. As we mentioned earlier, InnoDB divides pages into different types for different purposes. The ones we described above are data pages that store records.
Type the name hexadecimal describe FIL_PAGE_TYPE_ALLOCATED
0x0000 New allocation, not yet in use FIL_PAGE_UNDO_LOG
0x0002 The Undo log page FIL_PAGE_INODE
0x0003 Segment information node FIL_PAGE_IBUF_FREE_LIST
0x0004 Insert Buffer free list FIL_PAGE_IBUF_BITMAP
0x0005 The Insert Buffer bitmap FIL_PAGE_TYPE_SYS
0x0006 The system page FIL_PAGE_TYPE_TRX_SYS
0x0007 Transaction system data FIL_PAGE_TYPE_FSP_HDR
0x0008 Table space header information FIL_PAGE_TYPE_XDES
0x0009 Extension Description page FIL_PAGE_TYPE_BLOB
0x000A Overflow page FIL_PAGE_INDEX
0x45BF Index pages, that’s what we call them Data page
The type of data page where we store records is actually FIL_PAGE_INDEX, which is called the index page. As for what an index is, let’s break it down next time
-
FIL_PAGE_PREV and FIL_PAGE_NEXT
Five, the summary
-
A data page can be roughly divided into seven sections, which are
File Header
, which represents some general information of the page, and is a fixed 38 bytes.Page Header
, which represents some information that is proprietary to the data page, accounting for a fixed 56 bytes.Infimum + Supremum
, two virtual pseudo records, respectively representing the smallest and largest records in the page, occupy fixed26
Bytes.User Records
: The part that actually stores the record we inserted. The size is not fixed.Free Space
: An unused portion of a page of uncertain size.Page Directory
: The relative position of some records in the page, that is, the address offset of each slot in the page, the size is not fixed, the more records inserted, the more space this part of the space.File Trailer
: The part of the page used to verify that the page is complete, occupying a fixed 8 bytes.