A quick look at the data page structure

  • The data page represents this16KBThe size of the storage space can be divided into multiple parts, different parts have different functions, each part is shown in the figure:

  • aInnoDBThe data page storage space is roughly divided into7A part of

One, records in the page storage

  • The records will be stored as we specifyRow formatStored in theUser RecordsPart of the
  • When the page was first generated, it wasn’tUser RecordsThis part, every time we insert a record, it’s going to go fromFree SpacePart, that is, unused storage space to apply for a record size space partition toUser RecordsPart,
  • whenFree SpacePart of the space is completelyUser RecordsAfter partial replacement, it means that the page is used up, and if new records are inserted, you need to apply for a new page

1. Record the secret of header information

Let’s create a table:

mysql> CREATE TABLE page_demo(
    ->     c1 INT.->     c2 INT.->     c3 VARCHAR(10000),
    ->     PRIMARY KEY (c1)
    -> ) CHARSET=ascii ROW_FORMAT=Compact;
Query OK, 0 rows affected (0.03 sec)
Copy the code
  • c1andc2Columns are used to store integers,c3Columns are used to store strings

  • inpage_demoThe row format diagram for the table draws the related header attributes as wellc1,c2,c3Column information (other information not drawn does not mean that they do not exist, but for the sake of easy understanding, the ~ is omitted from the diagram), the simplified row format diagram is like this:

  • Continue inserting data
mysql> INSERT INTO page_demo VALUES(1.100.'aaaa'), (2.200.'bbbb'), (3.300.'cccc'), (4.400.'dddd'); 
Query OK, 4 rows affected (0.00 sec) 
Records: 4 Duplicates: 0 Warnings: 0
Copy the code
  • As is shown in

  • delete_mask

    • This property indicates whether the current record has been deleted. It occupies one binary bit. A value of 0 indicates that the record has not been deleted, and a value of 1 indicates that the record has been deleted.

    • Instead of being deleted immediately, all the deleted records will form a so-called junk list. The space occupied by the records in the list is called reusable space. If new records are inserted into the table later, the storage space occupied by the deleted records may be overwritten.

  • min_rec_mask

    • The mark is added to the smallest record in the non-leaf node at each level of the B+ tree
  • n_owned

    This is under wraps for now, but later it will be the main character

  • heap_no

    • This property represents the current record’s position on the page. As you can see from the figure, the four records we inserted are 2, 3, 4, and 5 on the page. We’re missing 0’s and 1’s

    • Mysql adds two records to the edges of each page, which are sometimes called pseudo records or virtual records because they are not inserted by us. One of these two pseudo records represents the minimum record and the other represents the maximum record

    • Records are available in ratio sizes and are compared to primary keys

    • No matter how many of our own records we insert into the page, Mysql automatically generates two pseudo records, a minimum record and a maximum record. The construction of these two records is quite simple, consisting of a 5-byte record header information and a fixed 8-byte section, as shown in the figure

  • Since these two records are not our own defined records, they are not storedpagetheUser RecordsFor parts, they are placed separately in a section calledInfimum + Supremum, as shown in the figure:

  • The values heap_no are 0 and 1, respectively

  • record_type

    • This property represents the type of the current record. There are four types of records.0Represents a normal record,1Represents B+ tree non-leaf node record,2Represents the minimum record,3Represents the maximum record. We can also see from the figure that the records we insert ourselves are ordinary records, theirrecord_typeValues are0, while minimum record and maximum recordrecord_typeValues, respectively2and3
  • next_record

    • It represents the address offset from the real data in the current record to the real data in the next record
    • The biggest recordthenext_recordThe value of0

  • If we delete one record from the list, the list will change. For example, if we delete the second record:

  • As can be seen from the figure, the following changes occurred before and after the deletion of the second record:

    • The second record is not removed from the storage space, but is recordeddelete_maskValue is set to1.
    • Recorded in article 2next_recordThe value becomes 0, which means there is no next record for the record.
    • Recorded in article 1next_recordIt points to the third record.
    • There’s something else you might have missed, which isThe biggest recordthen_ownedValues from5Turned out to be4“We’ll talk more about this later.
  • If we insert this record into the table again

  • As you can see from the picture,InnoDBInstead of applying for new storage space for the new record because of its insertion, the storage space of the original deleted record is directly reused.

2, Page Directory

  • Now that we know that records are concatenated in a single linked list by primary key on a page, what if we want to find a record on a page by primary key
SELECT * FROM page_demo WHERE c1 = 3;
Copy the code
  • This section describes the MySQL query process

    • Divide all normal records (including maximum and minimum records, but not those marked as deleted) into groups.
    • The header for the last record of each group (that is, the largest record in the group)n_ownedProperty to indicate how many records the record has,That is, there are several records in the group.
    • Will each group ofLast recordThe address offsets are extracted separately and stored in sequence close to each otherpageThe tail of PI, this place is called PIPage Directory, that is,Page directoryYou should go back to the head to see the diagrams for each section of the page. These address offsets in the page catalog are calledslotEnglish name:Slot), so this page directory is made up ofslotComposed of.
  • For example, the page_demo table now has 6 normal records, InnoDB will divide them into two groups, the first group has only one minimum record, the second group is the remaining 5 records, see the diagram below:

  • Note the N_OWNED attribute in the header information for the minimum and maximum records

    • minimum-recordedn_ownedA value of1This means that the group ending in the minimum record has only1One record, the minimum record itself.
    • maximum-recordedn_ownedA value of5This means that the group that ends with the largest record has only5A record, including the maximum record itself and our own insert4Records.
  • The group where the minimum record resides can have only one record, the group where the maximum record resides can have only 1 to 8 records, and the remaining groups can have only 4 to 8 records.

  • So the n_owned of the smallest record is 1, and the n_owned of the current largest record is 5

  • Each time a record is inserted, it finds a slot in the page directory where the primary key is larger than the primary key of the current record and the difference between the primary key and the primary key of the current record is the smallest. Then, the n_OWNED value of the record corresponding to this slot is increased by 1 to indicate that another record is added to the group until the number of records in the group is equal to 8.

  • If the number of records in a group is equal to eight and a record is inserted, the group is divided into two groups. One group has four records and the other has five records. This process creates a new slot in the page directory to record the offset of the largest record in the new group.

  • Add another 12 pieces of data

  • So let’s say we want to find the primary key of PI6The record of
  1. Calculate the position of the middle slot :(0+4)/2=2, so check the primary key of the record corresponding to slot 2 is 8, and since 8 > 6, set high=2, and keep low unchanged.

  2. Recalculate the position of the middle slot :(0+2)/2=1, so check that the primary key for slot 1 is 4, and since 4 < 6, set low=1, and leave high unchanged.

  3. Because the value of high-low is 1, make sure that the record with a primary key of 6 is in the group corresponding to slot 2. At this point we need to find the record in slot 2 with the smallest primary key, and then traverse the records in slot 2 along the unidirectional list.

  4. We can take the record for slot 1 (primary key 4), and the next record for that record is the record with the smallest primary key in slot 2, which has the primary key 5. So we can start with the record with the primary key of 5 and go through the records in slot 2 until we find the record with the primary key of 6. Since a group can contain only 1 to 8 records, the cost of traversing the records in a group is small.

So finding a record in a data page with a specified primary key is a two-step process:

  1. The slot in which the record is located is determined by dichotomy and the record with the smallest primary key in the group in which the slot is located is found.
  2. By recordnext_recordProperty traverses the records in the group to which the slot belongs.

Page Header

  • In order to get oneData pageFor example, how many records have been stored in the page, what is the address of the first record, how many slots are stored in the page directory, etcPage HeaderPI, it’s PIpageThe second part of the structure, which occupies the fixed56Each byte is used to store various state information. The following table describes what each byte does:
The name of the Occupied space describe
PAGE_N_DIR_SLOTS 2byte Number of slots in the page catalog
PAGE_HEAP_TOP 2byte The minimum address of the space that is not yet in use, i.eFree Space
PAGE_N_HEAP 2byte The number of records in this page (including minimum and maximum records and records marked for deletion)
PAGE_FREE 2byte The address of the first record that has been marked for deletion (each deleted record passesnext_recordIt also forms a single linked list, which can be reused.)
PAGE_GARBAGE 2byte Number of bytes occupied by deleted records
PAGE_LAST_INSERT 2byte The last place to insert a record
PAGE_DIRECTION 2byte Record the direction of insertion
PAGE_N_DIRECTION 2byte The number of consecutive records inserted in one direction
PAGE_N_RECS 2byte The number of records on the page (excluding minimum and maximum records and those marked for deletion)
PAGE_MAX_TRX_ID 8byte Changes the maximum transaction ID for the current page, which is defined only in the secondary index
PAGE_LEVEL 2byte The level of the current page in the B+ tree
PAGE_INDEX_ID 8byte Index ID, which indicates the index to which the current page belongs
PAGE_BTR_SEG_LEAF 10byte The header information for the B+ tree leaf segment is defined only on the Root page of the B+ tree
PAGE_BTR_SEG_TOP 10byte The header information for the non-leaf segments of the B+ tree is defined only on the Root page of the B+ tree

4. File Header

  • Store general information about the page, such as the number of the page, the page before it, the page after it, and the page after it38Byte, which is made up of the following:

:

The name of the Occupied space describe
FIL_PAGE_SPACE_OR_CHKSUM 4byte Page checksum
FIL_PAGE_OFFSET 4byte Page number
FIL_PAGE_PREV 4byte The number of the previous page
FIL_PAGE_NEXT 4byte The page number of the next page
FIL_PAGE_LSN 8byte Log Sequence Number where the page was last modified
FIL_PAGE_TYPE 2byte The type of the page
FIL_PAGE_FILE_FLUSH_LSN 8byte Defined in only one page of the system tablespace, which means that the file has been refreshed to at least the corresponding LSN value
FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID 4byte Which table space the page belongs to

Against this chart, let’s take a look at some of the most important parts:

  • FIL_PAGE_SPACE_OR_CHKSUM

    This represents the checksum of the current page. What is a checksum? That is, for a very, very long byte string, we’re going to use some algorithm to compute a short value to represent that long byte string, and that short value is called the checksum. This compares the checksum of two long bytes before comparing them. If the checksum is different, the two long bytes must be different, so the time loss of directly comparing two long bytes is eliminated.

  • FIL_PAGE_OFFSET

    Each page has a separate page number, just like your ID number. InnoDB uses the page number to uniquely locate a page.

  • FIL_PAGE_TYPE

    This represents the type of the current page. As we mentioned earlier, InnoDB divides pages into different types for different purposes. The ones we described above are data pages that store records.

    Type the name hexadecimal describe
    FIL_PAGE_TYPE_ALLOCATED 0x0000 New allocation, not yet in use
    FIL_PAGE_UNDO_LOG 0x0002 The Undo log page
    FIL_PAGE_INODE 0x0003 Segment information node
    FIL_PAGE_IBUF_FREE_LIST 0x0004 Insert Buffer free list
    FIL_PAGE_IBUF_BITMAP 0x0005 The Insert Buffer bitmap
    FIL_PAGE_TYPE_SYS 0x0006 The system page
    FIL_PAGE_TYPE_TRX_SYS 0x0007 Transaction system data
    FIL_PAGE_TYPE_FSP_HDR 0x0008 Table space header information
    FIL_PAGE_TYPE_XDES 0x0009 Extension Description page
    FIL_PAGE_TYPE_BLOB 0x000A Overflow page
    FIL_PAGE_INDEX 0x45BF Index pages, that’s what we call themData page

    The type of data page where we store records is actually FIL_PAGE_INDEX, which is called the index page. As for what an index is, let’s break it down next time

  • FIL_PAGE_PREV and FIL_PAGE_NEXT

Five, the summary

  1. A data page can be roughly divided into seven sections, which are

    • File Header, which represents some general information of the page, and is a fixed 38 bytes.
    • Page Header, which represents some information that is proprietary to the data page, accounting for a fixed 56 bytes.
    • Infimum + Supremum, two virtual pseudo records, respectively representing the smallest and largest records in the page, occupy fixed26Bytes.
    • User Records: The part that actually stores the record we inserted. The size is not fixed.
    • Free Space: An unused portion of a page of uncertain size.
    • Page Directory: The relative position of some records in the page, that is, the address offset of each slot in the page, the size is not fixed, the more records inserted, the more space this part of the space.
    • File Trailer: The part of the page used to verify that the page is complete, occupying a fixed 8 bytes.