The nature of indexes

Indexes are sorted data structures that help MySQL retrieve data efficiently. Indexes are created based on database tables.

The data structure used by the Inodb storage engine in MySQL to store indexes

In MySQL, the data structure used for storing indexes in Inodb storage engine is B+ tree

  1. Why not use binary trees, B-trees, red-black trees,
  • Binary tree: If the index using binary tree data structure, when the index of the database table field is a field of increasing, the binary trees composed of the index field becomes a chain table, binary tree, the following figure, when the index to look up data, can create a full table scan, each traverse a field, will conduct a disk IO, So it is not appropriate to use binary tree as the data structure for storing index.

  • Red and black tree: in fact the red-black tree is a kind of balanced binary tree, if the index data structure using the red-black tree does than using binary tree is more efficient, but when large amounts of data in the table, will cause the height of the red-black tree is too high, if find the data at the bottom of a red-black tree is need to traverse the bottom, still need to make multiple disk IO.

  • B tree: Balanced binary trees are not fully utilizedDisk prefetch functionB tree is a data structure created to make full use of the disk prefetch function. Each node of B tree can store multiple indexes. It sets the node size to the size of the disk page, making full use of the disk prefetch function. An entire node is read each time a disk page is read. And because each node stores so many indexes, the depth of the tree is very small. And then toThe number of disk reads performed would be very small, and more of the data read would be looked up in memory, but it is still not the preferred data structure for storing indexes.

  • B + tree: B+ tree is a variant of B tree. Leaf nodes in a B+ tree are composed of indexes and data. Non-leaf nodes are redundant and only used to store indexes (not data, to make room for more indexes) to help build a B+ tree. Each leaf node has bidirectional Pointers to adjacent leaf nodes (with Pointers between leaf nodes, range lookup can be supported), and the index fields increase from left to right. Designers cleverly used the disk to proofread principle of database system, will be the size of a node is set to equal to a page, so that each node (each node including multiple index field) to an I/O can be completely loaded memory, if can’t find in the load memory data, depending on the pointer will point to the next branch node loaded into the memory of binary search.

The difference between MyISAM storage engine and InoDB storage engine

  • MyISAM does not support transactions, but each operation is atomic, and supports table-level locking. Each operation locks the entire table, storing the total number of rows in the table. A MyISAM table has three files: index file, table structure file, and data file. InnoDB supports ACID transactions, supports row-level lock-level foreign key constraints, therefore supports concurrent writes, and does not store total rows.
  • MyISAM stores index files in the engine. MYI and data files. MYD is separate (non-clustered index), that is, the leaves of the B+ tree store the index and the address on disk of the row corresponding to that index (the pointer to the data file stored in the data field of the index file). MyISAM does not support database transactions, nor does it support row locking or foreign keys. Therefore, inserting or updating a database table will lock the entire table, which is inefficient.
  • In the InoDB storage engine, the B+ tree leaf node that stores the index stores the complete data record, that is, the clustered index. InoDB must have a primary key, which must be a clustered index. If you do not manually set the primary key, the unique index will be used. If there is no unique index, the hidden ID of a row inside the database will be used as the primary key. The indexes built on the clustered index are called secondary indexes. Secondary indexes always need secondary search to access data. Non-clustered indexes are secondary indexes. InoDB leaves of non-primary key indexes store not the address of the data row but the index and the value of the primary key field of the current table.

The difference between clustered and non-clustered indexes

  • Clustered index: Data is stored together with the index, and is organized in a certain order, find the index to find the data. Data can be obtained directly by clustered indexes, and the second query is more efficient than non-clustered indexes. Clustered indexes are efficient for range lookups because their data is sorted in size order.

  • Non-clustered index: The leaf node does not store data, but stores the address of the row corresponding to the index field. That is, it needs to search twice to find the address of the row according to the index and then query the data in the disk according to the address.

Classification of indexes

  1. Normal index. A common index is the most basic index and has no limits.
  2. Join indexes. Indexes created on multiple fields of a table follow the leftmost matching rule (described below).
  3. Unique index [unique]. The unique index requires that the value of the index column must be unique. There cannot be duplicate values, but null values are allowed.
  4. Primary key index. A primary key index is a special unique index. Only one primary key index can exist in a table and is used to uniquely identify a record. Null values are not allowed.
  5. Full-text index [fulltext]. Full-text index is used to retrieve field contains or does not contain the specified keyword, index of its internal structure is the same as the search engine of inverted index structure, the principle is applied to the text field to word segmentation, then appear for each word record an index, the index entry saved in all of the record of the word information, Full-text indexes can only be used to store tables with InnoDB or MyISAM engines. Full-text indexes can only be created for columns of type CHAR, VARCHAR or TEXT.

Joint index

  • A joint index, also known as a composite index, is used to index multiple columns on a table.
  • Leftmost matching rule: For federated indexes, Mysql uses the fields in the index from left to right. A query may use only part of the index, but must include the leftmost part. For example, if the index is key index (a,b,c), you can search for (a), (a,b), or (a,b,c), but cannot search for (b,c).

Joint index test:

CREATE TABLE `user2` (
  `id` int(11) DEFAULT NULL,
  `name` varchar(225) DEFAULT NULL,
  `pwd` varchar(225) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  KEY `union_index` (`id`,`name`,`pwd`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8


Copy the code
  • (ID), walk index
EXPLAIN SELECT * FROM USER2 WHERE  id=2  
Copy the code

  • (id, name), walk index
EXPLAIN SELECT * FROM USER2 WHERE  id=2  AND NAME='admin' 
Copy the code

  • (ID, name, PWD), walk index
EXPLAIN SELECT * FROM USER2 WHERE  id=2  AND NAME= 'root' AND pwd='root'
Copy the code

  • Skip id (name, PWD), the index is invalid
EXPLAIN SELECT * FROM USER WHERE  user.name='cheng'  AND user.pwd='admin'
Copy the code

  • Skip name (ID, PWD),id is indexed, PWD is not indexed
EXPLAIN SELECT * FROM USER2 WHERE  id=2  AND pwd='admin'
Copy the code

  • Note: when all columns in the index contain all columns in the table, the index will be entered regardless of whether the left-most matching rule is met.

Miss u too (˘⌣˘) : I really miss too

Refer to the article www.cnblogs.com/aspirant/p/…