preface

A database index is a sorted data structure in a database management system (DBMS) to help quickly query and update data in a database table.

The body of the

type

Normal index (Normal)

Also known as a non-unique index, it is the most common index with no restrictions.

Unique index

Unique index requirements cannot be repeated. A primary key is a unique index, but it must not be empty.

FullText index

For large data, such as message content or an article with several kilobytes of data, full-text indexes can be used to improve efficiency if you need to query with like. Only text fields such as TEXT, CHAR, vARCHar can create full-text indexes. MylSAM and InnoDB support full-text indexing.

Select * from fulltext where match(content) against(‘ today weather ‘IN NATURAL LANGUAGE MODE)

structure

Binary tree

All nodes in the left subtree are smaller than the parent, and all nodes in the right subtree are larger than the parent.

Disadvantages: Uncontrollable depth difference between left and right subtrees, prone to “inclined tree”, i.e. a linked list.

AVL Tree balanced binary Tree

The defect of uncontrollable depth difference between left and right subtrees of binary tree is solved. The absolute value of depth difference between left and right subtrees cannot exceed 1. Nodes store key values, data addresses, and node references.

Disadvantages: When there are only two (binary) lines of data, the depth increases, and you need to perform MULTIPLE I/OS to query the later data, which is inefficient.

Multi-path balanced binary Tree

The contents of nodes are the same as those of AVL Tree. The number of ways is always 1 more than the key. As the number of routes increases, the I/O count decreases significantly.

Multipath balanced binary Tree (B+Tree)

B+Tree is a variant of B-tree, not that there is anything wrong with b-tree, but that B+Tree implementation is better. The number of routes equals the number of keywords. Root and secondary nodes do not store data (that is, the full address of the recorded data); only leaf nodes store data. Each leaf node adds a pointer to the adjacent leaf node, and the last data points to the first data of the next leaf node, forming an ordered linked list structure.

Advantage:

  1. B Tree is a variant of B Tree. It can solve all the problems solved by B Tree (the node stores more keywords, and the number of paths increases).
  2. Stronger table scanning capability: To perform full table scanning, only leaf nodes need to be traversed, not the entire B+ tree.
  3. B+Tree has more disk capacity: non-leaf nodes do not save data areas, so more keywords can be saved, and more keywords can be loaded on the disk at a time.
  4. Better sorting ability: The last data of the leaf node points to the first data of the next leaf node.
  5. Efficiency is more stable: always get on the leaf node, IO number is stable.

Data storage file

You can see that each InnoDB has two files (.frm and.ibd). MylSAM has three files (.frm,.myd,.myi). FRM stores table structures, which any storage engine has.

MyISAM

MYD is a data file that stores data records. MYI is an index file that stores indexes. The index structure is still B+Tree, but it is a non-clustered index (described below), and all index leaf nodes hold the address pointing to the data store. As is shown in

InnoDB

Indexes and data are stored in a file. The data is stored directly on the leaf node of the clustered index. Indexes are data, and data are indexes.

Clustered index (clustered index)

If a table has a primary key index, the primary key index is the clustered index, which determines the physical order in which the rows are stored.

Secondary indexes

Other indexes that are not clustered indexes are secondary indexes. The leaf node stores the value of the corresponding clustered index. Why don’t you store the address? Because the address will change.

Process: As shown in the figure above, the name field in InnoDB is the index. When the user queries the data of name=Edison, it will find the clustered index of name=Edison in the secondary index, that is, ID =7. Then through this ID to aggregate index tree to find the corresponding data address of this ID, finally get the data. That’s what we call a return table.

Q: What if a table has no primary key?

  1. If we define a Primary Key, InnoDB will select it as the clustered index.
  2. If no Primary Key is explicitly defined, InnoDB selects a unique index that does not contain a null value as the Primary Key index.
  3. What if none of the above indexes are available? InnoDB chooses the built-in 6-byte ROWID as the hidden clustered index, which increases the primary key as row records are written.

Index usage principles

Discrete degree of column

Leftmost matching principle

To create an index(a,b,c), run the following command: index(a) index(a,b) index(a,b,c) Where b=? And where b =? and c = ? You can’t use indexes

Cover index

In a secondary index, an overwrite index is used when a select column needs to be retrieved only from the index, not from the data area. That is, there is no need to return to the table.

Index condition push down (ICP)

This function is enabled by default. Functions completed after 5.6 are only applicable to secondary indexes. The goal of ICP is to reduce the number of I/O operations by reducing the number of read data accessing the full row of the table. Push-down means that the filtering is done on the storage engine side instead of the Server layer.

SQL :select * from t where lastName = ‘wang’ and firstName like ‘%zi’; Normally, only lastName can be indexed, but firstName cannot. So the query process is:

  1. Select * from secondary index where lastName=’wang’
  2. The corresponding data rows are then queried back to the Server layer.
  3. Server layer filters based on firstName like ‘%zi’.

There is a problem here. If wang has a large number of data and only one data ending in ZI, then those are redundant queries.

Using WHERE: indicates that the data retrieved from the data engine does not meet all conditions and needs to be filtered at the Server layer.

The optimization of the push down is:

  1. Select * from secondary index where lastName=’wang’
  2. The secondary index filters out only one index ending in ZI
  3. Then query the corresponding data in the table and return it to the Server layer.

Index creation

  1. Create indexes on (on) and group by fields used by WHERE to judge order and join.
  2. Don’t have too many indexes: waste space and slow updates.
  3. Too long field, create prefix index.
  4. Do not index fields with low differentiation: The low dispersion results in more rows being scanned.
  5. Frequently updated fields, do not build: page split
  6. Random unordered fields: page split
  7. Composite indexes are hashed first
  8. Create composite indexes instead of single-column indexes

When is an index not needed

  1. Use functions, expressions, and computational symbols on indexed columns.
  2. The string is not quoted and is implicitly converted.
  3. like ‘%zi’
  4. Negative query: not in /not like

Explain

www.cnblogs.com/gomysql/p/3…

SQL Execution sequence

select distinct 
        <select_list>
from
    <left_table><join_type>
join <right_table> on <join_condition>
where
    <where_condition>
group by
    <group_by_list>
having
    <having_condition>
order by
    <order_by_condition>
limit <limit number>
Copy the code

SQL Execution sequence

From <left_table><join_type> 2, on <join_condition> 3, <join_type> join <right_table> 4, where <where_condition> 5, group Having <having_condition> 7, select count (*), distinct <select_list> 9, order by <order_by_condition> 10 and limit < limit_number >Copy the code