An overview of the

This is a reading note for the index section of High Performance mysql. Indexes are the stuff of the storage engine layer, so different storage engines implement indexes differently.

Benefits of Indexes

  1. Indexes greatly reduce the amount of data that the server needs to scan.
  2. Indexes help the server avoid sorting and temporary tables.
  3. Indexes can turn random I/ OS into sequential I/ OS.

Adaptive hash index

Innodb monitors secondary indexes that are frequently accessed. If secondary indexes are frequently accessed, InnoDB creates hash indexes (in memory) for secondary indexes, which improves the performance of secondary indexes.

However, hash indexes can only be used for equivalent queries and consume a certain buffer pool.

Mock hash index

For long data search, you can add a column to perform hash search for long data to improve performance.

SELECT id FROM url WHERE url_crc=CRC32 (" http://www.mysql.com ") AND the url = "HTTP://www.mysql.com
Copy the code

The prefix index

For long data, prefix indexes can be used, but the specificity of prefix is strong, so as to give full play to the performance of indexes. Using prefixes makes it impossible to use group by and order BY operations.

Multi-column single index, multi-column index

For a single index with multiple columns, the performance of the WHERE query is not very good. For this type of query, mysql will perform index merge, which consumes the corresponding CPU resources. However, the optimizer does not factor this time and resources into the cost of the query, and sometimes it is better to perform a full table query.

If index_Merge is found in the explain statement, you need to analyze whether optimization is required.

The order of the multi-column index is also important, which column is chosen as the index prefix.

The index combined

Mysql performs a conditional scan for multiple indexes on a single table, and then merges the results.

Clustering index

Data and indexes are clustered together.

The advantages and disadvantages:

1. Suitable for IO intensive applications. If all data is stored in memory, clustered indexes are useless.

2. The insertion speed depends on the insertion sequence. You are advised to insert data according to the primary key.

3. Updating the cluster index column (primary key) is expensive, because the whole data needs to be moved to a new location, or even page splitting occurs.

4. The scan speed of all tables may be slow.

5. The secondary index leaf needs to store the primary key column, and the secondary index needs two lookups

Cover index

If an index contains all the fields of a query (query fields as well as condition fields), it is called an overwrite index.

Benefits: no need to return to the table query, fast. Because indexes are small, performance is better after in-memory caching.

Overwrite indexes are generally implemented using b-tree because the values of index columns need to be stored.

An overwrite index can theoretically be used to query the corresponding field without going back to the table, but in some cases we can combine the overwrite index to reduce the access to the data row, which is called deferred association. (Older versions of mysql may not require this complicated trick.)

For example:

EXPLAIN SELECT * FROM products WHERE actor='SEAN CARREY 'AND title like'%APOLLO%'Copy the code

You cannot use an overwrite index for this row of data because all fields are queried. Even if we create a multi-column index (actor,title), the query logic will still look up actor=’SEAN CARREY ‘from the index, and then return to the table to do the like operation.

In theory, you can query the index data, and then query the data back to the table after the index filtering like operation. Note that MySQL cannot perform a LIKE operation on an index.

Optimizations: Extending indexes (actor, title, prod_id)

EXPLAIN SELECT * FROM products JOIN (SELECT prod_id FROM products WHERE actor='SEAN CARREY' AND title LIKE ) AS t1 ON (t1.prod_id=products.prod_id)
Copy the code

This operation is deferred association. Can be done after filtering back to the table query.

This problem is mainly caused by the fact that the API of the earlier version of mysql does not support sending query criteria to the storage engine. The data is pulled from the storage engine and filtered in mysql. 5.6 support for indexed conditional push (ICP) will optimize this aspect.

ICP

If part of the WHERE condition can be evaluated by using fields in the index, mysql Server pushes that part of the WHERE condition to the Storage Engine. The storage engine filters rows that do not meet the push index criteria through the index column data of the index tuple. Can only be used for secondary indexes.

SELECT * FROM products WHERE actor='SEAN CARREY 'AND title like'%APOLLO%'the index (actor, title)Copy the code

For this row of data, the storage engine looks for index entries that satisfy actor, and then filters like based on the queried index entries. Finally, the qualified data is queried from the Base Table and returned to the mysql Server layer.

The index sort

Condition:

  • The index column order is the same as the order by condition order, and all columns are sorted in the same direction
  • If the ORDER BY clause references all the columns in the first table, the index can be used for sorting
  • Must match most do match

Index and lock

Indexes allow queries to lock fewer rows. Reduce the number of row locks and increase concurrency. Avoiding fragmentation can speed up queries.