The index type
From the perspective of data structure:
- B+ tree index,
- Hash indexes, implemented based on hash tables, are valid only for full value matches. Resolve conflicts as linked lists. Very fast search O(1)
- Full-text indexing, which looks for keywords in text rather than directly comparing values in indexes, is similar to what search engines do.
- Spatial data index (R-tree), MyISAM supports spatial index, can be used as a geographic data store, will index data from all dimensions, effectively using any dimension to combine queries.
From the perspective of physical storage:
-
Clustered indexes. InnoDB stores indexes and data rows in the same structure. The data field of leaf nodes stores all data of rows.
Advantages: Fast data access. Indexes and data are stored in the same B+ tree species, so they are faster than non-clustered indexes. You can keep related data together. For example, aggregating data by user ID can fetch all the mail for the user ID by reading a few data pages from disk.
Disadvantages: Insertion speed validation depends on insertion order, and out-of-order writes can result in frequent page splits moving large amounts of data. Updates are expensive because each updated row is moved to a new location. Faced with page splitting, when a page is full, the storage engine splits the page into two pages to accommodate the row.
Insert data in primary key order whenever possible, and insert new rows using monotonically increasing cluster key values whenever possible.
-
The MyISAM index file is separate from the data file. The data field of the leaf node stores the address of the row data
Advantages: The update cost is less than clustered indexes.
Disadvantages: Like clustered indexes, non-clustered indexes rely on ordered data. A secondary query (back to the table) may be performed. After the pointer or primary key corresponding to the index is found, the query may be performed in the data file or table based on the pointer or primary key.
From a logical point of view:
- Normal/single-column indexes. The only purpose of a normal index is to quickly query data. Multiple normal indexes can be created on a table, and data duplication and nulls can be allowed.
INDEX index_name ( column )
. - Unique index, unique index is also a constraint. Unique index columns cannot have duplicate data, but NULL data is allowed, and multiple unique indexes can be created on a table.
UNIQUE ( column )
- The primary key index, which is used by the primary key column of the data table
PRIMARY KEY ( column)
- A federated index is an index created on multiple fields
INDEX index_name ( column1, ... )
, when using the leftmost matching principle - Prefix index, part of the index string
INDEX index_name ( column(10) )
- Full-text indexing, which looks for keywords in text rather than directly comparing values in indexes, is similar to what search engines do.
FULLTEXT ( column)
From the point of view of expression:
- The primary key index, which is used by the primary key column of the data table. If InnoDB does not explicitly specify a primary key for a table, InnoDB will automatically check if there is a unique index field in the table. If there is, InnoDB will select this field as the default primary key, otherwise InnoDB will automatically create a 6-byte increment primary key.
- Secondary/secondary indexes, except primary key indexes, are called secondary indexes. Secondary indexes and primary key indexes have roughly the same structure in MyISAM. In InnoDB, the leaf node of secondary index stores the primary key. Locating row data by the primary key requires two index lookups. Using primary key values as Pointers takes up more space for the secondary index, but moves rows without updating the secondary index.
- Overwrite indexes. If an index contains (or overwrites) the values of all the fields to be queried, it is called a “overwrite index.” Overwrite index means that the field to be queried is exactly the field of the index, so that the data can be queried directly according to the index without the need to query back to the table.
Index optimization SQL troubleshooting tuning
- Open the slow query log to view the slow query
- First of all, in the offline environment explain to see whether the execution plan is in line with the expected, for example, check whether the key field uses the index, which index is used. ② Check whether the type field is ALL. ③ Whether the number of rows scanned by the ROW field is too large and the estimated value. MySQL data units are all pages, using the sampling statistical method. (4) Whether the extra field needs to be sorted, that is, it cannot be sorted by index order; Whether to use temporary tables, etc. ⑤ If it is a composite index, check whether it is fully used by the key_len field.
- Understand business scenarios. See how the business uses this SQL, do targeted optimization.
- Eliminate cache interference. If the thread RT (response time) is fast or slow, it may be a cache problem, because the cache will be invalidated before the table is updated
- Use overwrite indexes. An index contains the values of all the fields to be queried. The index can be used to query data without going back to the table. For example, InnoDB secondary index contains primary key values. If the primary key values are found through secondary index, the overwrite index is used
- Use composite indexes. For example, you can create a joint index by checking the inventory based on the name. You do not need to check the inventory after checking the primary key based on the name, but you need to consider the business scenario to avoid occupying too much space.
- Note the left-most prefix rule and write SQL in the order defined. If a fuzzy query uses only the leftmost index of a combined index, it can still use the combined index without creating a separate index.
- Select the appropriate index column order. When sorting and grouping are not a concern, it is usually good to put the most selective columns at the top of the index.
- Index push down, mysql5.6 after official automatic optimization. For example, when a federated index (name,age) is queried based on the like of name and age looks for all matched rows within a certain range, only name can be used for the index, but age does not. After optimization, after name is used in the index, age matching conditions will be directly matched from the index, and then query back to the table, so that the data to be queried back to the table will be reduced. The idea is to make full use of the data in the index and try to filter out the invalid data before querying the whole row.
- Use prefix indexes. When you want to index a string, you can use a prefix index to save resources. If the prefix distinction is not high, you can store it in reverse order or hash.
- Note the implicit type conversion. For example, if id is a character type, the use of int query is equivalent to adding type conversion function, do not use index. Different character sets for two tables can also result, such as UTF8MB4 (which can exceed 3 bytes) and UTF8 (up to 3 bytes).
- Indexes for frequently updated fields should be carefully built. Indexes for infrequently queried fields do not need to be built.
- Follow index design guidelines for three star indexes, but are generally difficult to meet, depending on actual costs and business scenarios. (1) WHERE columns can form a single column index or a joint index. SQL > order by colulmn (colulmn, colulmn, colulmn, colulmn, colulmn, colulmn); ③ Use overridden indexes whenever possible
- Note The index failure scenario.