preface

What is Mysql? Do you really know?

An Index is a data structure that helps MySQL obtain data efficiently. Database query is one of the most important functions of database, we all hope that the speed of query data can be as fast as possible, so the designer of database system will optimize from the Angle of query algorithm.

MySQL > create index ()

Indexes are used to quickly find rows with specific column values. If there is no index, MySQL must start with the first row and then traverse the entire table to find relevant rows. The bigger the table, the more it costs. If the table has indexes for related columns, MySQL can quickly determine where to look in the middle of the data file without having to look at all the data. This is much faster than reading each line sequentially.

Most MySQL indexes (PRIMARY KEY, UNIQUE, INDEX, and FULLTEXT) are stored in a B-tree. Exceptions: Indexes for spatial data types use R trees; MEMORY tables also support hash indexes. InnoDB uses inverted lists for FULLTEXT indexes.

MySQL uses indexes to perform the following operations:

1. Quickly find rows that match the WHERE clause

2. If you can choose between multiple indexes, MySQL usually uses indexes that look for the smallest number of rows (the most selective index)

3. With a multi-column index (also called a “composite index” or “union index”), the optimizer can use any left-most prefix of the index to find rows. For example, if there is a three-column index on (col1, col2, col3), there are index searches on (col1), (col1, col2), and (col1, col2, col3).

4. MySQL can use indexes on columns more efficiently if you declare the same type and size when using join queries to retrieve rows from other tables. In this case, VARCHAR and CHAR are considered the same if they are declared to be of the same size. For example, VARCHAR (10) and CHAR (10) have the same size, but VARCHAR (10) and CHAR (15) have different sizes.

5. For comparisons between non-binary string columns, both columns should use the same character set

6. Sort or group tables if sorting or grouping is done on the leftmost prefix of the available index (for example, ORDER BY KEY_PART1, key_part2). If DESC is followed by all key parts, the keys are read in reverse order.

7. In some cases, MySQL can use indexes to satisfy the ORDER BY clause and avoid the extra sorting involved in performing file sorting operations.

8. In some cases, queries can be optimized to retrieve values without querying rows. (The index that provides all the necessary results for the query is called an override index.) If the query uses only columns contained in some indexes from the table, the selected values can be retrieved from the index tree to increase speed.

Finally, indexes are less important for queries on small tables. When a query needs to access most rows, sequential reads are faster than processing indexes.

Avoid full table scan

When MySQL uses full table scans to parse queries, the output from EXPLAIN shows ALL in the type column. This usually happens when:

1. Tables are so small that a full table scan is much faster than an index lookup. This is common for tables with fewer than 10 rows and shorter lines.

2. Do not use index columns in ON or WHERE sentences.

3. Compare indexed columns to constant values, which MySQL has calculated (based on index trees) to cover a large portion of the table and make table scans faster.

4. You are using keys with low cardinality through another column (many rows match key values). In this case, MySQL assumes that by using this key, it is possible to perform many key lookups and table scans will be faster.

For small tables, table scans are usually appropriate and have a negligible impact on performance.

For large tables, you can try the following techniques to avoid the optimizer incorrectly selecting a table scan:

5. Use ANALYZE TABLE tbl_name to update the key distribution

6. Use FORCE INDEX to tell MySQL that a table scan is expensive compared to using a given INDEX

Column index

B-tree data structures allow indexes to quickly find specific values, sets of values, or series of values corresponding to operators (e.g. =, >, ≤, BETWEEN, IN, etc.) IN the WHERE clause.

Each storage engine defines the maximum number of indexes and the maximum index length for each table. All storage engines support at least 16 indexes per table, and the total index length is at least 256 bytes.

The index prefix

Col_name (N) allows you to create an index that uses only the first N characters of a column. In InnoDB tables, prefixes are up to 767 bytes long.

The full text indexing

The FULLTEXT index is used for full-text search. Only InnoDB and MyISAM storage engines support FULLTEXT indexes, and only CHAR, VARCHAR and TEXT columns are supported. Indexing is always done on the entire column, and column prefix indexing is not supported.

Spatial index

Refers to a kind of data structure arranged in a certain order according to the position and shape of space objects or some spatial relations between space objects

MEMORY The index on the storage engine

By default, the MEMORY storage engine uses HASH indexes, but also supports BTREE indexes.

4. Multiple column index

MySQL can create compound indexes (that is, indexes on multiple columns). An index can contain up to 16 columns.

Suppose we have a sheet that says:

CREATE TABLE test (
    id         INT NOT NULL,
    last_name  CHAR(30) NOT NULL,
    first_name CHAR(30) NOT NULL,
    PRIMARY KEY (id),
    INDEX idx_name (last_name,first_name)
);
Copy the code

The idx_name index is the index based on the last_name and first_name columns. This index can be used for queries that specify both the value of last_name and first_name, or for queries that specify only the value of last_name because the index matches the left-most prefix.

Therefore, the IDx_name index can be used for the following queries:

SELECT * FROM test WHERE last_name='Jones';

SELECT * FROM test WHERE last_name='Jones' AND first_name='John';

SELECT * FROM test WHERE last_name='Jones' AND (first_name='John' OR first_name='Jon');

SELECT * FROM test WHERE last_name='Jones' AND first_name >='M' AND first_name < 'N';
Copy the code

However, the IDx_name index cannot be used for the following queries:

SELECT * FROM test WHERE first_name='John';

SELECT * FROM test WHERE last_name='Jones' OR first_name='John';
Copy the code

Consider the following SQL:

SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
Copy the code

If a multi-column index exists on COL1 and COL2, the appropriate rows can be fetched directly. If there are separate single-column indexes on COL1 and COL2, the optimizer will try to use index merge optimization, or find the most restrictive index by determining which index needs to exclude more rows, and use that index to retrieve rows.

If the table has a multi-column index, the optimizer can use any left-most prefix of that index to find rows. For example, if there is a three-column index (col1, col2, col3), then there is index search capability on (col1), (col1, col2, col3).

If the column does not form the leftmost prefix of the index, MySQL cannot use the index to perform a lookup.

Look at the following SQL statement:

SELECT * FROM tbl_name WHERE col1=val1;
SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
Copy the code

SELECT * FROM tbl_name WHERE col2=val2; SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3; If a compound index exists on (col1, col2, col3), then only the first two queries will be used. The last two queries do not use indexes to perform lookups because (col2) and (col2, col3) are not left-most prefixes for (col1, col2, col3).

Comparison of B-tree and Hash indexes

B tree index features

B-tree indexes can be used to compare columns in expressions using the =, >, >=, <, <=, BETWEEN operators. The index can also be used for LIKE comparisons if the argument to LIKE is a constant string that does not begin with a wildcard character.

The following clauses do not use indexes:

/* the LIKE value begins with a wildcard character */
SELECT * FROM tbl_name WHERE key_col LIKE '%Patrick%';
/*  the LIKE value is not a constant */
SELECT * FROM tbl_name WHERE key_col LIKE other_col;
Copy the code

Any index that does not cover all AND levels in the WHERE clause will not be used to optimize the query. In other words, in order to be able to use the index, you must use the prefix of the index in each AND group.

The following WHERE clause uses an index:

. WHERE index_part1=1 AND index_part2=2 AND other_column=3 /* index = 1 OR index = 2 */ ... WHERE index=1 OR A=10 AND index=2 /* optimized like "index_part1='hello'" */ ... WHERE index_part1='hello' AND index_part3=5 /* Can use index on index1 but not on index2 or index3 */ ... WHERE index1=1 AND index2=2 OR index1=3 AND index3=3;Copy the code

The following WHERE clauses do not use indexes:

   /* index_part1 is not used */
... WHERE index_part2=1 AND index_part3=2

    /*  Index is not used in both parts of the WHERE clause  */
... WHERE index=1 OR A=10

    /* No index spans all rows  */
... WHERE index_part1=1 OR index_part2=10
Copy the code

Sometimes MySQL does not use indexes even when they are available. One possible reason for this is that the optimizer estimates that using indexes will require access to a large percentage of the rows in the table. (In this case, a table scan might be faster because it requires fewer lookups.) However, if such a query uses LIMIT to retrieve only certain rows, then MySQL still uses indexes because it can find the few rows that return results more quickly.

Hash index characteristics

Hash indexes have some different characteristics than the indexes just discussed:

1. Hash indexes are only used for equality comparison (but very fast) with = or <=> operators, not for comparison operators to find ranges of values. Systems that rely on this kind of single-value lookup are called “key-value pair storage.” To use MySQL for such applications, use hash indexes whenever possible.

2. The optimizer cannot use hash indexes to speed up ORDER BY operations. (Hashed indexes cannot be used to search the next item in order)

3.MySQL cannot determine the approximate number of rows between two values (used by the range optimizer to decide which index to use)

4. Only the entire keys can be used to search rows. (For b-tree indexes, any leftmost prefix of key can be used to find rows)

B-tree

Tree data structure, widely used in database indexes. The structure is always ordered so that you can quickly find exact matches (equal operators) and ranges (for example, greater than, less than, and BETWEEN operators). Such indexes are available for most storage engines, such as InnoDB and MyISAM.

Because a b-tree node can have many children, b-trees differ from binary trees, where each node can have a maximum of two children.

The use of the term B-tree is intended to refer to general categories of index design. Since there is no complexity in the classical B-tree design, the B-tree structure used by the MySQL storage engine may be considered a variant.

Hash index

An index type reserved for queries that use the equality operator rather than the range operator. It can be used with MEMORY tables. Although hash indexes were historically the default index for MEMORY tables, the storage engine also supports B-tree indexes, which are often a better choice for general-purpose queries.

Optimize data size

Design tables so that they take up the least amount of space on disk. This can lead to significant improvements by reducing the amount of data written to and read from disk. Smaller tables typically require less main memory to process their contents during query execution. Any reduction in space for table data results in smaller indexes, which can be processed faster.

MySQL supports many different storage engines (table types) and row formats. For each table, you can decide which storage and indexing methods to use. Choosing the right form for your application can greatly improve performance.

Table Columns

1. Use the most efficient (and smallest) data type possible. MySQL has many specialized types that save disk space and memory. For example, if possible, use smaller integer types to get smaller tables. MEDIUMINT is generally better than INT because the MEDIUMINT column uses 25% less space.

2. If possible, declare the column as NOT NULL. It can make SQL operations faster by making better use of indexes and eliminating the overhead of testing each value for NULL. It also saves some storage space, one bit per column. If NULL values are really needed in the table, use them. Just avoid the default setting, which allows NULL values in every column.

Row Format

1. To further reduce space by storing table data in COMPRESSED form, specify ROW_FORMAT=COMPRESSED when creating InnoDB tables

Indexes

1. The primary key index of the table should be as short as possible. This makes it easy and efficient to identify each line. For InnoDB tables, primary key columns are repeated in each secondary index entry, so if you have many secondary indexes, shorter primary keys can save a lot of space.

2. Create only indexes that need to improve query performance. Indexes are great for retrieval, but slow down insert and update operations. If you access a table primarily by searching for combinations of columns, create a single composite index on the table rather than creating a separate index for each column. The first part of the index should be the most commonly used column. If many columns are always used when querying from a table, the first column in the index should be the most repeated column to better compress the index.

3. If it is a long string column, it is likely to have a unique prefix on the first character, in which case it is best to index using the MySQL prefix (PS: index only the first few characters). Shorter indexes are faster not only because they require less disk space, but also because they increase the number of hits in the index cache, thereby reducing disk seek times.

Joins

1. Declare columns with the same information in different tables with the same data type to speed up joins based on corresponding columns.

2. Keep column names simple so that the same name can be used in different tables and simplify join queries. For example, in a table named Customer, use the name column name instead of customer_name. To make your name portable to other SQL servers, consider limiting the name length to 18 characters.

Normalization

1. In general, try to keep all data non-redundant (called the third normal form in database theory). Assign them unique ids instead of a repetitive verbose value, repeat these ids across multiple smaller tables as needed, and join the tables in the query by referencing ids in the JOIN clause.

Optimize data types

Numeric types

1. It is best to use numeric values rather than strings for the unique identification of lines, because large values take up fewer bytes of storage than their corresponding strings, so transferring and comparing them is faster and takes up less memory.

Character and string types

1. When comparing values from different columns, declare these columns as much as possible using the same character set and collation rules to avoid string conversions when running queries.

2. For column values less than 8KB, use binary VARCHAR instead of BLOB. The GROUP BY and ORDER BY clauses can generate temporary tables, and if the original table does not contain any BLOB columns, these temporary tables can use the MEMORY storage engine.

3. If a table contains string columns such as name and address, but many queries do not retrieve these columns, consider splitting the string columns into separate tables and using join queries with foreign keys if necessary. When MySQL retrieves any value from a row, it reads a data block containing all the columns of that row (and possibly other adjacent rows). Keeping each row small and containing only the most commonly used columns allows for more rows per block. This compact table reduces disk I/O and memory usage for common queries.

4. When using a randomly generated value as the primary key in an InnoDB table, it is best to precede it with an ascending value such as the current date and time (if possible). When consecutive primary key values are stored physically next to each other, InnoDB can insert and retrieve them more quickly.

other

1.ORDER BY and GROUP BY use different columns, or use temporary tables when ORDER BY or GROUP BY uses columns from a table other than the first table in join queries.

2.MySQL has a hard limit of 4096 columns per table, but for a given table the effective maximum may be less. InnoDB has a limit of 1017 columns per table.

The last

I have arranged a: Mysql index related documents, Spring bucket series, Java systematic information, (including Java core knowledge points, interview topics and the latest 20 years of the Internet, e-books, etc.) friends who need to pay attention to the public number can be obtained.