MySQL index creation and design principles

MySQL > select * from ‘MySQL’

MySQL indexes include normal indexes, unique indexes, full-text indexes, single-column indexes, multi-column indexes, and spatial indexes.

  • In terms of functional logic, there are four kinds of indexes, namely common index, unique index, primary key index and full-text index.
  • According to the physical implementation, indexes can be divided into two types: clustered index and non-clustered index.
  • According to the number of fields, it is divided into single column index and joint index.

1. Plain indexes

Common indexes are used to improve query efficiency without any restrictions. Such indexes can be created in any data type, and their values are unique and non-null depending on the integrity constraints of the field itself. After the index is created, you can use the index to query. For example, create a plain index on the field name of table Student, which can be queried against when querying records.

2, unique index:

The UNIQUE parameter can be used to set an index to a UNIQUE index. That is, if a UNIQUE constraint is created on a table field, the UNIQUE index is automatically created for us. When creating a UNIQUE index, the value must be UNIQUE, but empty values are allowed. There can be multiple unique indexes in a table.

For example, to create a unique index in the student field email, the value of the field email must be unique. With unique indexes, a record can be determined more quickly.

Primary key index ();

A primary key index is a special type of UNIQUE index, which is NOT NULL+UNIQUE. A primary key index is a UNIQUE index. Innodb is a clustered index. Innodb is a clustered index

Why? This is determined by the physical implementation of the primary key index, since data stores can only be stored in one order within a file.

4. Single-column index

Create an index on a single field in a table. A single-column index is indexed only by this field. A single-column index can be a normal index, a unique index, or a full-text index. Just make sure that the index only corresponds to one field. A table can have multiple single-column indexes

5. Multi-column (composite, union) indexes

A multi-column index creates an index on a combination of fields in a table. This index points to multiple fields at the time of creation and can be queried by these fields, but it is used only when the first of these fields is used in the query condition.

For example, create a multi-column index idx_id, name_gender on fields ID, name, and gender in the table. This index will only be used if the field ID is used in the query condition. Follow the leftmost prefix collection when using composite indexes.

6, full-text text index:

Full-text indexing (also known as full-text retrieval) is a key technology used in search indexing. It can use [word segmentation technology] and other algorithms to intelligently analyze the frequency and importance of keywords in the text, and then select the search results we want intelligently according to certain algorithm rules. Full-text indexes are great for large data sets, but are less useful for small data sets.

Using the FULLTEXT parameter, you can set the index to full-text priming. Full-text lookup of values is supported on the columns where the index is defined, allowing the insertion of duplicate and null values in these index columns. Full-text indexes can only be created on fields of the CHAR, VARCHAR, or TEXT type and their series types. Full-text indexes can be used to speed up the query for fields of the string type with large data volume. For example, the information field in table Student is of type TEXT, which contains a lot of textual information. When a full-text index is built on field information, the speed of querying field information can be increased.

There are two typical types of full-text indexes: natural language full-text indexes and Boolean full-text indexes.

Natural language search citation calculates the relevance of each document object to the query. Here, relevance is based on the number of matched keywords and the number of times that keyword appears in the document. The fewer times words appear in the overall index, the higher their relevance. Conversely, very common words will not be searched, and if a word appears in more than 50% of the records, natural language searches will not search for such words.

MySQL database from version 3.23 to support full text indexing, but MySQL5.6.4 before only Myisam support, 5.6.4 after InnoDB support, but the official version does not support Chinese word segmentation, requires a third-party word segmentation plug-in. In version 5.7.6, MySQL has a built-in Ngram full-text parser to support word segmentation for Asian languages. When testing or using full-text indexes, take a look at your own version of MySQL. Whether the storage engine and data type support full-text indexing.

With the advent of big data era, relational database has been unable to cope with the demand of full-text index, and has been gradually replaced by solr, ElasticSearch and other specialized search engines.

7. Spatial index

The index can be set to a SPATIAL index using the parameter SPATIAL. Spatial index can only be established on spatial data type, which can improve the efficiency of the system to obtain spatial data. The spatial data types in MySQL include GEOMETRY, POINT, LINESTRING, and POLYGON. Currently only the MyISAM storage engine supports spatial retrieval, and indexed fields cannot be null. For beginners, this kind of index is rarely used.

Summary: Different storage engines support different types of indexes

  • InnoDB: supports b-tree and full-text indexes, but does not support Hash indexes.
  • MyISAM: supports indexes such as B-tree and full-text, but does not support Hash indexes.
  • Memory: Supports b-tree and Hash indexes, but does not support full-text indexes.
  • NDB: supports Hash indexes but does not support indexes such as B-tree and full-text.
  • Archive: Indexes such as B-tree, Hash, and full-text are not supported.

Suggestion: Don’t build more than 5 indexes for a table!

2. Create index

MySQL supports multiple methods for creating indexes on a single or multiple columns: There are several methods

  • Create table definition statementCREATE TABLESpecifies the index column,
  • useALTER TABLEStatement creates an index on an existing table
  • Or useCREATE INDEXStatement adds an index to an existing table.

2.1. Create an index when creating a table

Implicitly create an index

CREATE TABLE dept(
    dept_id INT PRIMARYKEY AUTO_INCREMENT, # create a primary KEY index dept_id dept_nameVARCHAR(20));CREATE TABLE emp(
    emp_id INT PRIMARYKEY AUTO_INCREMENT, # creates a primary KEY index emp_id emp_nameVARCHAR(20) UNIQUECreate a unique index emp_name dept_idINT.CONSTRAINT emp_dept_id_fk FOREIGN KEY(dept_id) REFERENCESDept (dept_id) # create a normal foreign key index.Copy the code

SQL > create index (s); SQL > create index (s); create index (s)

Create an index explicitly

If the index is explicitly created when the table is created, the basic syntax is as follows:

CREATE TABLE table_name [col_name data_type]
[UNIQUE | FULLTEXT | SPATIAL] [INDEX | KEY] [index_name] (col_name [length]) [ASC | DESC]
Copy the code
  • UNIQUE, FULLTEXT, and SPATIAL are optional parameters, representing UNIQUE index, full-text index, and SPATIAL index respectively.
  • INDEX and KEY are synonyms. They are used to specify the creation of an INDEX.
  • Index_name Specifies the name of the index. If this parameter is not specified, col_name is the index name by default.
  • Col_name is the column for which the index is to be created. This column must be selected from multiple columns defined in the data table.
  • Length is an optional parameter, indicating the length of the index. The index length can be specified only for a string field.
  • ASC or DESC specifies the ascending or descending index value store.

1. Create a common index

Select * from book; select * from book; select * from book; select * from book;

CREATE TABLE book(
    book_id INT ,
    book_name VARCHAR(100),
    authors VARCHAR(100),
    info VARCHAR(100) ,
    comment VARCHAR(100),
    year_publication YEAR, INDEX(year_publication) # declare INDEX, can use key, same effect);Copy the code

To view the index:

# way1:show create tablebook\G; # way2:show index from book;
Copy the code

Once you create the index field, you can use it as a WHERE selection criteria to search for records, which is a huge efficiency boost.

2. Create a unique index

Display creation, for example:

CREATE TABLE test1(
    id INT NOT NULL,
    name varchar(30) NOT NULL.UNIQUEINDEX uk_idx_id(id)# create unique INDEX uk_idx_id;Copy the code

After this statement is executed, run the SHOW CREATE TABLE command to view the TABLE structure:

SHOW INDEX FROM test1 \G;
Copy the code

3, primary key index

Innodb is a clustered index. Syntax: only by defining primary key constraints

Create index with table:

CREATE TABLE student (
    id INT(10) UNSIGNED AUTO_INCREMENT ,
    student_no VARCHAR(200),
    student_name VARCHAR(200),
    PRIMARYKEY(id);Copy the code

Delete primary key index:

ALTER TABLE student drop PRIMARY KEY ;
Copy the code

Alter primary key index: you must drop the original index before creating a new index

Create a single-column index

For example:

CREATE TABLE test2(
    id INT NOT NULL,
    name CHAR(50) NULL,
    INDEX single_idx_name(name(20)));Copy the code

After this statement is executed, run the SHOW CREATE TABLE command to view the TABLE structure:

SHOW INDEX FROM test2 \G;
Copy the code

Create a composite index

For example, create table test3 and create composite indexes on the ID, name, and age fields in the table. The SQL statement is as follows:

CREATE TABLE test3(
    id INT(11) NOT NULL,
    name CHAR(30) NOT NULL,
    age INT(11) NOT NULL,
    info VARCHAR(255),
    INDEX multi_idx(id,name,age)
);
Copy the code

Create a full-text index

Example 1: Create table test4 and create a full-text index on the info field of the table. SQL statement as follows:

CREATE TABLE test4(
    id INT NOT NULL,
    name CHAR(30) NOT NULL,
    age INT NOT NULL,
    info VARCHAR(255),
    FULLTEXT INDEX futxt_idx_info(info)
) ENGINE=MyISAM;
Copy the code

The last ENGINE can be omitted in MySQL5.7 and later because InnoDB supports full-text indexing in this release.

Example 2: Create a table with full-text indexes for the title and body fields.

CREATE TABLE articles (
    id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    title VARCHAR (200),
    body TEXT,
    FULLTEXT index (title, body)
) ENGINE = INNODB ;
Copy the code

Create a spatial index

When creating a spatial index, fields of the spatial type must be non-empty. For example, create table test5 and create a spatial index on the column where the space type is GEOMETRY. The SQL statement is as follows:

CREATE TABLE test5(
    geo GEOMETRY NOT NULL,
    SPATIAL INDEX spa_idx_geo(geo)
) ENGINE=MyISAM;
Copy the code

Create an index on an existing table

To CREATE an INDEX on an existing TABLE, use the ALTER TABLE statement or CREATE INDEX statement.

ALTER TABLE create index; ALTER TABLE create index;

ALTER TABLE table_name ADD [UNIQUE | FULLTEXT | SPATIAL] [INDEX | KEY]
[index_name] (col_name[length],...) [ASC | DESC]
Copy the code

MySQL > CREATE INDEX; CREATE INDEX; CREATE INDEX; CREATE INDEX; CREATE INDEX;

CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name
ON table_name (col_name[length],...) [ASC | DESC]
Copy the code

3, drop index

ALTER TABLE drop index ALTER TABLE drop index

ALTER TABLE table_name DROP INDEX index_name;
Copy the code

Example:

This statement adds a primary key, which means that the index value must be unique and cannot be NULL */
ALTER TABLE tabName ADD PRIMARY KEY(column_list);

/* 2, the index must be created by a unique key (except for NULL, which may occur several times) */
ALTER TABLE tabName ADD UNIQUE indexName(column_list);

SQL > create index (*/)
ALTER TABLE tabName ADD INDEX indexName(column_list);

/* select * from FULLTEXT where index = FULLTEXT */
ALTER TABLE tabName ADD FULLTEXT indexName(column_list);
Copy the code

Tip: Add an AUTO_INCREMENT column whose unique index cannot be dropped

DROP INDEX DROP INDEX DROP INDEX DROP INDEX DROP INDEX

DROP INDEX [index_name] ON table_name;
Copy the code

Tip: When you drop a column in a table, if the column to be deleted is part of the index, that column will also be deleted from the index. If all columns that make up the index are dropped, the entire index is dropped.

4. View the index

SHOW INDEX FROMtablename\G; # View in table formCopy the code

example:

5. Index design principles

When creating an index, you must consider which fields to create the index on and what type of index to create in order to use the index efficiently. Poor index design or lack of indexes can hinder database and application performance. Efficient indexes are important for good performance. Criteria should be considered when designing indexes.

5.1. What situations need to build an index

1. The value of a field is limited by uniqueness

The index itself can play a role of constraint, such as unique index, primary key index can play a unique constraint, so in our data table, if a field is unique, we can directly create a unique index, or primary key index. This allows a record to be identified more quickly through the index.

For example, the secondary number of the student table is a unique field, and creating a unique index for this field can quickly determine the information of a student. If the name is used, there may be a phenomenon of the same name, which reduces the query speed.

Business-specific fields, even composite fields, must have unique indexes. (Source: Alibaba)

Note: Do not assume that a unique index affects insert speed. The speed loss is negligible, but the speed increase is significant.

2. Create indexes for fields that are frequently used as query criteria

Create an index for a field that is frequently used in the WHERE condition of a SELECT statement. Especially in the case of large amount of data, creating common indexes can greatly improve the efficiency of data query.

For example, suppose we want to query the student_info table (with 1 million data points) for the user whose student_id = 123110.

  • There is a huge difference in the execution time of the result returned by indexing and unindexing the field.

Select * from GROUP BY and ORDER BY

Indexing is to store or retrieve data in a certain ORDER, so when we use GROUP BY to query data, or use ORDER BY to sort data, we need to index the grouped or sorted fields. If you have more than one column to sort, you can create composite indexes on those columns.

UPDATE, DELETE WHERE condition columns

When data is queried according to certain criteria and then updated or deleted, creating an index for the WHERE field can greatly improve efficiency. This works because we need to retrieve the record based on the WHERE condition column before updating or deleting it. The efficiency gains are even greater when non-indexed fields are updated, because non-indexed field updates do not require index maintenance.

5. Create indexes for DISTINCT fields

Sometimes, a field needs to be de-duplicated. If DISTINCT is used, creating an index for this field can also improve query efficiency.

Student_id = student_id; student_id = student_id; student_id = student_id;

SELECT DISTINCT(student_id) FROM `student_info`;
Copy the code

Running results (600637 records, running time 0.683s) :

SQL > create index (student_id);

SELECT DISTINCT(student_id) FROM `student_info`;
Copy the code

Operation results (600637 records, running time 0.010s) :

You can see an improvement in SQL query efficiency and the student_id is displayed in ascending order. This is because indexes sort the data in a certain order, so deduplicating is much faster.

6. Precautions for creating an index when a JOIN operation is performed on multiple tables

First of all, try not to have more than 3 join tables, because each additional table is equivalent to adding a nested loop, which can grow very quickly and seriously affect the efficiency of the query.

Second, create indexes for WHERE conditions, because WHERE is the filter for data conditions. Without WHERE conditional filtering can be very scary if the data volume is very large.

Finally, an index is created for the field used to join (that is, the field associated with other tables), and that field must be of the same type across multiple tables. For example, course_id is of type int(11) in both student_info and course tables, not vARCHAR in one.

For example, if we create an index only for student_id, execute the SQL statement:

SELECT course_id, name, student_info.student_id, course_name
FROM student_info JOIN course
ON student_info.course_id = course.course_id
WHERE name = '462eed7ac6e791292a79';
Copy the code

Operating results (1 piece of data, running time 0.189s) :

Here we create an index for name and execute the above SQL statement in 0.002s.

Create index with small column type

Use string prefixes to create indexes

Create a merchant table, because the address field is long, create a prefix index on the address field

create table shop(address varchar(120) not null);
alter table shop add index(address(12));
Copy the code

The question is, how much? Truncation is much, can not achieve the purpose of saving index storage space; There are fewer truncations, too many repetitions, and the hashing (selectivity) of segments decreases. How do you calculate the selectivity of different lengths? Let’s look at the selectiveness of the field in the total data:

select count(distinct address) / count(*) from shop;
Copy the code

Calculated by different lengths, compared with the selectivity of the full table:

Formula:

count(distinct left(Column name, index length)/count(*)
Copy the code

Such as:

select count(distinct left(address,10)) / count(*) as sub10, -- Selects the selectivity of the first 10 characters
count(distinct left(address,15)) / count(*) as sub11, -- Cut the selectivity of the first 15 characters
count(distinct left(address,20)) / count(*) as sub12, -- Cut the selectivity of the first 20 characters
count(distinct left(address,25)) / count(*) as sub13 -- Cut the selectivity of the first 25 characters
from shop;
Copy the code

This leads to another problem: the effect of index column prefixes on sorting

Development: Java Development Manual of Alibaba

[Mandatory] When creating an index for a VARCHAR field, the index length must be specified. It is not necessary to create an index for all fields. The index length depends on the actual text discrimination.

Note: The length of an index is contradictory to its distinctiveness. Generally, the distinctiveness of an index with a length of 20 is over 90% for string data. You can use the distinctiveness of count(distinct left(column name, index length)/count(*) to determine the distinctiveness.

9. Columns with a high degree of differentiation (hash) are suitable for indexing

10. The most frequently used columns are placed to the left of the union index

This also allows for less indexing. At the same time, the utilization of federated indexes can be increased due to the left-most prefix principle.

11. In cases where multiple fields are indexed, a federated index is superior to a single-value index

5.2. Limit the number of indexes

In practice, we also need to pay attention to balance, the number of indexes is not the better. We need to limit the number of indexes in each table. It is recommended that the number of indexes in each table be no more than 6. The reason:

  • Each index requires disk space, and the more indexes, the more disk space is required.
  • Indexes affect the performance of INSERT, delete.update, and other statements, because indexes adjust and UPDATE as the data in the table changes, which can be a burden.
  • When choosing how to optimize the query optimizer, can according to the unified information, for each index can be used to evaluate, to generate a best execution plan, if there are a lot of index | all at the same time can be used to query, can increase the MySQL optimizer generated execution plan time, reduce the query performance.

5.3. When is index creation inappropriate

1. Do not set indexes for fields that are not used in WHERE

Create an index for a column that is not needed in a WHERE condition (including GROUP BY and ORDER BY). Here’s an example:

SELECT course_ id, student id, create_time
FROM student_ info
WHERE student.id = 41251;
Copy the code

Because we are retrieving by student_ID, there is no need to create indexes for other fields, even if they appear in the SELECT field.

2. Do not use indexes for small tables

When the number of rows in a table is small, such as less than 1000 rows, there is no need to create indexes.

3, do not create indexes on columns with a large number of duplicate data

Example 1: To find 500,000 rows out of a million rows (such as male data), once the index is created, you need to access the index 500,000 times and then access the table 500,000 times, which can add up to more overhead than not using the index at all.

4. Avoid creating too many indexes on frequently updated tables

Frequently updated fields are not suitable for creating indexes

5. Using unordered values as indexes is not recommended

For example, id cards, UUID(which needs to be converted to ASCII for index comparison and may cause page splitting when inserted), MD5, HASH, and unordered long character strings.

6. Drop indexes that are no longer used or rarely used

7. Do not define redundant or duplicate indexes

① Redundant index examples: table construction sentences are as follows

CREATE TABLE person_info(
    id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,
    birthday DATE NOT NULL,
    phone_number CHAR(11) NOT NULL,
    country varchar(100) NOT NULL.PRIMARY KEY (id),
    KEY idx_name_birthday_phone_number (name(10), birthday, phone_number),
    KEY idx_name (name(10)));Copy the code

We know that the IDx_name_birthday_phone_number index provides a quick search for the NAME column, and creating a redundant index for the NAME column only adds to the maintenance cost and does not benefit the search.

On the other hand, we might create a duplicate index for a column, for example:

CREATE TABLE repeat_index_demo (
    col1 INT PRIMARY KEY,
    col2 INT.UNIQUE uk_idx_c1 (col1),
    INDEX idx_c1 (col1)
)
Copy the code

Col1 is a primary key, a unique index, and a normal index. However, the primary key itself generates clustered indexes, so the unique and normal indexes defined are duplicated. This situation should be avoided.