Welcome to pay attention to github.com/hsfxuebao, I hope to help you, if you think it can trouble to click on the Star ha
1. Declaration and use of indexes
1.1 Classification of indexes
MySQL indexes include normal indexes, unique indexes, full-text indexes, single-column indexes, multi-column indexes, and spatial indexes.
- from
Functional logic
There are 4 kinds of indexes, which are common index, unique index, primary key index and full-text index. - In accordance with the
Physical implementation
, indexes can be divided into two types: clustered index and non-clustered index. - In accordance with the
Indicates the number of fields
, divided into single column index and union index.
1.1.1 Common Indexes
Common indexes are used to improve query efficiency without any restrictions. Such indexes can be created in any data type, and their values are unique and non-null depending on the integrity constraints of the field itself. After the index is created, you can use the index to query. For example, create a plain index on the field name of table Student, which can be queried against when querying records.
1.1.2 Unique Index
An index can be set to a unique index with the UN work QUE parameter. When creating a unique index, the value of the index must be unique, but null values are allowed. There can be multiple unique indexes in a table.
For example, to create a unique index in the student field email, the value of the field email must be unique. With unique indexes, a record can be determined more quickly.
1.1.3 Primary key Index
A primary key index is a special type of UNIQUE index, which is NOTNULL+UNIQUE. A primary key index can only have one primary key index in a table.
Why? This is determined by the physical implementation of primary key indexes, because data stores can only be stored in one order in a file.
1.1.4 Single-column Index
Create an index on a single field in a table. A single-column index is indexed only by this field. A single-column index can be a normal index, a unique cable bow I, or a full-text index. Just make sure that the index only corresponds to one field. A table can have multiple single-column indexes.
1.1.5 Multi-column (composite, union) index
A multi-column index creates a cable bow I over a combination of fields in a table. The index points to multiple fields when the index is created and can be queried by these fields. However, the index is used only when the f field of these fields is used in the query condition. For example, create a multi-column index idx_ID_name_gender on the fields ID, name, and gender in the table. This index will only be used if the field ID is used in the query condition. Follow the leftmost prefix set when using composite indexes.
1.1.6 Full-text index
Full-text indexing (also known as full-text retrieval) is a key technology used by search engines at present. It can use [word segmentation technology] and other algorithms to intelligently analyze the frequency and importance of keywords in the text, and then screen out the search results we want intelligently according to certain algorithm rules. Full-text indexes are not usually suitable for large data sets, but are less useful for small data sets.
You can set the index to a full-text index using the FULLTEXT parameter. Full-text lookup of values is supported on the columns where the index is defined, allowing the insertion of duplicate and null values in these index columns. Full-text indexes can only be created on fields of the CHAR, VARCHAR, or TEXT type and their series types. Full-text indexes can speed up query for fields of the string type with a large amount of data. For example, the information field in table Student is of type TEXT, which contains a lot of textual information. When a full-text index is built on field information, the speed of querying field information can be increased.
There are two typical types of full-text indexes: natural language full-text indexes and Boolean full-text indexes.
- The natural language search citation calculates the relevance of each document object to the query. Here, relevance is based on the number of matched keywords and the number of times that keyword appears in the document. The fewer words in the index, the more relevant the match. Conversely, very common words will not be searched, and if a word appears in more than 50% of the records, natural language searches will not search for such words.
MySQL database from 3.23.23 to support full-text indexing, but MySQL5.6.4 before only Myisam support, innoDB after 5.6.4 support, but the official version does not support Chinese word segmentation, requires a third-party word segmentation plug-in. In version 5.7.6, MySQL has a built-in Ngram full-text parser to support word segmentation for Asian languages. When testing or using full-text indexes, check to see if your version of MySQL, storage engine, and data type support full-text indexes.
With the advent of big data era, relational database has been unable to cope with the demand of full-text index, and has been gradually replaced by solr, ElasticSearch and other specialized search engines.
1.1.7 Supplementary: Spatial Index
You can set the index to a spatial one using the argument parsonage workal. Spatial index can only be established on spatial data type, which can improve the efficiency of the system to obtain spatial data. The spatial data types in MySQL include GEOMETRY, POINT, L-nestr NG and POLYGON, etc. Currently only the MyISAM storage engine supports spatial retrieval, and indexed fields cannot be null. For beginners, this kind of index is rarely used.
Summary: Different storage engines support different types of indexes
InnoDB
: supports indexes such as B-tree and full-text, but does not support Hash indexes.MyISAM
: supports indexes such as B-tree and full-text, but does not support Hash indexes.Memory
: supports b-tree and Hash indexes, but does not support full-text indexes.NDB
: Supports Hash indexes but does not support indexes such as B-tree and full-text.Archive
: Does not support indexes such as B-tree, Hash, and full-text.
1.2 Creating An Index
MySQL supports multiple methods for creating indexes on a single or multiple columns: specifying index columns in the CREATE TABLE definition statement, creating indexes on existing tables using the ALTER TABLE statement, or adding indexes to existing tables using the CREATE NDEX statement.
Create an index when creating a table
When creating a TABLE using CREATE TABLE, you can define the data type of the column as well as the primary key constraint, foreign key constraint, or unique constraint. Regardless of which constraint is created, it is equivalent to creating an index on the specified column at the same time.
CREATE TABLE dept(
dept_id INT PRIMARY KEY AUTO_INCREMENT,
dept_name VARCHAR(20)
);
CREATE TABLE emp(
emp_id INT PRIMARY KEY AUTO_INCREMENT,
emp_name VARCHAR(20) UNIQUE,
dept_id INT,
CONSTRAINT emp_dept_id_fk FOREIGN KEY(dept_id) REFERENCES dept(dept_id)
);
Copy the code
The basic syntax for creating an index when explicitly creating a table is as follows:
CREATE TABLE table_name [col_name data_type]
[UNIQUE | FULLTEXT | SPATIAL] [INDEX | KEY] [index_name] (col_name [length]) [ASC |DESC]
Copy the code
UNIQUE
、FULLTEXT
和SPATIAL
Is an optional parameter, representing unique index, full-text index, and spatial index respectively.INDEX
withKEY
Is a synonym for creating an index.index_name
Col_name = col_name; col_name = col_name; col_name = col_name; col_name = col_namecol_name
Is a column of fields to be indexed, which must be selected from multiple columns defined in the data table;length
Is an optional parameter, indicating the length of the index. Only string fields can specify the length of the index.ASC
orDESC
Specifies an ascending or descending index value store.
- Create a normal index:
Select * from book; select * from book; select * from book; select * from book;
CREATE TABLE book(
book_id INT ,
book_name VARCHAR(100),
authors VARCHAR(100),
info VARCHAR(100) ,
comment VARCHAR(100),
year_publication YEAR,
INDEX(year_publication)
);
Copy the code
- Create unique index:
CREATE TABLE test1(
id INT NOT NULL,
name varchar(30) NOT NULL,
UNIQUE INDEX uk_idx_id(id)
);
Copy the code
After this statement is executed, run the SHOW CREATE TABLE command to view the TABLE structure:
SHOW INDEX FROM test1 \G
Copy the code
- Primary key index:
Innodb is a clustered index with the following syntax:
- Create index with table:
CREATE TABLE student (
id INT(10) UNSIGNED AUTO_INCREMENT ,
student_no VARCHAR(200),
student_name VARCHAR(200),
PRIMARY KEY(id)
);
Copy the code
- Delete primary key index:
ALTER TABLE student
drop PRIMARY KEY ;
Copy the code
To change a primary key index, drop the original index before creating a new index
- Create a single-column index:
CREATE TABLE test2(
id INT NOT NULL,
name CHAR(50) NULL,
INDEX single_idx_name(name(20))
);
Copy the code
After this statement is executed, run the SHOW CREATE TABLE command to view the TABLE structure:
SHOW INDEX FROM test2 \G
Copy the code
- Create composite index:
Select * from test3 where id, name, and age = 1;
CREATE TABLE test3(
id INT(11) NOT NULL,
name CHAR(30) NOT NULL,
age INT(11) NOT NULL,
info VARCHAR(255),
INDEX multi_idx(id,name,age)
);
Copy the code
After this statement is executed, use SHOW INDEX to view:
SHOW INDEX FROM test3 \G
Copy the code
- Create full-text index:
Example 1: Create table test4 and create a full-text index on the info field of the table. SQL statement as follows:
CREATE TABLE test4(
id INT NOT NULL,
name CHAR(30) NOT NULL,
age INT NOT NULL,
info VARCHAR(255),
FULLTEXT INDEX futxt_idx_info(info)
) ENGINE=MyISAM;
Copy the code
The last ENGINE can be omitted in MySQL5.7 and later because InnoDB supports full-text indexing in this release.
Example 2:
CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
title VARCHAR (200),
body TEXT,
FULLTEXT index (title, body)
) ENGINE = INNODB ;
Copy the code
Create a table to add full-text indexes to the title and body fields.
Example 3:
CREATE TABLE `papers` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(200) DEFAULT NULL,
`content` text,
PRIMARY KEY (`id`),
FULLTEXT KEY `title` (`title`,`content`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Copy the code
Queries other than like:
SELECT * FROM papers WHERE content LIKE '%';Copy the code
Select * from match+ Against;
SELECT * FROM papers WHERE MATCH(title,content) AGAINST (' query string ');Copy the code
Note:
- Find out version support before using full-text indexes;
- Full-text index is N times faster than like + %, but there may be accuracy problems.
- If a large amount of data needs to be indexed, you are advised to add data before creating indexes.
- Create a spatial index:
When creating a spatial index, fields of the spatial type must be non-empty. For example, create table test5 and create a spatial index on the column where the space type is GEOMETRY. The SQL statement is as follows:
CREATE TABLE test5(
geo GEOMETRY NOT NULL,
SPATIAL INDEX spa_idx_geo(geo)
) ENGINE=MyISAM;
Copy the code
1.2.2 Creating an Index on an Existing table
To CREATE an INDEX on an existing TABLE, use the ALTER TABLE statement or CREATE INDEX statement.
- ALTER TABLE ALTER TABLE create index syntax:
ALTER TABLE table_name ADD [UNIQUE | FULLTEXT | SPATIAL] [INDEX | KEY][index_name] (col_name[length],...) [ASC | DESC]
Copy the code
- The CREATE INDEX statement can add an INDEX to an existing table.
CREATE INDEX is mapped to an ALTER TABLE statement with the basic syntax:
CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name
ON table_name (col_name[length],...) [ASC | DESC]
Copy the code
1.3 Deleting an Index
ALTER TABLE drop index ALTER TABLE drop index
ALTER TABLE table_name DROP INDEX index_name;
Copy the code
Dropping an INDEX using the DROP INDEX statement The syntax for dropping an INDEX is as follows:
DROP INDEX index_name ON table_name;
Copy the code
Tip: When you drop a column in a table, if the column to be deleted is part of the index, that column will also be deleted from the index. If all columns that make up the index are dropped, the entire index is dropped.
2. MySQL8.0 index new features
2.1 Supports descending indexes
Descending indexes store key values in descending order. Although syntactically the descending index syntax has been supported since MySQL 4, in fact the DESC definition was ignored until sjmysql8.x actually supported descending indexes (limited to the InnoDB storage engine).
MySQL before version 8.0 is still created in ascending order index, use reverse scan, which greatly reduces the efficiency of the database. In some scenarios, descending indexes are significant. For example, if a query requires multiple columns to be sorted and the order requirements are inconsistent, using a descending index will prevent the database from using additional text ㈱E sequence operations, thus improving performance.
SQL > create TABLE ts1 from MySQL 5.7; MySQL 8.0;
CREATE TABLE ts1(a int,b int,index idx_a_b(a,b desc));
Copy the code
SQL > alter table TS1;
As you can see from the results, the index is still the default ascending order.
SQL > alter table TS1
As you can see from the results, the index is already in descending order. Let’s continue testing the performance of the descending index in the execution plan. Select * from ts1; insert 800 rows into TS1; insert 800 rows into TS1; insert 800 rows into TS1;
DELIMITER // CREATE PROCEDURE ts_insert() BEGIN DECLARE i INT DEFAULT 1; WHILE i < 800 DO insert into ts1 select rand()*80000,rand()*80000; SET i = i + 1; END WHILE; commit; END // DELIMITE ; # CALL ts_insert();Copy the code
SQL > alter table ts1;
EXPLAIN SELECT * FROM ts1 ORDER BY a,b DESC LIMIT 5;
Copy the code
As you can see from the results, the number of scans in the execution plan is 799 and Using Filesort is used.
Tip: Using filesort is a slow external sort in MySQL and is best avoided. In most cases, administrators can optimize indexes to avoid Using filesort and thus speed up database execution.
View the execution plan for table TS1 in MySQL 8.0. As you can see from the results, the number of scans in the execution plan is 5 and Using filesort is not used.
Note: Descending indexes are valid only for a particular sort order in a query, and can be even less efficient if used incorrectly. For example, MySQL 5.7 performs better than MySQL 8.0 in order by a desc, b desc.
Change the sorting criteria to order by a desc, B desc. SQL > alter table ts1;
EXPLAIN SELECT * FROM ts1 ORDER BY a DESC,b DESC LIMIT 5;
Copy the code
View the execution plan for table TS1 in MySQL 8.0.
As can be seen from the results, the modified MySQL 5.7 execution plan is significantly better than MySQL 8.0.
2.2 Hiding Indexes
Before MySQL 5.7, indexes can only be dropped explicitly. In this case, if an error occurs after the index is deleted, you can only create the index back by explicitly creating the index. If the amount of data in the data table is very large, or the data table itself is large, this operation can consume too many resources on the system, and the operation cost is very high.
Since MySQL 8.x, invisible indexes are supported. You only need to set an index to be dropped as a hidden index so that the query optimizer does not use the index (even if you use force Index, the optimizer does not use the index). Ensure that the system receives no response after setting the index to hidden, you can drop the index completely. This method of setting the index to hide and then dropping the index is called soft delete.
Also, if you want to verify the impact of an index drop on query performance, you can hide the index at the same time.
Note: a primary key cannot be set as a hidden index. When no primary key is displayed in the table, the first unique non-empty index in the table becomes an implicit primary key, and a hidden index cannot be set.
Indexes are VISIBLE by default. When using statements such as CREATE TABLE CREATE INDEX or ALTER TABLE, you can use VISIBLE or INVISIBLE keywords to make indexes VISIBLE.
- INVISIBLE create table INVISIBLE create table INVISIBLE create table INVISIBLE create table INVISIBLE
CREATE TABLE tablename(propName1 type1[CONSTRAINT1], propName2 type2[CONSTRAINT2],... propnamen typen, INDEX [indexname](propname1 [(length)]) INVISIBLE );Copy the code
INVISIBLE is used to mark an index as INVISIBLE.
2. Create table on existing table:
A hidden index can be set for an existing table in the following syntax:
CREATE INDEX indexname
ON tablename(propname[(length)]) INVISIBLE;
Copy the code
- ALTER TABLE statement create:
ALTER TABLE tablename
ADD INDEX indexname (propname [(length)]) INVISIBLE;
Copy the code
- Changing the visible status of an index You can change the visible status of an existing index by using the following statement:
ALTER TABLE tablename ALTER INDEX index_name INVISIBLE; ALTER TABLE tablename ALTER INDEX index_name VISIBLE; # Switch to unhidden indexCopy the code
If you switch the index_CNAME index to visible and view the execution plan through Explain, the optimizer selects the Index_Cname index.
Note: When an index is hidden, its content is still updated in real time as a normal index. If an index needs to be hidden for a long time, it can be removed because the presence of an index affects insert, update, and delete performance.
You can see the tuning aid of an index by setting the visibility of a hidden index.
- Make hidden indexes visible to the query optimizer:
In MySQL 8.x, a new way to test indexes is provided by turning on a setting that makes hidden indexes visible to the query optimizer with a switch (use_invisibLE_INDEXES) on the query optimizer. If USE_INvisIBLE_INDEXES is set to off(the default), the optimizer ignores hiding indexes. If set to ON, the optimizer will consider using the hidden index when generating the execution plan, even if the hidden index is not visible.
(1) Run the following command on the MySQL command line to check the switch Settings of the query optimizer:
select @@optimizer_switch \G
Copy the code
The following attribute configurations are found in the output information.
use_invisible_indexes=off
Copy the code
If this property is set to off, the hidden index is not visible to the query optimizer by default.
To make the hidden index visible to the query optimizer, run the following command on MySQL:
mysql> set session optimizer_switch="use_invisible_indexes=on";
Query OK, 0 rows affected (0.00 sec)
Copy the code
SQL statement executed successfully, check query optimizer switch Settings again.
mysql> select @@optimizer_switch \G
*************************** 1. row ***************************
@@optimizer_switch:
index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_
intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_co
st_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on
,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on
,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on,use_invisible_ind
exes=on,skip_scan=on,hash_join=on
1 row in set (0.00 sec)
Copy the code
In this case, you can see the following property configuration in the output result.
use_invisible_indexes=on
Copy the code
The value of the USE_INvisIBLE_INDEXES attribute is on, which means that the hidden index is visible to the query optimizer.
(3) Use EXPLAIN to view the index usage when field INvisIBLE_column is used as the query condition.
Explain select * from classes where cname = '1 ';Copy the code
The query optimizer uses hidden indexes to query data.
(4) If you want to make the hidden index invisible to the query optimizer, you only need to run the following command.
mysql> set session optimizer_switch="use_invisible_indexes=off";
Query OK, 0 rows affected (0.00 sec)
Copy the code
Look again at the query optimizer switch Settings.
mysql> select @@optimizer_switch \G
Copy the code
At this point, the value of the USE_INvisIBLE_INDEXES attribute has been set to “off”.
There is no point in hiding an index if it is visible to the optimizer
3. Index design principles
In order to use indexes efficiently, you must consider which fields and types of indexes to create when creating an index. Poor index design or lack of indexes can hinder database and application performance.
3.1 Data Preparation
- Create a database, create a table
CREATE DATABASE atguigudb1; USE atguigudb1; CREATE TABLE 'student_info' (' id 'INT(11) NOT NULL AUTO_INCREMENT,' student_id 'INT NOT NULL, `name` VARCHAR(20) DEFAULT NULL, `course_id` INT NOT NULL , `class_id` INT(11) DEFAULT NULL, `create_time` DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8; CREATE TABLE `course` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `course_id` INT NOT NULL , `course_name` VARCHAR(40) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;Copy the code
- Create the storage functions necessary to simulate the data:
# function 1: DELIMITER // CREATE FUNCTION rand_string(n INT) RETURNS VARCHAR(255) # This FUNCTION RETURNS a string BEGIN DECLARE CHARs_str VARCHAR(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ'; DECLARE return_str VARCHAR(255) DEFAULT ''; DECLARE i INT DEFAULT 0; WHILE i < n DO SET return_str =CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1)); SET i = i + 1; END WHILE; RETURN return_str; END // DELIMITER ;Copy the code
#函数2:创建随机数函数
DELIMITER //
CREATE FUNCTION rand_num (from_num INT ,to_num INT) RETURNS INT(11)
BEGIN
DECLARE i INT DEFAULT 0;
SET i = FLOOR(from_num +RAND()*(to_num - from_num+1)) ;
RETURN i;
END //
DELIMITER ;
Copy the code
Create function if error:
This function has none of DETERMINISTIC......
Copy the code
Since the slow query log bin-log is enabled, we must specify a parameter for our function.
In the master/slave replication, the master records the write operation in the bin-log. Read the bin-log from the slave machine and execute the statement to synchronize the data.
Using functions to manipulate data can result in inconsistent operation times between the slave machine and the primary key. So, by default, mysql does not turn on the create function setting.
- Mysql > allow function creation
show variables like 'log_bin_trust_function_creators';
Copy the code
- Command on: allows to create function Settings:
set global log_bin_trust_function_creators=1; The current window is valid only without global.Copy the code
- If mysqld is restarted, the above parameters will disappear again. Permanent method:
- For Windows: my.ini[mysqld]
log_bin_trust_function_creators=1 Copy the code
- /etc/my.cnf /my.cnf/mysqld
log_bin_trust_function_creators=1 Copy the code
- Create a stored procedure to insert simulated data:
DELIMITER // CREATE PROCEDURE insert_course(max_num INT) BEGIN DECLARE I INT DEFAULT 0; SET autocommit = 0; REPEAT # SET I = I + 1; INSERT INTO course (course_id, course_name) VALUES (rand_num(10000,10100),rand_string(6)); UNTIL i = max_num END REPEAT; COMMIT; END // DELIMITER;Copy the code
DELIMITER // CREATE PROCEDURE insert_STu (max_num INT) BEGIN DECLARE I INT DEFAULT 0; SET autocommit = 0; REPEAT # SET I = I + 1; INSERT INTO student_info (course_id, Class_id,student_id,NAME) VALUES (rand_num(10000,10100),rand_num(10000,10200),rand_num(1,200000),rand_string(6)); class_id,student_id,NAME) VALUES (rand_num(10000,10100),rand_num(10000,10200),rand_num(1,200000),rand_string(6)); UNTIL i = max_num END REPEAT; COMMIT; END // DELIMITER;Copy the code
- Calling a stored procedure:
CALL insert_course(100);
CALL insert_stu(1000000);
Copy the code
3.2 Which conditions are appropriate for creating an index
1. The value of a field is limited by uniqueness
The index itself can play a role of constraint, such as unique index, primary key index can play a unique constraint, so in our data table, if a field is unique, we can directly create a unique index, or primary key index. This allows a record to be identified more quickly through the index.
For example, the secondary number of the student table is a unique field. Creating a unique index for this field can quickly determine the information of a student. If the first and last names are used, the phenomenon of the same name may exist, thus reducing the query speed.
Business-specific fields, even composite fields, must have unique indexes. Note: Do not assume that a unique index affects insert speed, the speed loss is negligible, but the increase in search speed is significant.
2. Fields frequently used as WHERE query conditions
Create an index for a field that is frequently used in the WHERE condition of a SELECT statement. Especially in the case of large amount of data, creating common indexes can greatly improve the efficiency of data query. For example, suppose we want to query the student_info table (with 1 million data points) for the user whose student_id=123110.
3. GROUP BY and ORDER BY columns
Indexing is to store or retrieve data in a certain ORDER, so when we use GROUP BY to query data, or use ORDER BY to sort data, we need to index the grouped or sorted fields. If you have more than one column to sort, you can build composite indexes on those columns.
UPDATE and DELETE WHERE condition columns
When data is queried according to certain criteria and then updated or deleted, creating an index for the WHERE field can greatly improve efficiency. This works because we need to retrieve the record based on the WHERE condition column before updating or deleting it.
The efficiency gains are even greater when non-indexed fields are updated, because non-indexed field updates do not require index maintenance.
5. Indexes for DISTINCT fields need to be created
Sometimes, a field needs to be de-duplicated. If DISTINCT is used, creating an index for this field can also improve query efficiency. Student_id = student_id; student_id = student_id; student_id = student_id;
SELECT DISTINCT(student_id) FROM `student_info`;
Copy the code
Running results (600637 records, running time 0.683s) :
SQL > create index (student_id);
SELECT DISTINCT(student_id) FROM `student_info`;
Copy the code
Operation results (600637 records, running time 0.010s) :
You can see an improvement in SQL query efficiency and the student_id is displayed in ascending order. This is because indexes sort the data in a certain order, so de-weighting is much faster.
6. Precautions for creating indexes when multiple table joins are performed
First of all, try not to have more than 3 join tables, because each additional table is equivalent to adding a nested loop, which increases by an order of magnitude very quickly and seriously affects the efficiency of the query.
Second, create indexes for WHERE conditions, because WHERE is the filter for data conditions. Without WHERE conditional filtering can be very scary if the data volume is very large.
Finally, create an index for the field used for the join, and the field must be of the same type across multiple tables. For example, course_id is of type int(11) in both student_info and course tables, not vARCHAR in one.
For example, if we create an index only for student_id, execute the SQL statement:
SELECT course_id, name, student_info.student_id, course_name
FROM student_info JOIN course
ON student_info.course_id = course.course_id
WHERE name = '462eed7ac6e791292a79';
Copy the code
Operating results (1 piece of data, running time 0.189s) :
Here we create an index for name and execute the above SQL statement in 0.002s.
7. Create index with small column type
By type size we mean the size of the range of data that the type represents.
When we define the table structure, we need to explicitly specify the type of the column. For example, the integer types, such as TINY, MEDIUMINT, INT, BIGINT, etc., occupy an increasing amount of storage space, and the range of integers that can be represented is also increasing. If we want to index an integer column, try to keep the index column smaller if the represented integer range allows. For example, if we can use labor NT, don’t use B labor GHNT, and if we can use MED labor UM labor NT, don’t use labor NT. This is because:
- The smaller the data type, the greater the threshold operation during query
- The smaller the data type, the less storage the index occupies, within a data page
Put down more notes
To reduce the performance loss of disk I/O, which means that more data pages can be cached in memory to speed up read and write efficiency.
This recommendation is especially applicable to primary keys of tables, because the primary key is stored not only in the clustered index, but also in all secondary index nodes. If the primary key uses a smaller data type, it means more storage space and more efficient I/O.
8. Use string prefixes to create indexes
Create a merchant table, because the address field is long, create a prefix index on the address field
create table shop(address varchar(120) not null);
alter table shop add index(address(12));
Copy the code
The question is, how much? Truncation is much, can not achieve the purpose of saving index storage space; There are fewer truncations, too many repetitions, and the hash degree (selectivity) of the fields decreases. How do you calculate the selectivity of different lengths?
Let’s look at the selectiveness of the field in the total data:
select count(distinct address) / count(*) from shop;
Copy the code
Calculated by different lengths, compared with the selectivity of the full table:
Formula:
Count (distinct left)/count(*)Copy the code
Such as:
select count(distinct left(address,10)) / count(*) as sub10, Count (distinct left(address,15))/count(*) as sub11, Count (distinct left(address,20))/count(*) as sub12, -- Select the first 20 characters count(distinct left(address,25))/count(*) as sub13 -- Select the first 25 characters from shop;Copy the code
This leads to another problem: the effect of index column prefixes on sorting
Development: Java Development Manual of Alibaba
[Mandatory] When creating an index for a VARCHAR field, the index length must be specified. It is not necessary to create an index for all fields. The index length depends on the actual text discrimination.
Note: The length of an index is contradictory to its distinctiveness. Generally, the distinctiveness of an index with a length of 20 is over 90% for string data. You can use the distinctiveness of count(distinct left(column name, index length)/count(*) to determine the distinctiveness.
9. Highly differentiated (hash) columns are suitable for indexing
Data with high similarity are not suitable for indexing, such as male and female
10. Place the most frequently used columns to the left of the federated index
This also allows for less indexing. At the same time, the utilization of federated indexes can be increased due to the left-most prefix principle.
11. A federated index is superior to a single-value index when multiple fields are indexed
3.3 Limit the number of indexes
In practice, we also need to pay attention to balance, the number of indexes is not the better. We need to limit the number of indexes in each table. It is recommended that the number of indexes in each table be no more than 6. The reason:
- Each index needs to be occupied
Disk space
The more indexes you have, the more disk space you need. | I - Indexes affect
INSERT, DELETE, UPDATE
Statement performance, because indexes are adjusted and updated as the data in the table changes. - When choosing how to optimize a query, the optimizer evaluates each available index against uniform information to produce the best
The execution plan
If multiple indexes can be used for queries at the same time, it will increase the time of the MySQL optimizer to generate execution plans and reduce query performance.
3.4 What Conditions are Not Suitable for Creating indexes
1. Do not set indexes for fields that are not used in WHERE
Do not set indexes for fields that are not used in WHERE
Create an index for a column that is not needed in a WHERE condition (including GROUP BY and ORDER BY).
2. Do not use indexes for small tables
If the table has too few records, say less than 1000, there is no need to build an index. Because the table records are too few, whether to create an index has little impact on the query efficiency.
Example: Create table 1:
CREATE TABLE t_without_index(
a INT PRIMARY KEY AUTO_INCREMENT,
b INT
);
Copy the code
Provide stored procedure 1:
DELIMITER // CREATE PROCEDURE T_Wout_INSERT () BEGIN DECLARE I INT DEFAULT 1; WHILE i <= 900 DO INSERT INTO t_without_index(b) SELECT RAND()*10000; SET i = i + 1; END WHILE; COMMIT; END DELIMITER; // CALL t_wout_insert();Copy the code
Create table 2:
CREATE TABLE t_without_index(
a INT PRIMARY KEY AUTO_INCREMENT,
b INT,
INDEX idx_d(b)
);
Copy the code
Create stored procedure 2:
DELIMITER // CREATE PROCEDURE T_WITH_INSERT () BEGIN DECLARE I INT DEFAULT 1; WHILE i <= 900 DO INSERT INTO t_with_index(b) SELECT RAND()*10000; SET i = i + 1; END WHILE; COMMIT; END // DELIMITER ; # CALL t_with_insert();Copy the code
Query comparison:
mysql> select * from t_without_index where b = 9879; + + -- -- -- -- -- -- -- -- -- -- -- -- + | a | b | + -- -- -- -- -- - + -- -- -- -- -- -- + | 1242 | 9879 | + -- -- -- -- -- - + -- -- -- -- -- - + 1 row in the set (0.00 SEC) mysql > select * from t_with_index where b = 9879; + + -- -- -- -- -- -- -- -- -- -- -- + | | a | b + + -- -- -- -- -- -- -- -- -- -- -- + | 112 | 9879 | + -- -- -- -- - + -- -- -- -- -- - + 1 row in the set (0.00 SEC)Copy the code
You can see that the results are the same, but in the case of a small amount of data, the index doesn’t work.
Conclusion: When the number of rows in a table is small, such as less than 1000 rows, no index creation is required.
3. Do not create indexes on columns with a large number of duplicate data
Indexes are built on columns with more different values that are often used in conditional expressions, but not if there is a lot of duplicate data in the field. If the index is established, the query efficiency will not be improved, and the return will seriously slow down the data update speed.
Example 1: To find 500,000 rows out of a million rows (such as male data), once the index is created, you need to access the index 500,000 times and then access the table 500,000 times, which can add up to more overhead than not using the index at all.
Example 2: Suppose you have a student table with a total of 1 million students and only 10 men, or 1 in 100,000 of the population. The student table student_gender has the following structure. The value of student_gender in the data table is 0 or 1. 0 indicates female, and 1 indicates male.
CREATE TABLE student_gender(
student_id INT(11) NOT NULL,
student_name VARCHAR(50) NOT NULL,
student_gender TINYINT(1) NOT NULL,
PRIMARY KEY(student_id)
)ENGINE = INNODB;
Copy the code
If we want to screen out the men in this student list, we can use:
SELECT * FROM student_gender WHERE student_gender = 1
Copy the code
Running results (10 pieces of data, running time 0.696s) :
Conclusion: There is no need to index this field when the data is highly repetitive, such as more than 10%.
4. Avoid creating too many indexes on frequently updated tables
Level 1: Frequently updated fields do not have to be indexed. Because indexes need to be updated when data is updated, if there are too many indexes, it will also cause a burden when new indexes are created, thus affecting efficiency.
Second meaning: Avoid creating too many indexes on frequently updated tables, and keep the columns in the index as few as possible. In this case, although the query speed is increased, it will reduce the speed of updating the table.
5. It is not recommended to use unordered values as indexes
For example, id cards, UUID(which needs to be converted to ASCII for index comparison and may cause page splitting when inserted), MD5, HASH, and unordered long strings.
6. Delete indexes that are no longer used or rarely used
Some of the original indexes may no longer be needed when the data in the table is updated substantially or the data is used in a different way. Database administrators should periodically find these indexes and remove them to reduce the impact of indexes on update operations.
7. Do not define redundant or duplicate indexes
We must account with some index(a, B, C), either intentionally or unintentionally, using the same column as index(a), index(a,b), or index(a, B, C)
CREATE TABLE person_info(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
birthday DATE NOT NULL,
phone_number CHAR(11) NOT NULL,
country varchar(100) NOT NULL,
PRIMARY KEY (id),
KEY idx_name_birthday_phone_number (name(10), birthday, phone_number),
KEY idx_name (name(10))
);
Copy the code
The IDx_name_birthday_phone_number index provides a quick search for the NAME column. Creating a redundant index for the NAME column only adds to the maintenance cost and does not benefit the search.
On the other hand, we might create a duplicate index for a column, for example:
CREATE TABLE repeat_index_demo (
col1 INT PRIMARY KEY,
col2 INT,
UNIQUE uk_idx_c1 (col1),
INDEX idx_c1 (col1)
);
Copy the code
Col1 is not only the primary key, but also defined as a unique index and a normal index. However, the primary key itself will generate clustered indexes, so the unique index defined and the normal index are duplicated. This situation should be avoided.
3.5 summary
Indexes are a double-edged sword, improving query efficiency but also slowing down inserts and updates and taking up disk space.
The ultimate purpose of index selection is to make the speed of query faster, the principle given above is the most basic criteria, but can not be limited to the above criteria, large family in the future study and work to carry out continuous practice, according to the actual situation of the application of analysis and judgment, select the most appropriate index way.
Refer to the article
MySQL Technical Insider: InnoDB Storage Engine (2nd Edition) database Index Design and Optimization