Why do I need a prefix index
The problem
Have we encountered this problem when we index a field or multiple fields in a table?
Specified key ‘uniq_code’ was too long; max key length is 767 bytes.
The table structure is as follows:
create table `t_account`(
`id` BIGINT(20) UNSIGNED NOT NULL auto_increment COMMENT 'on the ID'. `date` varchar(50) NOT NULL DEFAULT ' ' COMMENT 'date'. `nick_name` varchar(50) NOT NULL DEFAULT ' ' COMMENT 'nickname'. `account` varchar(50) NOT NULL DEFAULT ' ' COMMENT 'account'. `city` varchar(100) NOT NULL DEFAULT ' ' COMMENT 'city'.. PRIMARY KEY (`id`), UNIQUE KEY `uniq_code` (`nick_name`.`account`.`city`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Test'; Copy the code
why
The default innodb_large_prefix=0 in MySQL5.6 limits the single-column index length to 767bytes.
In MySQL5.7 the default innodb_large_prefix=1 removes the 767bytes limit, but the maximum single-column index length cannot exceed 3072bytes.
As for why it is 767 bytes, it depends on the specific storage engine implementation, find the official documentation, also did not say why. https://dev.mysql.com/doc/refman/8.0/en/create-index.html
The number of bytes used by varchar(n) depends on the character set:If the character type is GBK, each character occupies two bytes.If the character type is UTF8, each character contains a maximum of three bytes.If the character type is UTF8MB4, each character occupies a maximum of four bytesCopy the code
Here I set the encoding as UTF8MB4 encoding, a character is 4 bytes, and I created the index 50+50+100=200 characters, a total of 800 bytes, so it is out of length.
The utF8 character set is a maximum of 767bytes, but it is not necessary to set the value to VARCHAR (255). The length of each field depends on the business. Too long is not good for joint indexing.
The solution
-
You can change the length of the field directly, or you can cancel some of the fields in the index, but this change is not friendly to the table itself.
-
By limiting the first n characters of a field to the index, a specific value can be obtained by measuring the length of data in real business.
UNIQUE KEY `uniq_code` (`nick_name`(20),`account`(20),`city`(20)) Copy the code
The first 20 characters of the three fields are the unique index, so that the length can not exceed, this is what we call a prefix index
-
Changes the maximum length of a single index
To change the index limit length, add the following content to the my.ini configuration file and restart:# change single-column index length to 767 bytes, single-column index length to 3072 bytes innodb_large_prefix=1 However, after this parameter is enabled, dynamic storage or compression of tables must be enabled:The system variable innodb_file_format is BarracudaROW_FORMAT is DYNAMIC or COMPRESSEDCopy the code
How do I determine the length of the prefix index
We mentioned above that we can solve the problem of index length exceeding the limit by prefix index, but how can we determine the appropriate length of index field prefix?
Here we can determine the selectivity of the prefix index by calculating the selectivity as follows
Full column selectivity:
SELECT COUNT(DISTINCT column_name) / COUNT(*) FROM table_name;
The selection of a length prefix:
SELECT COUNT(DISTINCT LEFT(column_name, prefix_length)) / COUNT(*) FROM table_name;
The closer the prefix selectivity is to full column selectivity, the better the index performance is.
Advantages and disadvantages of prefix indexes
- It takes up less space and is fast
- Cannot use prefix index to do ORDER BY and GROUP BY
- Unable to use prefix index for overlay scan
- It is possible to increase the number of scanned rows
For example, add id card index, can add hash index or reverse order after storage prefix index.
Again, the creation of a federated index
The same rules apply when we are not sure which field should be the first column in a joint index on a table.
The following example selects the combination (customer_id, staff_id) while creating a federated index for customer_id and staff_id.
# staff_id_selectivity: 0.0001
# customer_id_selectivity: 0.0373
# COUNT(*): 16049
The results show that customer_id is more selective, so you should select Customer_id as the first column in the federated index
SELECT
COUNT(DISTINCT staff_id)/COUNT(*) as staff_id_selectivity, COUNT(DISTINCT customer_id)/COUNT(*) as customer_id_selectivity, COUNT(*) FROM payment Copy the code
So said
The closer the index selectivity is to full column selectivity, the better the index performance.
That is, when this field is used to create an index, it becomes more distinct in the table data.
reference
Mysql > select index length from prefix
Some limits on index length in mysql – yuyue2014
MySQL types and data lengths – Nuggets
This article is formatted using MDNICE