MySQL implicitly cast index invalid

The problem

In my work, I found that there was an interface that only executed one SQL query statement, and the SQL clearly used the primary key column, but the speed was very slow.

After EXPLAINN in MySQL, we find that the primary key index is not used, but a full table scan is performed.

repetition

Table DDL with user_id as primary key index:

CREATE TABLE user_message (' user_id 'varchar(50) NOT NULL COMMENT' user ID', 'msg_id' int(11) NOT NULL COMMENT 'user ID', PRIMARY KEY (`user_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;Copy the code

SQL > select * from rows where primary key is used and rows where primary key is not used;

EXPLAIN SELECT COUNT(*) FROM user_message WHERE user_id = 1; id|select_type|table |partitions|type |possible_keys|key |key_len|ref|rows |filtered|Extra | --+-----------+------------+----------+-----+-------------+-------+-------+---+-----+--------+------------------------+ 1 | SIMPLE | user_message | | index | PRIMARY | PRIMARY | 206 | | 10000 | | 10.0 Using the where; Using index|Copy the code

The user_id field in the data table is of VARCHAR type, and the user_id field in the SQL statement is of INT type. The primary key index is invalidated after the primary key index is converted.

Implicit conversion

MySQL’s official documents: dev.mysql.com/doc/refman/… , introduces the rules for implicit conversion of MySQL types:

When the operand types on both sides of the operator are inconsistent, MySQL performs type conversions to make the operands compatible. These conversions occur implicitly. The following describes the implicit conversion of the comparison operation: • If one or both arguments are NULL, the result of the comparison is NULL; Except for the <=> equality comparison operator, for NULL <=> NULL, the result is true and no conversion is required. • If both parameters in a comparison operation are strings, they are compared as strings. • If both arguments are integers, they are compared as integers. • If a hexadecimal value is not compared to a number, it is treated as a binary string. • If one of the arguments is a TIMESTAMP or DATETIME column and the other is a constant, the constant is converted to a TIMESTAMP before the comparison is performed. Do not do this for arguments to IN(). To be safe, always use the full date-time, date, or time string when making comparisons. For example, to get the best results when using BETWEEN with date or time values, use CAST() to explicitly convert those values to the desired data type. • Single-row subqueries in one or more tables are not considered constant. For example, if an integer returned by a subquery is to be compared with a DATETIME value, the comparison is done as two integers and the integers are not converted to a time value. See the previous article, in which case CAST() is used to convert the result of the subquery to a DATETIME integer value. • If one of the arguments is a decimal value, the comparison depends on the other argument. If the other argument is a decimal or integer value, the arguments are compared as decimal values; If the other argument is a floating-point value, the arguments are compared as floating-point values. • In all other cases, parameters are compared as floating point numbers (real numbers). For example, compare a string to a numeric operand as a floating-point comparison.

According to the last rule above, in the previous SQL statement, the comparison between a string and an integer is converted to two floating-point numbers, with the string type “1” on the left converted to float 1.0 and the INT type 1 on the right converted to float 1.0.

It should be possible to use indexes if both are floating-point numbers. Why is it not used in execution?

MySQL converts a string to a floating point as follows:

1. All strings that do not start with a number are converted to 0:

 SELECT CAST('abc' AS UNSIGNED)
 ​
 CAST('abc' AS UNSIGNED)|
 -----------------------+
                       0|
Copy the code

2. String conversions that begin with a number are truncated, from the first character to the first non-numeric content:

 SELECT CAST(' 0123abc' AS UNSIGNED)
 ​
 CAST(' 0123abc' AS UNSIGNED)|
 ----------------------------+
                          123|
Copy the code

So, in MySQL, strings like “1”, “1”, “1a”, “01” are converted to numbers that are 1.

MySQL executes the above SQL statement by converting the value of each primary key column to a floating point number (CAST is executed on the primary key) and comparing it with the conditional parameter. Using a function on an index column causes the index to be invalidated, resulting in a full table scan.

We can use the primary key index by simply changing the parameter passed in the previous SQL to a string:

EXPLAIN SELECT COUNT(*) FROM user_message WHERE user_id = '1'; id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra | --+-----------+------------+----------+----+-------------+-------+-------+-----+----+--------+-----------+ 1|SIMPLE | user_message | | ref | PRIMARY | PRIMARY | 202 | const | 135 | | 100.0 Using index |Copy the code

conclusion

1. If the condition column is a string and the condition parameter is an integer, it is converted to a floating point number first, and then the whole table is scanned, resulting in index invalidation.

2. Condition parameters should be as close as possible to the type of the column, avoiding implicit conversion or performing conversion functions on passed parameters to the same type as the index column.

reference

1, analysis of MySQL implicit conversion