define
Collate is used to determine the sequence of text fields such as text and vARCHar
It can be on the definition of fields (name, Nick), tables (Employee), as shown below
CREATE TABLE `employee` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(256) COLLATE utf8mb4_bin COMMENT 'name',
`nick` varchar(256) COLLATE utf8mb4_bin COMMENT 'nickname'.PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Copy the code
It can also be used on libraries
create database test default character set utf8mb4 collate utf8mb4_general_ci;
Or use it directly in a query statement
select id from employee order by name collate utf8mb4_general_ci
Collate takes effect in the following order:
- The SQL statement
- Column level
- Table level
- Repository level
- Instance level
Relationship between UTF8 and UTfMB4
- Utf8 is 3 bytes long and UTfmb4 is 4 bytes long
- Utfmb4 is fully compatible with UTF8, so use UTFMB4 whenever possible
- Some expressions are expressed in 4 bytes and can only be expressed in UTFMB4
The difference between
The key point is whether the _CI suffix, case sensitive, is case-sensitive
- Utf8mb4_bin: case sensitive, binary data storage
- Utf8mb4_general_ci: case insensitive, may be inconsistent when sorting special characters, fast
- Utf8mb4_unicode_ci: case insensitive, precise comparison, and slow speed
pit
Since the database default collation is UTF8MB4_general_CI (case insensitive), if not specified, inconsistent keys (size inconsistent) will conflict with strings as primary keys, as shown below
MySQL [test]> CREATE TABLE `business` (
`business_id` varchar(32) COMMENT 'Business Unique Identifier'.PRIMARY KEY (`business_id`)
) ENGINE=InnoDB;
Query OK, 0 rows affected (0.02 sec)
MySQL [test]> insert into business values("key");
Query OK, 1 row affected (0.00 sec)
MySQL [test]> insert into business values("KEY");
ERROR 1062 (23000): Duplicate entry 'KEY' for key 'PRIMARY'
Copy the code
The solution
You can avoid this problem by changing the way the fields are sorted
MySQL [test]> alter table business modify column business_id varchar(32) collateutf8mb4_bin; GINE=InnoDB;
Query OK, 0 rows affected (0.02 sec)
MySQL [test]> insert into business values("KEY");
Query OK, 1 row affected (0.00 sec)
Copy the code
As you can see, after changing the sorting method of the field, you can insert it
After this problem occurs, you can only change the sorting mode from the field, instance, library, table modification can not affect the field sorting; When creating this field, it implicitly inherits the sorting mode of the upper level. When the upper level (instance, library, table) changes the sorting mode of the field, it does not affect the sorting mode of the field. Only the specified sorting mode of the field can be displayed
MySQL [test]> ALTER TABLE `business` CHARACTER SET utf8mb4 collate utf8mb4_bin;
Query OK, 0 rows affected (0.02 sec)
MySQL [test]> insert into business values("KEY");
ERROR 1062 (23000): Duplicate entry 'KEY' for key 'PRIMARY'
Copy the code
As shown above, even if we change the sorting mode of the table, the inconsistent keys will still collide, i.e., the sorting mode will not change
Select * from table_name where table_name is not sorted; select * from table_name where table_name is not sorted;
MySQL [tms_test]> show create table business;
+----------+---------------+
| Table | Create Table
+----------+---------------+
| business | CREATE TABLE `business` (
`business_id` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT 'Business Unique Identifier'.PRIMARY KEY (`business_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
+----------+---------------+
Copy the code
reference
Mysql utF8 is different from UTF8MB4, utf8MB4_bin, UTF8MB4_general_ci, utF8MB4_unicode_ci
What is COLLATE in MYSQL?