This is the fourth and final article in the series “Java and MySQL Character Set and Encoding”. This series was originally intended to understand MySQL encoding problems. In the process of sorting out, Java byte encoding problems came to mind, so this series was born.
Character encoding and comparison rules
Character encodings have been covered in previous sections. Here is just a little bit, MySQL castrate UTF-8.
1.1 UTF-8 in MySQL
Normally, utF-8 is 1 to 4 bytes long in various parts, such as the system and the Java language itself. MySQL utF-8 is utF8MB3 (UTF-8 most bytes 3), which supports a maximum of 3 bytes. As a result, some UTF-8 characters that are supported by other systems cannot be represented in MySQL, such as emojis.
To be compatible with utF-8, MySQL uses UTF8MB4 encoding scheme to support it. Let’s take a look at what character sets a MySQL system supports:
show charset;
Copy the code
1.2 Comparison Rules
The comparison rule is to compare the size of two characters. The most common way to do this is to use binary comparison sizes. Some may involve case-insensitive comparisons, which can lead to a variety of comparison rules. Each encoding scheme has a number of comparison rules. Common comparison rules (utF-8 as an example) :
show collation like 'utf8%';
Copy the code
2 character encoding and setting of comparison rules
There are four levels of character sets and comparison rules in MySQL:
Server Level Database level Table Level Column levelCopy the code
2.1 Server Level
SHOW VARIABLES LIKE 'character_set_server';
SHOW VARIABLES LIKE 'collation_server';
Copy the code
Of course, we can set this up when the service starts.
[server]
character_set_server=gbk
collation_server=gbk_chinese_ci
Copy the code
2.2 Database Level
SHOW VARIABLES LIKE 'character_set_database';
SHOW VARIABLES LIKE 'collation_database';
Copy the code
You can run the preceding two commands to view the information. Note: Database-level encodings can only be specified at creation time and cannot be changed after creation.
2.3 table level
You can specify when creating a table. If not specified, the database’s encoding and comparison rules are used.
2.4 column level
Columns are generally not encoded in a specific way, but can be changed by command if desired. Take the following example:
ALTER TABLE TABLE_name [[DEFAULT] CHARACTER SET CHARACTER SET [COLLATE]Copy the code
2.5 to modify
For changing character encoding and comparison rules, follow the following rules, and simply say that the two are linked:
- If you modify only the character set, the comparison rule changes to the default comparison rule of the modified character set.
- If only the comparison rule is modified, the character set is changed to the character set corresponding to the modified comparison rule.
3 Character encoding transcoding in MySQL communication
3.1 Processes involved
In the process of MySQL communication, it involves the whole process from client -> server -> client, so the coding conversion will be involved in the middle. As shown below (the picture is from the nuggets booklet “How MySQL works: Understanding MySQL from the root”).
3.2 Related Configurations
The MySQL transcoding configuration is as follows:
character_set_client
character_set_connection
character_set_results
Copy the code
Where character_set_client determines what encoding is used to decode the encoding from the client. Character_set_connection is used to convert the character set received by the Character_set_client to the specific encoding set used. Character_set_results refers to the coding set given to the client when the result is returned.
Let’s take a look at some of the database configurations:
3.3 SETTING demo for JDBC Coding
Change the spring JDBC encoding to GBK:
UncategorizedSQLException is the coding problem.
So let’s go back to UTF-8, and we’ll find it’s normal.
4 summary
This article first talked about MySQL character encoding, the theoretical knowledge of comparison rules, and looked at the specific data configuration. And then talk about the character encoding transcoding in MySQL communication. Finally, we addressed the previous article about the interaction between JDBC configuration and Character_set_client.
5 References
How MySQL Works: Understanding MySQL from the Root