One, foreword

For students engaged in Internet development, mysql can be said to be more familiar with. Whether you are a DBA, developer or tester, you have to deal with it almost every day. However, I have encountered these pits I don’t know if you have experienced?

Signed and unsigned

At the beginning of the project development, when our company formulated the development specification, we had a special discussion on whether the int type field of mysql should be defined as signed or unsigned.

Idea 1:

For fields in which you can be sure the value is positive, make them UNSIGNED and save half the storage space. The following statement creates an unsigned field:

create table test_unsigned(a int UNSIGNED, b int UNSIGNED);
Copy the code

Idea 2:

Mysql > create a signed field (int, int, int, int);

create table test_signed(a int);
insert into test_signed values(-1);
Copy the code

Execution Result:

Insert -1 into field A, and we see that the operation succeeds.

In both cases, after much discussion, we chose to use the signed definition of the int field. Why is that?

create table test_unsigned(a int UNSIGNED, b int UNSIGNED); Insert into test_unsigned values (1, 2);Copy the code

Create a test_unsigned table with two unsigned columns A and B and insert a column a=1, b=2

select b - a from test_unsigned;
Copy the code

No problem, return 1

But if the SQL is changed to this:

select a - b from test_unsigned;
Copy the code

Execution Result:

Error reported…

Therefore, in the use of unsigned field, must pay attention to the field subtraction negative pit, it is recommended to use a signed field, to avoid unnecessary problems.

Automatic growth

If you’ve ever built a table, you know that the primary key of a table can be defined to grow automatically. In this way, you can give the database to generate the primary key value itself without specifying it in the code, and the value is incremented. In general, the following statements are used to create a table:

create table test_auto_increment (a int auto_increment primary key);
Copy the code

But what if I changed it to something like this?

create table test_auto_increment (a int auto_increment);
Copy the code

Execution Result:

Error reported…

The screenshot doesn’t show them all, but the full prompt looks like this:

1075 - Incorrect table definition; there can be only one auto column and it must be defined as a key, Time: 0.006000
Copy the code

Autoincrement field must be defined as key, so we need to add primary key.

Here’s another interesting experiment:

insert into test_auto_increment(a) values (null),(50),(null),(8),(null);
Copy the code

Guess what the result will be?

The first null inserts 1, then inserts it in order of the true numeric size, and the next two nulls add 1 to the largest number.

SQL > insert negative numbers into primary key;

insert into test_auto_increment values(-3);
Copy the code

And the answer is yes, the primary key can insert negative numbers.

What about this SQL, insert 0 in primary key?

insert into test_auto_increment values(0);
Copy the code

Execution Result:

The execution can also succeed, but no data is inserted

4. Field length

When creating a table, we need to specify the length of the field after defining the type of the field, such as vARCHar (20), biginit(20), etc. So the question is, does vARCHar represent byte length, or character length?

create table test_varchar(a varchar(20)); Insert into test_varchar values(' test_varchar '); select length(a),CHARACTER_LENGTH(a) from test_varchar;Copy the code

Results after execution:

We can see that the length of 5 characters in Chinese is 15, indicating that 15 bytes are occupied, while the charcter_length function is 5, indicating that there are 5 characters. So vARCHar represents the character length, because some complex characters or Chinese characters, one byte cannot represent, utF8 encoding a Chinese character takes three bytes. The comparison table of different database encoding formats, which occupy different bytes, is as follows:

Except for varchar and char, the other types of mysql represent the length of bytes.

Int of n, what does this n mean? Start with an example:

create table test_bigint (a bigint(4) ZEROFILL);
insert into test_bigint values(1);
insert into test_bigint values(123456);
select * from test_bigint;
Copy the code

ZEROFILL: not long enough to fill 0

Execution Result:

Mysql > select * from ‘mysql’ where number of bytes is used

We can see from the table that bigint is 8 bytes long, but our definition of A shows 4 bytes, so it is preceded by 0 if it is less than 4 bytes long. When the length is 4 bytes, the actual length is displayed, for example, 123456. However, it is important to note that some mysql clients may display only 4 bytes of content even if they are full of 4 bytes, such as: 1234.

So bigint(4), where the 4 means the displayed length is 4 bytes, but the actual length is still 8 bytes.

Ignore case

We know that there are case issues in English letters, such as: Are a and A the same? We thought it must be different, but how does the database handle it?

create table test_a(a varchar(20));
insert into test_a values('a');
insert into test_a values('A');
select * from test_a where a = 'a';
Copy the code

What is the result of the execution?

I thought it would only return a, but I actually returned a as well. Why is that?

The default Collation of this table is UTF8_general_CI. This Collation ignores case. Therefore, the value of lowercase letter A is queried, but the value of uppercase letter A is also queried unexpectedly.

So what if we just want to find the value of lowercase A? What Collation does mysql support?

show collation;
Copy the code

Utf8_bin: utf8_bin: utf8_bin: utf8_bin: utf8_bin: utf8_bin

Modify the field type

ALTER TABLE test_a MODIFY COLUMN a VARCHAR(20) BINARY CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL;
Copy the code

Look at the data again

select * from test_a where a = 'a';
Copy the code

Execution Result:

Sure enough, it turned out right.

Special characters

When I was working on a project, I once provided a message function, but as a result, the client user entered an emoji, which directly led to an error in the interface.

The final reason is that utF8 was used for the character encoding of database and table at that time. The UTF8 of mysql can encode a character with 3 bytes at most, but an emoji is 4 bytes, so UTF8 does not support storing emoji.

How to solve this problem?

Change the character encoding to UTf8mb4. Utf8mb4 can be up to 4 bytes, however, only supported in mysql5.5.3 or later.

Mysql.cnf/mysql.ini/mysql.cnf/mysql.ini/mysql.cnf/mysql.ini

[client]
default-character-set = utf8mb4
[mysqld]
character-set-server = utf8mb4
collation-server     = utf8mb4_general_ci
Copy the code

Restart MySQL and use the following command to check the encoding. It should all be UTf8MB4. This is to change the encoding of the entire database.

SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%'; jiCopy the code

The result is:

You can also change the encoding mode of a table separately:

alter table test_a convert to character set utf8mb4 collate utf8mb4_bin;
Copy the code

And modify the encoding of a field:

ALTER TABLE test_a CHANGE a a VARCHAR(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
Copy the code

In addition, it is recommended that students create database and table when the character encoding is defined as UTF8MB4, to avoid some unnecessary problems.

The article comes from Su SAN said technology