1. Introduction
In the early days of MySQL data types, even for a long time int(10) was thought to mean that only 10 digits could be stored; Varchar (10) can store fewer than 10 Chinese characters, because one Chinese character requires several bytes. In fact, these perceptions are all wrong, and you can get the argument right both on the website and in practice.
2. Number type
Why is it wrong to say that int(M) can only store M digits? Take a look at the following example
Create a table
create table data_type(id int(3));
Copy the code
Based on the datatype declaration and error argument, we can only hold 3 digits in the ID field. We can try to insert a value with more than 3 digits into the ID field and see what happens
mysql> insert into data_type values(1000001);
Query OK, 1 row affected (0.00 sec)
Copy the code
In practice we can see that a value of 7 digits can be inserted normally. Why is this so? Take a look at the official documentation
2.1 Interpretation of M in official documents
For integer data types,
M
indicates the maximum display width. The maximum display width is 255. Display width is unrelated to the range of values a type can store
M only represents the maximum display width, and the display width has nothing to do with the range of stored values
2.2 Display width
From the above explanation, you may still have no idea what display width means. Let’s take a look at the following example
Create a table
create table data_type(id int(5) ZEROFILL);
Copy the code
Insert three pieces of data
insert into data_type values(1);
insert into data_type values(100);
insert into data_type values(10000);
Copy the code
View the results
By now I think you should understand the meaning of M, M only represents the display width, not the range of stored value. In the case of declaring ZEROFILL, when the number of bits of stored value is less than M, 0 will be used for filling, when the number of bits of stored value is greater than or equal to M, it will be displayed normally
Note: The preceding example may not be repeated on a third-party client. If possible, use the command line
2.3 Number of occupied bytes and Storage range
Since M only represents the display width, what is the number of bytes taken up by the integer type versus the storage range? Take a look at the official presentation
From the official documentation, it is clear how many bytes each type occupies and the corresponding storage range
As you know, every table has an ID of type Bigint (20) as the primary key. The primary key usually does not have a negative number. Declaring the ID as bigint(20) unsigned makes the storage range twice as large as it used to be
3. The value is a string
How many Chinese characters do you think VARCHar (M) can store? Do I need to do M / 4 or M / 3 calculations? Take a look at the following example
Create a table
create table data_type(name varchar(3))engine=innodb charset=utf8mb4;
Copy the code
Insert data
mysql> insert into data_type values('张');
Query OK, 1 row affected (0.01 sec)
mysql> insert into data_type values('Joe');
Query OK, 1 row affected (0.00 sec)
mysql> insert into data_type values('Zhang Xiaosan');
Query OK, 1 row affected (0.00 sec)
mysql> insert into data_type values('Zhang Xiaosan');
ERROR 1406 (22001): Data too long for column 'name' at row 1
Copy the code
It can be seen that varchar(3) can insert 3 Chinese characters, but the insertion fails if more than 3 Chinese characters are inserted. Here we can draw a conclusion that M does not represent bytes
3.1 Interpretation of M in official documents
String Data Type SyntaxA variable-length string.
M
represents the maximum column length in characters.
M stands for character length, which means it can hold M numbers, English letters and Chinese characters
3.2 Number of occupied bytes
Since M represents character length, how many bytes does varchar(M) occupy?
3.2.1 Character Code
Different character encodings occupy different bytes. For example, latin1 encodings 1 character to 1 byte. Ucs2 character encoding 1 character consists of 2 bytes. The UTF8 character encoding is 1 character in 3 bytes; Utf8mb4 character encoding 1 character is 4 bytes;
The serial number | A character encoding | Occupied bytes |
---|---|---|
1 | latin1 | 1 |
2 | ucs2 | 2 |
3 | utf8 | 3 |
4 | utf8mb4 | 4 |
Is it correct for varchar(M) to occupy 4M bytes when the character encoding is UTF8MB4?
3.2.2 Length Number of bytes occupied
String Data Type SyntaxMySQL stores
VARCHAR
values as a 1-byte or 2-byte length prefix plus data. The length prefix indicates the number of bytes in the value. AVARCHAR
column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.
We need 1 or 2 bytes to store the length of the string.
For example, a
VARCHAR(255)
column can hold a string with a maximum length of 255 characters. Assuming that the column uses thelatin1
character set (one byte per character), the actual storage required is the length of the string (L
), plus one byte to record the length of the string. For the string'abcd'
,L
is 4 and the storage requirement is five bytes. If the same column is instead declared to use theucs2
double-byte character set, the storage requirement is 10 bytes: The length of'abcd'
is eight bytes and the column requires two bytes to store lengths because the maximum length is greater than 255 (up to 510 bytes).
Varchar (255) in latin1 encoding to store ABCD needs to occupy 4 * 1 + 1 = 5 bytes; Varchar (255) The storage of ABCD in UCS2 takes up 4 * 2 + 2 = 10 bytes.
In ucS2, the length of vARCHar (255) = 255 x 2 Is greater than 255 bytes. Therefore, two bytes are required to represent the length
# 4. Summary
The M in the numeric type represents the display width and does not represent the range of stored values; The number of bytes vARCHar takes depends on the length of the character encoding and declaration