Relying on the development of the Internet, we can use some of the fragmented time waiting for the bus or taking the subway anytime and anywhere to learn and understand information. At the same time, the developed Internet is also convenient for people to quickly share their knowledge and discuss with friends with the same hobbies and needs.
But too convenient sharing also makes knowledge become multifarious, and it is easy for people to receive wrong information. Most of these errors are caused by technology moving quickly and not having the free time to update what has been released. In order to avoid confusion for later learners, we will take a look at some common examples of errors in the MySQL design specification today.
Primary key design
Incorrect design specification: It is recommended to use an increment ID value for primary keys. Do not use UUID, MD5, HASH, string as primary keys
This design specification can be seen in many articles. The advantages of auto-increment primary keys include small footprint, order, and ease of use.
Here’s a look at the disadvantages of auto-increment primary keys:
-
Since self-increment is generated on the server side, it needs to be protected by an self-increment AI lock. If there are a large number of insert requests at this time, there may be a performance bottleneck caused by self-increment, so there are concurrency performance problems.
-
The auto-added primary key can only be unique in the current instance, but cannot be globally unique, so it cannot be used in distributed architecture.
-
Public data value, easy to cause security problems, if our commodity ID is auto-increment primary key, users can modify the ID value to obtain goods, in serious cases, can know how many goods exist in our database.
-
Possible performance problems caused by MGR (MySQL Group Replication);
Since auto-increment is generated on the MySQL server, it needs to be protected by an auto-increment AI lock, which can cause performance bottlenecks if there are a large number of insert requests. For example, in MySQL databases, the innodb_autoinc_lock_mode parameter is used to control how long an auto-increment lock is held. Although we can adjust the innodb_autoinc_lock_mode parameter for maximum performance, there are other problems with this parameter. Therefore, in concurrent scenarios, UUID primary keys or customized service generated primary keys are recommended.
We can use the UUID() function directly in mysql to get the value of the UUID.
MySQL> select UUID(); +--------------------------------------+ | UUID() | +--------------------------------------+ | 23 ebaa88 - eb - b431 ce89-11-0242 ac110002 | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC)Copy the code
Note that when storing time, the UUID is stored in reverse order according to the time bits. That is, the low time bits are stored in the first place, and the high time bits are stored in the last place. That is, the first four bytes of a UUID change randomly with time rather than monotonously increasing. Non-random values generate discrete IO when inserted, resulting in performance bottlenecks. This is also the biggest drawback of UUID versus auto-increment.
To solve this problem, MySQL 8.0 introduced the function UUID_TO_BIN, which can put the UUID string:
-
By putting the time high in the first place, the out-of-order problem of UUID insertion is solved.
-
Remove useless string “-“, simplify storage space;
-
Converting strings into binary value stores eventually reduces the space from 36 bytes to 16 bytes.
Here we convert the previous UUID string 23ebaa88-CE89-11EB-b431-0242AC110002 to UUID_TO_BIN to the following binary value: UUID_TO_BIN
MySQL> SELECT UUID_TO_BIN('23ebaa88-ce89-11eb-b431-0242ac110002',TRUE) as UUID_BIN; +------------------------------------+ | UUID_BIN | +------------------------------------+ | 0 x11ebce8923ebaa88b4310242ac110002 | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + 1 row in the set (0.01 SEC)Copy the code
In addition, MySQL 8.0 also provides the BIN_TO_UUID function, which can invert a binary value to a UUID string.
Although the UUID_TO_BIN/BIN_TO_UUID function is not available in MySQL 8.0, you can use custom functions to resolve this problem. Application layer can be written according to their own programming language corresponding functions.
Of course, many students are also worried about the performance of UUID and the space occupied by storage. Here I also did the relevant insert performance test, and the results are shown in the following table:
As you can see, MySQL 8.0 provides the best sort UUID performance, even better than auto-increment ids. In addition, because UUID_TO_BIN translates to 16 bytes, it is only 8 bytes more than the increment ID, and the storage space is only 3G more than the increment ID.
And because UUID is globally unique, the benefits of using UUID are much greater than those of auto-increment ids. You may be used to using auto-increment home keys, but in concurrent scenarios, globally unique home keys such as UUID are recommended.
Of course, UUID is good, but in distributed scenarios, some additional information needs to be added to primary keys to ensure the query efficiency of secondary indexes. You are advised to customize primary keys based on services. However, when the concurrency and data volumes are not that high, it is recommended to use the increment UUID. Do not assume that a UUID cannot be a primary key.
Design of financial fields
Wrong design specification: Finance-related amount data must be of type DECIMAL because float and double are both imprecise floating-point types, and Decimal is an exact floating-point type. Therefore, financial fields such as user balance and commodity price are generally of decimal type, which can be exact to points.
However, in the design standards for mass Internet services, the use of DECIMAL type is not recommended, but rather the conversion of DECIMAL to an integer type is recommended. That is, financial types prefer to use fractional storage rather than meta-storage. For example, 1 dollar is stored in the database as an integer of type 100.
Here are the advantages of bigint:
-
Decimal is an encoding implemented through binary and is less computationally efficient than Bigint
-
With Bigint, fields are fixed-length fields, which store efficiently, whereas Decimal depends on the width defined, which performs better in data design
-
Use Bigint to store the amount divided into units, but also can store the amount of gigabit level, fully enough
Use of enumeration fields
Wrong design specification: Avoid using ENUM types
In previous development projects, we simply designed the fields as Tinyint when we encountered fields such as user gender, whether the product is on the shelves, whether the comment is hidden, etc., and then made remarks on the state of 0 and state of 1.
The problems with this design are also obvious:
-
Unclear expression: This table may be designed by other colleagues. If you are not particularly impressed, you need to read the field annotations every time, and sometimes you even need to go to the database to confirm the meaning of the field when coding
-
Dirty data: While it is possible at the application level to limit the number of values inserted by code, it is possible to modify values through SQL and visualization tools
For fields with fixed option values, it is recommended to use the ENUM ENUM string type, plus the strict SQL_MODE mode
In MySQL 8.0.16 and later, you can use the check constraint directly instead of using the enum to enumerate field types
And we typically use single characters like “Y” and “N” when defining enumeration values, which doesn’t take up much space. However, if the option value is not fixed, which may increase as the business grows, enumeration fields are not recommended.
Index limit
Bad design specification: limit the number of indexes on a table to no more than 5
There is no limit on the number of indexes in a MySQL table. You can create indexes for service queries
Use of subqueries
Bad design practice: Avoid using subqueries
In fact, this specification is correct for older versions of MySQL, because the previous version of MySQL database optimization for subqueries is limited, so in many OLTP business situations, we require online business as much as possible without subqueries.
However, the optimization of subqueries has been greatly improved in MySQL 8.0, so subqueries can be used safely in the new version of MySQL.
Subqueries are easier for humans to understand than joins. For example, we now want to check the number of students who have not posted articles in 2020
SELECT COUNT(*)
FROM user
WHERE id not in (
SELECT user_id
from blog
where publish_time >= "2020-01-01" AND publish_time <= "2020-12-31"
)
Copy the code
As you can see, the logic of the subquery is clear: who are the users of the article table queried by not IN?
If I write it in left JOIN
SELECT count(*)
FROM user LEFT JOIN blog
ON user.id = blog.user_id and blog.publish_time >= "2020-01-01" and blog.publish_time <= "2020-12-31"
where blog.user_id is NULL;
Copy the code
As you can see, LEFT JOIN can also fulfill the above requirements, but it is not easy to understand.
We used Explain to look at the execution plans for both SQL and found that they were the same
As can be seen from the above figure, both the sub-query and the LEFT JOIN are eventually converted to the LEFT Hash JOIN, so the execution time of the above two SQL is the same. That is, IN MySQL 8.0, the optimizer automatically optimizes the IN subquery to the best JOIN execution plan, which significantly improves performance.
conclusion
After reading the previous content, I believe you have a new understanding of MySQL. These common errors can be summarized as follows:
-
A UUID can also be used as a primary key. The performance of a self-increasing UUID is better than that of a self-increasing primary key
-
In addition to decimal, you can also try Bigint, which stores data in units
-
For fields with fixed option values, MySQL8 used to recommend using enumerated fields. MySQL8 uses check constraints instead of 0, 1, and 2
-
The number of indexes in a table is not limited to five. You can add or delete indexes based on service conditions
-
MySQL8 is optimized for sub-queries and can be used with confidence.
Recommended reading
Practical notes: Configure the mental path of the monitoring service for NSQ
Go-zero: Out-of-the-box microservices framework