Table structure design

Numeric types

The integer types

type Take up the space Minimum value – maximum value [signed] Minimum – maximum [unsigned]
TINYINT 1 – 128 ~ 127 0 ~ 255
SMALLINT 2 – 32768 ~ 32767 0 ~ 65535
MEDIUMINT 3 – 8388608 ~ 8388607 0 ~ 16777215
INT 4 – 2147483448 ~ 2147483647 0 ~ 4294967295
BIGINT 8 – 9223372036854775808 ~ 9223372036854775807 0 ~ 18446744073709551615

Integer types have signed and unsigned properties that indicate the range of integer values. The default is signed. If sql_mode is set to NO_UNSIGNED_SUBTRACTION, traction will allow unsigned subtraction to be signed

Floating point and high precision

Float and double are not high precision or SQL standard types and are not recommended. Use Decimal type

Business Design practices

  • The auto_increment attribute is used to increment an integer.

    • Use BIGINT as the primary key, not INT
    • Self-increment not persistent, possibly backtracking (pre-8.0)

    When the upper limit of the increment integer type is reached, the MySQL database will report a repeat error if the increment is inserted again.

  • In the massive Internet business design standard, the use of DECIMAL type is not recommended for funding types, but rather the conversion of DECIMAL to integer type is recommended.

String type

The mysql string types are CHAR, VARCHAR, BINARY, BLOB, TEXT, ENUM, and SET

CHAR(N) Holds a fixed length of characters. The value of N ranges from 0 to 255. VARCHAR(N) is used to store side length characters. The value of N ranges from 0 to 65536. N indicates a character.

For more than 65536 characters, you can use TEXT or BLOB, which have a maximum storage length of 4G. The difference is that bloBS have no character set attributes and are purely binary storage.

Character set

The recommended character set is UTF8MB4

SELECT CAST(0xF09F988E as char charset utf8mb4) as emoji 
Copy the code

Including 8.0, the character set is set to UTM8MB4 by default, and Latin1 by default before 8.0. This can be done in the configuration file:

[mysqld]
character-set-server=utf8mb4
Copy the code

GDK character set, 1 string stores a maximum of two bytes, UTF8MB4 stores a maximum of four bytes. In multi-byte character sets, CHAR and VARCHAR are identical in that they are varied-length stores.

collation

Collation is a rule that compares and sorts strings. You can run the SHOW CHARSET command to see the default Collation rules for a character set

SHOW CHARSET LIKE '%utf8%'
SHOW COLLATION  LIKE 'utf8mb4%'
Copy the code

Collation ends with _ci, which means case insensitive, _cs, which means case sensitive, and _bin, which means comparison by storing binary characters

SELECT CAST('a' as char) COLLATE utf8mb4_0900_as_cs = CAST('A' as CHAR) COLLATE utf8mb4_0900_as_cs as result;
Copy the code

Modify the character set

ALTER TABLE emoji_test CHARSET utf8mb4
Copy the code

This is to change the character set of the table to UTF8MB4. The next time a column is added, the character set of the new column is changed to UTF8MB4, but the default character set of the existing column is not changed.

If radical change is to be made, it needs to be implemented

ALTER TABLE emoji_test CONVERT TO CHARSET utf8mb4;
Copy the code

Business Design practices

Gender field design

Before 8.0, you can enumerate as ENUM strings, allowing only limited defined values to be inserted. If the SQL_MODE parameter is set to strict mode, an error will be reported when inserting undefined data.

Create Table: CREATE TABLE `User` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `sex` enum('M'.'F') COLLATE utf8mb4_general_ci DEFAULT NULL.PRIMARY KEY (`id`)
) ENGINE=InnoDB

SET sql_mode = 'STRICT_TRANS_TABLES';
Copy the code

After MySQL8.0.16, the database natively provides the CHECK constraint function

Create Table: CREATE TABLE `User` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `sex` char(1) COLLATE utf8mb4_general_ci DEFAULT NULL.PRIMARY KEY (`id`),
  CONSTRAINT `user_chk_1` CHECK (((`sex` = _utf8mb4'M') or (`sex` = _utf8mb4'F')))
) ENGINE=InnoDB
Copy the code

The date type

The common DATE types in Mysql database are YEAR, DATE, TIME,DATETIME,TIMESTAMP.

DATETIME

The DATETIME format is YYYY-MM-DD HH:MM:SS, which occupies a fixed 8 bytes.

As of Mysql5.6, the DATETIME type supports milliseconds, with N in DATETIME(N) indicating the precision of milliseconds.

CREATE TABLE User (
    id BIGINT NOT NULL AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    sex CHAR(1) NOT NULL,
    password VARCHAR(1024) NOT NULL,
    money INT NOT NULL DEFAULT 0,
    register_date DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
    last_modify_date DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
    CHECK (sex = 'M' OR sex = 'F'),
    PRIMARY KEY(id)
);
Copy the code

TIMESTAMP

The actual storage content is the number of milliseconds from ‘1970-01-01 00:00:00’, which occupies 4 bytes. The storage upper limit can only be 2001-19 03:14:07.

As of Mysql5.6, the TIMESTAMP type can also support milliseconds. The difference is that when milliseconds are present, the TIMESTAMP type takes up 7 bytes, while DATESTAMP takes up 8 bytes regardless of whether the millisecond information is stored.

TIMESTAMP contains time zone information. The time_zone parameter specifies the current time zone. The default time zone is SYSTEM. You can specify a time zone based on this parameter.

SET time_zone = '- 08:00'
SET time_zone = 'America/Los_Angeles'
SET time_zone = 'Asia/Shanghai'
Copy the code

Chinese time zone ‘+08:00’, US time zone ‘-08:00’, you can also set the name directly

Business Design practices

The selection

The TIMESTAMP upper limit comes quickly and has potential performance problems. Converting TIMESTAMP from ms itself has few instructions and does not bring direct performance problems, but using the default operating system time zone, each time through the time zone calculation, need to call the underlying operating system function _tz_convert() function, additional locking operation is required to ensure that the operating system time zone is not changed. Therefore, the performance is not as good as DATETIME, and the performance jitter may occur when a large number of concurrent requests occur. The optimization strategy is to use an explicit time zone, set the time zone in the configuration file, not the system time zone.

[mysqld]
time_zone = "+08:00"
Copy the code

You can test performance with mysqlSLAP

mysqlslap --create-schema=test -uroot --number-of-queries=1000000 --concurrency=100 --query='SELECT NOW()'
Copy the code

Unstructured storage

JSON typing has been supported since MySQL 5.7, while 8.0 addresses the logging performance bottleneck of updating JSON. MySQL 8.0 is highly recommended if you want to use JSON data types in a production environment.

The JSON type is good for storing less modified, relatively static data

JSON is divided into JSON objects and JSON arrays

Business Design practices

The user login

CREATE TABLE UserLogin (
    userId BIGINT NOT NULL,
    loginInfo JSON,
    PRIMARY KEY(userId)
);

SET @a = '{"cellphone" : "13918888888", "wxchat" : "cellphone", "QQ" : "82946772"}';

INSERT INTO UserLogin VALUES (1.@a);
Copy the code

A common function, JSON_EXTRACT, is used to extract the desired field content from JSON data. The JSON_UNQUOTE function is a function that removes quotation marks from the original JSON string and converts them to string

SELECT
    userId,
    JSON_UNQUOTE(JSON_EXTRACT(loginInfo,"$.cellphone")) cellphone,
    JSON_UNQUOTE(JSON_EXTRACT(loginInfo,"$.wxchat")) wxchat
FROM UserLogin;
Copy the code

Is equivalent to

SELECT
    userId,
    loginInfo->>"$.cellphone" cellphone,
    loginInfo->>"$.wxchat" wxchat
FROM UserLogin;
Copy the code

Create index method for JSON:

ALTER TABLE UserLogin ADD COLUMN cellphone VARCHAR(255) AS (loginInfo->>"$.cellphone");
ALTER TABLE UserLogin ADD UNIQUE INDEX idx_cellphone(cellphone);
Copy the code

Is equivalent to

CREATE TABLE UserLogin (
    userId BIGINT,
    loginInfo JSON,
    cellphone VARCHAR(255) AS (loginInfo->>"$.cellphone"),
    PRIMARY KEY(userId),
    UNIQUE KEY uk_idx_cellphone(cellphone)
);
Copy the code

User portrait

CREATE TABLE UserTag (
    userId bigint NOT NULL,
    userTags JSON,
    PRIMARY KEY (userId)
);

INSERT INTO UserTag VALUES (1,'[2,6,8,10]');
INSERT INTO UserTag VALUES (2,'[3,10,12]');
Copy the code

Mysql8.0.17 now supports multi-valued Indexes, which can be used to create Indexes on JSON arrays and quickly retrieve data via member of, jSON_CONTAINS, and jSON_OVERLans

 ALTER TABLE UserTag ADD INDEX idx_user_tags ((cast((userTags->"$") as unsigned array)));
EXPLAIN SELECT * FROM UserTag 
WHERE 10 MEMBER OF(userTags->"$")\G
EXPLAIN SELECT * FROM UserTag 
WHERE JSON_CONTAINS(userTags->"$", '[2, 10]')\G
EXPLAIN SELECT * FROM UserTag 
WHERE JSON_OVERLAPS(userTags->"$", ',3,10 [2] ')\G
Copy the code

Forgetting paradigm rules

Three paradigm

  • First normal Form (1NF): data reach atomicity, data cannot be divided
  • Second normal Form (2NF): Each row of data is unique, eliminating partial dependencies between data. Non-primary key fields in a table are completely dependent on primary key fields
  • Third normal Form (3NF): Each field is independently dependent on the primary key relationship, eliminating the internal dependencies of some non-primary key fields

Business Design practices

Autoincrement primary key design

The design of using the increment type of BIGINT as the primary key is only suitable for non-core business tables, such as alarm tables, log tables, etc. The real core business table, must not increment key master key.

  • Autoincrement has backtracking problems

  • The self-value-added function is generated on the server, resulting in concurrency performance problems

    Self-increment occurs on the server side and requires the protection of an auto-increment AI lock, resulting in a large number of insert requests, which may cause performance bottlenecks. Innodb_autoinc_lock_mode Controls the holding time of an autoincrement lock.

    INSERT INTO.VALUES (NULL,...). , (NULL,...). , (NULL,...). ;Copy the code
    Innodb_autoinc_lock_mode possible values Hold autoincrement lock time Hold/release the number of self-added locks
    0 Table lock implementation prior to version 5.1, not recommended
    1 Release the self-added lock at each SQL end 1 time
    2 Each increment releases the increment lock 3 times
  • The auto-increment primary key is guaranteed to be unique only in the current instance, not globally

  • Exposing data values may cause security problems

  • MGR(MySQL Group Replication) may cause performance problems

  • Distributed architecture design issues

Table structure design should follow this specification

  • Each table must have a primary key;
  • Auto-increment primary keys are recommended only for non-core business tables and should even be avoided;
  • The UUID or customized primary key is recommended for the core service table.
  • One copy of data should be reserved as much as possible, and primary key association is used for query to avoid redundant data.
  • In some scenarios, anti-paradigm design can be implemented using JSON data types to improve storage efficiency.

Table compression

The more records a page holds, the better the performance of the database. After compression is enabled, the storage space occupied is smaller, and the same unit of storage can store more data.

To enable compression, the database can be compressed by record, page, and table space, but in practical engineering, page compression is commonly used

  • Compress each record: Because each read and write needs to be compressed and decompressed, it depends on the CPU computing capacity, and the performance deteriorates significantly. In addition, because the size of a single record is not particularly large, generally less than 1K, the compression efficiency is not particularly good.
  • Compress table Spaces: Compression is very efficient, but requires table space files to statically not grow, which is difficult for disk-based relational databases.

Compression does not mean a decrease in performance, but may increase performance, because most database business systems, CPU processing power is surplus, and I/O load is the main bottleneck of the database.

With page compression, MySQL can compress a 16K page to 8K, or even 4K, which can halve or even reduce the size of I/O requests when written or read from disk, improving overall database performance.

Of course, compression is a balancing act and does not necessarily improve database performance. This performance “balance” depends on a trade-off between the cost of decompression and the cost of decompression. However, compression can effectively clean up the original capacity of the data, and the benefits of compression for storage space are huge.

COMPRESS pages compression

COMPRESS is a page compression feature provided by MySQL prior to version 5.7. Just specify ROW_FORMAT=COMPRESS when creating the table and set the scale of compression to be set with the KEY_BLOCK_SIZE option. Although compression is enabled with the ROW_FORMAT option, this is not record level compression, but compression based on the dimension of the page.

CREATE TABLE Log (
  logId BINARY(16) PRIMARY KEY,
  ......
)
ROW_FORMAT=COMPRESSED
KEY_BLOCK_SIZE=8
Copy the code

COMPRESS Page compression. Suitable for service tables that are not sensitive to performance, such as log tables, monitoring tables, and alarm tables, the compression ratio is about 50%. Although COMPRESS can effectively reduce storage space, the implementation of COMPRESS page compression has a huge overhead on performance, resulting in significant performance degradation. The main reason is that a compressed page in the memory buffer pool, there are two compressed and uncompressed pages.

Page1 and Page2 are compressed pages of 8K, but there are 16K pages in memory after decompression. The reason for this design is that 8K pages are used for subsequent updates and 16K pages are used for reading, so that reading does not have to be decompressed every time. Such implementation will increase the memory overhead, resulting in less effective data stored in the cache pool, and the performance of MySQL database naturally degrades significantly.

In order to solve the problem of decreased compression performance, TPC compression was introduced from MySQL 5.7.

TPC compression

Transparent Page Compress is a new Page compression feature available in version 5.7. It takes advantage of the Punch Hole feature of the file system.

CREATE TABLEThe Transaction (transactionIdBINARY(16) PRIMARY KEY,
  .....
)
COMPRESSION=ZLIB | LZ4 | NONE;

Copy the code

Since voids are a feature of the file system, using voids compression can only be compressed to 4K, the smallest unit of the file system, and its page compression is 4K aligned. It has almost no impact on database performance intrusion (less than 20%) and may even improve performance. Unlike COMPRESS page compression, TPC compression has only a 16K decompressed page in memory with no additional storage overhead for the buffer pool.

All page reads and writes are the same as uncompressed pages, with no overhead, and page compression is triggered only once when the page needs to be flushed to disk. However, since a 16K page is compressed to 8K or 4K, write performance actually improves.

The COMPRESS page compression function can be used for service tables that are not sensitive to performance, such as log tables, monitoring tables, and alarm tables, because they only require storage space.

For some of the more core pipelining tables, I prefer TPC compression. Because pipelining is a very core data storage business, it usually accompanies the core business. For example, in an e-commerce transaction, users deduct money, place an order and record the flow, which is a micro model of a core business.

Therefore, users have performance requirements for water meters. In addition, the flow is very large, enabling compression to store data more efficiently.

If you are concerned about performance jitter caused by compression, you are advised to: Do not enable THE TPC function for the current flow table because the flow table is usually stored by month or day. Enable the TPC compression function for the history flow table, as shown in the following figure

TPC page COMPRESSION can be enabled by using the ALTER TABLE XXX COMPRESSION = ZLIB command, but this will compress only the newly added data, not the original data. So the above ALTER TABLE operation only modifies the metadata, which can be done instantaneously.

To compress the entire TABLE, execute the OPTIMIZE TABLE command:

ALTER TABLE Transaction202102 COMPRESSION=ZLIB;
OPTIMIZE TABLE Transaction202102;
Copy the code