Specification for table structure design
- Instead of performing operations in the database, complex operations are moved to the program side to save resources such as CPU of database services. Meanwhile, the index fields that have undergone common or functional operations cannot be used as indexes, but the fields that have undergone functional operations can be used as functional indexes.
- Control single table data volume, pure
INT
Not more than 1000W lines, includingCHAR
No more than 500W lines - Reasonable breakdown by user USERID, date, region, etc
- It is recommended that a single database contains no more than 300-400 tables
- The number of sub-entries in a single table must be limited to 1024
- The number of table fields is small and precise, which is conducive to EFFICIENT IO, full table traversal, fast table repair, improved concurrency, and fast ALTER table
- The maximum number of fields in a single table is 20 to 50
- Single table does not exceed 50 pure
INT
A maximum of 20 fieldsCHAR(10)
field - A single line cannot exceed 200 bytes
- Sacrifice some paradigms and add some redundancy
Field design specification
- Choose the appropriate numeric field type (depending on the business
tinyint/int/bigint
, take 1/4/8 bytes respectively.) - Fields expressing the concept of yes or no, used
is_xxx
The data type isUNSIGNED TINYINT
(1 indicates yes, 0 indicates no) (Note: this rule is not valid for the time being. TINYINT is changed to Y and N, and an error is reported during data synchronization.) - The decimal type is
DECIMAL
, prohibited useFloat and double
, there is precision defect - Time type usage
BIGINT
Bigint = bigint; bigint = bigint; bigint = bigint - Suitable for service scenarios where the field length is fixed or similar
CHAR
, can reduce debris, high query performance - Suitable for business scenarios where the field length varies greatly or the number of updates is small
VARCHAR
, can reduce space - Convert strings to numbers because numeric indexes are more efficient, faster to query, and take up less space than string indexes
- Avoid the use of
ENUM
Because adding a new type requires modifying the table structure - Avoid the use of
NULL
Field, because it is difficult to query optimization, containingNULL
Invalid compound index,NULL
Column index requires extra space,NULL
Can only useIS NULL
orIS NOT NULL
, and in= /! =/in/not in
Have a big pit - Use less and split
TEXT/BLOB
, the processing performance is much lower thanVARCHAR
If necessary, split the table into separate tables (it is recommended to split large fields and fields with low access frequency into separate tables for storage and separate hot and cold data) - Instead of storing images in the database, use the file system to store images, and only store the image storage address in the database (similar to files)
- use
INT UNSIGNED
Store IPv4, do not use itCHAR(15)
- use
VARCHAR(20)
Store mobile phone numbers. Do not use integers (+/-/() may be involved in country codes, such as +86;VARCHAR
Fuzzy queries can be made, for exampleLike '138%'
) - The number of letters around the underscore must be greater than or equal to 2; otherwise, Java does not comply with the Java specification when generating get and set methods
- All date field names end in ‘_date’. Several important date fields, announcement date pub_date, change date chan_date, trade date trade_date, listing date list_date, delist_date
- Non-date fields cannot end with ‘_date’
- All unique encoded field names end in ‘_key’. For example, security unique code sec_key, institution unique code org_key, date unique code date_key,
- Non-coded fields cannot end with ‘_key’
The index specification
- Autoincrement column or global ID as InnoDB primary key, recommended to use business independent
AUTO_INCREMENT
Column or global ID generator as proxy primary key - Non-unique index name The default value is
Idx_ field name
Named after the - Unique index usage
Uk_ field name
Named after the - The primary key index
pk_
Named after the - You are advised to limit the number of indexes in a single table to five
- Adding indexes can improve queries, but slow down updates. More indexes are not always better. The number of index fields should not exceed 20% of the table fields
- Character fields must be indexed with a prefix,
like '%name%'
Don’t use indexes - Do not perform operations on the index column; otherwise, the index cannot be used, resulting in a full table scan
- Try not to use foreign keys, because high concurrency is prone to deadlocks (foreign keys coupling between tables, affecting SQL performance such as UPDATE/DELETE, may cause deadlocks, easy to become a database bottleneck in high concurrency)
- Don’t do it unless you have to
JOIN
Query if to be performedJOIN
The query,JOIN
The fields must be of the same type and indexed - (a), (a,b), (a,b), (a,b, C)
- Prevent index invalidation due to implicit conversion caused by different field types
The SQL standard
- SQL statements as simple as possible, an SQL can only be in a CPU operation, 5000+QPS high concurrency, a large SQL may block the entire database
- Disassemble into multiple simple SQL, cache hit ratio is higher, use more CPU; Reduce table lock time, especially MyISAM
- Transaction/connection usage principle: use immediately, use immediately close
- Transaction-independent operations are taken out of the transaction to reduce resource usage
- Replace long transactions with multiple short transactions without breaking consistency
- Avoid the use of stored procedures, triggers, events (debugging, troubleshooting, migration are difficult, poor scalability) as far as possible, reduce the use of MySQL functions to process the results, the program end is responsible for
- As far as possible need not
SELECT *
Obtain only required data columns; otherwise, more CPU, memory, I/O, and network bandwidth resources are consumed - The same field will
OR
toIN
Pay attention to controlIN
N is recommended to be less than 200
Select id from t where phone= ‘159’ or phone= ‘136’;
= >
Select id from t where phone in (‘ 159 ‘, ‘136’);
- Different fields will
OR
Instead ofUNION
Select id from t where phone = ‘159’ or name = ‘John’;
= >
Select id from t where phone= ‘159
union
Select id from t where name= ‘jonh’
Avoid negative query and % prefix fuzzy query, cannot use the index, resulting in full table scan
Use COUNT(*) as little or as little as possible
SELECT id FROM user LIMIT 10000, 10 SELECT id FROM user LIMIT 10000, 10 SELECT id FROM user WHERE ID >= 10000 LIMIT 10
If the result does not need to be deduplicated, use UNION ALL instead of UNION, which has deduplicated overhead
High concurrent DB disallows more than two table joins
Why don’t Internet companies recommend join?
1. It is not conducive to write operations. When a read operation is performed, the read data is locked, blocking other services from updating this data. If multiple aggregation functions are involved, multiple tables are locked at the same time. Therefore, read and write operations cannot be performed, affecting the overall system performance.
2. It is not conducive to maintenance. When services change, for example, a table in a JOIN is modified, existing SQL may become unavailable.
3. The horizontal scaling of database computing resources relative to the Service layer is much more difficult.
Compare column values with data types (to prevent index invalidation due to implicit conversions caused by different field types)
Avoid large SQL, large transactions, large quantities of time to occupy system resources and block the system, a SQL can only be in one CPU operation
Try not to INSERT... SELECT
Load Data Imports data in batches, avoiding peak hours
Look at the execution plan using EXPLANIN
Observation Slow query logs
Show processList Displays the process status
MySQL subquery optimization is poor IN most cases, especially IN WHERE subquery IN ID, generally can be overwritten by JOIN
General agreement
Never show lock on the program side, external lock on the database is not controllable, high concurrency is a disaster
Table storage engine must use InnoDB
Utf8mb4 (utF8MB4 is a superset of UTF8, used when storing 4 bytes such as emoticons)
MySQL database tables are case-sensitive and field names are case-insensitive (Linux VS Win)
Table names do not use plural nouns
Disable keywords
The three required field ids are create_AT and update_AT
The best way to name a table is to add "business name _ purpose of the table".
The library name should be the same as the service name
MySQL table building specification
-
The database name, table name, and field name must be separated by underscores (_) and lowercase letters.
-
The database name, table name, and field name contain a maximum of 12 characters.
-
Use nouns instead of verbs for database, table, and field names.
-
InnoDB storage engine is recommended
-
Stores exact floating-point numbers using DECIMAL instead of FLOAT and DOUBLE.
-
It is recommended to use UNSIGNED to store non-negative values.
-
No length is added to the integer definition
-
Using short data types, such as TINYINT UNSIGNED if the value ranges from 0 to 80.
-
Do not use TEXT, BLOB types.
-
VARCHAR(N) indicates the number of characters rather than bytes. For example, VARCHAR(255) can store a maximum of 255 Chinese characters. You need to select N based on the actual width. N Is as small as possible. The maximum length of all VARCHAR fields in a MySQL table is 65535 bytes. Memory operations such as sorting and creating temporary tables use the length of N to allocate memory.
-
Select UTF8 for the table character set.
-
Prohibit using VARBINARY, BLOB to store images, files, etc in the database.
-
Storage time — doubtful
mysql Data type and number of bytes occupied
TINYINT 1 byte
SMALLINT Specifies 2 bytes
MEDIUMINT The value is 3 bytes
INT 4 bytes
INTEGER Indicates 4 bytes
BIGINT Specifies 8 bytes
FLOAT(X) 4 if X < = 24 or 8 if 25 < = X < = 53
FLOAT 4 bytes
DOUBLE 8 bytes
DOUBLE PRECISION 8 bytes
REAL is 8 bytes
DECIMAL(M,D) M bytes (D+2 if M < D)
NUMERIC(M,D) M byte (D+2, if M < D)
Date and time types
The amount of storage required by the column type
DATE 3 bytes
DATETIME 8 bytes
TIMESTAMP 4 bytes
TIME 3 bytes
YEAR 1 byte
String type
The amount of storage required by the column type
CHAR(M) M bytes, 1 <= M <= 255
VARCHAR(M) L+1 byte, where L <= M and 1 <= M <= 255
TINYBLOB, TINYTEXT L+1 byte, where L< 2 ^ 8
BLOB, TEXT L+2 bytes, where L< 2 ^ 16
MEDIUMBLOB, MEDIUMTEXT L+3 bytes, where L< 2 ^ 24
LONGBLOB, LONGTEXT L+4 bytes, where L< 2 ^ 32
ENUM(‘value1′,’value2’,…) 1 or 2 bytes, depending on the number of enumerated values (Max. 65535)
SET(‘value1′,’value2’,…) 1,2,3,4 or 8 bytes, depending on the number of collection members (up to 64 members)
The machine executes SQL statements
From a_table on a_table.id = b_table.a_id left join b_table where a_table.id = 1 group by a_table.age select a_table.id order by a_table.age desc limit 100
What is the index
An Index is a data structure that helps MySQL retrieve data efficiently
The index type
The primary key index
Single-valued index: An index contains only a single column
Unique index: The value of the indexed column must be unique, allowing empty values
Compound index: That is, an index contains multiple columns
The basic grammar
Create index ();
Mysql >ALTER TABLE table_name ADD PRIMARY KEY (column) 2.UNIQUE mysql>ALTER TABLE table_name Mysql >ALTER TABLE table_name ADD INDEX index_name (column) 4.FULLTEXT(FULLTEXT) mysql>ALTER TABLE table_name ADD FULLTEXT ( column ) 5. Mysql >ALTER TABLE table_name ADD INDEX index_name (column1, column2, column3)
DROP INDEX [indexName] ON mytable;
Select table_name FROM table_name
Index structure
BTree index, Hash index, full-text index, r-tree index
Focus on BTree index
The case where an index needs to be created
1. Unique indexes are automatically created for primary keys
2. Create indexes for fields that are frequently used as query conditions
3. Query the fields associated with other tables and create indexes based on the foreign key relationship
4. Frequently updated fields are not suitable for creating indexes
Select * from ‘Where’ Where (select * from ‘Where’)
6. Tendency to create composite indexes under high concurrency
7. Query sorted fields, if the sorted fields through the index to access will greatly improve the speed of sorting
8. Query statistics or group fields
When not to create an index
1. Too few table records
2. Frequently added, deleted and modified watches
3. Table fields with repetitive and evenly distributed data, so only frequently queried and frequently ordered data columns should be indexed.
Note that if a data column contains a lot of duplicate content, there is not much practical value in indexing it.
Explain
Use the EXPLAIN keyword to simulate the optimizer’s execution of SQL statements to see how MySQL handles your SQL statements. Analyze performance bottlenecks in your query or structure
Id: serial number of a select query, containing a set of numbers indicating the order in which the select clause or operation table is executed in the query
The ids are the same and the execution sequence is from top to bottom
The ID is different. For subqueries, the ID sequence increases. A larger ID has a higher priority and is executed earlier
The ids are the same and different and exist at the same time
2, select_type: query type, mainly used to distinguish common query, joint query, sub-query, and other complex query
1.SIMPLE: a SIMPLE select query that does not contain subqueries or unions
2.PRIMARY: If the query contains any complex subparts, the outermost query is marked as
3.SUBQUERY: Contains subqueries in a SELECT or WHERE list
4.DERIVED: Subqueries contained in the FROM list are labeled as DERIVED and MySQL recursively executes these subqueries, putting the results in temporary tables.
5.UNION: If the second SELECT appears after the UNION, it is marked as UNION; If UNION is included in the subquery of the FROM clause, the outer SELECT will be marked: DERIVED
6.UNION RESULT: SELECT the RESULT from UNION table
3, table: display the row data about which table
4, type: displays what type of query is used in order from best to worst: system>const>eq_ref>ref>range>index>ALL
1. System: the table has only one row (equal to the system table). This is a special case of const type
2. Const: used to compare primary key or unique index. Because it only matches one row, it’s fast. By placing the primary key in the WHERE list, MySQL can convert the query to a constant
Eq_ref: unique index. For each index key, only one record in the table matches it. This is common in primary key or unique index scans
4. Ref: non-unique index scan that returns all rows matching a single value. It is essentially an index access that returns all rows that match a single value. However, it may find multiple rows that match the criteria, so it should be a mixture of lookup and scan
5. Range: Retrieves only rows in a given range, using an index to select rows. A range scan is better than a full table scan because it only needs to start at one point of the index and end at another point without scanning all the indexes
6. Index: Full index Scan. The index type is different from ALL. This is usually faster than ALL because index files are usually smaller than data files. (all and index are both read from the full table, but index is read from the index, and all is read from the hard disk)
7. All: FullTable Scan, which traverses the entire table to find matching rows
At least range, preferably ref
Possible_keys: Possible_keys: Displays one or more possible indexes in the table. If an index exists on a field involved in a query, the index is listed but not necessarily used by the query
6. Key: the actual index
Key_len: indicates the number of bytes used in the index. This column can be used to calculate the length of the index used in the query. With no loss of accuracy, the shorter the length, the better. The value key_len displays is the maximum possible length of the index, not the actual length, that is, key_len is calculated from the table definition, not retrieved from the table
Ref: indicates that the index column is used and, if possible, is a constant. Those columns or constants are used to find values on index columns
9. Rows: Approximate the number of rows to be read to find the desired record based on table statistics and index selection
10. Extra: Additional information
- 1.Using filesort: mysql uses an external index sort for data, instead of reading data in the order of the indexes in the table. MySQL can not use index to complete a sort operation called “file sort”
MySQL uses temporary tables to sort query results. Common in sort order by and group by queries
- USING index: The Coveing index is used in the select operation, avoiding access to rows of the table. Efficient! Using WHERE indicates that the index is used to perform a key lookup. If a using WHERE is not present at the same time, the surface index is used to read data rather than perform lookup actions.
4.Using WHERE: The surface uses where filtering
5. Using join buffer: The join buffer is used
6. Impossible WHERE: The value of the WHERE clause is always false and cannot be used to retrieve any tuples
7. Select Tables Optimized Away: Optimize MIN/MAX operation based on index without GROUPBY clause or
MyISAM storage engine optimizes the COUNT(*) operation without waiting until the execution stage to evaluate,
The optimization is completed in the generation phase of the query execution plan.
8. Distinct: Optimize DISTINCT to stop looking for the same value once the first matching tuple is found
- Using index condition: a new feature in mysql5.6 that uses indexes to query data
Avoid index invalidation
1. If you create a composite index, follow the index order and do not skip the index
2, do not perform any operation on the index column such as: calculation, function, (automatic/manual) type conversion, will cause index failure and full table scan
If the index is placed after a range, it will be invalid, such as: > < etc
4, try to use overwrite index (index column and query column), reduce select*
5, 6. Mysql does not equal (! = or <>) will cause a full table scan
6, 7. Is not null
Mysql > select * from ‘aa%’ where ‘aa%’ = ‘aa%’
8. String indexing without single quotation marks is invalid
9. Use or spargently, as it will cause index failure when connecting
Order by index; otherwise, “in-file sort” will be generated
11, Group by creates Using temporary; Using Filesort, you can use Java functional programming to achieve grouping
Slow Query logs
View the SQL in the slow query logs and optimize them
Show profiles
Mysql provides resources that can be used to analyze the resource consumption of statement execution in the current session. Can be used for tuning measurements of SQL
Steps:
Execute the SQL statement to be optimized
Perform the show profiles;
show profile (type) for id;
type:
| ALL – show ALL cost information
| BLOCK IO – display BLOCK IO associated overhead
| the CONTEXT SWITCHES – CONTEXT switch overhead associated
| CPU – related to CPU overhead information
| the IPC – is a sign that send and receive relevant cost information
| – display MEMORY overhead associated MEMORY
| PAGE FAULTS – display a PAGE fault related cost information
| SOURCE – display and Source_function Source_file, Source_line related cost information
| SWAPS, according to the exchange of frequency associated overhead information
The following conditions must be optimized:
3.1 Converting HEAP to MyISAM: The result of the query is too large for memory, moving to disk;
3.2 Creating TMP table: Create a temporary table, copy data to the temporary table, and then delete it.
Copying to TMP tables on disk: Danger!!
3.5 locked
3.6 If the number of queries is too large without paging, sending data will be time-consuming
The index specification
- Autoincrement column or global ID as InnoDB primary key, recommended to use business independent
AUTO_INCREMENT
Column or global ID generator as proxy primary key - Non-unique index name The default value is
Idx_ field name
Named after the - Unique index usage
Uk_ field name
Named after the - The primary key index
pk_
Named after the - You are advised to limit the number of indexes in a single table to five
- Adding indexes can improve queries, but slow down updates. More indexes are not always better. The number of index fields should not exceed 20% of the table fields
- Character fields must be indexed with a prefix,
like '%name%'
Don’t use indexes - Do not perform operations on the index column; otherwise, the index cannot be used, resulting in a full table scan
- Try not to use foreign keys, because high concurrency is prone to deadlocks (foreign keys coupling between tables, affecting SQL performance such as UPDATE/DELETE, may cause deadlocks, easy to become a database bottleneck in high concurrency)
- Don’t do it unless you have to
JOIN
Query if to be performedJOIN
The query,JOIN
The fields must be of the same type and indexed - (a), (a,b), (a,b), (a,b, C)
- Prevent index invalidation due to implicit conversion caused by different field types
- Avoid negative query and % prefix fuzzy query, cannot use the index, resulting in full table scan
Which fields are good for indexing?
- Index columns should be built on columns with high data distinction. The higher the distinction is, the higher the non-repetition rate of data is and the better the effect of new indexes is.
Why create composite indexes?
- When performing a query in MySQL, only one index can be used.
Last_name, first_name, age
MySQL selects the strictest index (with the least number of records in the result set). - (a), (a,b), (a,b), (a,b, C)
- If there is a composite index (
col_a,col_b,col_c
), the index is used in the following cases:
Col_a = “value”;
Col_a = value and col_B = value;
Col_a = “value” and col_b = “value” and col_C = “value”;
Col_b = “value” and col_A = “value” and col_C = “value”;
- For the last statement, MySQL will automatically optimize to the same as the third statement (
=
和in
It can be out of order, for examplea = 1 and b = 2 and c = 3
To establish (a,b,c
) the indexes can be in any order, and MySQL’s query optimizer will optimize them into a form that the indexes recognize. Indexes are not used in the following cases:
Col_b = “value”;
Col_b = “value” and col_c = “value”;
Understand single-column indexes, multi-column indexes, and left-most prefixes
Example: Now we want to find the user ID that meets the following criteria:
Mysql >SELECT ‘id’ FROM user WHERE last_name = ‘L’ AND ‘first_name’ = ‘ZW’ AND ‘age’ = 30
Single-column index:
ALTER TABLE user ADD INDEX last_name (last_name);
Select last_name=’L’, select last_name=’L’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’, select last_name=’ ZW’.
The index of the LAST_name column is much more efficient than performing a full scan of the table, but we still require far more records to be scanned than we actually need. Although we can drop the index on the last_name column and create an index on either first_name or AGE, the search efficiency is similar regardless of which column we create the index on.
Multi-column index:
ALTER TABLE user ADD INDEX lname_fname_age (last_ame,first_name,age);
To improve search efficiency, we need to consider using multi-column indexes. Since the index files are stored in b-tree format, we can get the final results without scanning any records. In fact, there is a field order problem in a multi-column index, generally put the higher degree of differentiation in the first, so that the joint index is more effective.
Leftmost prefix:
Select last_name (last_name) from lname_fname_age; select last_name (last_name) from lname_fname_age; (last_name,first_name) and (last_name,first_name,age) combined indexes.
Note: When creating a multi-column index, the most frequently used column in the WHERE clause is placed at the far left, depending on business requirements. MySQL will keep matching to the right until it encounters range queries (>, <, between, like) and then stops matching
When do you build an index?
In general, columns that appear in WHERE and JOIN need to be indexed.
The inadequacy of the index
- All this is about the benefits of using indexes, but using them too much can lead to abuse. So indexes also have their drawbacks:
While indexes greatly speed up queries, they slow down the speed of updating tables, such as INSERTS, UPDATES, and DELETES. When updating a table, MySQL not only saves the data, but also the index file.
Index files that take up disk space. In general, this is not a serious problem, but if you create multiple composite indexes on a large table, the index files can swell quickly. Indexes are only one factor in improving efficiency. If your MySQL database has large tables, you need to spend time on building the best indexes or optimizing queries.
Considerations for using indexes
- Indexes do not contain columns with NULL values
Any column that contains a NULL value will not be included in the index, and any column in the composite index that contains a NULL value is invalid for the composite index. So we don’t want the default value of the field to be NULL when we design the database.
- Using short indexes
Index the string, specifying a prefix length if possible. For example, if you have a CHAR(255) column, do not index the entire column if multiple values are unique within the first 10 or 20 characters. Short indexes not only improve query speed but also save disk space and I/O operations.
- Index column sort
MySQL queries use only one index, so if an index is already used in the WHERE clause, the column in order BY will not be indexed. Therefore, the database default sort can meet the requirements of the situation do not use the sort operation; Try not to include more than one column sort, and create composite indexes for those columns if necessary.
- Like statement operation
The use of the like operation is generally discouraged, and it is also a question of how to use it if you must. Like “%aaa%” does not use indexes whereas like “aaa%” can use indexes.
- Do not perform operations on columns
select id from user where YEAR(adddate)<2007;
The operation will be performed on each row, which will cause the index to fail and perform a full table scan, so we can instead
Select id from user where adddate< ‘2018-01-01’;
- Do NOT use the NOT IN, <>, and BETWEEN operations