Small knowledge, big challenge! This article is part of the “Programmer’s Essentials
This article also participated in the “Digitalstar Project” to win a creative gift package and creative incentive money
Recently, I have been busy with my work, and I am a little lazy. I don’t want to write. I have been paddling for many days.
Most of the game database is using mysql, open source, free is his magic weapon, although not as awesome as Oracle, but for daily use, it is fully sufficient, so most companies use mysql as the data implementation choice, because InnoDB engine has been used before, So today I will talk about the principle of database optimization, which is based on the InnoDB engine. I hope you can solve the same problem when you encounter it. OK, let’s get started.
1. The execution sequence of query statements
select[distinct]
from
join(e.g.,left join)on
where
group by
having
union
order by
limit
Copy the code
Order of execution: from where to gather order limit
Select * from (select * from (select * from (select * from))
2, where do the conditions restrict
3, then do group by
4. Then do the having filter
5. Then sort the results
6, finally limit the quantity limit
I think some people still can not remember the order of SQL query execution, my suggestion is to sort out a simple memory of the short sentence, and then repeatedly read, until remember, next time you encounter a problem to recall their memory of the short sentence, can solve the problem, rote memorization is good, as short as possible.
2. Data type selection
• Smaller data types are usually better: because of better storage space
• Simple data types
• Try to avoid NULL: null-containing compound indexes are invalid. Nullable columns use more storage space and require special handling in Mysql.
TIMESTAMP space utilization is much smaller than DATETIME, more convenient than plastic processing, plastic will not bring any benefits.
IPV4 addresses are often stored using VARCHAR(15). However, it is actually 32 as an unsigned integer, not a string. The decimal point is used to divide the address into four paragraphs just to make it easier for people to read.
Object-relational mapping (ORM) systems (and the “frameworks” that use them) are often inefficient, with some ORM systems storing arbitrary types of data into arbitrary types of back-end data stores. This design appeals to developers because it allows them to work in an object-oriented manner, regardless of how the data is stored.
Principle: As small as possible (occupies less storage space), as long as possible (occupies a fixed storage space), as long as possible integer.
3. Index optimization
Primary key Index InnoDB aggregates data by primary key. Data is stored in primary key order. Updating primary key index columns is expensive because InnoDB is forced to move each updated row to a new location
A common mistake with multi-column indexes is to create a separate index for each column, a mistake often caused by vague advice to index all columns in a WHERE condition.
In fact, creating separate indexes on multiple columns does not improve mysql query performance in most cases, and sometimes makes queries slower.
Indexes need extra space for records. If indexes are used improperly, both indexes and data need to be loaded at the same time, resulting in slow query.
Note the order of the index columns
The correct order depends on the queries that use the index, and how best to satisfy the sorting and grouping needs
A general rule of thumb:
1. Place the most selective column first
2. Do not perform function operations on index columns when querying
3. Use indexes whenever possible
4. Query performance optimization
1. Query lifecycle Client -> server query cache -> Parser -> Preprocessor -> Query optimizer -> Query execution engine -> Storage Engine -> Data
Hit the cache as much as possible, read as little data as possible, look up as little data as possible.
SELECT count(1), count(*), count(*), count(*), count(*)
When the table is modified, the query field in the code may forget to modify, resulting in code error, which is a security risk.
3. Small tables drive large tables
Read as little associated data as possible, the less data the faster,
Note: Data that does not exist in the secondary table is null
Optimizing data access principles:
1. Whether unnecessary data is requested from the database to reduce data loading and network transmission
2. Check whether a large number of unnecessary records are scanned (number of returned rows/number of scanned rows) to reduce the data reading time
conclusion
• Use small, simple and appropriate data types, and avoid NULL as much as possible
• Avoid excessive design, such as schema design that leads to complex queries
• Reasonable mix of formal and anti-formal design, space for time, appropriate redundancy for speed
• Single row access is slow and it is best to read blocks containing as many rows as you need
• Sequential access to range data is fast
• When the volume of data increases, improper indexing can lead to dramatic performance degradation
To write a good query, you must understand schema design, index design, and so on. They are inextricably linked. Only consider the needs of all parties
In order to design a high-performance database system, the advantages and disadvantages are weighed.
Rule: Do as little as possible, preferably none