An overview,

1. Why optimize

  • One application throughput bottleneck tends to be the processing speed of the database
  • With the use of applications, database data is increasing, and database processing pressure is increasing
  • Data in a relational database is stored on disk and is slow to read and write (compared to data in memory)

2. How to optimize

  • The design phase of tables and fields, where better storage and computation are considered
  • Optimizations provided by the database itself, such as indexes
  • Horizontal scaling, master/slave replication, read/write separation, load balancing, and high availability
  • Typical SQL statement optimization (with little effect)

Second, field design

1. Typical scheme

①. There are requirements for accuracy

  • decimal
  • Decimal to integer

②. Use integers to represent strings (IP)

  • inet_ aton("ip' )
  • inet_ ntoa(num)

③ Use not null whenever possible

  • The calculation logic of nuI value is complicated

(4). Fixed length and non-fixed length selection

  • Longer numeric data can use Decimal
  • Char is a fixed length (any content exceeding the length will be truncated), vARCHar is a non-fixed length, text is an extra save for the length of the content, and varchar is a save for the length of the data

⑤. The number of fields should not be too much field annotation is necessary, field name see name think meaning, you can reserve fields for expansion

2. The paradigm

①. The first normal form: segment atomicity (relational databases have columns, the default is consistent)

②. Second normal Form: eliminating partial dependence on primary keys (because there may be more than one primary key); Use a business-independent field as the primary key

③. Third normal Form: eliminate transitive dependence on primary keys; High cohesion, such as commodity table can be divided into commodity brief information table and commodity details table two tables

3. Storage engine selection (MyISAM and Innodb)

1. Functional differences

Innodb supports transactions, row-level locking, and external keys

2. Storage differences

(1). The method of storage: bow | MyISAM data and cords are stored separately (. MYI. MYD), and the Innodb exist together. (FRM)

Table mobility: Tables can be moved by moving the corresponding MYI and MYD, and Innodb has additional associated files

③ Fragmentation space: Fragmentation space will be generated when MyISAM deletes data (occupying table file space). You need to run the Optimizetable table-name command periodically to manually optimize the fragmentation space. Innodb doesn’t.

4. Ordered storage: Innodb inserts data according to the primary key order. Therefore, the data in the table is ordered by primary key by default (it takes time to write, because it needs to find the insertion point in B + tree, but the search efficiency is high).

3. Selection differences

① Read more and write less and use MyISAM

② Read and write Innodb:

  • Support transaction/foreign key to ensure data integrity and compliance
  • Strong concurrency (row locking)

Four, index,

1. What is an index

An identifying keyword extracted from the data and mapped to the corresponding data

Type 2.

Primary key index primary key: the key must be unique and not null

Normal index key: matched indexes are ordered only by the first field

The key must be unique

Fulltext key (not supported in Chinese)

3. Index management syntax

Check the index

  • show create table student
  • desc student

②. Build an index

  • First. Name varchar(1 6),last name(1 6), key name(first_ name,last_ name)
  • Alter table student add key/unique key/primary key/ultext key key. name(first_ name,last_ name)

③ Drop index

  • alter table student drop key key_ name
  • If the primary key index is to be deleted and the primary key grows automatically, you need to alter modify to cancel the self-growth before deleting the index

4. Implement plan explain

Analyze whether and what indexes are used for SQL execution

5. Index usage scenarios

  • Where: If the lookup fields are indexed, index overrides are performed
  • Order by: If an index is set up and the index is ordered, select the corresponding data from the index directly, which is more efficient than reading all the data from the query
  • Join: If the condition field of join on is indexed, the lookup becomes efficient
  • Index overwrite: lookup directly on an index without reading the data

6. Grammatical details

Even when indexes are established, some scenarios are not necessarily used

  • where id+1 = ? It is recommended to write where id =? 1, independent guarantee line | field bow to
  • The like statement does not fuzzily match before the keyword, that is, “%keyword does not use the index, whereas “keyword% does
  • An index is used only when both conditional fields of the OR key are indexed. A full table scan is performed only when either condition field is not indexed
  • State value. For status values such as gender, where one key corresponds to many pieces of data, using an index is considered less efficient than a full table scan

7. Storage structure of indexes

  • Btree: search multi-fork tree: search efficiency log(nodeSize,N), where nodeSize refers to the number of keywords in a node (this depends on the keyword length and nodeSize)

  • B + Tree: Upgraded from BTree, data and keywords are stored in the same space, saving the time of searching for the location of data by mapping keywords to data

5. Query cache

1. Cache the SELECT query result. Key is the SQL statement and value is the query result

If the SQL function is the same, but only with multiple Spaces or minor changes, the key will not match

2. Start the client

query. cache. _type
Copy the code
  • 0 – don’t open it
  • 1- If this function is enabled, each SELECT is cached by default, but not for a certain SQl-no-cache: select sql-no-cache
  • 2- Enable, default is not cache, select SQL -cache to specify which -cache

3. Set the cache size on the client

query_ cache .size
Copy the code

4. Heavy egg cache

reset query cache
Copy the code

5. The cache is invalid

Daily changes to a table invalidate all caches based on that table (surface management)

Six, partition

1. By default, a table corresponds to a group of storage files. However, when a large amount of data (usually 10 million files) is generated, data must be divided into multiple storage files to ensure the processing efficiency of a single file

Partition by partition

  • Hash – The partition field is an integer
  • The key-partition field is a string
  • Range – Based on comparison, supports only less than
  • List – Based on status values

3. Zone management

  • Create table article0 partition by key(title) partitions 10
  • Alter table article add partition alter table article add partition

4. Select a common detection field for the partition field. Otherwise, the partition is meaningless

Seven, horizontal segmentation and vertical segmentation

Level 1.

Multiple tables with the same structure store the same type of data

A single table ensures id uniqueness

2. The vertical

Split fields into multiple tables where the table records are a correspondence

Eight, the cluster

1. Master/slave replication

①. Manually synchronize the slave and master first

  • stop slave
  • The master exports data to the slave once
  • Show master status with read lock Records File and Position
  • Go to slave. change master to

②. Start slave Check whether Slave_ IO_ Running and Slave_ SQL_ _Running are YES

③ The master can read and write data, but the slave can only read data. Otherwise, the master/slave replication fails and manual synchronization is required again

④ mysqlReplicate quickly configures primary and secondary replication

2. Read/write separation (based on master/slave replication)

① Use the original stcConecton

WriteDatabase provides write connections

ReadDatabase provides read connections

②. Using Sping AOP and Aspec to achieve dynamic switching of data sources

  • RoutingDataSourcelmpl extends AbstractRoutingDataSource, rewrite determineDatasource, into SqISessionFactory, Configure defaultTargetDatasource and targetDatasource (select the specific data source value-ref based on the return value of determineDatasource)

  • DatasourceAspect section component, configure Pointcut @pointcut aspect0 (all methods of all DAO classes), configure pre-enhanced @before (” aspect0″) Before(Joinpoint Point), Through the point. The getSignature. GetName METHOD name, and the METHOD of TYPE MAP prefix collection, set the write/read to the current thread (thread is to execute next DAO, lead to enhance its intercept down)

  • DatasourceHandler, which uses ThreadLocal to pre-notify the method to bind the data source to the thread that executes the method. The execution method retrieves the data source based on the current thread

3. Load balancing

algorithm

  • polling
  • Weighted polling
  • According to load

4. High availability

Provide a redundant machine for single-machine service

  • The heartbeat detection
  • Virtual IP
  • A master-slave replication

Typical SQL

1. Online DDL

To avoid long timetable-level locking

  • Copy policy, row by row replication, records the re-execution of old table SQL logs during replication
  • Mysql 5.6 online | DDL, greatly shorten the lock time

2. Import data in batches

①. Disable indexes and constraints first, and then create them after importing them

②. Avoid item by item

In order to ensure consistency, InnoDB defaults to each SQL plus transaction (which is also time-consuming). Transactions should be manually established before batch import and manually committed after import.

3. limit offset,rows

Avoid rabbit larger offset (larger page number)

Offset is used to skip data, which can be filtered rather than detected and skipped

4. select *

Query required fields as far as possible to reduce network transmission delay (little impact)

5. Order by rand ()

A random number is generated for each piece of data and sorted according to the random number. You can use an application to generate random primary keys instead

6. limit 1

If you are determined to retrieve only one piece of data, it is recommended to add limit 1 to all of them

10. Slowly query logs

1. Locate SQL queries with low query efficiency and optimize them accordingly

2. Configuration items

  • Enable slow_ query.log
  • Critical time long_query.time

3. The slow query log records the SQL that exceeds the critical time and is saved in xxx-slow.log under datadir

11, the Profile

1. Automatically record the execution time of each SQL item and the time spent for a specific SQL step

2. Configure the item date

Open the profiling

3. View the log information show Profiles

4. View the time taken by the detailed SQL steps

show profiles for query Query_ ID
Copy the code

Typical server configuration

1. Max_ connections: indicates the maximum number of client connections

Table_open_ cache, the number of table file cache handles, speed up the read and write of table files

3. Key_buffer. _size: indicates the size of the index cache

4. Innodb_buffer. pool size, the size of innoDB buffer pool, to achieve the premise of various innoDB functions

5. Innodb_file_per_ TABLE, one IBD file for each table, otherwise InnoDB shares table space

MySQLSlap MySQLSlap

1. The automatic generation of sq | and perform to test the performance

myqslap -a-to-generate sql -root -root
Copy the code

2. Concurrent testing

Mysqlslap –auto-generate-sql –concurrency= 100-uroot-proot

3. Multiple rounds of testing, reflecting the average situation

Mysqlslap –auto-generate-sql –concurrency= 100 –interations= 3-uroot-proot Perform three rounds

4. Test the storage engine

  • – engine = innodb: Mysqlslap –auto-generate-sql –concurrency= 100 –interations=3 — engine-innodb-uroot-proot Innodb processing performance after 3 rounds

  • – engine = myisam: Mysqlslap — auto-generate-sql –concurrency= 100 –interations=3 — engine-innodb-uroot-proot Perform 3 rounds of myISAM processing performance