The data type
Integer types
Tinyint 1 byte SmallInt 2 bytes MediumInt 3 bytes INT 4 bytes BigInt 8 bytesCopy the code
Each of the above integer types has the optional property unsigned, which means unsigned. For example, int contains 4 bytes. The value ranges from -2^32/2 to (2^32-1)/2. If an unsigned int also occupies 4 bytes, the range is 0 to 2^32-1.
Real type
A real number is a number with a decimal part. Mysql supports exact and imprecise types.
Float 4-byte support for approximate computation Dubole 8-byte support for approximate computation Decimal stores exact decimals, supports exact computation and is suitable for financial dataCopy the code
Decimal is suitable for storing financial data and is more expensive to compute than float or double, but more accurate. Is there a way to get the best of both? Instead of Decimal, you can use Bigint to multiply the monetary units you need to store by the number of decimal places. Assuming the stored financial data is accurate to one part in ten thousand, you can multiply all amounts by ten thousand.
String type
Varchar stores a string of variable length, up to 65535 char Stores a string of fixed length, up to 255Copy the code
varchar
- Use only as much space as is necessary; shorter strings use less space. But if MySQL’s ROW_FORMAT=FIXED, then each row uses fixed-length storage, which wastes space
- One or two extra bytes are required to record the length of the string: if the maximum length of the column <=255 bytes, only 1 byte is used, otherwise 2 bytes are used
- Since it is variable length, it is possible to make the line longer during an update. If a row takes up more space and there is no more space to store in the memory page, InnoDB needs to split the row to fit into the page as well. This creates memory fragmentation.
char
- Mysql allocates sufficient space based on the defined string length
- Due to the fixed length, it is not easy to generate fragments for the data changed in the approach
Applicable scenario
- The maximum length of a string column is much larger than the average length; Column updates are rare, so fragmentation is not a problem; Varchar more appropriate
- For all values close to the same length; Char is more suitable
- For very short columns, CHAR is also more storage efficient than VARCHAR. For example, char(1) requires only one byte, and varchar(1) requires two bytes, because there is also a record-length byte
Date and time types
Datetime 8 bytes, independent of the time zone Maximum year 9999 TIMESTAMP 4 bytes, dependent on the time zone Maximum year 2038 Date 4 bytes Year month Day format: 0000:00:00 time 3 bytes minute second format: 00:00:00 year Occupies 1 byte. The value ranges from 1901 to 2155. Format: 0000Copy the code
A datetime or timestamp field is assigned and displayed as a string in the format of ‘YYYY-MM-DD HH:MM:SS’ or ‘YYYYMMDDHHMMSS’. The field is displayed in the format of ‘YYYY-MM-DD HH:MM:SS’ after the select command is used. For example, if I now type ‘2008-08-08 08:08:08’, the dateTime type is changed to 2008-08-08 08:08:08 display, and I type ‘2008080808080808’, also changed to 2008-08-08 08:08:08 display.
Commonly used SQL
Access control
Use mysql; select * from user; Create account create user user name identified by 'password' Change password set passwd for user = passwd (' new password ') Delete account drop user user name Check access permissions show grants for The user nameCopy the code
Character set
Databases are used to store and retrieve data. Different character sets determine how data is stored and retrieved. Different character sets have default collation rules
Show character set Show all collation rules show collation Show the current character set show variables like 'character%' Show the current collation rules Variables like 'collation%' Modify character set collation rule Character_set_server Server character set, which is the default character setCopy the code
The storage engine
mysql> show engines; +--------------------+---------+----------------------------------------------------------------+--------------+------+- -----------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +--------------------+---------+----------------------------------------------------------------+--------------+------+- -----------+ | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO | | CSV | YES | CSV storage engine | NO | NO | NO | | MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO | | BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | NO | NO | NO | | InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES | | PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO | | ARCHIVE | YES | Archive storage engine | NO | NO | NO | | MyISAM | YES | MyISAM storage engine | NO | NO | NO | | FEDERATED | NO | Federated MySQL storage engine | NULL | NULL | NULL | +--------------------+---------+----------------------------------------------------------------+--------------+------+- -----------+Copy the code
The database
Mysql -h host name (IP) -u user name -p port -p password Show databases; Select a database use a database name Query a database creation statement show create database a database name Drop a database nameCopy the code
The data table
Basic operation
Show tables; Create table create table name (field name type constraint (multiple constraints are separated by Spaces), field name type constraint, field name type constraint); RENAME TABLE < old TABLE name > [TO] < new TABLE name >; Show columns from table describe Table name -- shortcut desc table name -- Super shortcut drop table name drop table nameCopy the code
Alter table structure
ALTER TABLE < TABLE name > ADD < new TABLE name >< data type >[constraint]; ALTER TABLE < TABLE name > ADD < new TABLE name > < data type > ALTER TABLE < TABLE name > ADD < new TABLE name > < data type > [constraint] AFTER < existing TABLE name >; ALTER TABLE < TABLE name > MODIFY < field name > < Data type >[Constraint] MODIFY field name and data type ALTER TABLE < TABLE name > CHANGE < old field name > < new field name > < new data type >[Constraint] Delete a field ALTER TABLE < table_name > DROP < table_name >; ALTER TABLE < TABLE name > MODIFY < field name > < data type > NOT NULL AUTO_increment ALTER TABLE < TABLE name > MODIFY < field name > < data type > NULL Alter table < table name > add constraint < table name > alter table < table name > add constraint < table name > Add constraint < constraint name > primary key(表 1) ALTER table < constraint name > add constraint < constraint name > references Primary key(表 2 Table Table name DROP Constraint Constraint nameCopy the code
The index operation
Create a new index create [unique] index idx_name on table_name(column_list) Drop index IDx_name on table_name Query index show index from table_nameCopy the code
Data manipulation
The query
select distinct <select_list> from <left_table> <join_type> join <right_table> on <join_condition> where <where_condition> group by <group_by_list> having <having_condition> order by <order_by_list> limit <limit_condition> From <left_table> <join_type> Join <right_table> on <join_condition> where <where_condition> group by <group_by_list> having <having_condition> select distinct <select_list> order by <order_by_list> limit <limit_condition>Copy the code
The WHERE clause operates
=! = < <= > >= BETWEEN values 1 and 2 (values 1 and 2) not between values 1 and 2 (values 1 and 2) is null is not null in(values 1, 2, 3) not In (value 1, value 2, value 3) like The wildcard characters % and - are supported. The former matches zero or more characters, while the latter matches only one characterCopy the code
Group by having
The system first obtains a result set based on the SELECT statement, and then merges the records with the same group field into a single record based on the group field. This time behind the rest of those who do not exist in the Group By statement as grouping is based on the field is likely to be multiple values, but it is only a record, a grouping of a data line is not in the multiple numerical, so there is need to through certain processing more than the value of the column into a single value, and then put it in the corresponding data, So what does this step is the aggregate function. That’s why these functions are called aggregate functions.
Multiple fields are separated by commas (,) and grouped from left to right by specified fields
Order by < 表 名 desc/asc>
Multiple fields are separated by commas (,) and sorted from left to right by specified fields
limit
Limit 5: A maximum of five rows of data can be queried starting from 0 limit 2,5: A maximum of five rows of data can be queried starting from 2Copy the code
Combination query
Select statement 1 union SELECT statement 2 -- delete select statement 1 union all SELECT statement 2 -- Do not delete the same number of fields queried before and after the two statements, only compatible types. This is used when you are querying data from different tables but want to combine the returned results.Copy the code
Commonly used functions
- Aggregation function
Count (*) -- don't ignore NULL rows count(field) -- Ignore NULL rows Max (field) -- Ignore NULL rows min(field) -- Ignore NULL rows sum(field) -- Ignore NULL rows AVg (field) Count (distinct fields) -- Ignore rows with a value of NULL, count(*) -- ignore rows with a value of NULL, count(distinct fields) -- ignore rows with a value of NULL, Min (Distinct fields) -- ignores rows with a NULL value. Sum (Distinct fields) -- ignores rows with a NULL value and computes different values. Avg (Distinct fields) -- ignores rows with a NULL value and computes the average of different valuesCopy the code
- The text function
concat(params...) -- Arguments are separated by commas, Trim (field) - trim the left and right whitespace ltrim(field) - trim the left and right whitespace rtrim(field) - Trim the right whitespace lower(field) - lower(field) - upper(field) - upper(field) - upper(field) - uppercase Length (field) - Returns the lengthCopy the code
- Time function
Year returns the year part of the date Month returns the month part of the date day returns the number of days of the date Time returns the time part of the date hour(field) Date_format () -- Returns the formatted date or time datediff() -- computes the difference between the two dates. now() -- returns the minute part of the time date_format() -- returns the formatted date or time datediff() -- computes the difference between two dates. now() Curdate () -- curtime() -- return the current timeCopy the code
insert
INSERT INTO table_name (field1, field2,... FieldN) --field may not be specified, indicating all VALUES of the table (value1, value2... valueN ), (value11, value22,... valueNN ); INSERT INTO table_name (field1, field2,... FieldN) --field may not be specified, indicating all column SELECT statements of the table; The number of columns in a select query must be the same as the number of field columnsCopy the code
update
UPDATE table_name SET table_name = 1 [, table_name = 2...] [the WHERE clause]Copy the code
delete
DELETE FROM < table name > [WHERE clause]Copy the code
To optimize the
Logical architecture
Mysql is divided into Server layer and storage engine layer
Layer 1 Server layer: includes connector, query cache, analyzer, optimizer, and executor. Connectors manage connections and verify permissions; Query cache, hit the cache directly return; Analyzer for lexical and grammatical analysis; Optimizer generates execution plan, selects index; The actuator operates the engine and returns data. Functions across storage engines are implemented in this layer, including built-in functions, views, triggers, and stored procedures.
The second storage engine layer is responsible for data storage and extraction. Storage engine is pluggable, Mysql uses InnoDB engine.
How is an SQL query executed
select * from user where id=3;
1. Connector: The client runs the mysql -h x -p y -p z command to connect to mysql through the connector
2. Query cache: After receiving a query request, mysql queries the query request in the query cache. If the query is not found, go further, and then cache it in the query cache from the storage engine and return it to the client — similar to key-value, where the query SQL is key and the result set is value
- It is not recommended to use the query cache. As soon as a table is updated, all the query caches on the table will be invalidated. The invalidation of the query cache is very frequent and does more harm than good. The query cache applies to static tables that are updated over a long period of time.
Show variables like '%query_cache%';Copy the code
3. Parsers: Parsers analyze the morphology, such as the string user as a user table and the string ID as an ID column. Also perform parsing to determine whether SQL statements meet mysql syntax — what to do
4. Optimizer: Mysql already knows what the client needs to do after the analyzer, but how best to do it requires optimizer optimization, optimizer selection of indexes to use, generation of execution plans, etc. — how to do it
5. Actuator:
1) call storage engines take user interface table the first line of the data, and then judge whether id equals 3, if not skip, if it is added to the result set 2) call storage engines take the next line of data interface, repeating the same judgment logic, until the very end of the table after a row 3) will the traversal of the result set returned to the clientCopy the code
How is an SQL update statement executed
Update user set address=’ hk ‘where id=3;
1. The update query cache of the table is invalid
2. Process of execution: Linker – Analyzer – optimizer – Executor – Storage engine. The flow chart of actuators and storage engines is as follows. Dark colors indicate storage engine operations, and light colors indicate actuator operations.
InnoDB is a transactional storage engine and you can see that there is a commit transaction operation at the end. Recording two logs uses a two-phase commit.
3. There are two important log files called redo logs and binlogs
Redo log Redo log
- This log is unique to the InnoDB storage engine
- Redo logs are fixed in size. You can configure a group of four files, each file 1g, as shown below. Check point indicates the current erasable position, write pos indicates the current write position, and the shaded part indicates the writable free area. When write POS catches up with the Check point, it cannot write and needs to stop and push the check point forward.
- The redo log records what changes are made to a memory page
- When a record needs to be updated, the InnoDB engine writes the record to the redo log and updates it to memory. The InnoDB engine updates the operation record to disk when appropriate.
- The redo log ensures that previously committed records are not lost after an abnormal database restart
Binlog Archive logs
- This log is for the Mysql Server layer
- The file is appended
- The original logic of the statement is recorded
Database transaction
Transaction ACID- atomicity, consistency, isolation, persistence
Problems with multiple transactions:
- Dirty read – Dirty read is when a transaction is accessing data and making changes to the data that have not been committed to the database, and then another transaction is accessing the data and using the data.
1.Mary's original salary was 1000, but the financial staff changed her salary to 8000(but did not submit the transaction). 2.Mary read her salary and found her salary changed to 8000. 3. The financial discovers that the operation is wrong and the transaction is rolled back, so Mary's salary is changed to 1000 again. Like this,Mary's salary of 8000 is a dirty number.Copy the code
- Non-repeatable read – The same data is read multiple times in one transaction, and the data is read and committed after modification by another transaction. The same data is accessed by another transaction while the transaction is still active. Then, between the two reads in the first transaction, the data read in the first transaction may not be the same because of the modification in the second transaction. The data that is read twice in a transaction is not the same and is therefore called a non-repeatable read.
1. In transaction 1, Mary reads her salary as 1000 and does not complete operation 2. In transaction 2, at this time, the financial staff changed Mary's salary to 2000 and submitted the transaction. 3. In transaction 1, when Mary reads her salary again, it changes to 2000Copy the code
- Phantom reading – a phenomenon that occurs when a transaction is not executed independently, such as the first transaction making a change to the data in a table that involves all rows in the table. At the same time, the second transaction also modifies the data in the table by inserting a new row into the table. Then, as if in an illusion, the user operating on the first transaction will discover that there are unmodified rows in the table.
1. Transaction 1, read all employees whose salary is 1000. 2. Insert a new employee record into the employee table where the salary is 1000 3. Transaction 1 reads all the employees whose salary is 1000 again and reads 11 records in total.Copy the code
Mysql supports transaction isolation in the following four ways:
- Read Uncommitted – Before a transaction is committed, its changes can be seen by other transactions
- Read Committed – After a transaction commits, its changes are seen by other transactions. You can avoid
- Repeatable read – The data a transaction sees during execution is always the same as the data the transaction sees when it starts
- Serialization – For the same row, write locks are added to write records, and read locks are added to read records. In case of read/write conflicts, the last accessed transaction must wait for the previous transaction to complete
Query the isolation level of the current database
show variables like 'transaction%';
Copy the code
Concurrency control
Read-write lock
Read locks are shared and threads do not block each other; Write locks are exclusive. Holding a write lock in one thread blocks write locks and read locks in other threads. Read locks are also known as shared locks, and write locks are also known as exclusive locks.
Lock strategy
Locking consumes resources, including obtaining locks, checking locks, and releasing locks. If the system spends a lot of time managing locks rather than accessing data, the performance of the system may suffer. Thus, striking a balance between the cost of locking and the security of data (which affects performance) is a locking strategy.
Lock policies supported by mysql
Table locks
Minimum overhead, but large lock granularity, support small concurrent processing.
Row locks
Only storage engine layer exists, not implemented in mysql server layer. The row lock has a small granularity and supports concurrent processing to the maximum extent, but the lock cost is high.
When the lock
Implicit locking
InnoDB uses a two-phase lockout protocol. During transaction execution, locks are automatically locked as needed depending on the isolation level. Locks are released only when commit or ROLLBACK is performed, and all locks are released at the same time.
According to lock
InnoDB also supports specific statements for display locking
select ... Lock in share mode select... For update -- Add exclusive lock row lockCopy the code
A deadlock
Two or more transactions occupy the same resource and request to lock the resource occupied by the other, resulting in a vicious circle. Deadlocks can occur when multiple transactions lock resources in different order, as well as when multiple transactions lock the same resource at the same time. The following simulates two transactions locking resources in different order:
Transaction 1
start transaction; Update user set address=' hk 'where id=1; Update user set address=' hk 'where id=2; commit;Copy the code
Transaction 2
start transaction; Update user set address=' macau 'where id=2; Update user set address=' macau 'where id=1; commit;Copy the code
How does the database solve the deadlock problem? Various deadlock detection and deadlock timeout mechanisms are implemented in the database system. InnoDB storage engine uses a deadlock detection strategy, which immediately returns an error when a deadlock is detected and rolls back the transactions that hold the least row-level exclusive locks.
For example, the sequence number indicates the operation sequence, paying special attention to steps 9 and 10. After step 9, transaction 1 is blocked. After Step 10, InnoDB detects a DeadLock due to a DeadLock, so transaction 2 immediately prints a reminder that the DeadLock is within and rolls back the transaction. Step 9 is no longer blocking because step 10 released the lock. The following execution flow also illustrates the transaction isolation mechanism.
Multi-version concurrency control MVCC
The summary table
If a reference holds counters in a table, you may encounter concurrency issues when updating the technology.
Create count table
CREATE TABLE `counter` (
`cnt` int(10) unsigned NOT NULL
)
Copy the code
count
update counter set cnt=cnt+1; There is a global mutex on this record. This causes transactions that want to execute the statement to execute only seriallyCopy the code
For better concurrent update performance, you can store the counter in multiple rows, randomly select one row at a time for update, and modify the table structure to
CREATE TABLE `counter` (
`slot` tinyint unsigned not null primary key,
`cnt` int(10) unsigned NOT NULL
)
Copy the code
Add 200 rows of data to the table in advance and select random slots for update
update counter set cnt=cnt+1 where slot=100*rand();
Copy the code
To get the statistics, use the following query
select sum(cnt) from counter;
Copy the code
When the requirement is changed to start a new counter every once in a while, the table is changed to
CREATE TABLE `counter` (
`day` date not null,
`slot` tinyint unsigned not null,
`cnt` int(10) unsigned not null,
primary key(day,slot)
)
Copy the code
This scene is not pre-generated like the previous sheep, but replaced by on Duplicate key Update
insert into counter(day,slot,cnt)
values(CURRENT_DATE, 100*rand(), 1)
on duplicate key update cnt=cnt+1;
Copy the code
If you want to reduce the number of rows in the table to prevent the table from becoming too large, you can write a periodic task, merge all the results into slot 0, and delete the other slots
alter table
Mysql alter table operation performance is a big problem for large tables. Mysql performs most of the table structure modification operations by creating an empty table with the mind of the structure, retrieving all data from the old table, inserting the new table, and then deleting the old table. Such operations can take a long time, especially if memory is low and the table is large and there are many indexes. In general, most ALTER table operations will cause mysql to break in service.
Common scenarios
- Perform the ALTER TABLE operation on a machine that does not provide the service, and then switch with the master library that provides the service
- Create a new table that is independent of the source table structure, and then swap the two tables by renaming and deleting them
Examples of Special Scenarios
- Modify column defaults. Column defaults exist in the table’s. FRM file, so you can modify this file directly without touching the table itself
alter table user modify age int mot null default 18; Alter table user ALTER age set default 18; Modify. FRM files directly without table data. This can is very fastCopy the code
The index
An index is a data structure used by a storage engine to quickly find records. Index optimization is the most effective way to optimize query performance and can easily improve query performance by several orders of magnitude.
MySQL indexing is implemented at the storage engine layer, not at the server layer.
Primary key index Non-primary key index Unique index Common index overwrite index
Query optimization
The problem
The index
An index is a data structure used by a storage engine to quickly find records. Index optimization is the most effective way to optimize query performance and can easily improve query performance by several orders of magnitude. And the indexed fields are ordered.
Cause of long SQL execution time
- No index created
- Created index, but invalid index
- Too many join associated queries
- Server tuning and parameter Settings (such as buffering, thread count)
The characteristics of
- advantage
Improves query efficiency and reduces data IO
The sorting cost is reduced and CPU consumption is reduced
- disadvantage
Index files occupy memory space
Indexes speed up queries while slowing down table updates. Insert /update/delete, mysql will not only save the data, but also save the index file for each update to which the index column was added
The index type
Single-valued index: An index contains only one column
Compound index: an index that contains multiple columns
Unique index: The value of the indexed column must be unique, but null values are allowed
B+ tree retrieval principle
A 3-tier B+ tree can represent millions of data (1200^3=). If millions of data lookups take only 3 IO, the performance commit will be huge. If there is no index and each data item takes 1 IO, the total will be millions of IO, which is obviously very expensive.
The index creation situation
- The primary key automatically creates a unique index
- Fields that are frequently queried should be indexed
- Query fields associated with other tables are indexed by foreign key relationships
- Single value index/composite index selection problem, tends to create composite index under high concurrency
- A sorted field in a query that can be accessed through an index to speed sorting
- Statistics or group fields in the query
The case where no index is created
- Too few table records
- Frequently updated fields are not suitable for creating indexes, see disadvantages
- The WHERE condition does not create indexes for fields that are not available
- Table fields with repeated and evenly distributed data, and columns containing many duplicate contents, do not make much sense to build indexes. Like gender, just male and female. The closer the selectivity of an index is to 1, the more efficient the index is
Index selectivity: Refers to the ratio of the number of different values in the index column to the number of records in the table. If there are 2000 records in a table with 1980 different values in the index column, the index selectivity is 1980/2000=0.99.
Performance analysis
EXPLAIN
mysql> explain select * from user where id=2;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | user | NULL | const | PRIMARY | PRIMARY | 1 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set (0.09 sec)
Copy the code
Used to query SQL execution plans
Use the Explain keyword to simulate the optimizer’s execution of SQL queries to see how MySQL handles SQL statements and to analyze performance bottlenecks in query statements or table structures.
What can do
- The read order of the table
- The type of query that reads data
Query Result Parsing
id
The ids are the same and the execution sequence is from top to bottom
The ID is different. For subqueries, the ID sequence increases. A larger ID has a higher priority and is executed earlier
Select_type — query type
- simple
A simple SELECT query that contains no subqueries
- primary
The outermost part of a query containing any complex subparts is marked as
- subquery
Subqueries are included in select or WHERE lists
- derived
Subqueries contained in the FROM list are marked as derived. Mysql performs these asset searches recursively, putting the results in temporary tables
- union
If the second select appears after the union, it is marked as union; If union is included in the subquery of the FROM clause, the outer select is also marked as union
- union result
Select to get the results from the UION table
Table – Shows which table the data is about
Type – Access type
From best to worst:
system>const>eq_ref>ref>range>index>all
In general, make sure the query is at least range and preferably ref
- system
This is a special case of const type that is not normally encountered and can be ignored
- const
Const is used for primary key or unique index. Because it only matches one row, it’s fast.
- eq_ref
Unique index scan. For each index key, only one record in the table matches it. Common with primary key indexes or unique index scans.