Database knowledge consolidation

What are the isolation levels of databases? What are the isolation levels of databases? Read Uncommitted: As the name suggests, a transaction can Read data from another uncommitted transaction.

Read COMMITTED: When a transaction waits for another transaction to commit before it can Read data.

Repeatable read: No modification operation is allowed when data is read (transaction is started).

Serializable: Serializable is the highest transaction isolation level, at which transactions are serialized and sequentially executed to avoid dirty reads, unrepeatable reads, and phantom reads. However, this transaction isolation level is inefficient and costly to database performance, and is generally not used.

It is worth noting that the default transaction isolation level for most databases, such as Sql Server and Oracle, is Read COMMITTED. The default isolation level for Mysql is Repeatable Read.

2, what is magic reading.

Phantom reading: a phenomenon that occurs when a transaction is not executed independently, for example, when the first transaction modifies data in a table, involving all rows in the table. At the same time, the second transaction also modifies the data in the table by inserting a new row into the table. Then, as if in an illusion, the user operating on the first transaction will discover that there are unmodified rows in the table.

For example, a programmer spent 2,000 yuan on consumption one day, and his wife checked his consumption record today (whole-watch FTS was scanned, his wife’s business was opened), and found that it was indeed 2,000 yuan. At this time, the programmer spent 10,000 yuan to buy a computer, that is, a new INSERT consumption record was added and submitted. When his wife printed the programmer’s consumption record list (his wife’s affairs were submitted), he found that he spent 12,000 yuan, which seemed to be an illusion. This is unreal reading.

3. What storage engines does MYSQL have? The general differences between the two storage engines are as follows:

1) InnoDB supports transactions, MyISAM does not, which is very important. Transaction is a kind of advanced processing mode, such as in some column add, delete and change as long as which error can also be rolled back to restore, but MyISAM can not.

2) MyISAM is suitable for query-oriented and insert-oriented applications, while InnoDB is suitable for frequently modified and security-related applications

3) InnoDB supports foreign keys, MyISAM does not

4) InnoDB is the default engine since MySQL5.5.5

5) InnoDB does not support FULLTEXT indexes

6) InnoDB does not save the number of rows in a table. For example, when select count() from table, InnoDB needs to scan the whole table to calculate the number of rows, but MyISAM simply reads the number of rows saved. Note that MyISAM also needs to scan the entire table when the count() statement contains the WHERE condition.

7) For self-growing fields, InnoDB must contain an index for that field only, but in MyISAM tables it is possible to create a joint index with other fields.

8) InnoDB deletes the entire table row by row, which is very slow. MyISAM will rebuild the table.

Update table set a=1 where user like ‘%lee%’

Some people say that MYISAM can only be used for small applications, but this is just a bias.

If there is a large amount of data, it is necessary to upgrade the architecture to solve the problem, such as separate tables and libraries, and separate read and write, rather than relying solely on the storage engine.

Now InnoDB is generally used, mainly due to MyISAM full table lock, read and write serial problems, concurrent efficiency lock table, low efficiency, MyISAM for read and write intensive applications generally will not be used.

In a word:

1. The MyISAM type does not support advanced processing such as transaction processing, while the InnoDB type does.

2.MyISAM tables emphasize performance and execute faster than InnoDB tables, but do not provide transaction support, while InnoDB provides advanced database features such as transaction support and external keys.

4. How to safely modify the same line of data under high concurrency.

1) Use pessimistic locks. That is, only one thread is currently performing the operation, excluding external requests for changes.

2), FIFO (First Input First Output, First in First out) cache queue idea. Putting requests directly on the queue does not result in some requests never getting the lock. It’s kind of like forcing multiple threads into a single thread

3) Use optimistic locks (recommended). Compared with “pessimistic locking”, it adopts a looser locking mechanism, which is mostly updated with Version number

Read more: blog.csdn.net/riemann_/ar…

5. What are optimistic locks and pessimistic locks? What are INNODB’s standard row-level locks?

Optimistic Concurrency Control (SHORT for “OCC”) : Is an approach to Concurrency Control. It is optimistic that multi-user concurrent transactions will not interact with each other and each transaction will be able to process its own data using locks.

Pessimistic lock (ABBREVIATION “PCC”) : The opposite of the optimistic lock is the Pessimistic lock. Pessimistic locking refers to the operation of data, which is considered to have data conflict, so the operation of the same data can only be carried out by obtaining the lock during each operation. This is similar to Synchronized in Java, so pessimistic locking takes more time. They are shared and exclusive locks. Shared lock and exclusive lock are different implementations of pessimistic lock, both of which belong to the category of pessimistic lock.

6, WHAT are the general steps of SQL optimization, how to see the execution plan, how to understand the meaning of each field. [1]. Use the show status command to know the execution frequency of various SQL. After the success of the mysql client connection, through the show [session | global] status command can provide service status information, also can use mysqladmin extend – status command to get the messages.

Com_select: indicates the number of times the select operation is performed. Only 1 is added to a query. 2), Com_insert: the number of insert operations. For batch insert operations, only one is accumulated. Com_update: indicates the number of times the update operation is performed. 4), Com_delete: the number of delete operations.

These parameters are accumulated for all storage engine table operations. Innodb_rows_read: select the number of rows returned by the query. Innodb_rows_inserted: Number of rows inserted by an INSERT operation. Innodb_rows_updated: Specifies the number of rows updated by the update operation. 4), Innodb_rows_deleted: indicates the number of rows deleted by the delete operation.

With these parameters, it is easy to know whether the current database application is mainly for insert updates or query operations, as well as the approximate percentage of execution of the various types of SQL. The count of the update operation is the count of the number of times it was performed, and both commit and rollback are accumulated.

For transactional applications, Com_commit and Com_rollback can be used to learn about transaction commit and rollback. For databases with frequent rollback operations, this may indicate application writing problems.

Connections: indicates the number of attempts to connect to the mysql server. Uptime: indicates the server working time. Slow_queries: indicates the number of slow queries

[2] query SQL statements with low execution efficiency:

Slow query logs to locate inefficient SQL statements. When started with –log-slow-queries[=file_name], mysqld writes a log file containing all SQL statements whose execution time is longer than long_query_time seconds.

You can run the show processList command to check the status of the threads that are running in mysql, including whether the threads are locked, and so on. You can view the execution of SQL in real time and optimize some lock table operations.

[3] analyze the execution plan of inefficient SQL through explain: After querying an inefficient SQL statement, you can run the explain or desc command to obtain information about how MySQL executes the SELECT statement, including how tables are joined during the select statement execution and the join order.

Here are the parameters returned by the explain statement:

1), id: the serial number of the select query, containing a set of numbers, indicating the order in which the select clause or operation table is executed in the query. Different ids: If the query is a sub-query, the id sequence increases, the greater the ID, the higher the priority, the earlier the query is executed. 3. In all groups, the larger the ID, the higher the priority, the earlier the execution. 2), select_type: type is mainly used to distinguish the complexity of ordinary query, joint query, sub-query, etc.

SIMPLE: A SIMPLE select query that does not contain subqueries or unions.

PRIMARY: If the query contains any complex self-query, the outermost query is PRIMARY.

SUBQUERY: Contains subqueries in SELECT or WHERE.

DERIVED: Subqueries contained in the FROM list are marked as

DERIVED MySQL recursively executes these subqueries, putting the results into temporary tables.

UNION: If the second SELECT appears after UNION, it is marked as UNION, and if UNION is contained in the subquery of the FROM clause, the outer SELECT is marked as DERIVED.

UNION RESULT: SELECT the RESULT from the UNION table.

3), table: show the row of data about which table.

4), type: The main types are as follows:

From best to worst: system > const > eq_ref > ref > range > index > ALL, generally reached rang level, preferably reached ref level.

5) Possible_keys: possible indexes can be applied to the table. If there are indexes in the query field, they can be listed, but they may not be used in the query.

6) keys: the actual index used. If not null, no index is used. If an overwrite index exists in the query (overwrite index: the fields queried are the same as the fields and number of the created index), only key is displayed in the index.

7), key_len: indicates the number of bytes used in the index. The shorter the length, the better, without compromising accuracy. The value of key_len is the maximum possible length of the index field, not the actual length. That is, key_len is calculated based on the actual table definition, not by checking in the table.

Ref: The column that shows the index is used, if possible, is a constant. Those columns or constants are used to find values on the index.

9) rows: Based on table statistics and index selection, roughly estimate the number of rows to find the desired record.

10) Extra: Contains important information that is not suitable for display in other columns.

Mysql > select * from ‘mysql’;

There are four necessary conditions for a deadlock:

Mutual exclusion: a resource can only be used by one process at a time.

Request and hold conditions: when a process is blocked by requesting resources, it holds on to acquired resources.

(3) Non-deprivation conditions: the process has obtained resources, before the end of the use, can not be forcibly deprived.

(4) Cyclic waiting condition: several processes form a end-to-end cyclic waiting resource relationship.

These four conditions are necessary for deadlocks, and they must be true whenever a deadlock occurs on the system, and no deadlock occurs unless one of these conditions is met.

There are two ways to resolve database deadlocks:

1. Restart the database. (2) Kill the process of grabbing resources

8, Mysql index principle, index types, how to create a reasonable index, how to optimize the index.

MySql > create index ();

1) We screen out the final results by constantly narrowing the range of data we want to obtain, and at the same time turn random events into sequential events. In other words, with this indexing mechanism, we can always use the same search method to lock data.

2) Index is a means to improve the performance of data query through complex algorithms. Switch from disk IO to memory IO.

MySql > select * from ‘MySql’;

1), ordinary index index: speed up the search

Select * from primary key where primary key is not null and primary key is not null. ②, unique index: accelerate search + primary key unique constraint.

3) Index:

Primary key(id,name); ②, unique(id,name); ③ unique(id,name);

4), fulltext index: used for searching a long article, the effect is best.

5), spatial index: understanding is good, almost no use.

9. Differences between clustered and non-clustered indexes.

Clustering: Indexes and records are clustered together.

“Non-clustered index” : index files are stored separately from data files. The leaf page of the index file only holds the primary key value, and the corresponding data block must be found to locate the record.

10, what is the meaning of select for update, will lock table or lock rows or other.

Select for UPDATE statements are manual locking statements that we often use. With the for Update clause, we can manually implement data lock protection at the application level. Belongs to and issues locks.

11, why use Btree implementation, how it is split, when to split, why is balanced.

Why use B+ trees? To be concise is because:

1). Files are too large to be stored in memory, so they must be stored on disk

2). The index structure should be organized to minimize the number of disk I/O in the search process.

3). Locality principle and disk prefetch, prefetch length is generally an integer multiple of page (in many operating systems, page size is usually 4K)

4). The database system makes clever use of the disk prefetch principle, and sets the size of a node to be equal to a page, so that each node can be fully loaded with only one I/O (because there are two arrays in the node, so the address is consecutive). In red-black trees, the H is significantly deeper. Locality cannot be exploited because nodes that are logically close (parent and child) can be physically far away.

Split only when the Key exceeds 1024, because with the increase of data, a node Key is full, in order to maintain the characteristics of B tree, there will be split, just like red black trees and AVL trees need to rotate in order to maintain the properties of the tree!

What is ACID in the database?

A (atomic) : atomic, either all submit or all fail, not some succeed and some fail.

C (consistent) : consistent. The data consistency constraint is not destroyed after the start and end of a transaction

I (isolation) : isolation. Concurrent transactions do not affect or interfere with each other. D (Durabilit) : persistence. Updates made to the database by committed transactions must be kept permanently. Even if a crash occurs, it cannot be rolled back or data lost.

13. A certain table has nearly ten million data, CRUD is relatively slow, how to optimize.

The data is of tens of millions of levels, occupying a relatively large storage space, it is conceivable that it is not stored in a continuous physical space, but in a chain of physical space in multiple fragments. Maybe for long string comparisons, it takes more time to find and compare, which results in more time being spent.

1) As a relational database, what is the reason for the emergence of such a large table? Whether the table can be split to reduce the number of single table fields and optimize the table structure.

2) Check the field order of the primary key index under the condition that the primary key is valid, so that the field order of the condition in the query statement is consistent with the field order of the primary key index.

3) Manual transaction control is adopted in the program logic. Instead of automatically submitting every data inserted, a counter is defined for batch manual submission, which can effectively improve the running speed.

More analysis can be found at blog.csdn.net/riemann_/ar…

Mysql > optimize table scan

Avoid the IS NULL determination of fields in the WHERE clause.

Use in where clauses should be avoided! = or <> operators, which will cause the engine to abandon the index and perform a full table scan instead.

Avoid using or to join conditions in where clauses.

“In” and “not” should also be used sparingly.

Like query (not left start).

Do not use the NUM=@num argument.

Do not use expression operations on fields num/2=XX in the WHERE clause.

Do not function fields in the WHERE clause.

How to write SQL to effectively use composite indexes.

Index (a,b); index(a,b); index(a,b); Select table tname where a=XX and b=XX or B like ‘TTT%’;

Mysql > in > exists

The in statement in mysql hashes the outer table to the inner table, while the exists statement loops the outer table to query the inner table. It has long been thought that exists is more efficient than in statements, but this is not accurate. This is the distinction between the environment.

㊤. If two tables are of the same size, there is little difference between in and exists.

㊥. If one of the two tables is small and the other is large, exists is used for the large subtable, and in is used for the small subtable.

㊦, not in, and NOT exists If the query statement uses not in, then both internal and external tables are scanned without an index. The not extsts subquery can still be used for indexes on the table. So regardless of the size of the table, not exists is faster than not in.

EXISTS returns TRUE or FALSE only. UNKNOWN IN Is returned when NULL is present

17, database autoincrement primary key possible problem.

[1]. Some problems, such as primary key duplication, may occur when using auto-increment primary key to do database sub-table. [2], database import, may cause some problems due to the primary key.

Can refer to the article: yq.aliyun.com/articles/38…

18. Did you meet the sub-database sub-table in the project you have done? How did you do it?

Refer to the article: www.cnblogs.com/butterfly10…

MYSQL master-slave delay

In fact, there is no one way to beat the master/slave synchronization delay, because all SQL must be executed on the slave server, but if the master server is constantly updated and constantly written, then the possibility of delay increases. Of course we can do some mitigation measures.

A) The simplest solution to reduce the slave synchronization delay is to optimize the architecture to make the DDL of the master library execute as fast as possible. In addition, the master library is write, which has high data security, such as SYNc_binlog =1, innodb_flush_LOG_at_trx_COMMIT =1, while the slave does not need such high data security. You can set sync_binlog to 0 or disable binlog. Innodb_flushlog can also be set to 0 to improve SQL execution efficiency. Another is to use better hardware than the master library as slaves.

B) use a slave server as a backup, instead of providing a query, so that his load down, execute SQL relay log in the high efficiency of natural.

C) Increase the slave server, the purpose is to spread the read pressure, so as to reduce the server load.

Related Posts

The Spring event monitoring mechanism of the Boot | small volume of free learning

OOM Killer mechanism for the Linux kernel

Client setup and registration