The most detailed explanation of Mysql interview questions in history

1, MySQL replication principle and process

Basic principle flow, 3 threads and the correlation between them;

Master: binlog thread — records all statements that change the database data in the master binlog;

Slave: IO thread — after using the start slave, it is responsible to pull the binlog content from the master and put it into its own relay log.

From: SQL thread — execute statements in relay log;

2. The difference between MyISam and InnoDB in MySQL is at least 5 points

(1) Ask 5 different points;

InnoDB supports things, MyISAM does not
InnoDB supports row-level locking, while MyISAM supports table-level locking
3>.InnoDB supports MVCC, MyISAM does not
4>.InnoDB supports foreign keys, while MyISAM does not
InnoDB does not support full-text indexing, while MyISAM does.

(2) Four features of InnoDB engine

Insert Buffer, Double Write, Adaptive hash index (AHI), Read Ahead

(2) selectcount(*) which is faster and why

Myisam is faster because myISAM maintains a counter that can be called directly.

Varchar (50) = 50; varchar(50) = 50

(1) Varchar vs. char

Char is a fixed-length type and vARCHar is a variable-length type

(2) the meaning of 50 in VARCHar (50)

Up to 50 characters. Varchar (50) takes the same space as hello (200), but the latter consumes more memory for sorting because order by COL uses fixed_length (same with memory engine).

(3) the meaning of 20 in int (20)

The length of the display character

For example, if it is an ID that records the number of rows, it will display 00000000001 ~~00000000010 when 10 bytes are inserted. If the number of digits exceeds 11, it will display only 11 bits. If you do not add the parameter that makes it add 0 before less than 11 bits, it will not add 0 before

20 indicates that the maximum display width is 20, but it still occupies 4 bytes of storage, and the storage range remains unchanged.

(4) Why is mysql designed this way

Does not make sense for most applications, but specifies some tool used to display the number of characters; Int (1) and int(20) store and calculate the same;

4. Innodb transaction and log implementation

(1) How many kinds of logs are there?

Error log: Records error information, warning information or correct information.

Query logging: Records information about all requests to the database, whether or not they are executed correctly.

Slow query log: Set a threshold. All SQL statements whose running time exceeds this threshold are recorded in the slow query log file.

Binary log: Records all the changes made to the database.

Trunk logs: Trunk logs are also binary logs used to recover the slave library

Transaction logs: redo log redo log and rollback log undo

(2) Four isolation levels of things

Isolation level

Read uncommitted (RU)
Read Submitted (RC)
Repeatable (RR)
serial

(3) How transactions are implemented through logging, the more in-depth the better.

Transaction logging is implemented through redo and InnoDB log buffer. When a transaction is started, the LSN (log sequence number) of the transaction is recorded. When a transaction is executed, the transaction log is inserted into the log cache of InnoDB’s log storage engine. When a transaction commits, the storage engine’s log buffer must be written to disk (controlled by innodb_flush_LOG_at_trx_COMMIT), that is, the log must be written before data is written. This approach is called “write-ahead logging”

MySQL binlog format

Statement: Each SQL Statement that modifies data is recorded in a binlog.

Advantages: You do not need to record the changes of each row, reducing the amount of binlog logs, saving I/O, and improving performance. (How much performance and log volume can be saved compared to a ROW depends on the SQL application. Normally, the amount of log volume generated by modifying or inserting a row is less than that generated by a Statement. However, if you perform an update operation with a bar, delete a whole table, alter a table, etc., the number of log volume generated by modifying or inserting a row is less than that generated by a Statement. ROW generates a large number of logs, so you should consider whether to use ROW logs based on the actual situation of the application, how much more logs are generated, and the I/O performance issues.

Disadvantages: Since only executing statements are recorded, in order for the statements to run correctly on the slave, some information must be recorded about the execution of each statement to ensure that all statements get the same results on the slave as they did on the master side. In addition, mysql replication, such as some specific functions, slave and master can be consistent on a number of related issues (such as sleep(), 1454098, and user-defined functions(UDF) problems).

Statements using the following functions also cannot be copied:

LOAD_FILE()
UUID()
USER()
FOUND_ROWS()
SYSDATE() (unless — sysdate-IS-now option is enabled at startup)

At the same time in INSERT… SELECT produces more row-level locks than RBR

Row: Does not record information about the CONTEXT of an SQL statement.

Advantage: Binlog does not record context-specific information about the SQL statement being executed, only record which record has been modified. So the rowLevel log content clearly records the details of each row of data modification. And there are no specific cases where stored procedures, or functions, or trigger calls and triggers cannot be copied correctly

Disadvantages: All statements executed in the log are recorded as the modification of each row, which may generate a large amount of log content. For example, if an UPDATE statement changes multiple records, each change in the binlog will be recorded, resulting in a large amount of binlog. In particular, when a statement such as ALTER TABLE is executed, each record in the table is logged because the table structure is changed.

Mixedlevel: A combination of the above two levels is used. Common statement changes use the statment format to store binlogs. For example, statements cannot perform primary/secondary copy operations. MySQL stores binlogs in the ROW format. The MySQL database selects a log format based on each SQL Statement executed. That is, Statement and ROW are selected. The row level mode of MySQL has been optimized. Not all changes are recorded in the row level mode. For example, statement mode is used to record table structure changes. In the case of statements that modify data, such as UPDATE or DELETE, all row changes are recorded.

MySQL database CPU up to 500%

Show processList show processList show processList show processList
2, check the timeout log or error log (do several years of development, generally will be query and large quantities of insert will lead to CPU and I/O rise, of course, does not rule out the network state suddenly broken, resulting in a request server only to accept half, such as where clause or paging clause is not sent, of course, a pit experience)

7, SQL optimization methods

(1) Explain the meaning of various items;

select_type
Copy the code

Represents the type of each SELECT clause in the query

type
Copy the code

Represents the way MySQL finds the desired rows in the table. Also called “access type”

possible_keys
Copy the code

Indicates which index MySQL can use to find rows in a table. If an index exists on a column involved in the query, the index will be listed, but not necessarily used by the query

key
Copy the code

Displays the actual index used by MySQL in the query. If no index is used, NULL is displayed

key_len
Copy the code

Represents the number of bytes used in the index, which is used to calculate the length of the index used in the query

ref
Copy the code

Represents the join match criteria for the above table, that is, which columns or constants are used to find values on indexed columns

Extra
Copy the code

Contains additional information that is not suitable for display in other columns, but is important

(2) The meaning and usage scenarios of profile;

Query how long the SQL will take to execute, and see how much CPU/Memory usage, Systemlock, Table lock, etc

8. Implementation principle of backup plan, mysqlDump and XtranBackup

(1) Backup plan;

Every company here is different, don’t you just say one hour and everything

(2) Backup and recovery time;

Here with the machine, especially the hard disk speed has a relationship, the following list a few for reference only

Mysqldump 20G 2 minutes (mysqldump)

30 minutes for 80G (mysqldump)

111G 30 minutes (mysqldump)

288GB in 3 hours (Xtra)

4 hours of 3T (XTRA)

The logical import time is usually five times longer than the backup time

(3) Implementation principle of Xtrabackup

InnoDB maintains a redo log file, which we can call a transaction log file. The transaction log stores recorded changes to each InnoDB table data. When InnoDB starts, InnoDB checks the data files and transaction logs and performs two steps: It applies (rolls forward) committed transaction logs to the data files and rolls back data that has been modified but not committed.

Mysql > select * from ‘mysqldump’ where ‘insert’ = ‘1’; .value()? What if the backup requires the master copy point information?

--skip-extended-insert

[root@helei-zhuanshu ~]# mysqldump -uroot -p helei --skip-extended-insertEnter password: KEY `idx_c1` (`c1`), KEY `idx_c2` (`c2`) ) ENGINE=InnoDB AUTO_INCREMENT=51 DEFAULT CHARSET=latin1; / *! 40101 SET character_set_client = @saved_cs_client */; -- -- Dumping datafortable `helei` -- LOCK TABLES `helei` WRITE; / *! 40000 ALTER TABLE `helei` DISABLE KEYS */; INSERT INTO 'helei' VALUES (1,32,37,38,'the 2016-10-18 06:19:24'.'susususususususususususu');

INSERT INTO `helei` VALUES (2,37,46,21,'the 2016-10-18 06:19:24'.'susususususu'); INSERT INTO 'helei' VALUES (3,21,5,14)'the 2016-10-18 06:19:24'.'susu');
Copy the code

10, 500 DB, restart in the fastest time

You can use the batch SSH tool PSSH to run restart commands on the machines that need to be restarted. You can also use a multithreaded tool such as salt (provided the client has salt installed) or Ansible (ansible requires SSH to avoid login) to operate multiple servers at the same time

11. Innodb read and write parameter optimization

(1) Read parameters

The global buffer pool andlocalBuffer;Copy the code

(2) Write parameters;

innodb_flush_log_at_trx_commit

innodb_buffer_pool_size
Copy the code

(3) IO related parameters;

innodb_write_io_threads = 8

innodb_read_io_threads = 8

innodb_thread_concurrency = 0
Copy the code

(4) Cache parameters and applicable scenarios of cache.

query cache/query_cache_type
Copy the code

Query Cache is not suitable for all tables. The main cause of query cache failure is that the corresponding table is changed

The first one: read a lot of words to see the proportion, in simple terms, if it is a list of users, or the proportion of data is relatively fixed, such as the list of goods, it can be opened, the premise is that these libraries are more centralized, the practice of the database is relatively small.

The second is that when we “cheat”, such as when we are bidding on the query cache, we can still get the QPS surge effect, but of course the connection pool configuration is the same. In most cases, if the amount of writing is in the majority, the number of visits is not much, so do not open, for example, social networking sites, 10% of people produce content, the rest 90% are in consumption, open or effect is very good, but if you are QQ messages, or chat, it is very deadly.

It is not recommended to open query cache for a site with a high number of concurrent requests

12. How do you monitor your database? How do you search your slow logs?

There are many monitoring tools, such as Zabbix and LEPUS, which I’m using here

13, Have you done master-slave consistency check? If so, how? If not, what are you going to do?

There are many tools for checking primary-secondary consistency, such as checksum, mysqlDIff, and pt-table-checksum

14. Does your database support emoji? If not, how to operate?

If the character set is UTF8, you need to upgrade to UTF8_MB4 to support it

15, How do you maintain the data dictionary of the database?

We all have different maintenance methods. I generally annotate it directly in the production library and export it into Excel for easy circulation by using tools.

Select * from table_name where table_name = ‘X’; select * from table_name where table_name = ‘X’; select * from table_name where table_name = ‘X’

Disassembly problems: connection consumption + storage disassembly space; Possible problems: query performance;

1, if you can tolerate the space problems caused by splitting, it is best to physically place the primary key of the frequently queried table together (partition) order IO to reduce join consumption. Finally, this is a text column plus a full-text index to minimize join consumption
2, if you can tolerate the loss of query performance without splitting: the above scheme will certainly have problems under some extreme conditions, then not splitting is the best choice

17, MySQL InnoDB engine row lock by what is done (or implemented)? Why is that?

InnoDB does row locking based on indexes

Select * from tab_with_index where id = 1 for update;

For UPDATE can perform row locking based on conditions, and ids are columns with index keys,

If id is not an index key then InnoDB will complete the table lock, and concurrency will be out of the question

18. Open question: It is said to be Tencent’s

A table of 600 million a, a table of 300 million B, through the external TID association, how can you query the 200 data records meeting the conditions from 50000 to 50200 in the fastest way?

1. If table A is self-growing and TID is continuous, the ID of table B is the index

select * from a,b where a.tid = b.id and a.tid>500000 limit 200;
Copy the code

2. If the TID of table A is not continuous, then overwrite index is required. TID is either a primary key or secondary index. Table B ID also needs to have an index.

select * from b , (select tid from a limit50000200) awhere b.id = a .tid;
Copy the code

What is a stored procedure? What are the pros and cons?

Stored procedures are pre-compiled SQL statements.

1, the understanding of the more straightforward: the stored procedure is a set of records, it is composed of some T – SQL statement block, these T – SQL code as a way to realize some function (for single tables or more tables to add and delete), and then come up with a name for this block of code, in use this function called when he went.
2. Stored procedure is a precompiled code block with high execution efficiency. A stored procedure replaces a large number of T_SQL statements, which can reduce network traffic, improve communication rate and ensure data security to a certain extent

20. What is the index? What are the effects and advantages and disadvantages?

1. An index is a structure that sorts the values of one or more columns in a database table. It is a data structure that helps MySQL obtain data efficiently
2. An index is a way to speed up the retrieval of data in a table. The index of a database is similar to the index of a book. In books, indexes allow users to quickly find the information they need without having to go through the entire book. In databases, indexes also allow database programs to quickly find data in tables without having to scan the entire database.

MySQL database has several basic index types: normal index, unique index, primary key index, full text index

1, index to speed up the database retrieval speed
2. Indexes slow down maintenance tasks such as inserts, deletes, and modifications
3. A unique index ensures the uniqueness of each row
4, through the use of indexes, you can use optimization hider in the process of query, improve the performance of the system
5. Indexes need to occupy physical and data space

What is the transaction?

Transaction is the basic unit of concurrency control. A transaction is a sequence of operations that are either performed or not performed, and it is an indivisible unit of work. A transaction is the unit of data consistency maintained by a database, and data consistency is maintained at the end of each transaction.

Do indexed queries always improve query performance? why

In general, querying data through an index is faster than a full table scan. But we must also be aware of the costs.

1. Indexes need space to store and need regular maintenance. The index itself is modified whenever a record is added or deleted from the table or an index column is modified. This means that each INSERT,DELETE, and UPDATE record will cost an additional 4 or 5 disk I/ OS. Because indexes require extra storage and processing, unnecessary indexes can slow query response times. Using INDEX queries may not improve query performance. INDEX RANGE SCAN queries are applicable to two situations:
2. Based on a range retrieval, a typical query returns a result set less than 30% of the number of records in the table
3. Search based on non-unique index

23. Briefly say the drop, DELETE and TRUNCate sections

In SQL, DROP, DELETE, and TRUNCate all indicate deletion, but the three have some differences

Delete and TRUNCate delete only the data of the table but not the structure of the table
2, speed, generally speaking: DROP > TRUNCate > DELETE
3. The DELETE statement is DML, and the operation will be put into rollback seinterfaces, and only take effect after the transaction is submitted.
Truncate, DROP is DDL and the operation takes effect immediately. The original data is not added to the rollback segment and cannot be rolled back. The action does not trigger.

24. In what scenarios are DROP, DELETE, and TRUNCate used?

1. When a table is no longer needed, use drop
2. When you want to delete some rows, use delete with the WHERE clause
3. Use TRUNCate when deleting all data in a reserved table

What are superkey, candidate key, primary key and foreign key respectively?

1. Superkeys: The set of attributes that uniquely identify a tuple in a relationship is called a relational superkey. An attribute can be used as a superkey, or a combination of attributes can be used as a superkey. Superkeys contain candidate keys and primary keys.
2. Candidate key: is the minimum superkey, that is, the superkey with no redundant elements.
3. Primary key: a combination of data columns or attributes that uniquely and completely identify stored data objects in a database table. A data column can have only one primary key, and the value of the primary key cannot be missing, that is, cannot be Null.
4. Foreign key: The primary key of another table in a table is called the foreign key of that table.

26. What are views? And what are the usage scenarios for views?

A view is a virtual table that has the same functions as a physical table. Views can be added, modified, looked up, or manipulated, usually by rows or subsets of columns in one or more tables. Changes to the view do not affect the base table. It makes it easier for us to get data compared to multi-table queries.
2. Only part of the fields are exposed to visitors, so create a virtual table, called a view.
3, the query data comes from different tables, and the inquirers want to query in a unified way, so that can also build a view, the query results of multiple tables together, the inquirers only need to directly obtain data from the view, do not have to consider the data from different tables brought by the difference

Say three paradigms.

First Normal Form (1NF) : Fields in database tables are single-attribute and non-divisible. This single attribute is made up of basic types, including integer, real, character, logical, date, and so on.
Second normal Form (2NF) : There is no partial function dependence of non-key fields on any candidate key field in a database table (partial function dependence refers to the presence of some field in the combined key determining the non-key field), that is, all non-key fields are completely dependent on any set of candidate keys.
Third normal Form (3NF) : On the basis of second normal form, the data table conforms to third normal form if there is no transfer function dependence of non-key fields on any candidate key field. The so-called transfer function dependence refers to the dependence of C transfer function on A if there is A “A → B → C” decision relation. Therefore, database tables that meet the third normal form should not have the following dependencies: key field → non-key field X → non-key field Y

What are optimistic and pessimistic locks for databases?

The task of concurrency control in a database management system (DBMS) is to ensure that the isolation and unity of transactions and the unity of the database are not broken when multiple transactions simultaneously access the same data in the database. Optimistic concurrency control (optimistic locking) and pessimistic concurrency control (pessimistic locking) are the main techniques used in concurrency control.

Pessimistic locking: Shielding all operations that might violate data integrity, assuming concurrency conflicts

Optimistic locking: Data integrity violations are checked only at commit time, assuming no concurrency conflicts will occur.