MySQL > Interview with BAT

MySQL replication principle and process

(1) Copy the basic principle flow

1. Master: Binlog thread — records all statements that change database data and puts them in the master binlog;

2. Slave: IO thread — after using the start slave, it is responsible to pull the binlog content from the master and put it into its own relay log.

3. Execute statements in relay log from: SQL thread;

(2)MySQL replication threads have several and between the association

MySQL replication is based on the interaction of the following three threads:

1. The Master binlog dump thread is responsible for transmitting the Master binlog events to the slave.

2. The I/O thread on the Slave receives the binlog sent by the Master and writes it to the relay log.

3. The SQL thread on Slave reads the relay log and executes it.

4. If it is a multi-threaded replication, whether it is a fake multi-threaded replication at 5.6 library level or a true multi-threaded replication at MariaDB or 5.7, the SQL thread only serves as a coordinator and is only responsible for reading the binlog from the relay log and handing it to the worker thread. The woker thread is responsible for executing specific binlog events;

(3) how does MySQL ensure data consistency in the replication process and reduce data synchronization delay

Consistency mainly includes the following aspects:

1. Before and before MySQL5.5, the location of the relay log executed by the SLAVE SQL thread can only be saved in the file (relay log.info), and this file is synchronized to disk every 10000 times by default. This means that when the SLAVE crash is restarted, the location of the SQL thread executed is inconsistent with the data in the database, which results in replication errors. If the slave crash is not replicated, data inconsistency may occur. MySQL 5.6 introduces the relay_log_info_repository parameter. When setting relay_log_info_repository to TABLE, MySQL stores the location of the SQL thread to mysql.slave_relay_log_info. In this way, the position of the update table and the user transaction executed by the SQL thread are bound to a transaction. In this way, when the slave unexpectedly breaks down, the slave can restore the position of the SQL thread and the user transaction to the consistent state through innoDB crash recovery.

2. MySQL 5.6 introduces GTID replication. Each GTID transaction is executed at most once on each instance, which greatly improves data consistency of replication.

3. Semi-synchronous replication is introduced in MySQL 5.5. After users install the plug-in and enable the parameter, set the timeout time. However, if the user thread commits while waiting on the master before the asynchronous switch, the transaction is already committed, and the transaction is visible to other sessions on the master. If the master goes down, then the transaction is not visible to the slave. This issue was not resolved until 5.7;

The rpl_semisynchronous replication parameter rpl_semi_sync_master_WAIT_point parameter is after_sync by default, which means that the transaction is not committed until the semi-synchronous replication is cut. Instead, the transaction is committed after receiving an ACK from slave, and the replication is truly lossless.

5. In the case of lossless replication in 5.7, the master crashed unexpectedly. After the master was restarted, it was found that the binlog was not transmitted to the slave. 1, the outage has been cut into asynchronous, 2 is not cut into asynchronous?? How to determine whether downtime has been cut into asynchronous?? How to deal with??

Delay:

A group commit is used for all transactions on a single database or table. A group commit is used for all transactions on a single database or table. The slave can also be executed concurrently by multiple worker threads. Much the same as the introduction of multithreaded replication by MairaDB10.0.0.5.

(4) Solutions to copy bugs encountered in work

In 5.6, the multi-library replication sometimes stops by itself. We wrote a script to restart the slave. To complement…

MySQL myISam differs from InnoDB in at least 5 points

(1) Ask 5 different points

1.InnoDB supports things, MyISAM does not

InnoDB supports row-level locking, while MyISAM supports table-level locking

3.InnoDB supports MVCC, MyISAM does not

4.InnoDB supports foreign keys, while MyISAM does not

5.InnoDB does not support full-text indexing, MyISAM does.

6.InnoDB cannot copy tables to another machine by directly copying table files. Myisam supports this

7.InnoDB tables support multiple row formats, myISam does not

InnoDB is an index organized table, myISam isa heap table

(2) Four features of InnoDB engine

1. Insert buffer

2. Double Write

3. Adaptive Hash index (AHI)

4. Read ahead

(3) Improvements to Innodb in various mysql versions

Key improvements to Innodb engine under MySQL5.6:

1.online DDL

2. Memcached NoSQL interface

3. Transportable tablespace (ALTER table discard/import tablespace)

4. When MySQL is shut down, the buffer pool (space, page_no) can be dumped, and the buffer pool (space, page_no) can be reload when MySQL is restarted

5. Index and table statistics are persisted to mysql. innodb_table_STATS and mysql. innodb_index_STATS to provide stable execution plans

6.Com Pressed row format Supports compressed tables

MySQL 5.7 InnoDB engine major improvements

1. You can sometimes use online DDL to change the vARCHAR field length

2. The Buffer pool can be changed online

3. The proportion of the Buffer pool can be exported

4. Support new INNODB TABLESPACE, and can create multiple tables in it

5. Disk temporary tables are stored in InnoDB and stored in InnoDB Temp TABLESPACE, which used to be myISAM

6. Transparent tablespace compression

(4) select count(*), which is faster, why

Myisam is faster because myISAM maintains a counter that can be called directly.

(5) the realization of index of 2

Innodb isa heap table. Myisam isa heap table. Innodb isa heap table

Varchar (50) = 50; varchar(50) = 50

(1) Varchar vs. char

In single-byte character sets, char (N) is always of fixed length when stored internally, and is not in the variable-length field length list. Under multi-byte character sets, char(N) If the number of bytes stored exceeds N, then char(N) is indistinguishable from varchar (N). Under the multi-byte character set, if the number of bytes stored is less than N, N bytes are stored, followed by a space up to N bytes. Both store variable-length data and variable-length field length lists. Varchar (N) stores variably long data and variably long field length lists

(2) The meaning of 50 in VARCHar (50)

Varchar (50) and (200) take up the same amount of space to store Hello, but the latter will consume more memory when sorting because order by COL uses fixed_length (the same with the memory engine). In earlier versions of MySQL, 50 stood for the number of bytes; now it stands for the number of characters.

(3) the meaning of 20 in int (20)

The length of the display character

Does not affect internal storage, only affects the number of zeros before the int with zerofill definition, easy report display

(4) Why is mysql designed this way

Does not make sense for most applications, but specifies some tool used to display the number of characters; Int (1) and int(20) store and calculate the same;

Fourth, innoDB transaction and log implementation

(1) How many kinds of logs are there

Undo and redo

(2) The storage form of logs

Redo: Page changes are written to the redo log buffer, then to the redo log file system cache (fwrite), and then to the disk file (fsync).

Undo: Before MySQL5.5, Undo logs can only be stored in ibData * files. After 5.6, innodb_undo_TABLespaces can be used to store Undo logs outside ibData *.

(3) How transactions are implemented through logging, the more in-depth the better

The basic process is as follows:

This is because when a transaction changes a page, undo is remembered first, redo is remembered before undo, then redo is modified, then redo is remembered. Redo (which includes undo modifications) must be persisted to disk before data pages. When a transaction needs to be rolled back, the data page can be rolled back to the former mirror because of Undo

If the redo log does not have a commit record, undo is used to roll back the changes made to the redo log before the transaction started. If there is a commit record, roll forward through redo until the transaction is complete and commit.

MySQL binlog format

(1) The meaning of various log formats

1.Statement: Each SQL Statement that modifies data is recorded in a binlog.

Advantages: You do not need to record the changes of each row, reducing the amount of binlog logs, saving I/O, and improving performance. (How much performance and log volume can be saved compared to row depends on the SQL application. Normally, the number of logs generated by modifying or inserting a ROW is less than the number of logs generated by Statement.

However, considering that ROW format will generate a large number of logs in the case of update operations, entire table deletion, alter table and other operations, it is necessary to consider whether to use ROW format logs according to the actual situation of the application, how much logs will be generated, and IO performance problems.

Disadvantages: Since only executing statements are recorded, in order for the statements to run correctly on the slave, some information must be recorded about the execution of each statement to ensure that all statements get the same results on the slave as they did on the master side. In addition to mysql replication,

There are a number of issues associated with consistency on slave and master functions, such as sleep(), 1534555, and User-defined functions(UDFs).

Statements using the following functions also cannot be copied:

* LOAD_FILE()

* UUID()

* USER()

* FOUND_ROWS()

* SYSDATE() (unless the — sysdate-IS-now option is enabled at startup)

At the same time in INSERT… SELECT produces more row-level locks than RBR

2.Row: Does not record information about the CONTEXT of the SQL statement.

Advantage: Binlog does not record context-specific information about the SQL statement being executed, only record which record has been modified. So the rowLevel log content clearly records the details of each row of data modification. And there are no specific cases where stored procedures, or functions, or trigger calls and triggers cannot be copied correctly

Disadvantages: All statements executed in the log are recorded as the modification of each row, which may generate a large amount of log content. For example, if an UPDATE statement changes multiple records, each change in the binlog will be recorded, resulting in a large amount of binlog. In particular, when a statement such as ALTER TABLE is executed, each record in the table is logged because the table structure is changed.

3.Mixedlevel: A combination of the above two levels is used. Common statement changes use the statment format to store binlogs. For example, statements cannot perform primary/secondary copy operations. MySQL will use row format to store binlogs. MySQL will treat the log form according to each specific SQL statement executed.

That is, choose between Statement and Row. The row level mode of MySQL has been optimized. Not all changes are recorded in the row level mode. For example, statement mode is used to record table structure changes. In the case of statements that modify data, such as UPDATE or DELETE, all row changes are recorded.

(2) Application scenarios

When multiple rows are processed in a SINGLE SQL statement, the statement consumes more space and the row consumes more space. But row mode is more reliable.

(3) In combination with the first question, the pros and cons of each log format in replication

Statements may take up less space and take less time to transmit to the slave, but they are not as reliable as row mode. Row mode is more space-intensive but reliable when manipulating multiple rows of data.

MySQL database CPU increases to 500%

When the CPU increases to 500%, run the top command of the operating system to check whether mysqld is occupied. If not, find out the processes with high CPU usage and handle the problem. If mysqld is the cause, show processList to see if there is a session running in it. Find the high SQL consumption to see if the execution plan is accurate, if the index is missing, or if there is simply too much data. In general, it is important to kill these threads (and see if CPU usage drops), and then re-run the SQL after making appropriate adjustments (such as adding indexes, changing SQL, changing memory parameters). It is also possible that each SQL server does not consume a lot of resources, but all of a sudden, a large number of sessions are connected and the CPU spikes. In this case, you need to work with the application to analyze why the number of connections increases and adjust accordingly, such as limiting the number of connections

Seven, SQL optimization

(1) Explain the meanings of various items

Id: The identifier of each operation that is performed independently, indicating the order in which the objects are operated. Generally speaking, if the ID value is large, it is executed first. If the ID values are the same, the order is from top to bottom.

Select_type: The type of each SELECT clause in a query.

Table: name, the name of the object being operated on, the usual table name (or alias), but there are other formats.

Partitions: indicates matched partition information.

Type: the join type.

Possible_keys: Lists possible indexes.

Key: indicates the actual index.

Key_len: indicates the average length of the index key used, in bytes.

Ref: Indicates the reference to the object being operated on on the row. This may be a constant const, or it may be from another table

The object to which key points, such as the join column of a driver table.

Rows: Estimates the number of rows to scan at a time.

Filtered: Rows *filtered/100 indicates the number of rows at the end of this step (estimated value).

Extra: Important additional information.

(2) The meaning and application scenarios of profile

Profile is used to analyze the consumption distribution of SQL performance. When slow SQL cannot be solved with Explain, you need to use profile to conduct a more detailed analysis of SQL, find out which part of SQL spent most of the time consumed, and identify SQL performance bottlenecks.

(3) Index problems in explain

In general, use index(type is const, ref, etc., key column has value) instead of full table scan (type is explicit ALL) in Explain results. For example, columns that have where conditions and are selective need to be indexed.

Join columns of the driven table also need to be indexed. Join columns of the driven table may also be indexed jointly with where condition columns. When there is a need for sorting or group by, you can also consider creating indexes to meet the need for direct sorting and summary.

Backup plan, mysqlDump and XtranBackup

(1) Backup plan

Mysqldump can be used to perform a full backup every day (mysqldump), because it is lighter and more flexible. The backup time is selected during peak service periods

The resulting file is smaller, and even smaller when compressed. Xtranbackup is a faster backup than mysqlDump for libraries over 100GB. Full backup is performed once a week and incremental backup is performed every other day during off-peak service periods.

(2) Backup and restoration time

Physical backup is fast, but logical backup is slow

Here with the machine, especially the hard disk speed has a relationship, the following list a few for reference only

Mysqldump 20G 2 minutes (mysqldump)

30 minutes for 80G (mysqldump)

111G 30 minutes (mysqldump)

288GB in 3 hours (Xtra)

4 hours of 3T (XTRA)

The logical import time is usually five times longer than the backup time

(3) How to handle the backup and restoration failure

First of all, we should make full preparations before recovery to avoid mistakes during recovery. For example, the validity check, permission check, space check after backup. If any error occurs, adjust accordingly according to the error prompt.

(4) MysqlDump and Xtrabackup implementation principle

mysqldump

Mysqldump is a logical backup. Add the –single-transaction option for consistency backup. The background process sets the TRANSACTION ISOLATION LEVELREPEATABLE READ level of the session to RR(SET Session TRANSACTION ISOLATION LEVELREPEATABLE READ) and then explicitly starts a TRANSACTION (START TRANSACTION /*! 40100 WITH CONSISTENTSNAPSHOT */), this ensures that the data read in this transaction is the snapshot of the transaction. And then read the data out of the table. –master-data=1 FLUSH TABLES WITH READ LOCK; showmaster status =1 FLUSH TABLES WITH READ LOCK Unlock it now and read the table. When all data has been derived, the transaction can be terminated

Xtrabackup:

Xtrabackup is a physical backup. It copies tablespace files and scans for redo logs. When innoDB is finally backed up, a flush Engine logs operation is performed to ensure that all redo logs have been dropped (involving a two-phase commit)

Because Xtrabackup does not copy binlogs, you must ensure that all redo logs fall to disk, otherwise the last set of committed transaction data may be lost. This point in time is when InnoDB completes the backup. Although the data files are not consistent, having redo during this time period makes the data files consistent

Mood). Flush tables with read lock for myISam and other engines This makes for perfect hot spare.

Mysql > select * from ‘mysqldump’ where ‘…. ‘ What if value()? What if the backup requires the master copy point information?

--skip-extended-insert
[root@helei-zhuanshu ~]# mysqldump -uroot -p helei --skip-extended-insertEnter password: KEY `idx_c1` (`c1`), KEY `idx_c2` (`c2`) ) ENGINE=InnoDB AUTO_INCREMENT=51 DEFAULT CHARSET=latin1; / *! 40101 SET character_set_client = @saved_cs_client */; -- -- Dumping datafortable `helei` -- LOCK TABLES `helei` WRITE; / *! 40000 ALTER TABLE `helei` DISABLE KEYS */; INSERT INTO 'helei' VALUES (1,32,37,38,'the 2016-10-18 06:19:24'.'susususususususususususu');
INSERT INTO `helei` VALUES (2,37,46,21,'the 2016-10-18 06:19:24'.'susususususu'); INSERT INTO 'helei' VALUES (3,21,5,14)'the 2016-10-18 06:19:24'.'susu');

Copy the code

Ten, 500 DB, restart in the fastest time

You can use the batch SSH tool PSSH to run restart commands on the machines that need to be restarted. You can also use a multithreaded tool such as salt (provided the client has salt installed) or Ansible (ansible requires SSH to avoid login) to operate multiple servers at the same time

Innodb read and write parameter optimization

(1) Read parameters

Global buffer andlocalBuffer; Global buffer: Innodb_buffer_pool_size Innodb_log_buffer_size Innodb_additional_mem_pool_sizelocalBuffer (the following are session variables for server layer, not InnoDB) :  Read_buffer_size Join_buffer_size Sort_buffer_size Key_buffer_size Binlog_cache_sizeCopy the code

(2) Write parameters

innodb_flush_log_at_trx_commit
innodb_buffer_pool_size
insert_buffer_size
innodb_double_write
innodb_write_io_thread
innodb_flush_method

Copy the code

(3) IO related parameters

innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 0
Sync_binlog
Innodb_flush_log_at_trx_commit
Innodb_lru_scan_depth
Innodb_io_capacity
Innodb_io_capacity_max
innodb_log_buffer_size
innodb_max_dirty_pages_pct

Copy the code

(4) Cache parameters and application scenarios of cache

query cache/query_cache_type

Query Cache is not suitable for all tables. The main cause of query cache failure is that the corresponding table is changed

The first one: read a lot of words to see the proportion, in simple terms, if it is a list of users, or the proportion of data is relatively fixed, such as the list of goods, it can be opened, the premise is that these libraries are more centralized, the practice of the database is relatively small.

The second is that when we “cheat”, such as when we are bidding on the query cache, we can still get the QPS surge effect, but of course the connection pool configuration is the same. In most cases, if the amount of writing is in the majority, the number of visits is not much, so do not open, for example, social networking sites, 10% of people produce content, the rest 90% are in consumption, open or effect is very good, but if you are QQ messages, or chat, it is very deadly.

It is not recommended to open query cache for a site with a high number of concurrent requests

How do you monitor your database? How do you search your slow logs?

There are many monitoring tools, such as Zabbix and LEPUS, which I’m using here

Have you done master-slave consistency check? If so, how? If not, what are you going to do?

There are many tools for checking primary-secondary consistency, such as checksum, mysqlDIff, and pt-table-checksum

Mysql > alter table table_name (‘ X ‘, ‘text’, ‘X’); mysql > alter table table_name (‘ X ‘, ‘X’); Write the reasons for your choice

A: Disassembly brings problems: connection consumption + storage disassembly space; Possible problems: query performance;

If you can tolerate the space problems associated with splitting, it is best to physically place the primary key of the frequently queried table together (partitioned) order IO to reduce join consumption. Finally, this is a text column with a full-text index to minimize join consumption

If you can tolerate the performance penalty of not splitting the query: the above scenario is bound to be problematic under extreme conditions, then leaving it alone is the best option

MySQL InnoDB engine row lock is done by adding on what? Why is that?

A: InnoDB does row locking based on indexes

Select * from tab_with_index where id = 1 for update;

For UPDATE can perform row locking based on conditions, and ids are columns with index keys,

If id is not an index key then InnoDB will complete the table lock, and concurrency will be out of the question

Select * from mysqldump; select * from mysqldump;

Full backup [root@HE1 ~]# mysqldump -uroot -p --single-transaction -A --master-data=2 >dump.sqlRestore only the contents of the ERP library [root@HE1 ~]# mysql -uroot -pMANAGER erp --one-database <dump.sql--one-database -o --one-database -o --one-database -o --one-database -o --one-database -o --one-database -o The specific implementation method is as follows: Extract the table structure of T table from the full database backup [root@HE1 ~]# sed -e'/./{H; $! d; }' -e 'x; /CREATE TABLE `t`/! d; q' dump.sqlDROP TABLE IF EXISTS`t`; / *! 40101 SET@saved_cs_client =@@character_set_client */; / *! 40101 SETcharacter_set_client = utf8 */; CREATE TABLE `t` ( `id` int(10) NOT NULL AUTO_INCREMENT, `age` tinyint(4) NOT NULL DEFAULT'0',
  `name` varchar(30) NOT NULL DEFAULT ' ', PRIMARY KEY (`id`) ) ENGINE=InnoDBAUTO_INCREMENT=4 DEFAULT CHARSET=utf8; / *! 40101 SETcharacter_set_client = @saved_cs_client */; Extract contents of t table from full database backup [root@HE1 ~]# grep'INSERT INTO `t`' dump.sqlINSERT INTO ` t ` VALUES (0, 0,' '), (1, 0,'aa'2, 0), (,'bbb'), (3, 25,'helei');

Copy the code

17. The biggest mysql DB problem you have encountered in your current job and how did you solve it?

You can choose a difficult case that you’ve dealt with, or you can choose a deadlock case that the teacher talked about in class; Deadlock caused by not having Purge + INSERT unique index: Refer to the committee notes for a case study.

InnoDB transaction isolation level: InnoDB transaction isolation level: InnoDB transaction isolation level: InnoDB transaction isolation level

(1) Four isolation levels of things

Read uncommitted

Read Committed

Repeatable read

Serial (serializable)

(2) Different levels of phenomena

Read Uncommitted: Indicates that dirty data that is not committed in other sessions can be Read.

Read Committed: Unrepeatable reads are allowed, but dirty reads are not allowed. Once committed, other sessions can see the committed data.

Repeatable Read: Disables non-repeatable reads, dirty reads, and phantom reads (unique to InnoDB).

Serializable: Transactions can only be executed one after another, but not concurrently. The transaction isolation level is highest.

Different isolation levels have different phenomena and have different locking/concurrency mechanisms, and the higher the isolation level, the worse the concurrency of the database.

The last

Welcome to pay attention to my public number [programmer chasing wind], the article will be updated in it, sorting out the data will be placed in it.