preface
In our LeByte open class, we covered some key points about some flow charts/schematics related to MySQL database. I’ve sorted it out. I’ll take notes. We’ll study together.
1. Master/slave replication diagram of MySQL
MySQL master-slave replication principle is a frequent interview question for dachang backend. It is necessary to understand the principle of MySQL master-slave replication.
The principle of master-slave replication can be summarized in three steps as follows:
The main database has a bin-log binary file that records all SQL statements (binlog threads).
Copy the SQL statement from the bin-log file of the primary database from the secondary database (I/O thread);
Execute these SQL statements again (SQL execution threads) from the database’s relay-log redo log file.
As shown below:
The master-slave replication above is performed in five steps:
Update events (update, INSERT, delete) are written to binlog.
Step 2: Initiate a connection from the library to connect to the master library;
Step 3: Create a binlog dump thread and send the contents of the binlog to the slave database.
Step 4: After starting from the slave library, create an I/O thread to read the binlog content from the master library and write it to the relay log.
Step 5: An SQL thread is also created to read from the relay log, execute the read update event from Exec_Master_Log_Pos, and write the update to the SLAVE DB.
2. Logical architecture diagram of MySQL
It helps to have an architectural picture in your mind of how the MySQL components work together.
MySQL logical architecture diagram is divided into three layers:
The first layer is responsible for connection processing, authorization, security, etc.
Each client connection has a thread in the server process. The server maintains a thread pool, so there is no need to create or destroy a thread for each new connection.
When a client connects to the Mysql server, the server authenticates the client using the user name, password, or SSL certificate.
Once the client is connected, the server continues to verify that the client has permission to execute a particular query.
The second layer compiles and optimizes SQL.
This layer includes query parsing, analysis, optimization, caching, and all built-in functions;
For the SELECT statement, the server checks the query cache before parsing the query. If the query result can be found in the cache, the server directly returns the query result without performing query parsing or optimization.
All functions across storage engines are implemented in this layer: stored procedures, triggers, views.
The third layer is the storage engine.
Storage engine is responsible for storing and extracting data in MySQL.
Storage engines communicate with the upper layer through apis that mask differences between storage engines and make these differences transparent to the upper layer query process.
The storage engines do not parse SQL, and the different storage engines do not communicate with each other, but simply respond to requests from the upper-layer server.
3. InnoDB logical storage structure diagram
From the logical storage structure of InnoDB storage engine, all data are stored in a logical space, called tablespace. A tablespace consists of segments, extents, and pages. Pages are sometimes called blocks in some documents.
InnoDB logical storage structure diagram is as follows:
Tablespace (tablespace)
Table space is the highest level of Innodb storage engine logic. All data is stored in table space.
By default, Innodb storage engine has a shared tablespace ibDatA1, that is, all data is stored in this tablespace;
If innodb_file_per_TABLE is enabled, note that only data, indexes, and insert buffer bitmaps are stored in the table space of each table. Other types of data, such as rollback information, insert buffer retrieval pages, system transaction information, are stored in the table space. The secondary write buffer and so on are still in the original shared table.
Segment (segment)
A table space consists of segments, such as data segments, index segments, and rollback segments.
InnoDB storage engine tables are organized by index, so data is index and index is data. The data segment is the leaf node of B+ tree, and the index segment is the non-index node of B+ tree.
InnoDB storage engine segment management is done by the engine itself, DBA can not and does not need to control it.
Area (among)
Extents are Spaces made up of contiguous pages, each of which is 1MB in size in any case;
To ensure page continuity in extents, the InnoDB storage engine requests 4-5 extents from disk at a time
By default, InnoDB storage engine pages are 16KB in size, with a total of 64 contiguous extents in a single extents.
Page (page)
Pages are the smallest unit of InnoDB disk management;
In the InnoDB storage engine, the default page size is 16KB;
Starting with innodb1.2. x, you can set the page size to 4K, 8K, 16K with the innodb_page_size parameter.
Common page types in InnoDB storage engine include data page, undo page, system page, transaction data page, insert buffer bitmap page, insert buffer free list page, etc.
4. Schematic diagram of InnoDB page structure
4.1 InnoDB page structure monomer diagram
The InnoDB data page consists of the following 7 sections, as shown in the figure below:
The size of File Header, Page Header and File Trailer are fixed, which are 38, 56 and 8 bytes respectively. These Spaces are used to mark some information of the Page, such as Checksum and the number of layers of B+ tree index where the data Page is located. The User Records, Free Space, and Page Directory sections are the actual row record storage Space, so the size is dynamic.
Below we use the form to describe the seven parts roughly:
4.2 Storage flow chart recorded on a page
Whenever we insert a record, we will apply for a Space of record size from the Free Space part, that is, the unused storage Space, and divide it into the User Records part. When all the Space in the Free Space part is replaced by the User Records part, This means that the page is used up. If a new record is inserted, it needs to apply for a new page. This process is illustrated as follows:
4.3 Data structure diagram of different Innodb pages
A table can have thousands of records, and a page is only 16KB, so it may take many pages to hold the data. The File Header is the first part of the InnoDB page. Its FIL_PAGE_PREV and FIL_PAGE_NEXT represent the page number of the previous and next page respectively, that is, the pointer to the last and next nodes in the linked list.
5. Innodb index structure diagram
Let’s first look at a sample data table, assuming Col1 is the primary key, as follows:
5.1 B+ tree aggregation index structure diagram
A clustered index is an index created with a primary key;
Clustered indexes store data in tables on leaf nodes.
5.2 Structure diagram of non-clustered indexes
Assuming that the index is Col3, the index structure diagram is as follows:
A non-clustered index is an index created with a non-primary key;
Non-clustered indexes store primary keys and index columns on leaf nodes;
When querying data using a non-clustered index, the primary key on the leaf is retrieved and the desired data is retrieved. (The process of getting the primary key and then looking it up is called back table);
If all columns queried are the same as those of the index, the index column is called an overwrite index.
5.3 InnoDB Lock type mind map
5.4 Locking Mechanism
Optimistic locking and pessimistic locking are two concurrency control ideas that can be used to solve the lost update problem.
Optimistic locking
Every time I went to fetch data, I was optimistic that there would be no concurrency problems;
Therefore, data access and processing are not locked each time;
However, during the update, the system determines whether there is a conflict according to the version number or timestamp. If there is a conflict, the transaction is processed; if there is no conflict, the transaction is committed.
Pessimistic locking
Every time I go to fetch data, I am very pessimistic. I think it will be modified by others and there will be concurrency problems.
Therefore, exclusive locks are added before accessing and processing data.
Data is locked throughout data processing, and the lock is released only after the transaction commits or rolls back.
5.5 the lock granularity
Table lock: low overhead, fast lock; The locking force is large, the probability of lock conflict is high, and the concurrency is low. No deadlocks occur;
Line lock: expensive, slow lock; Deadlocks occur; The lock granularity is small, the probability of lock conflict is low, and the concurrency is high.
Page locks: the overhead and speed of locking is between table and row locks. Deadlocks occur; The locking granularity is between table and row locks, and the concurrency is average.
5.6 compatibility
A Shared lock
Also known as read lock (S lock);
If one transaction acquires the shared lock, other transactions can acquire the shared lock but cannot acquire the exclusive lock. Other transactions can read but cannot write.
SELECT … LOCK IN SHARE MODE Indicates adding a SHARE LOCK.
Exclusive lock
Also known as write lock (X lock).
If transaction T places an exclusive lock on data A, other transactions cannot place any type of lock on data A. Transactions that are granted exclusive locks can both read and modify data.
SELECT … FOR UPDATE displays adding exclusive locks.
5.7 the lock mode
Record lock: a lock on the corresponding index record for a row, locking a row record;
Gap lock: a lock on the index record interval, locking an interval;
Next-key lock: a combination of the record lock and the lock on the gap before the index record, locking the row record + interval;
Intentional locks: Designed to support simultaneous existence of multiple granularity locks.
The article is reprinted with music bytes