The difference between a common index and a unique index is that the field contents of a common index can be repeated while the field contents of a unique index cannot be repeated.
First, the query process
Select id from test where k=10; Firstly, leaf nodes will be searched by layer through the root of B+ tree. After finding the corresponding data page, binary search and positioning records will be conducted within the page.
- For a normal index, after finding the first record that meets the criteria (id,10), the search continues until the first record that does not meet the criteria k=10 is found.
- For unique indexes, the result is returned directly after the first qualifying record (ID,10) is found;
A unique index defines uniqueness, where there is only one record that meets the condition, whereas a common index may have multiple records
What is the difference in performance between the two cases?
As we know, data in InnoDB is read and written on a data page basis. That is to say, when a record with K=10 is found, the data page containing the record is already in memory and is read sequentially. The performance of reading one record and reading multiple records is not much different. Of course, if you happen to read the next record on the next data page, it will take a little longer, but this is a low probability event. In summary, plain indexes and unique indexes are about as efficient for the query process.
Second, the update process
1. change buffer
When a data page needs to be updated, there are two situations:
- Data page in memory: Updated directly, the data page is marked dirty.
- When the data page is not in the memory, the update operation is cached in the change buffer. When the data page needs to be read next time, the data page is loaded into the memory, and the update operation in the change buffer is applied to the data page to ensure the consistency of data.
The change buffer is persistent data that is written to disk
2, the merge
The process of applying the operations in the Change Buffer to the original data page is called merge. Merge is triggered when the data page is read, the system background is merged periodically, or the database is shut down.
Obviously, caching update statements in change buffer has two advantages: 1. Reducing disk I/O times and improving statement execution speed; 2. 2, reduce data page read memory, improve memory utilization;
3. Application scenarios
Of course, using Change Buffer will not improve performance in every situation.
Unique index or plain index?
For unique indexes, data pages need to be loaded into memory for each update to determine whether the uniqueness constraint is violated. However, for data pages already in memory, random read and write is no longer important, so there is no need to use change buffer cache update operation.
For ordinary indexes, because the data page does not need to determine unique constraints in memory, update operations can be cached, reducing the number of random I/OS to disk.
Is plain index ok?
If the business scenario is one that needs to be read immediately after being written, is change Buffer still advantageous?
The merge process is triggered immediately after the change Buffer is written. In this case, the number of random DISK I/OS does not decrease and even increases the maintenance cost of The Change Buffer.
In other words, change Buffer is suitable for scenarios with more writes and less reads, so that more update operations can be cached before the next merge, which is more profitable. Common examples of this business model are billing, logging, and other systems.
Change buffer and redo log
Redo log is the core of WAL, which improves performance by reducing the amount of I/O consumed by random writes to disk. The change buffer reduces the I/O consumption of random reads to disk.