Hello, I’m Leo. Currently, I am engaged in Java backend development in Changzhou. This is the second article in the MySQL Learning collation series. This series will be output together with byte, netease, Ali, Tencent, Meituan and Kuaishou. Hope to help more friends as soon as possible into the factory!

After reading this article, don’t say you have learned MySQL.

After reading this article, stop saying that you have learned MySQL.

Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

6. The ID of the primary key is automatically added

MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL

6.1 the advantages and disadvantages

  • Using the primary key to increment the ID saves half the disk space compared to the UUID
  • Range like query, the performance of the self-added ID is better than that of the UUID. (But with caching, there is little difference in performance.)
  • Write test, increment ID is 4 times UUID

6.2 Implementation Principles

location

The self-increment is stored in the table structure definition. The actual table structure definition is stored in the.frm file, but the self-increment is not saved.

A self-value-added saving policy

  • The MyISAM engine’s self-increment is saved in data files.
  • In innodb5.7 and earlier versions, self-increment is stored in memory and not persisted. After each restart, the first time you open the table, you will find the maximum value of self-increment and use the maximum value +1 as the current self-increment ID.
  • Innodb8.0 saves the auto-increment in Redolog and relies on Redolog to restore the value before the restart

Self-value-added modification mechanism

If a field is increment, insert a row of data

  1. Alter TABLE AUTO_INCREMENT increment (int id, int id, int id); alter table AUTO_INCREMENT (INT id, int id, int id, int id);
  2. If the id field specifies a value when inserting data, the value specified in the statement is used.
  • If the value to be inserted is less than auto-increment, the auto-increment of the table remains the same
  • If the value to be inserted is greater than or equal to the self-increment, you need to change the current self-increment to the new one

The increment increment algorithm is as follows: start with auto_increment_offset and increment with auto_INCREment_increment until the first value greater than X is found as the new increment. The default value is 1

The timing of self-value-added modifications

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `c` (`c`)
) ENGINE=InnoDB;
Copy the code

Suppose there is one (1,1,1) data in the table. If (null, 1,1) is inserted. The execution process is as follows

  1. The executor calls the InnoDB engine interface to write a line that is passed with the value (0,1,1);
  2. InnoDB finds that the user does not specify the value of increment id, obtains the current increment value of t by 2, and then inserts 2 into (2,1,1).
  3. And then you change the auto-increment to 3
  4. Insert because c is the only index. So inserting (2,1,1) will give you an error.

That’s when the problem arises, the value of 2 is not inserted, and the self-increment is not changed back, so you have a discontinuous situation

Another case is transaction issues. So that’s 2 mechanics and 1 strategy. Let’s just cut the shit out of it.

To avoid two transactions applying to the same increment ID, you must lock and apply sequentially.

  • Transaction A applies for an ID value when executing an insert statement. If transaction A applies for an ID value of 3, then the increment value will be 4
  • Transaction B (ID = 4); transaction B (primary key = 5); transaction B (primary key = 5)

Above are two transactions executing an insert statement if one of them fails or is rolled back. So there’s a gap that’s missing. The original data with ID 4 is missing. It’s also 5, and it hasn’t changed back to 4.

6.3 ID Discontinuity Problem

  1. The auto-increment was not changed back after unsuccessful insertion, resulting in discontinuity
  2. Transaction rollback problem

6.4 Why does MySQL not change ID back

MySQL didn’t change the ID back because of performance concerns!

We can take two counter examples. If you want to return, you must determine whether the returned ID exists. So how to judge, is definitely to look up the table.

If the table ID exists before applying for an ID, performance is compromised. Getting an ID would have been a quick operation, but now you have to check whether the ID exists on the primary key index.

Performance: dare cut my performance I hammer death you.

In another case, the lock is released after a transaction commit is confirmed. In this way, although security can be guaranteed, but the granularity of the lock is too large, and the system concurrency is greatly reduced.

So the business side of the balance, still choose performance, did not change the ID back.

6.6 on the lock

Here is a brief history for the interviewer

Version 5.0

The system uses the safer, less concurrency approach we described above. A statement applies for a self-added lock, which will be released after the statement is executed.

5.1.22 version

Compared to version 5.0, a policy innodb_autoinc_lock_mode was introduced

  • If the value is set to 0, the auto-add lock policy is released after the statement is executed, which is the policy of version 5.0
  • If set to 1, the INSERT statement is released immediately after the request. If it is inser-select, it will wait for the statement to complete.
  • If the value is set to 2, all self-added locks are released immediately after being applied to increase the concurrency

Analyze the insert-select problem

Recall the transactions that were written to the log. If the binlog format is statement, how do they record it?

If insert-select uses the release immediately after application scheme. Multiple data only apply primary key ID, will appear two transaction intersection form. That is, the ID data for a transaction is not contiguous.

Once both transactions are executed at the same time, either transaction A or transaction B is written first. Whenever a binlog is taken from the library and executed, insert-select returns with consecutive ids, the library’s data is inconsistent.

MySQL currently references version 5.1.22

6.7 What Can I Do If THE NUMBER of ids Exceeds the Upper Limit

This knowledge point is also a topic that big factory often asks. Because the data volume of big factory is relatively large, such a scene will indeed be encountered.

If the upper limit of primary key ID is increased, the original data will be overwritten. The line is 4,294,967,295, nearly 4.3 billion.

From this point of view, we should still actively create autoincrement primary keys in InnoDB tables. This is because it is more acceptable to report a primary key conflict error when a table is inserted after its self-increment ID reaches the upper limit.

Overwriting data, after all, means data loss, affecting data reliability; Report primary key conflict, insert failure, affect availability. In general, reliability takes precedence over availability.

If a table does not have a primary key, MySQL creates a hidden field, row_ID, by default. By default, primary key ids are 8 bytes long, and row_id is 6 bytes long.