1 lock system

InnoDB is a common storage engine in MySQL. We often encounter slow SQL or deadlocks, so it is necessary to understand the locking mechanism under DDL and DML statements.

InnoDB storage engine locks can be divided into different dimensions:

  • According to the lock granularity, it can be divided into table lock and row lock

  • According to the compatibility of locks, they can be divided into shared locks and exclusive locks

1.1 the lock mode

1.1 intention to lock

I Intended Shared lock (LOCK_IS)
  • Table level lock: To add a shared lock to the corresponding record row, obtain the intended shared lock under the corresponding table or the table level lock with higher lock strength
Ii Intended Exclusive Lock (LOCK_IX)
  • Table level lock, if you need to add exclusive lock on the corresponding record row, you must first obtain the corresponding intention exclusive lock under the corresponding table or table level lock with higher lock strength

Common intent locks, such as SELECT… SQL > SELECT * from table_share; SQL > SELECT * from table_share FOR UPDATE, add table level intention exclusive lock first.

1.2 Shared Lock (LOCK_S)

The shared lock is used to read a row in a transaction. It is not expected that the data row will not be modified by other transaction locks, but the LOCK_S lock generated by all read operations does not conflict, so as to improve the read concurrency

The SELECT… IN SHARE MODE

Normal queries assign LOCK_S locks to records at the SERIALIZABLE isolation level

For normal INSERT/UPDATE, a LOCK_S lock is applied when a duplicate key is detected (or if there is a duplicate key marked for deletion)

1.3 Exclusive Lock (LOCK_X)

The main purpose of exclusive locking is to avoid concurrent modification control of the same record. UPDATE or DELETE, SELECT… FOR UPDATE, all records are locked exclusively. Only the transaction that owns the lock can read and modify the row, while other transactions block. The lock is exclusive, and only one transaction can hold an exclusive lock on the same row at a time

1.4 Auto-Add Lock (LOCK_AUTO_INC)

An AUTO_INCREMENT lock is triggered when an inserted table contains AUTO_INCREMENT columns. When an insert table has an autoincrement column, the database needs to automatically generate an AUTO_INC table lock for the table before generating an autoincrement column. Insert operations of other transactions are blocked to ensure that the generated autoincrement column is unique. The lock logic of AUTO_INC is related to InnoDB’s lock mode, which is controlled by the innodb_autoinc_lock_mode parameter

1.2 Row lock types

Row locks are applied to clustered indexes and secondary indexes to avoid dirty and phantom reads by locking at different isolation levels

1.2.1 Record Lock (LOCK_REC_NOT_GAP)

A record lock is a row lock. Its main purpose is to lock the current database’s record rows. Record locks are used at the RC and RR isolation levels

1.2.2 Gap Lock (LOCK_GAP)

A GAP lock is a lock that is added to an index record range to lock a range without locking the record itself. It can be understood as an interval lock. Generally, GAP locks are used in RR isolation. If you do not want to use GAP locks, you can avoid GAP locks by switching to the RC isolation level or by enabling the innodb_locks_unsafe_for_binlog option

1.2.3 LOCK_ORDINARY

A combination of record locks and gap locks. Next-key LOCK in MySQL RR isolation level can solve phantom read problems in RR isolation level. Phantom reading is the execution of the same query within the same transaction, will find different row records. For example, if the index contains the values 100, 101, and 230, the existing RR locks are as follows:

  • (- up, 100]
  • (100, 101]
  • (101, 230]
  • (230 + up)

1.2.4 Insert Intention (LOCK_INSERT_INTENTION)

INSERT INTENTION is a type of GAP lock, which is mainly used to improve INSERT concurrency and avoid phantom exception problems. If multiple transactions are inserted into the same GAP, they do not need to wait for each other. For example, there are records 5 and 11 on the current index, and records 4 and 6 are inserted at the same time by two concurrent transactions. They add GAP locks for (5,11), but do not conflict with each other (because inserted records do not conflict)

2 Source code implementation

The general overview of lock-related structures is shown below:

2.1 the lock system

/** * lock system structure */
struct lock_sys_t{
  / / the mutex
  ib_mutex_t	mutex;
  // The hash key of the record lock hash table is formed by the record lock spaceID and page no, and value is lock_t
  hash_table_t*	rec_hash;
  ib_mutex_t	wait_mutex;	
  // Wait to lock the suspended thread
  srv_slot_t*	waiting_threads;
  ibool		rollback_complete;
  // Maximum lock wait time
  ulint		n_lock_max_wait_time;
  os_event_t	timeout_event;
  // Whether there are active timeout threads
  bool		timeout_thread_active;
};
Copy the code
  • The primary member variable is onehash table, used to manage the lock object for global active transaction creation

2.2 lock structure

/** * Lock object structure */
struct lock_t {
  // The transaction that owns the lock
  trx_t*		trx;
  // The chain list held by the transaction
  UT_LIST_NODE_T(lock_t) trx_locks;
  / / lock mode and type lock_type | type_mode
  ulint		type_mode;
  // Record the lock hash chain node
  hash_node_t	hash;
  
  // Record the lock index
  dict_index_t*	index;
  // Lock information is managed in union mode to save space
  union {
    / / table locks
    lock_table_t	tab_lock;
    / / row locks
    lock_rec_t	rec_lock;
  } un_member;
  };
Copy the code
  • Attribute variable hash: When the lock is inserted into lock_sys->hash, the linked list is formed using the variable hash.

  • Type_mode: lock mode and type

The lock mode

/** * Lock mode */
enum lock_mode {
    // S intent lock
  LOCK_IS = 0.// X intent lock
  LOCK_IX,
  // S share lock
  LOCK_S,
  // X exclusive lock
  LOCK_X,	  
  / / on the lock
  LOCK_AUTO_INC
  };
Copy the code

The lock type

// Lock mode mask code
#define LOCK_MODE_MASK	0xFUL
// Table lock 5 bit size 16
#define LOCK_TABLE	16
// Row lock 6 bit size 32
#define	LOCK_REC	32
// Lock type mask code
#define LOCK_TYPE_MASK	0xF0UL
// The lock waits for the identifier
#define LOCK_WAIT	256

// Row lock mode type
// indicates next-key lock, which locks the record itself and the corresponding gap
#define LOCK_ORDINARY	0
// The gap before the record is locked (the record itself is not locked)
#define LOCK_GAP	512
/ / record lock
#define LOCK_REC_NOT_GAP 1024
// Insert the intent lock
#define LOCK_INSERT_INTENTION 2048
// indicates that the lock is created by another transaction (e.g. implicit lock conversion)
#define LOCK_CONV_BY_OTHER 4096 
Copy the code

Uint32_t LOCK_T ::type_mode

The 0-3 bits indicate the lock mode, including intentional shared lock, intentional exclusive lock, shared lock, exclusive lock or autoincrement lock

Four bits indicate that the lock is a table type, one indicates that it is a table lock, and 0 indicates that it is not

8 bits indicates whether the lock is waiting

Digits 9 to 31 indicate the record lock type

In c++ code implementation, different modes of lock representation methods are shown in the following table

The lock mode (type_mode) Lock the notation
Record locks LOCK_X | LOCK_REC_NO_GAP
Clearance lock LOCK_X | LOCK_GAP
Key in the lock LOCK_X | LOCK_ORDINARY
Insert intent lock LOCK_X | LOCK_GAP | LOCK_INSERT_INTENTION

Lock type judgment

    switch (lock_get_type_low(lock)) {
      / / table locks
      case LOCK_TABLE:
      iter->bit_no = ULINT_UNDEFINED;
      break;
      / / record lock
      case LOCK_REC:
      iter->bit_no = lock_rec_find_set_bit(lock);
      ut_a(iter->bit_no ! = ULINT_UNDEFINED);break;
      default:
      ut_error;
    }

    /** * get the lock type */
    ulint lock_get_type_low(const lock_t*	lock)
    {
      ut_ad(lock);
      return(lock->type_mode & LOCK_TYPE_MASK);
    }
Copy the code

Lock_type now only uses bits 5 and 6, indicating whether a table lock is a row lock

Lock mode judgment

enum lock_mode lock_get_mode(
  const lock_t*	lock){
  ut_ad(lock);
  return(static_cast<enum lock_mode>(lock->type_mode & LOCK_MODE_MASK));
}
Copy the code

Record lock type judgment

// Check the gap lock
ulint lock_rec_get_gap(const lock_t*	lock){
  ut_ad(lock);
  ut_ad(lock_get_type_low(lock) == LOCK_REC);
  return(lock->type_mode & LOCK_GAP);
}
// Record the lock judgment
ulint lock_rec_get_rec_not_gap(const lock_t*	lock)
{
  ut_ad(lock);
  ut_ad(lock_get_type_low(lock) == LOCK_REC);
  return(lock->type_mode & LOCK_REC_NOT_GAP);
}
// Insert intent lock judgment
ulint lock_rec_get_insert_intention(const lock_t*	lock)	
{
  ut_ad(lock);
  ut_ad(lock_get_type_low(lock) == LOCK_REC);
  return(lock->type_mode & LOCK_INSERT_INTENTION);
}
Copy the code

The un_member member variable indicates that lock_t is either a table lock or a row lock

/** * table lock structure */
/** A table lock */
struct lock_table_t {
  // Database table structure
  dict_table_t*	table;
  // Lock on the same table in the database
  UT_LIST_NODE_T(lock_t)locks;
};

/** ** row lock structure */
struct lock_rec_t {
  // Id of the table space
  ulint	space;
  // Corresponding page page number
  ulint	page_no;
  // The bitmap structure is used to determine the number of bits in a page that are locked
  ulint	n_bits;
};
Copy the code

[space, page_no] determines which page the lock corresponds to, and heap_no is used to indicate the row number on the page. A row can be uniquely identified with [space, page_no, heap_no]. InnoDB uses an N_bits bitmap to indicate which rows are locked

Table locks and record locks share the data structure LOCK_T

Row locking is managed in the unit of page. A transaction creates only one lock_T object on the same page. The heap NO, which is uniquely identified in the page, is queried into bitmap to determine whether the row is locked

2.3 Lock Compatibility and lock strength

  • Add locking strength matrix
/* STRONGER-OR-EQUAL RELATION (mode1=row, mode2=column) * IS IX S X AI * IS + - - - - * IX + + - - - * S + - + - - * X + + + + + * AI - - - - + * See lock_mode_stronger_or_eq(). */
static const byte lock_strength_matrix[5] [5] = {
 /** IS IX S X AI */
 /* IS */ {  TRUE,  FALSE, FALSE,  FALSE, FALSE},
 /* IX */ {  TRUE,  TRUE,  FALSE, FALSE,  FALSE},
 /* S */ {  TRUE,  FALSE, TRUE,  FALSE,  FALSE},
 /* X */ {  TRUE,  TRUE,  TRUE,  TRUE,   TRUE},
 /* AI */ {  FALSE, FALSE, FALSE, FALSE,  TRUE}
};
Copy the code

When adding a lock, first determine whether the current transaction has been added to the lock of the same level or a higher level, such as LOCK_X unnecessary to add LOCK_S

  • Lock mutex matrix
/* LOCK COMPATIBILITY MATRIX * IS IX S X AI * IS + + + - + * IX + + - - + * S + - + - - * X - - - - - * AI + + - - - * *  Note that for rows, InnoDB only acquires S or X locks. * For tables, InnoDB normally acquires IS or IX locks. * S or X table locks are only acquired for LOCK TABLES. * Auto-increment (AI) locks are needed because of * statement-level MySQL binlog. * See also lock_mode_compatible(). */
static const byte lock_compatibility_matrix[5] [5] = {
 /** IS IX S X AI */
 /* IS */ {  TRUE,  TRUE,  TRUE,  FALSE,  TRUE},
 /* IX */ {  TRUE,  TRUE,  FALSE, FALSE,  TRUE},
 /* S */ {  TRUE,  FALSE, TRUE,  FALSE,  FALSE},
 /* X */ {  FALSE, FALSE, FALSE, FALSE,  FALSE},
 /* AI */ {  TRUE,  TRUE,  FALSE, FALSE,  FALSE}
};
Copy the code

If the lock is the same transaction as the current lock, return false without waiting. If the lock is compatible with the basic lock type of the current lock, it does not need to wait. Otherwise, the lock wait is entered

3 reference

Official document InnoDB Locking

InnoDB transaction lock system introduction

Recommended reading

Chapter 2 of the JVM series – Class files to virtual Machines

Dapr Combat (part 1)

Dapr Combat part ii

DS version control core principles revealed

DS 2.0 era API operation posture

, recruiting

Zhengcaiyun Technology team (Zero) is a passionate, creative and executive team based in picturesque Hangzhou. The team has more than 300 r&d partners, including “old” soldiers from Alibaba, Huawei and NetEase, as well as newcomers from Zhejiang University, University of Science and Technology of China, Hangzhou Electric And other universities. Team in the day-to-day business development, but also in cloud native, chain blocks, artificial intelligence, low code platform system, middleware, data, material, engineering platform, the performance experience, visualization technology areas such as exploration and practice, to promote and fell to the ground a series of internal technical products, continue to explore new frontiers of technology. In addition, the team is involved in community building, Currently, There are Google Flutter, SciKit-Learn, Apache Dubbo, Apache Rocketmq, Apache Pulsar, CNCF Dapr, Apache DolphinScheduler, and Alibaba Seata and many other contributors to the excellent open source community. If you want to change something that’s been bothering you, want to start bothering you. If you want to change, you’ve been told you need more ideas, but you don’t have a solution. If you want change, you have the power to make it happen, but you don’t need it. If you want to change what you want to accomplish, you need a team to support you, but you don’t have the position to lead people. If you want to change the original savvy is good, but there is always a layer of fuzzy window…… If you believe in the power of believing, believing that ordinary people can achieve extraordinary things, believing that you can meet a better version of yourself. If you want to be a part of the process of growing a technology team with deep business understanding, sound technology systems, technology value creation, and impact spillover as your business takes off, I think we should talk. Any time, waiting for you to write something and send it to [email protected]

Wechat official account

The article is published synchronously, the public number of political cloud technology team, welcome to pay attention to