Summary: This article will introduce the common data structures and their meanings in MDL systems, then discuss the acquisition mechanism and deadlock detection of MDL from an implementation perspective, and finally share how to monitor MDL status in practice.
The author source of mooring song | | ali technology to the public
A background
In order to meet the transaction isolation and consistency requirements of database under concurrent requests, and to play a role in various storage engines of MySQL plug-in, MySQL implements the Metadata Locking (MDL) mechanism in the Server layer. For example, when a transaction accesses a resource in the database, it can restrict other concurrent transactions from deleting that resource. This is a logical lock. Unlike the limited mutex provided by the operating system kernel, MDL can flexibly customize the lock object, lock type and priority of different lock types, and can even dynamically adjust the compatibility of different lock types in different system states. Greatly convenient database to various query requests for reasonable concurrency control.
This article will introduce the common data structures and their meanings in MDL systems, then discuss the acquisition mechanism and deadlock detection of MDL from an implementation perspective, and finally share how to monitor MDL status in practice.
Ii basic Concepts
1 MDL_key
MDL objects are described in key-value pairs, with each key representing a unique lock object (value representing a database resource). Key is represented by MDL_key, which represents the name of the object as a string.
The complete string consists of namespaces, names for each level of the hierarchy, and multiple namespaces to distinguish objects with the same name of different types. A namespace consists of different object types that can be created in a database, such as GLOBAL, SCHEMA, TABLE, FUNCTION, and PROCEDURE.
The name of an object can be composed of many layers, depending on the type. For example, table objects are uniquely described by database and table names; If it is a SCHEMA object, there is only the database name level. Names are separated by the string terminator ‘\0’. Therefore, the string composed of these parts as a whole can be used as a unique representation of some kind of database object for key.
2 enum_mdl_type
For the same database object, different queries have different access patterns, such as SELECT statements that want to read the contents of the object, INSERT/UPDATE statements that want to modify the contents of the object, and DDL statements that want to modify the structure and definition of the object. These statements have different requirements for object impact and concurrency isolation, so MySQL defines different types of MDL and compatibility between them to control concurrent access to these statements.
The types of MDL are represented by enum_MDl_type. The most common types include:
- MDL_SHARED(S), which can share metadata for accessing objects, such as the SHOW CREATE TABLE statement
- MDL_SHARED_READ(SR), which can share data from objects accessed, such as SELECT statements
- MDL_SHARED_WRITE(SW), which can modify object data, such as INSERT/UPDATE statements
- MDL_SHARED_UPGRADABLE(SU), an upgradable shared lock that later upgradable to stronger locks (such as X locks that block concurrent access), such as phase 1 of DDL
- MDL_EXCLUSIVE(X), an exclusive lock that blocks concurrent access to the object by other threads, can modify the metadata of the object, such as phase 2 of the DDL
Different query statements request different types of MDL, combined with flexible compatibility customization between different types of MDL, allowing concurrency control over conflicting statements. The default compatibility between different types of MDL for the same object is described below.
Different types of MDL compatibility
MySQL divides lock types into range locks and object locks.
1) Range locking
Scope locks have a few types (IX, S, and X) and are mainly used for objects in the GLOBAL, COMMIT, TABLESPACE, BACKUP_LOCK, and SCHEMA namespaces. The compatibility of these types is simple and mainly restricts concurrent operations as a whole. For example, global read locks block transaction commits, and DDL updates meta information of table objects block SCHEMA level modification operations by requesting a schema-wide intent exclusive lock (IX).
These types of MDL compatibility relationships are defined by two matrices. For the same object, one is the compatibility of the acquired MDL type with the new request type; The other is the pending compatibility of the MDL request type to the new request type. Because IS(INTENTION_SHARE) IS compatible with other locks in all cases, it can be ignored in MDL systems.
| Type of active | Request | scoped lock | type | IS(*) IX S X | ---------+------------------+ IS | + + + + | IX | + + - - | S | + - + - | X | + - - - | | Pending | Request | scoped lock | type | IS(*) IX S X | ---------+-----------------+ IS | + + + + | IX | + + - - | S | + + + - | X | + + + + |Copy the code
Here: “+” — means that request can be satisfied
“-” — means that request can’t be satisfied and should wait
2) Object lock
Object locks are rich in MDL types and apply to most basic objects in a database. Their compatibility matrix is as follows:
Request | Granted requests for lock | type | S SH SR SW SWLP SU SRO SNW SNRW X | ----------+---------------------------------------------+ S | + + + + + + + + + - | SH | + + + + + + + + + - | SR | + + + + + + + + - - | SW | + + + + + + - - - - | SWLP | + + + + + + - - - - | SU | + + + + + - + - - - | SRO | + + + - - + + + - - | SNW | + + + - - - + - - - | SNRW | + + - - - - - - - - | X | - - - - - - - - - - | Request | Pending requests for lock | type | S SH SR SW SWLP SU SRO SNW SNRW X | ----------+--------------------------------------------+ S | + + + + + + + + + - | SH | + + + + + + + + + + | SR | + + + + + + + + - - | SW | + + + + + + + - - - | SWLP | + + + + + + - - - - | SU | + + + + + + + + + - | SRO | + + + - + + + + - - | SNW | + + + + + + + + + - | SNRW | + + + + + + + + + - | X | + + + + + + + + + + | Here: "+" -- means that request can be satisfied "-" -- means that request can't be satisfied and should waitCopy the code
In the process of MDL acquisition, the MDL of granted/pending status incompatible with the requested MDL can be judged by the two compatibility matrices to determine whether the request can be satisfied or not. If it cannot be satisfied, the MDL enters the pending state.
MDL system also determines the strength of lock type by compatibility matrix, as follows:
/** Check if ticket represents metadata lock of "stronger" or equal type than specified one. I.e. if metadata lock represented by ticket won't allow any of locks which are not allowed by specified type of lock. @return true if ticket has stronger or equal type false otherwise. */ bool MDL_ticket::has_stronger_or_equal_type(enum_mdl_type type) const { const MDL_lock::bitmap_t *granted_incompat_map = m_lock->incompatible_granted_types_bitmap(); return ! (granted_incompat_map[type] & ~(granted_incompat_map[m_type])); }Copy the code
If type is incompatible with an MDL compatible with m_type, then type is stronger. Otherwise, the m_TYPE type is the same or stronger. Or the weaker type is incompatible with the MDL type, and the stronger MDL is incompatible with both.
Three important data structures
1 Relationship Diagram
2 MDL_request
Represents a request by a statement to the MDL, consisting of MDL_key, enum_MDl_type, and enum_MDl_duration. MDL_key and enum_MDl_type determine the object and lock type of the MDL.
There are three types of enum_MDL_duration, representing the holding period of the MDL: single statement level, transaction level, and explicit period.
The lifecycle of MDL_request is outside the MDL system, controlled by the user, and can be a temporary variable. However, the MDL lifecycle acquired through this request is persistent, controlled by the MDL system, and is not released with the destruction of the MDL_request.
3 MDL_lock
For an object in the database, only one lock object MDL_lock corresponding to its name (MDL_key) exists. MDL_lock is created and managed in memory by lock-free HASH when the objects of the database are accessed for the first time; When subsequent access comes, access to the same object will refer to the same MDL_lock.
MDL_lock contains both the M_WAITING queue that is currently waiting for the lock object and the M_GRANTED queue that has been granted the lock object. The elements in the queue are represented by MDL_ticket.
MDL_lock_strategy composed of static bitmap objects is used to store the compatibility matrix of scope lock and object lock. The compatibility of the lock can be obtained according to the namespace of MDL_lock.
4 MDL_ticket
MDL_lock and enum_MDl_type together form MDL_ticket, which represents the access permission of the current thread to the database object. MDL_ticket is created each time a query requests an MDL lock, memory is allocated by the MDL system and destroyed at the end of the transaction.
The MDL_ticket contains two sets of Pointers to connect all tickets obtained by the thread to tickets in the waiting or granted state of the lock object in which the ticket participates.
5 MDL_context
The context in which a thread acquirement an MDL lock, one for each connection, contains all mdL_tickets acquired by that connection. They are stored in their own linked lists for different lifecycles, managed by MDL_ticket_store.
All locks acquired by a connection can be divided into three types based on their lifetime: statement level, transaction level, and explicit lock. Statement-level and transaction-level locks have automatic lifecycles and scopes that accumulate over the course of a transaction. Statement-level locks are automatically released after the outermost statement. Transaction-level locks are released after COMMIT, ROLLBACK, and ROLLBACK TO SAVEPOINT. They are not released manually. Tickets with an explicit lifetime are acquired for locks across transactions and checkpoint, including HANDLER SQL locks, LOCK TABLES locks, and user-level locks GET_LOCK()/RELEASE_LOCK(). Statement-level and transaction-level locks are added to the front of the corresponding linked list in reverse chronological order, and when we roll back to a checkpoint, the corresponding ticket is released from the front of the list to the last ticket acquired before checkpoint creation.
When a thread wants to acquire an MDL lock, it first looks in its MDL_ticket_store to see if it has already acquired an MDL_ticket of a stronger type of the same lock object within the transaction. Therefore, MDL_ticket_store provides an interface for searching MDL_tickets based on MDL_request requests. One interface is to search mdL_tickets in the MDL_ticket list of different lifecycles. If the number of MDL_tickets obtained by the current thread exceeds the threshold (default: 256), all MDL_tickets are maintained in an additional STD ::unordered_multimap to speed up the lookup.
MDL_ticket_store::MDL_ticket_handle MDL_ticket_store::find( const MDL_request &req) const { #ifndef DBUG_OFF if (m_count >= THRESHOLD) { MDL_ticket_handle list_h = find_in_lists(req); MDL_ticket_handle hash_h = find_in_hash(req); DBUG_ASSERT(equivalent(list_h.m_ticket, hash_h.m_ticket, req.duration)); } #endif /*! DBUG_OFF */ return (m_map == nullptr || m_count < THRESHOLD) ? find_in_lists(req) : find_in_hash(req); }Copy the code
MDL acquisition process
Almost all query statements (including DML and DDL phase 1) are initiated at the Parse phase by LEX and YACC to initiate MDL lock requests to the tables that need to be accessed based on the type of statement, such as SELECT statements SR and INSERT statements SW. The ALTER TABLE statement is SU. This is done in the following call stack:
PT_table_factor_table_ident::contextualize()
|--SELECT_LEX::add_table_to_list()
|--MDL_REQUEST_INIT -> MDL_request::init_with_source()
Copy the code
Before the statement is executed, the open_tables_FOR_query function is used to open all tables to be accessed and obtain TABLE objects. In this process, the MDL lock is acquired before the table resources are acquired to prevent concurrent reads and writes to the meta information of the same table. All MDL lock requests are made by calling MDL_context::acquire_lock in the context of the current thread: MDL_context
| - open_table open_tables_for_query () () / / open loop each table | - open_table_get_mdl_lock () | - MDL_context: : acquire_lock () / / Acquire the lock, lock if there are any conflicts, then waiting for the conflict of the lock is released | - MDL_context: : try_acquire_lock_impl ()Copy the code
1 MDL_context::try_acquire_lock_impl
Let’s focus on the MDL_context:: try_acquire_LOCK_IMPL process. This function contains the acquisition of various types of locks (compatible and incompatible) and lock conflict detection, passing in the current MDL_request and output the obtained MDL_ticket.
Based on MDL_request, the current thread searches for a ticket with a stronger type and the same or different life cycle in the same object MDL_ticket. If it already holds the same life cycle, return it directly; For those with different life cycles, clone a return of the same period according to ticket.
As mentioned above, locks that are unobtrusive and obtrusive can be divided into unobtrusive and obtrusive locks according to the compatibility of lock types. In the process of obtaining locks, they also correspond to fast path and slow path respectively, indicating different degrees of difficulty in obtaining locks.
Unobtrusive(fast path)
For MDL requests that are unobtrusive (SR/SW, etc.), because this is the overwhelming number of unobtrusive requests and has good compatibility, you don’t need to record the specific MDL_ticket, just how many requests have been fetched. So use the integer atomic variable STD ::atomic M_fast_PATH_state in MDL_lock to count the number of unobtrusive lock types granted. Each unobtrusive lock has a different value. Leaving a fixed bit range to store the cumulative result of this lock type is equivalent to counting all unobtrusive locks granted using a longlong type, which can be modified without locking via CAS. IS_DESTROYED/HAS_OBTRUSIVE/HAS_SLOW_PATH
/** Array of increments for "unobtrusive" types of lock requests for per-object locks. @sa MDL_lock::get_unobtrusive_lock_increment(). For per-object locks: - "unobtrusive" types: S, SH, SR and SW - "obtrusive" types: SU, SRO, SNW, SNRW, X Number of locks acquired using "fast path" are encoded in the following bits of MDL_lock::m_fast_path_state: - bits 0 .. 19 - S and SH (we don't differentiate them once acquired) - bits 20 .. 39 - SR - bits 40 .. 59 - SW and SWLP (we don't differentiate them once acquired) Overflow is not an issue as we are unlikely to support more than 2^20 - 1 concurrent connections in foreseeable future. This encoding defines the below contents of increment array. */ {0, 1, 1, 1ULL << 20, 1ULL << 40, 1ULL << 40, 0, 0, 0, 0, 0},Copy the code
Obtain the unobtrusive integer increment value of the corresponding type based on the request type of MDL_request. If the increment value is 0, it indicates that the lock is obtrusive and requires a slow path.
/**
@returns "Fast path" increment for request for "unobtrusive" type
of lock, 0 - if it is request for "obtrusive" type of
lock.
@sa Description at method declaration for more details.
*/
MDL_lock::fast_path_state_t MDL_lock::get_unobtrusive_lock_increment(
const MDL_request *request) {
return MDL_lock::get_strategy(request->key)
->m_unobtrusive_lock_increment[request->type];
}
Copy the code
If the value is not 0, which means that the lock is unobtrusive, then the fast path is used to increment MDL_lock:: M_FAST_PATH_state directly through CAS. However, you need to make sure that the object is not obtrusive by any other thread, because some of the unobtrusive and obtrusive lock types are mutually exclusive, and only in the absence of obtrusive, Other unobtrusive locks are compatible with each other so that they can be obtained without having to judge the lock ownership of other threads.
MDL_lock::fast_path_state_t old_state = lock->m_fast_path_state; do { /* Check if hash look-up returned object marked as destroyed or it was marked as such while it was pinned by us. If yes we need to unpin it and retry look-up. */ if (old_state & MDL_lock::IS_DESTROYED) { if (pinned) lf_hash_search_unpin(m_pins); goto retry; } /* Check that there are no granted/pending "obtrusive" locks and nobody even is about to try to check if such lock can be acquired. In these cases we need to take "slow path". */ if (old_state & MDL_lock::HAS_OBTRUSIVE) goto slow_path; } while (! lock->fast_path_state_cas( &old_state, old_state + unobtrusive_lock_increment));Copy the code
After the CAS is complete, set the status and reference of the related data structure, and add the current MDL_ticket to the MDL_ticket_store of the thread.
/*
Since this MDL_ticket is not visible to any threads other than
the current one, we can set MDL_ticket::m_lock member without
protect of MDL_lock::m_rwlock. MDL_lock won't be deleted
underneath our feet as MDL_lock::m_fast_path_state serves as
reference counter in this case.
*/
ticket->m_lock = lock;
ticket->m_is_fast_path = true;
m_ticket_store.push_front(mdl_request->duration, ticket);
mdl_request->ticket = ticket;
mysql_mdl_set_status(ticket->m_psi, MDL_ticket::GRANTED);
Copy the code
Obtrusive(slow path)
For MDL requests that are obtrusive (such as SU/SRO/X), the MDL_ticket is stored in the M_GRANTED linked list corresponding to MDL_lock. Therefore, the list and other bitmaps need to be traversed to determine whether there is a lock conflict with MDL_ticket that other threads have acquired or are waiting for.
Before you need to go slow path to acquire the lock, the current thread needs to materialize the lock in MDL_lock:: M_FAST_PATH_STATE that was previously acquired by the current thread through fast Path and remove it from the bitmap. Add to MDL_lock:: M_GRANTED. Because the bitmap contained in MDL_lock:: M_FAST_PATH_state is not thread specific, and there is no lock conflict between multiple locks acquired by the current thread, so before using the bitmap to judge, Ensure that the tickets of MDL_lock:: M_FAST_PATH_state belong to other threads.
/** "Materialize" requests for locks which were satisfied using "fast path" by properly including them into corresponding MDL_lock::m_granted bitmaps/lists and removing it from packed counter in MDL_lock::m_fast_path_state. */ void MDL_context::materialize_fast_path_locks() { int i; for (i = 0; i < MDL_DURATION_END; i++) { MDL_ticket_store::List_iterator it = m_ticket_store.list_iterator(i); MDL_ticket *matf = m_ticket_store.materialized_front(i); for (MDL_ticket *ticket = it++; ticket ! = matf; ticket = it++) { if (ticket->m_is_fast_path) { MDL_lock *lock = ticket->m_lock; MDL_lock::fast_path_state_t unobtrusive_lock_increment = lock->get_unobtrusive_lock_increment(ticket->get_type()); ticket->m_is_fast_path = false; mysql_prlock_wrlock(&lock->m_rwlock); lock->m_granted.add_ticket(ticket); /* Atomically decrement counter in MDL_lock::m_fast_path_state. This needs to happen under protection of MDL_lock::m_rwlock to make it atomic with addition of ticket to MDL_lock::m_granted list and to enforce invariant [INV1]. */ MDL_lock::fast_path_state_t old_state = lock->m_fast_path_state; while (! lock->fast_path_state_cas( &old_state, ((old_state - unobtrusive_lock_increment) | MDL_lock::HAS_SLOW_PATH))) { } mysql_prlock_unlock(&lock->m_rwlock); } } } m_ticket_store.set_materialized(); }Copy the code
Once the materialization is complete, you can view the locktype states of the current lock type that the lock is waiting for (m_WAITING), the granted ticket type (m_GRANTED), and unobtrusive (MDL_lock:: M_FAST_PATH_STATE), Check whether the currently requested lock type can be obtained based on the compatibility matrix above. This process is mainly in MDL_lock:: CAN_grant_lock.
bool MDL_lock::can_grant_lock(enum_mdl_type type_arg, const MDL_context *requestor_ctx) const { bool can_grant = false; bitmap_t waiting_incompat_map = incompatible_waiting_types_bitmap()[type_arg]; bitmap_t granted_incompat_map = incompatible_granted_types_bitmap()[type_arg]; /* New lock request can be satisfied iff: - There are no incompatible types of satisfied requests in other contexts - There are no waiting requests which have higher priority than this request. */ if (! (m_waiting.bitmap() & waiting_incompat_map)) { if (! (fast_path_granted_bitmap() & granted_incompat_map)) { if (! (m_granted.bitmap() & granted_incompat_map)) can_grant = true; else { Ticket_iterator it(m_granted); MDL_ticket *ticket; /* There is an incompatible lock. Check that it belongs to some other context. */ while ((ticket = it++)) { if (ticket->get_ctx() ! = requestor_ctx && ticket->is_incompatible_when_granted(type_arg)) break; } if (ticket == NULL) /* Incompatible locks are our own. */ can_grant = true; } } } return can_grant; }Copy the code
In m_WAITING and M_GRANTED, in addition to the linked list connecting tickets, bitmap is also used to collect all types of tickets in the linked list for direct comparison. If an incompatible type is found in m_GRANTED, you need to traverse the linked list to determine whether the ticket of the incompatible type was acquired by the current thread. Lock conflicts occur only when the ticket is acquired by a non-current thread. Unobtrusive locks, if available, are added directly to the MDL_lock:: M_GRANTED list.
2 Lock wait and notification
If the MDL_ticket is successfully obtained, the MDL is obtained and you can continue the query process. If the unobtrusive lock is not available (either because the unobtrusive lock was forced to slow path by the obtrusive lock, or because the obtrusive lock itself is not available), then lock wait is required. The lock wait process is uniformly handled regardless of whether it is unobtrusive or obtrusive.
Each thread’s MDL_context contains an MDL_wait member, because lock wait and deadlock detection are thread-based objects that subscribe to notifications by adding the MDL_ticket of the corresponding request to the lock wait queue. A set of mutex, Condition variable, and enumeration states are used to complete wait and notification between threads. There are five waiting states:
// WS_EMPTY since EMPTY conflicts with #define in system headers on some
// platforms.
enum enum_wait_status { WS_EMPTY = 0, GRANTED, VICTIM, TIMEOUT, KILLED };
Copy the code
WS_EMPTY is the initial state, and the others are the waiting result states. As can be seen from the command, the waiting result may be:
- GRANTED, the thread has acquired the waiting MDL lock
- VICTIM, the thread that is the VICTIM of a deadlock, requires the transaction to be re-executed
- TIMEOUT: Indicates that the wait times out
- KILLED. The thread was KILLED while waiting
The waiting thread first adds the ticket it wants to get to the M_WAITING queue of MDL_lock, and then calls the function of MDL_wait to wait timeout according to the configured waiting time:
/** Wait for the status to be assigned to this wait slot. */ MDL_wait::enum_wait_status MDL_wait::timed_wait( MDL_context_owner *owner, struct timespec *abs_timeout, bool set_status_on_timeout, const PSI_stage_info *wait_state_name) { enum_wait_status result; int wait_result = 0; mysql_mutex_lock(&m_LOCK_wait_status); while (! m_wait_status && ! owner->is_killed() && ! is_timeout(wait_result)) { wait_result = mysql_cond_timedwait(&m_COND_wait_status, &m_LOCK_wait_status, abs_timeout); } if (m_wait_status == WS_EMPTY) { if (owner->is_killed()) m_wait_status = KILLED; else if (set_status_on_timeout) m_wait_status = TIMEOUT; } result = m_wait_status; mysql_mutex_unlock(&m_LOCK_wait_status); return result; }Copy the code
When other threads holding incompatible locks complete a query or transaction, they release all locks together, depending on whether they were acquired from a fast path or a slow path. Restore the state of MDL_lock:: M_FAST_PATH_STATE and the MDL_lock:: M_GRANTED list. Otherwise, if MDL_lock:: M_WAITING has a waiting ticket, MDL_lock::reschedule_waiters() is called to wake up the thread that can get the lock, and the wait state is set to GRANTED:
void MDL_lock::reschedule_waiters() { MDL_lock::Ticket_iterator it(m_waiting); MDL_ticket *ticket; while ((ticket = it++)) { if (can_grant_lock(ticket->get_type(), ticket->get_ctx())) { if (! ticket->get_ctx()->m_wait.set_status(MDL_wait::GRANTED)) { m_waiting.remove_ticket(ticket); m_granted.add_ticket(ticket); . /** Set the status unless it's already set. Return false if set, true otherwise. */ bool MDL_wait::set_status(enum_wait_status status_arg) { bool was_occupied = true; mysql_mutex_lock(&m_LOCK_wait_status); if (m_wait_status == WS_EMPTY) { was_occupied = false; m_wait_status = status_arg; mysql_cond_signal(&m_COND_wait_status); } mysql_mutex_unlock(&m_LOCK_wait_status); return was_occupied; }Copy the code
The waked wait thread continues execution if it finds that the ticket is GRANTED; Otherwise, errors are reported according to different situations.
3 Deadlock detection
Before each thread enters the lock wait, a deadlock detection is performed to prevent the current thread from falling into a deadlock. Before detecting deadlocks, materialize the unobtrusive locks obtained by the current thread so that they appear in the MDL_lock:: M_GRANTED linked list and deadlock detection can detect them. And set the current thread’s wait lock MDL_context:: M_WAITing_for to the current ticket. Each thread that enters the wait will set a wait object. Deadlocks can be detected along this wait chain.
/** Inform the deadlock detector there is an edge in the wait-for graph. */ void will_wait_for(MDL_wait_for_subgraph *waiting_for_arg) { /* Before starting wait for any resource we need to materialize all "fast path" tickets belonging to this thread. Otherwise locks acquired which are represented by these tickets won't be present in wait-for graph and could cause missed deadlocks. It is OK for context which doesn't wait for any resource to have "fast path" tickets, as such context can't participate in any deadlock. */ materialize_fast_path_locks(); mysql_prlock_wrlock(&m_LOCK_waiting_for); m_waiting_for = waiting_for_arg; mysql_prlock_unlock(&m_LOCK_waiting_for); }Copy the code
MDL_wait_for_subgraph
An abstract class representing an edge in the wait graph is traversed by a deadlock detection algorithm. MDL_ticket, derived from MDL_wait_for_subgraph, implements the accept_visitor() function to have the helper inspection class look for wait rings along the edge.
Deadlock_detection_visitor
The helper class that detects wait rings in the wait graph contains state information during the detection process, such as m_start_node, the starting thread for deadlock detection; The thread m_VICTIM is selected according to the weight after the lock is issued during the search. The thread depth of the search, if the thread’s wait chain is too long, exceeds the threshold (default 32), deadlock is considered to have occurred even if no deadlock is detected.
Enter_node () and leave_node() functions are implemented to enter and exit the next thread node, and inspect_edge() is used to find whether the current thread node is already the starting node to judge loop formation. Victimization Determines a victim by comparing the victim’s deadlock weight with opt_change_victim_to().
/**
Inspect a wait-for graph edge from one MDL context to another.
@retval true A loop is found.
@retval false No loop is found.
*/
bool Deadlock_detection_visitor::inspect_edge(MDL_context *node) {
m_found_deadlock = node == m_start_node;
return m_found_deadlock;
}
/**
Change the deadlock victim to a new one if it has lower deadlock
weight.
@param new_victim New candidate for deadlock victim.
*/
void Deadlock_detection_visitor::opt_change_victim_to(MDL_context *new_victim) {
if (m_victim == NULL ||
m_victim->get_deadlock_weight() >= new_victim->get_deadlock_weight()) {
/* Swap victims, unlock the old one. */
MDL_context *tmp = m_victim;
m_victim = new_victim;
m_victim->lock_deadlock_victim();
if (tmp) tmp->unlock_deadlock_victim();
}
}
Copy the code
Testing process
Deadlock detection approach is, first, breadth-first, starting from the current thread to wait for the locks, traverse MDL_lock queue waiting queue and granted, see if there are not acquire the current thread, and wait for the lock is not compatible with the lock, if this holds thread is the same as the starting point of the algorithm traverses the thread, then lock waits chain deadlock; Then depth-first, the holding or waiting thread of the incompatible lock starts. If the thread is also in the waiting state, the above process is recursively repeated until the waiting starting thread is found. Otherwise, no deadlock is judged. The code logic is as follows:
MDL_context: : find_deadlock () | - MDL_context: : visit_subgraph (MDL_wait_for_graph_visitor *) / / if m_waiting_for exists, Call the corresponding ticket accept_visitor () | -- - MDL_ticket: : accept_visitor (MDL_wait_for_graph_visitor *) / / lock acquisition case according to the corresponding MDL_lock detection | - MDL_lock: : visit_subgraph () / / award list of recursive traversal lock (m_granted) and waiting list (m_waiting) to judge whether there is waiting for the starting node (deadlocks) / / Recursive in order list and waiting list MDL_context to find deadlocks | - Deadlock_detection_visitor: : enter_node () / / in the first place to the current node | - traversing awarded list (m_granted), Judging compatibility | - if not compatible, call Deadlock_detection_visitor: : inspect_edge () to determine whether a deadlock | - traversing waiting list (m_waiting), ditto | traversing award list, judging compatibility | - if not compatible, A recursive call to MDL_context::visit_subgraph() looks for connected subgraphs. If a thread waiting for ticket have clear status, not WS_EMPTY, can direct return to | traversing waiting list, Ditto | - Deadlock_detection_visitor: : leave_node () / / / * We leave the current node do a breadth first search - first - that is, inspect all edges of the current node, and only then follow up to the next node. In workloads that involve wait-for graph loops this has proven to be a more efficient strategy [citation missing]. */ while ((ticket = granted_it++)) { /* Filter out edges that point to the same node. */ if (ticket->get_ctx() ! = src_ctx && ticket->is_incompatible_when_granted(waiting_ticket->get_type()) && gvisitor->inspect_edge(ticket->get_ctx())) { goto end_leave_node; }}... /* Recurse and inspect all adjacent nodes. */ granted_it.rewind(); while ((ticket = granted_it++)) { if (ticket->get_ctx() ! = src_ctx && ticket->is_incompatible_when_granted(waiting_ticket->get_type()) && ticket->get_ctx()->visit_subgraph(gvisitor)) { goto end_leave_node; }}...Copy the code
Victim weight
When a deadlock is detected, the thread with the lowest weight of the ticket is selected as the victim according to the weight of each thread waiting for the ticket to give up waiting and release the lock it holds. In the Deadlock_detection_visitor::opt_change_victim_to function.
The weights are fairly crude, regardless of the phase of the transaction or the content of the statement being executed. There is only a default weight based on the type of lock resource and the type of lock, in the MDL_ticket::get_deadlock_weight() function.
- DEADLOCK_WEIGHT_DML. The minimum weight of DML statements is 0
- DEADLOCK_WEIGHT_ULL, the user manually locks with a center weight of 50
- DEADLOCK_WEIGHT_DDL, the maximum weight of DDL statements is 100
Thus, in the event of a deadlock, the preference is for the DML statement to be rolled back and the DDL statement to continue execution. When a statement of the same type causes a deadlock, the preference is to let the thread that enters the wait chain later become the victim, and let the thread that waits longer continue to wait.
After the current thread sets the state of the VICTIM thread on the deadlock ring to VICTIM and wakes up, the current thread enters the wait state.
Five MDL monitoring
Performance_schema is a read-only variable that needs to be reset. Add:
[mysqld]
performance_schema = ON
Copy the code
Set performance_schema. setup_INSTRUMENTS to performance_schema. setup_INSTRUMENTS
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME = 'wait/lock/metadata/sql/mdl';
Copy the code
We can then access the Performance_schema.metadatA_LOCKS table to monitor MDL fetches, such as having two threads in the following states:
connect-1 > BEGIN; | Query OK, 0 rows affected (0.00 SEC) | | connect 1 > SELECT * FROM t1; | +------+------+------+ | | a | b | c | | +------+------+------+ | | 1 | 2 | 3 | | | 4 | 5 | 6 | | + -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- + | 2 rows in the set (0.00 SEC) | # DDL will hang | connect - 2 > ALTER TABLE t1 ADD INDEX i1 (a);Copy the code
Performance_schema. Metadata_locks: Thread 1 holds SHARED_READ locks for T1. Thread 2, which needs to acquire the EXCLUSIVE lock, is in the wait state.
mysql > SELECT * FROM performance_schema.metadata_locks; +-------------+--------------------+----------------+-------------+-----------------------+---------------------+------- --------+-------------+--------------------+-----------------+----------------+ | OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | COLUMN_NAME | OBJECT_INSTANCE_BEGIN | LOCK_TYPE | LOCK_DURATION | LOCK_STATUS | SOURCE | OWNER_THREAD_ID | OWNER_EVENT_ID | +-------------+--------------------+----------------+-------------+-----------------------+---------------------+------- --------+-------------+--------------------+-----------------+----------------+ | TABLE | test | t1 | NULL | 140734873224192 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:6759 | 68 | 23 | | GLOBAL | NULL | NULL | NULL | 140734862726080 | INTENTION_EXCLUSIVE | STATEMENT | GRANTED | sql_base.cc:6137 | 69 | 6 | | SCHEMA | test | NULL | NULL | 140734862726240 | INTENTION_EXCLUSIVE | TRANSACTION | GRANTED | sql_base.cc:6124 | 69 | 6 | | TABLE | test | t1 | NULL | 140734862726400 | SHARED_UPGRADABLE | TRANSACTION | GRANTED | sql_parse.cc:6759 | 69 | 6 | | BACKUP LOCK | NULL | NULL | NULL | 140734862726560 | INTENTION_EXCLUSIVE | TRANSACTION | GRANTED | sql_base.cc:6144 | 69 | 6 | | TABLESPACE | NULL | test/t1 | NULL | 140734862727040 | INTENTION_EXCLUSIVE | TRANSACTION | GRANTED | lock.cc:811 | 69 | 6 | | TABLE | test | #sql-5a52_a | NULL | 140734862726720 | EXCLUSIVE | STATEMENT | GRANTED | sql_table.cc:17089 | 69 | 6 | | TABLE | test | t1 | NULL | 140734862726880 | EXCLUSIVE | TRANSACTION | PENDING | mdl.cc:4337 | 69 | 6 | | TABLE | performance_schema | metadata_locks | NULL | 140734887891904 | SHARED_READ | TRANSACTION | GRANTED | sql_parse.cc:6759 | 67 | | 4 +-------------+--------------------+----------------+-------------+-----------------------+---------------------+------- + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + 9 rows in the set (0.00 SEC)Copy the code
Optimization of PolarDB on MDL
In MySQL Community Edition, access to partitioned table data (DML) and partition maintenance (DDL) block each other, mainly because DDL needs to acquire MDL_EXCLUSIVE lock on partitioned table. As a result, partition maintenance can only be performed during off-peak hours, and the need to create/delete partitions for partitioned tables is frequent, which greatly limits the use of partitioned tables.
In PolarDB, we introduced partition level MDL locks, which reduced the lock granularity of DML and DDL to partition level, improved concurrency, and enabled “online” partition maintenance. Data access and partition maintenance of partitioned tables do not affect each other. Users can maintain partitions more freely without affecting service traffic of partitioned tables, which greatly enhances the flexibility of using partitioned tables.
This feature has been released in PolarDB 8.0.2.2.0 and above, welcome users to use.
Seven reference
[1] Source code mysql/ mysql-server 8.0.18: github.com/mysql/mysql… [2] MySQL, source code analysis, commonly used SQL statement of MDL lock source analysis: mysql.taobao.org/monthly/201… [3] Online partition maintenance function: help.aliyun.com/document\_d…
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.