One, foreword

There are four lock states in total, from lowest to highest: no lock, biased lock, lightweight lock, heavyweight lock. What do these four lock states represent respectively, and why are locks upgraded? Before JDK 1.6, synchronized was still a heavyweight lock, and a relatively inefficient lock. However, after JDK 1.6, Jvm optimized synchronized to improve lock acquisition and release efficiency, introducing biased locking and lightweight locking. Since then the state of the lock has four (not locked, biased locking, lightweight and heavyweight lock), and four state will as competition gradually upgrade, and it is irreversible process, which cannot be downgraded, that is to say, only lock escalation (from low level to high level), can’t lock down (high level to low level), This means biased locks cannot be downgraded to biased locks after being upgraded to lightweight locks. The purpose of this policy is to improve the efficiency of acquiring and releasing locks.

Two, the four states of the lock

In the initial implementation of synchronized, “blocking or waking up a Java thread requires the operating system to switch CPU state to complete this state switch, which consumes processor time. If the content in the synchronized code block is too simple, this switch may take longer than the user code execution time”. This was the original way synchronized synchronized was implemented, and it was a source of criticism from developers that synchronized was inefficient before JDK6, which introduced “biased locking” and “lightweight locking” to reduce the performance cost of acquiring and releasing locks.

Therefore, there are four types of lock status, from low to high: no lock, biased lock, lightweight lock, heavyweight lock, lock status can only be upgraded, not degraded

As shown in the figure:

Third, the idea and characteristics of the lock state

The lock state Store content Sign a
unlocked Object hashCode, object generation age, whether biased lock (0) 01
Biased locking Biased thread ID, biased timestamp, object generation age, whether biased lock (1) 01
Lightweight lock Pointer to the lock record in the stack 00
Heavyweight lock A pointer to a mutex 11

Four, lock comparison

The lock advantages disadvantages Applicable scenario
Biased locking Locking and unlocking require no additional cost, with a nanosecond difference compared to implementing asynchronous methods If there is lock contention between threads, there is additional lock cancellation cost This applies to scenarios where only one thread accesses a synchronized block
Lightweight lock Competing threads do not block, improving the response time of the program Using spin consumes CPU if you never get a competing thread Pursuit of response speed, synchronous block execution speed is very fast
Heavyweight lock Thread contention does not use spin and does not consume CPU Threads are blocked and response time is slow In pursuit of throughput, synchronous block execution is slow

Five, the Synchronized lock

Synchronized locks are stored in Java object headers. What is an object header?

5.1 Java Object Headers

Taking the Hotspot VIRTUAL machine as an example, the Hopspot object header contains two main parts of data: a Mark Word and a Klass Pointer.

Mark Word: Default store object HashCode, generational age, and lock flag bit information. This information is irrelevant to the definition of the object itself, so Mark Word is designed as a non-fixed data structure to store as much data as possible in a very small amount of memory. It reuses its own storage space based on the state of the object, which means that the data stored in Mark Word changes as the lock flag bit changes during runtime.

Klass Point: a pointer to the object’s class metadata that the virtual machine uses to determine which class the object is an instance of.

Synchronized locks are stored in Java object headers. Synchronized locks are stored in Java object headers. What does the MarkWord look like in the object header, and what does it store?

On a 64-bit VM:

In a 32-bit VM:

Let’s take a 32-bit virtual machine as an example to see how the bytes of its Mark Word are allocated

No lock: the object header allocates 25 bits of space for storing the hashcode of the object, 4 bits for storing the age of the object, 1bit for storing the identifier of whether the object is biased towards the lock, and 2 bits for storing the identifier of the lock

Biased lock: it is divided into more details in biased lock, and still opens 25bit space, in which 23bit is used to store thread ID, 2bit is used to store Epoch, 4bit is used to store age of object generation, 1bit is used to store biased lock identifier, 0 means no lock, 1 means biased lock, and the identifier bit of lock is still 01

Lightweight lock: in the lightweight lock directly open 30bit space to store the pointer to the lock record in the stack, 2bit to store the flag bit of the lock, its flag bit is 00

Heavyweight locks: In heavyweight locks and lightweight locks, 30 bits of space is used to store Pointers to heavyweight locks, and 2 bits are used to store the identification bit of the lock, which is 11

GC flag: open 30bit memory space but not occupied, 2bit space to store the lock flag bit 11.

The lock flag bit of both no-lock and biased lock is 01, but the 1bit in front distinguishes whether the state is no-lock or biased lock

Git markoop.hpp (openJDK)

public:
  // Constants
  enum { age_bits                 = 4,
         lock_bits                = 2,
         biased_lock_bits         = 1,
         max_hash_bits            = BitsPerWord - age_bits - lock_bits - biased_lock_bits,
         hash_bits                = max_hash_bits > 31 ? 31 : max_hash_bits,
         cms_bits                 = LP64_ONLY(1) NOT_LP64(0),
         epoch_bits               = 2
  };
Copy the code
  • Age_bits: this is what we call a generation reclamation identifier, occupying 4 bytes
  • Lock_bits: indicates the flag bit of a lock, occupying 2 bytes
  • Biased_lock_bits: indicates whether to bias the lock. It takes 1 byte
  • Max_hash_bits: Is the number of bytes used in hashcode for unlocked VMS, 32-4-2 -1 = 25 bytes for 32-bit VMS, 64-4-2 -1 = 57 bytes for 64-bit VMS, but 25 bytes are unused. So a 64-bit Hashcode takes 31 bytes
  • Hash_bits: for 64-bit VMS, 31 is used if the maximum number of bytes is greater than 31; otherwise, the actual number of bytes is used
  • Cms_bits: 0 bytes are used for either a 64-bit vm or 1byte for a 64-bit vm
  • Epoch_bits: the size of the epoch is 2 bytes.

5.2 the Monitor

Monitor can be understood as a synchronization tool or a synchronization mechanism and is often described as an object. Each Java object has an invisible lock, called an internal lock or Monitor lock.

Monitor is a thread-private data structure, and each thread has a list of available Monitor Records, as well as a global list of available records. Each locked object is associated with a Monitor, and an Owner field in the monitor stores the unique identity of the thread that owns the lock, indicating that the lock is occupied by the thread.

Synchronized is implemented through an internal object called a monitor Lock, and the essence of the monitor Lock depends on the underlying operating system’s Mutex Lock (Mutex Lock). However, switching between threads in the operating system requires switching from user state to core state, which costs a lot and takes a relatively long time to switch between states, which is why Synchronized is of low efficiency. Therefore, this type of Lock, which relies on the implementation of the operating system Mutex Lock, is called a heavyweight Lock.

As locks compete, locks can be upgraded from biased locks to lightweight locks to heavyweight locks (but locks are upgraded in one direction, meaning they can only be upgraded from low to high, with no lock degradation). Bias locking and lightweight locking are enabled by default in JDK 1.6. We can also disable bias locking by -xx: -usebiasedlocking =false.

Six, the classification of locks

6.2 no lock

No lock means that no resource is locked. All threads can access and modify the same resource, but only one thread can modify the resource successfully.

The lockless feature is that modifications are performed within a loop, with threads constantly trying to modify shared resources. If there are no conflicts, the modification succeeds and exits, otherwise the loop continues. If multiple threads modify the same value, one thread will succeed, and the others will retry until the modification succeeds.

6.3 biased locking

When a synchronized block is first implemented, a lock object becomes a bias lock (CAS modifies the lock flag bit in the object’s head), which literally means a lock “biased in favor of the first thread to acquire it.” The thread does not actively release bias locks after executing synchronized blocks of code. When a block of synchronized code is reached for the second time, the thread will determine whether the thread holding the lock is its own (the thread holding the lock ID is also in the object head), and if so, proceed normally. Since the lock was not released before, there is no need to re-lock it. If only one thread is using the lock all the time, it is clear that biased locking has little additional overhead and high performance.

Biased locking means that when a piece of synchronized code is accessed by the same thread all the time, that is, there is no competition between multiple threads, the thread will automatically acquire the lock in the subsequent access, thus reducing the consumption of acquiring the lock, that is, improving performance.

When a thread accesses a block of synchronized code and acquires a lock, the thread ID of the lock bias is stored in Mark Word. Instead of using CAS to lock and unlock a thread when it enters and exits a synchronized block, it checks whether the Mark Word stores a biased lock pointing to the current thread. The acquisition and release of lightweight locks rely on multiple CAS atomic instructions, while biased locks only need to rely on one CAS atomic instruction when ThreadID is replaced.

Biased lock only when other threads try to compete for biased lock, the thread holding biased lock will release the lock, the thread will not actively release biased lock.

The cancellation of biased locks requires waiting for the global safe point, which is the point in time at which no bytecode is being executed. It suspends the thread with a biased lock and then determines whether the lock object is locked. If the thread is not active, the object header is set to lock free and bias locks are revoked to return to lock free (flag bit 01) or lightweight lock (flag bit 00).

6.4 Lightweight Lock (Spin Lock)

Lightweight locking is when a biased lock is accessed by another thread, the biased lock will be upgraded to lightweight. Other threads will try to acquire the lock in the form of spin (see the spin at the end of the article), and the thread will not block, thus improving performance.

The acquisition of lightweight locks mainly consists of two situations: ① when the biased lock function is turned off; ② Biased locks are upgraded to lightweight locks due to multiple threads competing for biased locks.

Once a second thread enters the lock contention, the bias lock is upgraded to a lightweight lock (spin lock). To clarify what lock contention is, there is no lock contention if multiple threads take turns acquiring a lock, but each time the lock is acquired without blocking, there is no lock contention. Lock contention occurs only when a thread attempts to acquire a lock and finds that the lock is already occupied and has to wait for it to be released.

If the lock contention continues in the lightweight lock state, the thread that did not grab the lock will spin, that is, constantly loop to determine whether the lock can be successfully acquired. The operation of obtaining the lock is to modify the lock flag bit in the head of the object through CAS. First compares whether the current lock flag bit is “released”, if so it is set to “locked”, and the comparison and setting occurs atomically. The thread then modifies the current lock holder information to itself.

Long hours of spinning can be very resource-intensive, with one thread holding the lock and the other threads running on empty CPU, unable to perform any useful tasks, a phenomenon called busy-waiting. If multiple threads use a lock, but no lock contention occurs, or if only a slight lock contention occurs, synchronized uses lightweight locks, allowing for short periods of busy behavior, and so on. This is a tradeoff idea, short periods of busy, etc., in exchange for the overhead of threads switching between user and kernel mode.

6.4 Heavyweight Locks

Obviously, there is a limit to the number of spins (there is a counter that records the number of spins, allowing 10 cycles by default, which can be changed by vm parameters). If lock contention is serious, a thread that has reached the maximum number of spins will upgrade the lightweight lock to the heavyweight lock (again, the CAS changes the lock flag, but does not change the thread ID that holds the lock). When a subsequent thread tries to acquire a lock and finds that the lock is a heavyweight lock, it suspends itself (rather than being busy) and waits to be woken up in the future.

A heavyweight lock means that when one thread acquires the lock, all other threads waiting to acquire the lock block.

In short, all control is given to the operating system, which takes care of scheduling between threads and thread state changes. This leads to frequent thread state switching, thread suspension and wake up, which consumes a lot of system capital

Five, the summary

This article describes the four states of lock and how to upgrade the lock step by step. If there is any problem or lack of understanding in the article, welcome to point out and communicate in the comments section below. Thank you