This column focuses on sharing the knowledge of large-scale Bat interview, and it will be updated continuously in the future. If you like it, please click a follow

Interviewer: The synchronize keyword what are the principles of virtual machine implementation, will be able to talk about what’s memory visibility, lock escalation Psychological analysis: the interviewer must be to want to further test the content of the concurrent, you see on earth have you done concurrent processing, most developers tend to be ignored in the development of the App concurrent processing, this problem will be difficult to most people. Candidate: The implementation principle of locking, lock optimization, and Java object header

Memory semantics for locks

The underlying synchronization is implemented using the operating system’s Mutex Lock.

  • Memory visibility: If you lock a variable, the value of the variable will be emptied from the working memory. Before the execution engine can use the variable, load or assign its value again. The variable must first be synchronized back to main memory (store and write operations).
  • Operation atomicity: Two synchronized blocks holding the same lock can only be entered serially
Memory semantics for locks:
  • When a thread releases the lock, the JMM fluses the shared variables in the thread’s local memory to the main memory
  • When a thread acquires a lock, the JMM invalidates the thread’s local memory. This makes critical section code protected by the monitor have to read shared variables from main memory
Memory semantics for lock release and lock acquisition:
  • When thread A releases A lock, thread A essentially sends A message (thread A’s modification to the shared variable) to the next thread that will acquire the lock.
  • Thread B acquires a lock, essentially receiving a message from a previous thread that modified the shared variable before releasing the lock.
  • Thread A releases the lock and thread B acquires the lock, essentially sending A message to thread B through main memory




2. Synchronized lock

Synchronized uses locks that are stored in Java object headers.

The JVM implements method synchronization and code block synchronization based on entering and exiting Monitor objects. Code block synchronization is implemented using monitorenter, which inserts at the start of the synchronized code block after compilation, and Monitorexit, which inserts at the end of the method and at exceptions. Any object has a Monitor associated with it, and when a Monitor is held, it is locked.

When monitorenter executes, it first attempts to acquire the lock on an object. If the object is not locked or the current thread already owns the lock, it increments the lock counter. Accordingly, the lock counter is decrement by 1 when the monitorexit directive is executed, and when it is reduced to 0, the lock is released. If the object lock fails to be acquired, the current thread blocks and waits until the object lock is released by another thread.

Two points to note:

1, synchronized fast for the same thread is reentrant, will not appear to lock themselves;

2. The synchronization block blocks the entry of subsequent threads until the entry thread completes its execution.

3. Mutex Lock

A Monitor Lock is essentially implemented by relying on the underlying operating system’s Mutex Lock. Each object corresponds to a “mutex” tag, which ensures that only one thread can access the object at any one time.

Mutex: Used to protect critical sections and ensure that only one thread accesses data at a time. To access a shared resource, the mutex is first locked. If the mutex is locked, the calling thread blocks until the mutex is unlocked. After access to the shared resource is complete, the mutex is unlocked.

How Mutex works:




  • To apply for a mutex
  • If successful, the mutex is held
  • If that fails, spin. The process of spin is to wait for the mutex online and keep launching mutex gets until the MUtex is obtained or the spin_count limit is reached
  • Select Yiled or Sleep according to different working mode
  • If the sleep limit is reached, or if active awakening or yield is completed, steps 1) to 4) are repeated until the sleep limit is reached

Since Java threads are mapped to the native threads of the operating system, blocking or waking up a thread requires the operating system to help complete the transition from user state to core state, so the state transition takes a lot of processor time. Synchronized is therefore a heavyweight operation in the Java language. In JDK1.6, the virtual machine made some optimizations, such as adding a spin wait before notifying the operating system of a blocked thread to avoid frequent interruptions to the core mentality:

Synchronized performs almost as well as ReentrantLock in the java.util.concurrent package, thanks to locking optimizations added in JDK1.6 (see below). ReentrantLock only provides the richer features of synchronized, not necessarily the better performance, so synchronized is preferred for synchronization where it can fulfill the requirements.

Java object header





At runtime, the data stored in Mark Word changes as the lock flag bit changes, using the 32-bit JDK as an example:





Five. Lock optimization

Bias lock, lightweight lock, heavyweight lock

Synchronized is implemented through an internal object called a monitor Lock, and the essence of the monitor Lock depends on the underlying operating system’s Mutex Lock (Mutex Lock). However, switching between threads in the operating system requires switching from user state to core state, which costs a lot and takes a relatively long time to switch between states, which is why Synchronized is of low efficiency. Therefore, this type of Lock, which relies on the implementation of the operating system Mutex Lock, is called a “heavyweight Lock.”

Java SE 1.6 introduces biased locks and lightweight locks to reduce the performance cost of acquiring and releasing locks. There are four lock states, in descending order: no lock, biased locks, lightweight locks, and heavyweight locks. Locks can be upgraded but not degraded.

5.1 biased locking

HotSpot authors have found that in most cases locks are not only not contested by multiple threads, but are always acquired multiple times by the same thread. Biased locking is intended to improve performance when only one thread executes a synchronized block.

When a thread, when accessing a synchronized block and obtain the lock will lock in the head and stack frame object record store to lock in the thread ID, after this thread synchronization block on the entry and exit for CAS operation is not needed to lock and unlock, simply test object head Mark in the Word is stored with biased locking points to the current thread. Biased locking is created in the absence of a multithreaded competition situation as far as possible to reduce unnecessary lightweight lock execution path, because of the lightweight lock acquisition and release rely on the CAS atomic instructions for many times, and biased locking only need to rely on a CAS when replacement ThreadID atomic instruction (because once appear, multithreading competition we must cancel the biased locking, So the performance loss of the undo operation with bias lock must be less than the performance cost of the saved CAS atomic instruction.

Biased lock acquisition process:

  • (1) Whether the Mark of bias lock in the access Mark Word is set to 1 and whether the lock flag bit is 01 — confirm that it is in the state of bias.
  • (2) If the state is biased, test whether the thread ID points to the current thread. If so, enter step (5); otherwise, enter Step (3).
  • (3) If the thread ID does not point to the current thread, the CAS operation will compete for the lock. If the competition is successful, set the thread ID in Mark Word to the current thread ID, and then execute (5); If the competition fails, perform (4).
  • (4) If CAS fails to acquire the biased lock, it indicates that there is competition (CAS failure to acquire the biased lock means that at least other threads have obtained the biased lock, because threads will not release the biased lock voluntarily). When the safepoint is reached, the thread with bias lock will be suspended first, and then the thread with bias lock will be checked whether it is alive (because the thread with bias lock may have finished executing, but it will not release bias lock actively). If the thread is not active, The object head is set to lock free (flag bit is “01”), and then re-biased to the new thread; If the thread is still alive, it will revoke the bias lock and upgrade to the lightweight lock state (flag bit “00”). At this point, the lightweight lock is held by the thread that originally held the bias lock and continues to execute its synchronization code, while the competing thread spins to wait for the lightweight lock.
  • (5) Execute the synchronization code.

Partial lock release process:

Step (4) above. Biased lock uses a mechanism to release biased lock after the emergence of contention: biased lock only when other threads try to compete biased lock, the thread holding biased lock will release the lock, the thread will not actively release biased lock. The revocation of biased lock needs to wait for the global safe point (at which no bytecode is being executed). It will first suspend the thread with biased lock and determine whether the lock object is locked. After revocation of biased lock, it will return to the state of unlocked (flag bit “01”) or lightweight lock (flag bit “00”).

Close bias lock:

Biased locking is enabled by default in Java 6 and Java 7. Since biased locking is intended to improve performance when only one thread is executing synchronized blocks, if you are sure that all locks in your application are normally in contention, you can turn biased locking off with the JVM argument: -xx: -usebiasedLocking =false, and your application will enter lightweight locking by default.

5.2 Lightweight Lock

Lightweight locks are designed to improve performance when threads execute synchronized blocks nearly alternately.

Lightweight lock locking process:

  • (1) When the code enters the synchronization block, if the Lock state of the synchronization object is lockless (the Lock flag bit is “01”, whether the bias Lock is “0”), the VIRTUAL machine will first establish a space called Lock Record in the stack frame of the current thread, which is used to store the copy of the current Mark Word of the Lock object. The official product is the taliban Mark Word. The state of the thread stack and object header is shown below.
  • (2) Copy the Mark Word in the object header to the lock record.
  • (3) After the copy is successful, the VM will use CAS to try to update the Mark Word of the object to the pointer pointing to Lock Record, and change the owner pointer in Lock Record to the Object Mark Word. If the update succeeds, go to Step 3; otherwise, go to Step 4.
  • (4) If the update succeeds, then the thread owns the lock of the object, and the lock flag of the object Mark Word is set to “00”, which means that the object is in the lightweight lock state. At this time, the state of the thread stack and the object header is shown in the figure below.





  • (5) If the update operation fails, the virtual machine will first check whether the Mark Word of the object points to the stack frame of the current thread. If it does, it indicates that the current thread has the lock of the object, and then it can directly enter the synchronization block to continue execution. Otherwise, it indicates that multiple threads compete for the lock. If there is only one waiting thread, it can spin to wait for a while, and another thread may release the lock soon. But when the spin exceeds a certain number of times, or when one thread is holding the lock, one is spinning, and a third person is calling, the lightweight lock expands to a heavyweight lock, which blocks all but the thread that owns the lock, preventing the CPU from idling, and the status value of the lock flag changes to “10.” The Mark Word stores Pointers to heavyweight locks (mutex), and then blocks the thread waiting for the lock.

Lightweight lock unlocking process:

  • (1) Try to replace the current Mark Word with the product copied in the thread by CAS operation.
  • (2) If the replacement is successful, the whole synchronization process is complete.
  • (3) If the replacement fails, another thread has attempted to acquire the lock (at this time the lock has expanded), and the suspended thread must be awakened at the same time the lock is released.
5.3 Heavyweight Locks

As mentioned in step (5) of the locking process of lightweight lock, lightweight lock ADAPTS to the situation where threads execute synchronous blocks almost alternately. If the same lock is accessed at the same time, lightweight lock will expand to heavyweight lock. Mark Word lock bit updated to 10, Mark Word points to mutex (heavyweight lock)

Synchronized heavyweight locking is implemented through an internal object called a monitor Lock, which in essence relies on the underlying operating system’s Mutex Lock (Mutex Lock). However, switching between threads in the operating system requires switching from user state to core state, which costs a lot and takes a relatively long time to switch between states, which is why Synchronized is of low efficiency.

(See Mutex Lock above)

5.4 Switch between bias lock, lightweight lock and heavyweight lock









Biased locks, lightweight locks are optimistic locks, heavyweight locks are pessimistic locks.

  • When an object is first instantiated and no threads are accessing it. It’s biased, meaning it now thinks that only one thread can access it, so when the first thread accesses it, it favors that thread, and in that case, the object holds the biased lock. Bias to the first thread. This thread uses CAS when it changes the object header to bias lock and changes the ThreadID in the object header to its own ID. When it accesses the object again, it only needs to compare the IDS and does not need to use CAS for operations.
  • Once a second thread accesses the object, the bias lock is not released voluntarily, so the second thread can see the bias state of the object, indicating that there is already a contention on the object. Check that the thread that originally held the lock on the object is still alive, and if it hangs, you can make the object lock-free, and then tilt it back to the new thread. If the original thread is still alive, you will perform the thread’s stack operation, check the usage of the object, if still need to hold biased locking is biased locking upgrade for lightweight, lock (biased locking is a time when this upgrade for lightweight lock), the lightweight lock by the original thread of biased locking holds, continue to implement the synchronization code, Competing threads go into spin and wait for the lightweight lock; If no use exists, you can revert the object to an unlocked state and then re-bias it.
  • Lightweight locks consider contention to exist, but the degree of contention is minimal. Usually two threads will stagger operations on the same lock, or wait a little (spin) before the other thread releases the lock. But when the spin exceeds a certain number, or when one thread is holding the lock, one is spinning, and a third person is calling, the lightweight lock bulges into a heavyweight lock, which blocks all threads except the one that owns the lock, preventing the CPU from idling.

Other lock optimization

6.1 lock elimination

Lock elimination is the removal of unnecessary lock operations. The virtual machine real-time editor, at runtime, removes locks that require synchronization in code but detect that shared data contention is not possible.

According to code escape, a piece of code is considered thread-safe and does not need to be locked if it is determined that no data on the heap will escape from the current thread.

Take a look at this program:

public class SynchronizedTest {

    public static void main(String[] args) {
        SynchronizedTest test = new SynchronizedTest();

        for (int i = 0; i < 100000000; i++) {
            test.append("abc"."def"); } } public void append(String str1, String str2) { StringBuffer sb = new StringBuffer(); sb.append(str1).append(str2); }}Copy the code

Although the append of a StringBuffer is a synchronous method, the StringBuffer in this program is a local variable and does not escape from the method. So this process is actually thread-safe and removes the lock.

6.2 lock coarsening

If a series of consecutive operations repeatedly lock and unlock the same object, even if the locking operation occurs in the body of the loop, frequent mutex synchronization can cause unnecessary performance losses even if there is no thread contention.

If the virtual machine detects a string of fragmented operations that lock the same object, the scope of lock synchronization will be extended (coarsed) outside the entire operation sequence.

Here’s an example:

public class StringBufferTest {
    StringBuffer stringBuffer = new StringBuffer();

    public void append(){
        stringBuffer.append("a");
        stringBuffer.append("b");
        stringBuffer.append("c"); }}Copy the code

Each call to stringBuffer.append requires locking and unlocking. If the virtual machine detects a series of locking and unlocking operations on the same object in a row, it will combine them into a larger locking and unlocking operation at the first append. Unlock after the last append method.

6.3 Spin locks and adaptive spin locks

  • The reason for introducing spin locks: The biggest performance impact of mutex synchronization is the implementation of blocking, because suspended threads and resumed threads need to be carried out in kernel mode, which puts a lot of pressure on the system’s concurrency performance. At the same time, the virtual machine development team has noticed that in many applications, shared data is locked for only a short period of time, and it is not worth it to frequently block and wake up threads for this short period of time.
  • Spin lock: Let the thread execute a meaningless busy loop (spin) and wait some time without being immediately suspended (spin does not give up processor time) to see if the thread holding the lock will release the lock soon. Spin-locking was introduced in JDK 1.4.2 and is disabled by default, but can be turned on using -xx :+UseSpinning; This function is enabled by default in JDK1.6.
  • Disadvantages of spin locking: Spin waiting is not a substitute for blocking, and while it avoids the overhead of thread switching, it takes up processor time. If the thread holding the lock releases it quickly, the spin is very efficient; Conversely, the spinning thread consumes processor resources without doing any meaningful work, resulting in wasted performance. Therefore, there must be a limit to the spin wait time (spin number), such as 10 cycles, and if the spin exceeds the defined time and still does not acquire the lock, it should be suspended (blocked). The number of spins can be adjusted with the -xx :PreBlockSpin parameter. The default number of spins is 10.
  • Adaptive spin locking: JDK1.6 introduces adaptive spin locking, which means that the number of spins is no longer fixed, but is determined by the previous spin time on the same lock and the state of the lock owner: If the spin wait has just successfully acquired the lock on the same object, and the thread holding the lock is running, the virtual machine will assume that the spin is likely to succeed again, and it will allow the spin wait to last a relatively long time. If the spin is rarely successfully acquired for a lock, it is possible to omit the spin process in future attempts to acquire the lock to avoid wasting processor resources. In simple terms, if a thread spins successfully, it will spin more next time, and if it fails, it will spin less.
  • Spin lock usage scenario: From the lightweight lock acquisition process, we know that when the thread fails to perform the CAS operation in the lightweight lock acquisition process, it is spun to acquire the heavyweight lock. (See “Lightweight Locks” above)

7. To summarize

  • Synchronized features: ensure memory visibility and atomicity of operation

  • Reasons that synchronized affects performance:

    • 1. Additional operations are required for locking and unlocking; 2. The biggest impact of mutex synchronization on performance is the implementation of blocking, because the operations of suspending and resuming threads involved in blocking need to be completed in kernel mode (the performance cost of switching between user mode and kernel mode is relatively high).
  • Synchronized lock: The Mark Word in the object header is reused based on the lock flag bit

    • Biased locking: Improves performance when only one thread executes a synchronized block. Mark Word stores the ThreadID of the lock bias. Later, the thread does not need CAS to lock and unlock when entering and exiting the synchronized block, but simply compares ThreadID. Features: bias locks are released only when thread contention appears. The thread holding bias locks will not release bias locks. The subsequent thread will first check whether the thread holding the biased lock is alive. If it is not in stock, the object will become unlocked and biased again. If alive, the bias lock is upgraded to a lightweight lock, at which point the lightweight lock is held by the thread that originally held the bias lock, continues to execute its synchronization code, and the competing thread goes into spin waiting to acquire the lightweight lock
    • Lightweight Lock: create a space called Lock Record in the stack frame of the current thread, try to copy the current Mark Word of the Lock object to the Lock Record of the stack frame, if the copy is successful: The VIRTUAL machine will use the CAS operation to try to update the object’s Mark Word to a pointer to the Lock Record, and the owner pointer in the Lock Record will point to the object’s Mark Word. Copy failure: If there is currently only one thread waiting, spin to wait for a while, and the thread holding the lightweight lock may soon release the lock. But when the spin exceeds a certain number, or when one thread is holding the lock, one is spinning, and a third person is calling, the lightweight lock expands to the heavyweight lock
    • Heavyweight lock: Refers to the mutex (mutex), the underlying implementation of the operating system mutex lock. The threads waiting for the lock will be blocked. In Linux, Java threads and operating system kernel threads are mapped one by one, so it involves switching between user mode and kernel mode, and blocking and recovery of operating system kernel mode threads.

About me

More Android Advanced Interview collections are available on Github

You can click on it if you need itAbout meContact me to obtain

I hope to communicate with you and make progress together

At present, I am a programmer. I not only share the knowledge related to Android development, but also share the growth history of technical people, including personal summary, workplace experience, interview experience, etc., hoping to make you take a little detours.