In Java programs, we can use the synchronized keyword to lock programs. It can be used either to declare a block of synchronized code or to flag static or instance methods directly.
When a synchronized block is declared, the compiled bytecode contains the Monitorenter and Monitorexit directives. Both instructions consume an element of the reference type on the operand stack (that is, the reference in the parentheses of the synchronized keyword) as the lock object to be unlocked.
public void foo(Object lock) { synchronized (lock) { lock.hashCode(); } // The above Java code will be compiled into the following bytecode public void foo(java.lang.object); Code: 0: aload_1 1: dup 2: astore_2 3: monitorenter 4: aload_1 5: invokevirtual java/lang/Object.hashCode:()I 8: pop 9: aload_2 10: monitorexit 11: goto 19 14: astore_3 15: aload_2 16: monitorexit 17: aload_3 18: athrow 19: return Exception table: from to target type 4 11 14 any14 17 14 anyCopy the code
I have posted a piece of Java code containing a block of synchronized code and the bytecode it compiles. You may notice that the bytecode above contains one Monitorenter directive and multiple Monitorexit directives. This is because the Java virtual machine needs to ensure that the acquired lock can be unlocked on both normal and abnormal execution paths.
You can construct all possible execution paths against bytecode and exception handling tables based on what I described in exception handling to see if monitorexit is executed after monitorenter.
When using the synchronized tag method, you’ll see that the access tag for the method in the bytecode includes ACC_SYNCHRONIZED. This flag indicates that the Java virtual machine needs to perform monitorenter operations when entering the method. To exit the method, either by returning normally or throwing an exception to the caller, the Java virtual machine needs to perform the Monitorexit operation.
public synchronized void foo(Object lock) { lock.hashCode(); } // The above Java code will compile to the following bytecode public synchronized void foo(java.lang.object); descriptor: (Ljava/lang/Object; ) V flags: (0x0021) ACC_PUBLIC, ACC_SYNCHRONIZEDCode: stack=1, locals=2, args_size=2 0: aload_1 1: invokevirtual java/lang/Object.hashCode:()I4: pop 5: returnCopy the code
Here the lock objects corresponding to the Monitorenter and Monitorexit operations are implicit. For instance methods, the lock object corresponding to these two operations is this; For static methods, the lock objects corresponding to these two operations are Class instances of their classes.
The role of monitorenter and Monitorexit can be abstractly understood as each lock object having a lock counter and a pointer to the thread that holds the lock.
If the target lock object has a counter of 0 when monitorenter is executed, it is not held by another thread. In this case, the Java virtual machine sets the thread holding the lock object to the current thread and increments its counter by one.
If the target lock object’s counter is not zero, the Java VIRTUAL machine can increment its counter by 1 if the holding thread is the current thread, or wait until the holding thread releases the lock.
When monitorexit is executed, the Java virtual machine decrement the lock object’s counter by one. When the counter drops to 0, the lock has been released.
This counter is used to allow the same thread to acquire the same lock repeatedly. For example, if you have multiple synchronized methods in a Java class, calls between these methods, either directly or indirectly, involve repeated locking operations on the same lock. Therefore, we need to design such a reentrant feature to avoid implicit constraints in programming.
With the abstract locking algorithm out of the way, let’s take a look at the specific locking implementation in the HotSpot VIRTUAL machine.
Heavyweight lock
Heavyweight locking is the most basic implementation of locking in Java virtual machines. In this state, the Java virtual machine blocks the threads that failed to lock and wakes them up when the target lock is released.
Java threads are blocked and woken up by the operating system. For example, on POSIX-compliant operating systems (such as macOS and most Linux), this is done through the mutex of pThreads. In addition, these operations will involve system calls that require switching from user to kernel mode of the operating system, which can be very expensive.
To minimize costly thread blocking, wake up operations, the Java virtual machine goes into spin, runs over the processor and polls to see if the lock is released before the thread enters the blocking state and if it can’t compete for the lock once it’s woken up. If the lock is released, the current thread does not have to block and acquires the lock.
Spin states can waste a lot of processor resources compared to thread blocking. This is because the current thread is still running, but running useless instructions. It expects the lock to be released in the process of running useless instructions.
We can use waiting at a traffic light as an example. The blocking of a Java thread is equivalent to a stalled stop, and the spin state is equivalent to an idle stop. If the red light waiting time is very long, then flameout parking is relatively fuel-efficient; If the red light wait time is very short, such as when we only do an integer addition in a synchronized block, then the lock is bound to be released within a short period of time, making idling more appropriate.
However, in the case of the Java virtual machine, it cannot see the remaining time of the red light, and therefore cannot choose whether to spin or block based on the length of wait time. The Java virtual machine offers an adaptive spin solution that dynamically adjusts the spin time (the number of cycles) based on whether or not a lock can be acquired during the previous spin wait.
In our case, if we had waited for the green light to turn green, we would have waited longer; If you did not turn off the fire before the green light, then the time to turn off the fire is shorter.
Another side effect of the spin state is an unfair locking mechanism. A blocked thread has no way to immediately compete for the released lock. A thread in a spin state, however, is likely to get the lock first.
Lightweight lock
You’ve probably seen an intersection at night with yellow lights flashing in all four directions. Since there may be less traffic at the intersection late at night, if the traffic light is also set up, there is likely to be only one car waiting for the red light in four directions.
Therefore, the traffic light may be set to flash yellow, which means the vehicle can pass freely, but the driver needs to pay attention to observe (personal understanding, please consult the traffic police department for practical significance).
A similar situation exists for Java virtual machines: multiple threads request the same lock at different times, meaning there is no lock contention. In this case, the Java virtual machine uses lightweight locks to avoid blocking and waking up heavyweight locks. Before introducing the principle of lightweight locking, let’s take a look at how the Java Virtual machine distinguishes lightweight locking from heavyweight locking.
In the object memory layout article, I introduced the Mark Word field in the object header. The last two digits are used to indicate the lock status of the object. 00 represents a lightweight lock, 01 represents no lock (or biased lock), 10 represents a heavyweight lock, and 11 is related to the markup of the garbage collection algorithm.
When a lock is added, the Java virtual machine determines whether it is already a heavyweight lock. If not, it marks out a space in the current stack frame of the current thread as the lock record for that lock, and copies the tag field of the lock object into the lock record.
The Java virtual machine then attempts to replace the tag field of the lock object with a CAS (compact-and-swap) operation. As an explanation, CAS is an atomic operation that compares whether the value of the destination address is equal to the expected value and, if so, replaces it with a new value.
Assume that the current lock object has a mark field of X… XYZ, the Java virtual machine compares whether the field is X… X01. If so, it is replaced with the address of the lock record that was just assigned. Its last two digits are 00 because of memory alignment. At this point, the thread has successfully acquired the lock and can continue execution.
If it’s not X… X01, so there are two possibilities. First, the thread repeatedly acquires the same lock. At this point, the Java virtual machine will clear the lock record to zero to indicate that the lock was acquired repeatedly. Second, other threads hold the lock. At this point, the Java virtual machine expands the lock to a heavyweight lock and blocks the current thread.
When to unlock operation, if the current lock record (you can imagine a thread lock all of the records as a stack structure, press the lock into a lock record at a time, unlock the popup a lock record, the record refers to is the lock on the top of the stack of the lock) a value of 0, the representative to enter the same lock, repeat them back.
Otherwise, the Java virtual machine attempts to use CAS to compare whether the value of the tag field of the lock object is the address of the current lock record. If so, it is replaced with the value in the lock record, which is the original tag field of the lock object. At this point, the thread has successfully released the lock.
If not, it means that the lock has been inflated to a heavyweight lock. At this point, the Java virtual machine enters the heavyweight lock release process, waking up the thread that was blocked competing for the lock.
Biased locking
If lightweight locks are optimistic, then biased locks are even more optimistic: only one thread requests a lock at all times.
It’s as if you had a traffic light on your private estate and you were the only one driving. The practice of biased locking is to recognize the license plate number of incoming cars at traffic lights. If it matches your license plate, light up green.
Specifically, when a thread is locked and the lock object supports biased locking, the Java virtual machine uses CAS to record the address of the current thread in the tag field of the lock object and sets the last three digits of the tag field to 101.
During the rest of the run, whenever a thread requests the lock, the Java virtual machine only needs to determine whether the last three digits of the lock object flag field are 101, whether the current thread’s address is included, and whether the epoch value is the same as the epoch value of the lock object’s class. If so, the current thread holds the bias lock and can return directly.
What is the epoch value here?
Let’s start with the undo bias lock. When the thread requesting the lock and the thread address held by the lock object marker field do not match (and the epoch value is equal, if not, then the current thread can bias the lock to itself), the Java virtual machine needs to revoke the bias lock. This undo process is cumbersome, requiring the thread holding the bias lock to reach a safe point and then replace the bias lock with a lightweight lock.
If one kind of lock object to the total number of undo exceeded a threshold (corresponding to the Java virtual machine parameters – XX: BiasedLockingBulkRebiasThreshold, the default is 20), then the Java virtual opportunities announced the biased locking failure of a class.
The idea is to maintain an epoch value in each class, which you can think of as the generation bias lock. When bias locking is set, the Java virtual machine needs to copy the EPOCH value into the tag field of the lock object.
When declaring a class’s biased lock invalid, the Java virtual machine increments the epoch value of the class to indicate that the biased lock of the previous generation is invalid. A new bias lock requires a copy of the new EPOCH value.
To ensure that a thread currently holding a biased lock does not lose the lock, the Java virtual machine needs to traverse the Java stack of all threads, find instances of that class that have been locked, and increment the epoch value in their tag field by one. This operation requires all threads to be in the safe point state.
If the total number of undo than another threshold (corresponding to the Java virtual machine parameters – XX: BiasedLockingBulkRevokeThreshold, the default value is 40), then the Java virtual opportunity to think this class is no longer suitable for biased locking. At this point, the Java virtual machine unlocks the biased lock of the class instance and sets the lightweight lock directly for the class instance during subsequent locking.
conclusion
This paper introduces the implementation of synchronized keyword in Java virtual machine. According to the cost from high to low, it can be divided into three types: heavyweight lock, lightweight lock and biased lock.
Heavyweight locks block and wake up the thread requesting the lock. It works when multiple threads are competing for the same lock. The Java virtual machine adopts adaptive spin to prevent threads from blocking and waking up in the face of very small blocks of synchronized code.
Lightweight lock uses CAS operation to replace the marker field of the lock object with a pointer pointing to a space on the current thread stack where the original marker field of the lock object is stored. It applies to the case where multiple threads apply for the same lock at different times.
Biased locking only takes the CAS operation on the first request, recording the address of the current thread in the tag field of the lock object. Later in the run, the lock operation of the thread holding the bias lock is returned directly. It works when the lock is only held by the same thread.
As a practical part of this article, we verify an anecdotal claim that calling Object.hashcode () closes biased locks on this Object.
You can use parameters – XX: + PrintBiasedLockingStatistics to print the number of all kinds of locks. Because of C2 use is another parameter – XX: + PrintPreciseBiasedLockingStatistics, so you can limit the Java virtual machine only use C1 to instantaneous compiling (corresponding parameters – XX: TieredStopAtLevel = 1).