This article introduces the concepts and classifications involved in thread safety, how synchronization is implemented, and the underlying operating principles of virtual machines, as well as a series of lock optimization measures taken by virtual machines to achieve efficient concurrency.
- An overview of the
- Thread safety
- Lock the optimization
1. An overview of the
Derived | in the point to understand the JVM memory model & thread mainly introduced how to realize the concurrent virtual machine, the focus now is how to realize the “efficient” virtual machine.
2. Thread safety
Before achieving efficiency, concurrency needs to be correct, so this section introduces thread safety first.
A. definition: when multiple threads access to an object, if don’t have to consider these threads in the runtime environment of scheduling and execution alternately, also do not need to undertake additional synchronization, or any other coordinated operation in the caller, call the object’s behavior can get the right results, that the object is thread-safe.
Thread-safe code must have one feature: the code itself encapsulates all the necessary correctness guarantees (such as mutex synchronization, etc.), so that the caller does not need to care about multithreading, let alone take any measures to ensure that the correct invocation of multithreading.
B. Classification: According to the degree of thread safety from strong to weak, it can be divided into five categories
- immutableThe external visible state never changes and is always a consistent state across multiple threads.
- It must be thread safe
- How to implement:
- If shared data is aBasic data types, as long as it’s used in the definition
final
Keyword modification; - If shared data is aobject, the simplest method is to declare all variables in the object with state as
final
.
- If shared data is aBasic data types, as long as it’s used in the definition
- Absolute thread-safety: Fully meets the previous definition of thread-safety where “no additional synchronization measures are required by the caller regardless of the runtime environment.”
- Relative thread safety: Can ensure that the individual operation on the object is thread-safe, and no additional safeguard is required during the invocation. However, for some consecutive calls in a specific order, additional synchronization measures may be required at the calling end to ensure the correctness of the invocation.
- It’s thread safety in the usual sense
- Most thread-safe classes fall into this category, such as
Vector
,HashTable
,Collections#synchronizedCollection()
A collection of packages… - More on implementation in the next section.
- The thread is compatible withThe object itself is not thread-safe, but can be safely used in a concurrent environment by using synchronization correctly on the calling side.
- It’s not thread safe in the usual sense
- Most classes in the Java API are thread-compatible, for example
ArrayList
andHashMap
.
- Thread antagonism: Code that cannot be used concurrently in a multithreaded environment, regardless of whether synchronization is taken on the calling end.
C. Thread-safe implementation
This article focuses on the virtual machine itself
- Implement thread safety through code writing
- Synchronization and locking are implemented on VMS
Mutual Exclusion&Synchronization
- meaning:
- Synchronization: When multiple threads concurrently access shared data, ensure that shared data is used by only one thread at a time.
- Mutex: A means to achieve synchronization. Critical sections, Mutex, and Semaphore are the main implementations of Mutex.
Mutual exclusion is cause, synchronization is effect; Mutual exclusion is the method, synchronization is the destination.
- This is a pessimistic concurrency strategy, which assumes that problems are guaranteed as long as proper synchronization is not done, and therefore locks regardless of whether shared data is actually competing.
- The biggest issue is the performance of doing thread Blocking and waking, also known as Blocking Synchronization.
- methods:
- use
synchronized
Key words:- The principle of: will be synchronized block after compilationBefore and afterRespectively to form
monitorenter
andmonitorexit
These two bytecode instructions are passed through onereference
Type to specify the object to lock and unlock. If an object parameter is explicitly specified, the object’sreference
; Otherwise, according tosynchronized
Whether the instance method or the Class method takes the corresponding object instance or Class object as the lock object. - process: perform
monitorenter
Command to attempt to acquire the lock of the object. If the object is not locked or has been acquired by the current thread, then the lock counter +1; But in the performancemonitorexit
Command, lock counter -1; When the lock counter =0, the lock is released; If the object lock fails to be acquired, the current thread blocks until the object lock is released by another thread. - Pay special attention to:
synchronized
Synchronous blocks for the same threadreentrantThere will be no self-locking problem; Also, synchronization blocks subsequent threads until the incoming thread is finished executing.
- The principle of: will be synchronized block after compilationBefore and afterRespectively to form
- Use a reentrant lock
ReentrantLock
:- The sameUsage: with
synchronized
Very similar, and both are reentrant. - with
synchronized
thedifferent:- Interruptible wait: When the thread holding the lock does not release the lock for a long time, the waiting thread can choose to abandon the wait and process something else instead.
- Fair lock: Multiple threads waiting for the same lock must acquire the lock in the order in which the lock was applied. while
synchronized
It is unfair that any thread waiting for the lock has a chance to acquire it when the lock is released.ReentrantLock
It is also unfair by default, but you can use a constructor with a Boolean value to switch to a fair lock. - The lock binds multiple conditionsA:
ReentrantLock
Object can be called multiple timesnewCondition()
Bind multiple at onceCondition
Object. And in thesynchronized
Of the lock objectwait()
andnotify()
ornotifyAl()
Only one implicit condition can be implemented, and an additional lock must be added to associate with more than one condition.
- chooseIn:
synchronized
It is preferred for synchronization where the requirements can be met. The next two graphs compare their throughput on different processors.
- The sameUsage: with
- use
② Non-blocking Synchronization:
- Optimistic concurrency strategy based on conflict detection, that is, the operation is carried out first, if there is no other thread contention for shared data, the operation succeeds; On the contrary, there is a conflict to take other compensation measures.
- In order to ensure thatoperationandCollision detectionThese two steps are atomic and require a hardware instruction set, such as:
- Test and Set (test-and-set)
- Fetch and Increment
- Swap
- Compare and Swap (CAS)
- Load links/Conditional stores (LOAD-linked/store-conditional,LL/SC)
③ No synchronization scheme
- Definition: No synchronization is required to ensure thread-safety, because some code is inherently thread-safe. Here are two examples:
- 1.Reentrant code(Reentrant Code) /Pure code(Pure Code)
- Meaning: code can be interrupted at any point in its execution to execute another piece of code, and control is returned without any errors in the original program.
- Common features: independent of data stored on the heap and common system resources, the amount of state used is passed in from parameters, no non-reentrant methods are called…
- Criteria: A method is reentrant if its return is predictable and can return the same result as long as it inputs the same data.
Code that is reentrant is not necessarily thread-safe, and vice versa.
- 2.Thread local storage(Thread Local Storage)
- Meaning: The visible scope of shared data is limited to the same thread, without synchronization can ensure that there is no data contention between threads.
- use
ThreadLocal
Class implements thread-local storage functionality: per-threadThread
There is one in each objectThreadLocalMapObject that stores a set ofThreadLocal.threadLocalHashCodeIs key, key-value pair with local thread variable as value, andThreadLocalThe object is of the current threadThreadLocalMapAccess entry, also contains a uniquethreadLocalHashCodeValue to retrieve the corresponding local thread variable in the thread key value pair.
3. Lock optimizations
With the correctness of concurrency resolved, here are five lock optimization techniques for sharing data more “efficiently” between threads, resolving contention issues, and improving program execution efficiency.
A. Adaptive Spinning
- Background: When implementing mutex synchronization, the operations of suspending and resuming threads need to be completed in kernel mode, which affects the concurrency performance of the system. At the same time, the locking state of shared data on many applications is only temporary and there is no need to suspend or resume threads.
- spinlocks: When the physical machine has multiple processors that make multiple threads execute in parallel at the same time, let the thread that first requests the lock wait, but does not give up the execution time of the processor, to see whether the thread that holds the lock will release the lock soon, then just let the thread execute a busy loop, that is, spin.
- Note: Spin wait is not a substitute for blocking. It avoids the overhead of thread switching but consumes processor time. Therefore, the spin wait must be limited to a certain amount of time.
- Adaptive spin lock: The spin time is no longer fixed, but is determined by the last spin time on the lock and the state of the lock owner. Specific performance is:
- If the spin wait has just been successfully acquired for a lock and the thread holding the lock is running, the virtual machine is likely to allow the spin wait longer.
- If spin is rarely successfully acquired for a lock, it is likely that the spin wait will be omitted in the future to avoid wasting processor resources.
B. Lock Elimination
- Lock elimination: The removal of locks that require synchronization on code but are detected as impossible to compete for shared data by the virtual machine’s just-in-time compiler.
- Criteria: If all data on the heap in a piece of code does not escape to be accessed by other threads, it can be treated as data on the stack, that is, thread private and does not need to be locked synchronously.
C. Lock Coarsening
In general, the scope of the synchronized block is limited to the actual scope of the shared data. This minimizes the number of operations that need to be synchronized and ensures that the thread waiting for the lock can acquire the lock as soon as possible even if there is a lock contention.
However, if the same object is locked and unlocked repeatedly, frequent mutex synchronization will cause unnecessary performance loss even if there is no thread contention. In this case, the SCOPE of lock synchronization will be coarsed outside the entire operation sequence, so that only one lock is required.
D. Lightweight Locking
- Objective: To reduce the performance cost of traditional heavyweight locks using OS mutex without multithreading. Note that it is not used to replace heavyweight locks.
Let’s start by understanding the memory layout of the HotSpot VIRTUAL machine’s object header: there are two parts
- The first part, which stores the runtime data of the object itself, is called the Mark Word and is the key to implementing lightweight and biased locking. Such as hash code, GC generation age and so on.
- Another section is used to store Pointers to the object type data in the method area, and an additional section is used to store the array length if it is an array object.
- Locking process: When the code enters the synchronization block, if the synchronization object is not locked (lock flag bit is
01
), the virtual machine creates a space called Lock Record in the stack frame of the current thread, which is used to store a copy of the Lock object Mark Word. The diagram below.
The virtual machine then attempts to update the object’s Mark Word to a pointer to the Lock Record using a CAS operation. If the update action is successful, then the current thread has the lock of the object, and the lock flag of the object Mark Word changes to 00, that is, it is in the lightweight lock state. Otherwise, the virtual machine first checks whether the Mark Word of the object points to the stack frame of the current thread. If the current thread has the lock of the object, the virtual machine can directly enter the synchronization block to continue execution. Otherwise, the virtual machine has been preempted by another thread. The diagram below.
In addition, if there are more than two threads competing for the same lock, the lightweight lock is no longer valid. To expand to the heavyweight lock, the lock flag bit is changed to 10. The pointer to the heavyweight lock is stored in the Mark Word, and the thread waiting for the lock is also blocked.
- Unlock process: if the product’s Mark Word still points to the Lock Record of the thread, replace the product’s current Mark Word and the copied product of the thread with CAS operation. If the replacement succeeds, the synchronization process is complete. Otherwise, another thread is trying to acquire the lock, and the suspended thread must be awakened at the same time the lock is released.
- Advantages: Because for most locks, there is no contest for the entire synchronization cycle, lightweight locks eliminate synchronized mutex by using CAS operations.
E. Biased Locking
- Objective: To eliminate synchronization primitives in the case of uncontested data and further improve the performance of the program.
- Meaning: A bias lock is biased in favor of the first thread that acquired it. If the lock is not acquired by another thread in subsequent executions, the thread that holds the bias lock will never need to synchronize again.
- Locking processThe Mark Word lock flag bit is set to when the lock object is first acquired by a thread
01
, i.e., bias mode. Meanwhile, CAS operation is used to record the ID of the thread that acquired the lock in the Mark Word of the object. If the operation is successful, the thread holding the biased lock will not perform any further synchronization operations each time it enters the lock associated synchronization block. - The unlock process: When another thread attempts to acquire the lock, the lock object will revert to unlocked according to whether it is currently locked
01
Or lightweight locking00
The subsequent synchronization operation is like the lightweight lock execution process. The diagram below.
- Advantages: Improves the performance of programs with synchronization but no contention, but this pattern is not necessary if most locks in the program are always accessed by multiple threads.