The critical area:



A critical section is a feature of a program segment that accesses shared resources (such as shared devices or shared storage) that cannot be accessed by multiple threads at the same time. When a thread enters a critical segment, other threads or processes must wait (for example, the waiting method). Some synchronization mechanisms must be implemented at the entry and exit points of the critical segment to ensure that these shared resources are mutually exclusive, such as Semaphore. A device that can only be accessed by a single thread, such as a printer.



The mutex:



A mutex is a variable that can be in one of two states: unlock and lock. Thus, only one binary bit is needed to represent it, although in practice, an integer quantity is often used, with 0 indicating unlock and all other values indicating lock. Mutex uses two procedures. When a thread (or process) needs to access a critical section, it calls mutex_lock. If the mutex is currently unlocked (i.e. the critical section is available), the call succeeds and the calling thread is free to enter the critical section.

On the other hand, if the mutex is locked, the calling thread is blocked until the thread in the critical section completes and calls mutex_UNLOCK. If multiple threads are blocked on the mutex, a thread is randomly selected and allowed to acquire the lock.



Monitor:



A tube (English: Monitors) is a program structure in which worker threads of subroutines (objects or modules) mutually exclusive access to shared resources. These shared resources are typically hardware devices or sets of variables.

A pipe program implements that at most one thread is executing a subroutine of a pipe program at any one time. Compared with concurrent programs that modify data structures to achieve mutually exclusive access, pipe-side implementations greatly simplify programming

All kinds of hardware resources and software resources in the system can be abstractly described by data structure, which means to represent the resources with a small amount of information and operations performed on the resources, while ignoring their internal structure and implementation details.

A shared data structure is used to represent the shared resources in the system abstractly, and the operations performed on the shared data structure are defined as a set of procedures.



Semaphore:



Semaphore, sometimes called a Semaphore, is a facility used in multithreaded environments to ensure that two or more critical pieces of code are not called concurrently. Before entering a critical code segment, the thread must acquire a semaphore; Once that critical piece of code is complete, the thread must release the semaphore. Other threads that want to enter the critical code segment must wait until the first thread releases the semaphore. To do this, create a Semaphore VI and place Acquire Semaphore VI and Release Semaphore VI at the beginning and end of each key snippet. Verify that these semaphores VI refer to the semaphore originally created.



CAS operation (compare-and-swap) :



CAS has three operands, the memory value V, the old expected value A, and the new value B to modify. Change the memory value V to B if and only if the expected value A and memory value V are the same, otherwise nothing is done. More details:Zl198751.iteye.com/blog/184857… 



Reorder:



The reordering of a program by the compiler and processor at execution to improve performance. It appears to improve the concurrency of programs and thus improve performance! But for multithreaded programs, reordering may cause the program to execute with results that are not what we want! Reordering is divided into “compiler” and “processor” aspects, and “processor” reordering includes “instruction level reordering” and “memory reordering”.



One, thread and memory interaction operation 



 



All variables (instance fields, static fields, elements that make up array objects, excluding local variables and method parameters) are stored in main memory. Each thread has its own working memory, which holds the main memory copy of the variables used by the thread. All operations by a thread on a variable must be done in working memory rather than directly reading or writing to variables in main memory. Different threads cannot directly access variables in each other’s working memory. The transfer of variable values between threads is done through main memory.



The Java memory model defines eight operations:

 

 

  • Lock: a variable acting on main memory that identifies a variable as a thread-exclusive state;
  • Unlock: A variable that operates on main memory. It releases a locked variable so that it can be locked by another thread.
  • Read: a variable acting on main memory that passes the value of a variable from main memory to the thread’s working memory for subsequent load action;
  • Load: a variable operating on working memory that places the value of the variable from main memory for the read operation into a copy of the variable in working memory;
  • Use: variable applied to working memory, which passes the value of a variable in working memory to the execution engine;
  • Assign: a variable applied to working memory that assigns a value received from the execution engine to a variable in working memory;
  • Store: a variable applied to working memory that transfers the value of a variable in working memory to main memory for subsequent write operations;
  • Write: a variable operating on main memory that writes the value of a variable from the working memory for the store operation to the main memory variable.

Volatile ensures that new values are immediately stored in main memory and flushed from main memory immediately before each use. 2) Forbid instruction reordering optimization. Note: The volatile keyword does not guarantee correct operations on shared data in a multithreaded environment. This can be used when you need to notify all threads immediately after your state changes. Two, the three characteristics of concurrency atomicity atomicity refers to the minimum non-separable operation instructions, that is, a single machine instruction, atomic operations can only have one thread at any time, so it is thread-safe. The Java memory model ensures atomicity of variables through read, load, assign, use, Store, and write operations. The two 64-bit data types, long and Double, do not enforce atomicity, a so-called nonatomic protocol, for their read, load, Store, and write operations. However, the various commercial Java virtual machines today implement the 4 nonatomic protocol operations of the long and double data types as atomic. So access to and read from primitive data types in Java are atomic operations. A wide range of atomicity is guaranteed through lock, unlock and synchronized blocks. Visibility Visibility means that when one thread changes the value of a shared variable, other threads can immediately know about the change. The Java memory model implements visibility by synchronizing the new value back to main memory after a variable is modified and flushing the value from main memory before the variable is read. Visibility is guaranteed in Java with the keywords volatile, final, and synchronized:

 

 

 

  • Volatile: Ensures visibility by refreshing variable values.
  • Synchronized: Before a variable lock is used to lock a block, you must clear the variable values in the working memory and read them from the main memory again. Before unlocking a block, you must synchronize the variable values back to the main memory to ensure visibility.
  • Final: Once a field modified by final is initialized in the constructor and the constructor does not pass the this reference, the value of the final field is visible in other threads and can be accessed correctly without synchronization.

Orderliness refers to the fact that within a thread, all operations are executed in order, while between threads, operations are executed out of order because of the delay in synchronization between working and main memory. Java ensures order between threads through the volatile and synchronized keywords.

 

 

  • Volatile disallows instruction reordering optimization to achieve order.
  • Synchronized ensures order by allowing only one thread to lock a variable at a time.


Third, the implementation of Java thread 



Threads are implemented in three ways:



Kernel threads (Kernal threads)



Kernel threads (KLT) are threads directly supported by the operating system Kernel. This Thread is switched by the Kernel, which schedules the threads through the operation scheduler and maps the tasks of the threads to each processor. Generally, programs do not directly use kernel threads, but use a high-level interface of kernel threads — light-weight Process (LWP). Lightweight processes are generally referred to as threads. Since each lightweight Process is supported by a kernel thread, kernel threads must be supported first. To have lightweight processes. This 1:1 relationship between lightweight processes and kernel threads is called the one-to-one threading model. Lightweight processes consume a certain amount of kernel resources (such as kernel thread stack space), and the system call costs are relatively high, so there is a limit to how many lightweight processes a system can support.



Light weight Processes



Broadly speaking, a Thread as long as it’s not a kernel Thread, it can be considered a User Thread, the User Thread (UT), and special User Thread is fully established in User space threads library, the realization of the system kernel cannot perceive threads, the establishment of User threads, synchronization, and scheduling of destruction completely done in User mode, No kernel help is needed. If implemented properly, this thread does not need to be switched to kernel mode, so operations can be very fast and low cost, or can support a larger number of threads, some of which are implemented by user threads in high-performance databases. This 1: N relationship between a process and a user thread is called the one-to-many threading model. (Windows and Linux use this approach.)



The advantage of using the user thread is that it does not need the support of the system kernel, and the disadvantage is that there is no support of the system kernel. All the thread operations need to be handled by the user program, so the program that uses the user thread is generally more complex, and now the program that uses the user thread is less and less.



User threads/Mixed threads



There are both user threads and lightweight processes. User threads are still built entirely in user space, while lightweight processes supported by the operating system act as a bridge between user threads and kernel threads. In this hybrid mode, the ratio of user threads to lightweight processes is variable, with an M: N relationship. Many Unix systems provide an implementation of the M: N threading model.



Java thread scheduling 



Java threads were implemented based on user threads called “green threads” prior to JDK1.2, where the threading model was replaced with an operating system native threading model. Therefore, in the current JDK version, the operating system supports a thread model that largely determines how Java virtual machine threads are mapped. There is no agreement between platforms on this, and the virtual machine specification does not specify which thread model Java threads need to implement.



There are two ways to schedule threads 



Concordantly: The execution time of a thread is controlled by the thread itself, and the thread notifies the system to switch to another thread after completing the task. (Not recommended)

Advantages: simple implementation, thread switching operation is known to the thread itself, there is no thread synchronization problem.

Disadvantages: Thread execution time is out of control, if the thread does not give up CPU execution time for a long time, the system may crash.



Preemptive: each thread the execution time of the operating system to allocate, the operating system to give each thread execution time slice, get the time slice thread execution, available time slice to preempt the execution time, after thread switching is not thread itself to decide (Java using thread scheduling mode is a preemptive scheduling).

Advantages: Thread execution time can be controlled, no system crash due to a thread blocking problem.



The scheduling relationship of thread states in Java 



 



5. Thread safety levels in Java 



Immutable:



It can be final of the basic type; It can be final, but the behavior of the object has no effect on its state. For example, a String subString is new. A String object is immutable with all Number types such as BigInteger and BigDecimal. But AtomicInteger and AtomicLong, both subtypes of Number, are not immutable. Its state objects are unsafe, and all operations are CAS operations, ensuring atomicity.



Absolute thread safety:



Regardless of the runtime environment, the caller does not need any additional synchronization measures.



Relative thread safety:



This is thread safety in our usual sense. You need to ensure that individual operations on objects are thread-safe. Examples include Vector, HashTable, synchronizedCollection wrapping collection, etc.



Thread compatibility:



The object itself is not thread-safe, but can be implemented by means of synchronization. Generally we don’t mean thread-safe, most of the time we mean that. Like ArrayList, HashMap, etc.



Thread opposition:



Code that cannot be used concurrently, whether or not the calling end has adopted synchronization measures.



Six, the implementation of thread safety 



The mutex synchronization 



When accessing multiple threads, ensure that only one thread is in use at a time.

Critical sections, Mutex, and Semaphore are all means of synchronization

The most basic mutex in Java is synchronized, which, when compiled, produces monitorenter and Monitorexit bytecode instructions. Each bytecode requires a reference parameter to specify which object to lock and unlock, as well as a lock counter. To record the number of times to lock, lock several times to unlock several times to return to the lock state.



But as we’ve seen in Java and Threads, The Java threads map to the OPERATING system’s native threads. Blocking and waking up all require the OPERATING system’s help. It’s a time-consuming switch from user mode to core state. The virtual machine itself does a bit of tweaking, such as a spin wait before notifying the operating system of blocking, to avoid frequent switching to the core mindset.



Advantages of ReentrantLock over synchronized:

 

 

  • Interruptible wait: If the thread holding the lock does not release the lock for a long time, the waiting thread can choose to abandon the wait.
  • New RenentrantLock(Boolean fair) Indicates that a synchronized ReentrantLock is acquired in the order in which it was acquired.
  • Lock multiple conditions: Multiple conditions can be obtained by multiple newconditions, which can easily achieve more complex thread synchronization functions. Through await (), signal ();


Nonblocking synchronization 



The main problem with mutex and synchronization is the performance problem of blocking and awakening, so this is often called blocking synchronization (pessimistic concurrency strategy). With the development of hardware instruction sets, we have other options: Optimistic concurrency strategy based on collision detection, is easy to speak the first operation, if there are no other thread contention Shared data, operation is successful, if any, are for other compensation (the most common is constantly retry), the optimistic concurrency strategy implementation dont need to hang the thread, the synchronous operation is called a non-blocking synchronization.



Such directives are:

1) Test and set (test-and-set)

2) Get and add

3) exchange

4) Compare and exchange (CAS)

5) Load link/Conditional Store (load-linked/store-conditional LL/SC)



The next two are processor instructions added to modern processors. CAS operations are not available in Java until JDK1.5. Unaddressed, the unsafe-looking class has several methods, such as compareAndSwapInt() and compareAndSwapLong(), that the vm performs special processing on. The result is a platform-specific processor CAS instruction. Procedures that have no method calls can be considered unconditionally inlined.



It used to be necessary to synchronize i++, but now you have this CAS operation to ensure atomicity, such as AtomicInteger. But CAS has an ABA problem. This can be solved by AtomicStampedReference.



No synchronization 



Some code is inherently thread-safe and does not require synchronization. There are two categories:



Reentrant Code: Pure Code that does not rely on data stored on the heap or common system resources, uses state quantities passed in from parameters, does not call non-reentrant methods, and returns predictable results.



Thread Local Storage: Limits the visibility of shared data to the same Thread. This eliminates synchronization and ensures that data contention does not occur between threads. ThreadLocal storage can be implemented through the java.lang.ThreadLocal class.



Locking mechanism in Java 



Pessimistic locking 



Assume that concurrency conflicts will occur, and mask any operations that might violate data integrity. Pessimistic locking assumes that there is a high probability that other threads will attempt to access or change the object you are accessing or changing, so in a pessimistic locking environment, the object is locked before you start changing it, and the lock is not released until you commit your changes.



Optimistic locking 



Assume that no concurrency conflicts occur. Easy to leave unlocked.



Spin-locking and adaptive spin 



Thread to suspend and resume the operation needs to be done into the kernel mode, the operation system of concurrent performance to bring very great pressure, in many applications, Shared data locked only lasts for a short period of time, in order to this period of time to suspend and resume the thread is not worth it, you can make requests after the lock thread wait for a while, but don’t give up the processor execution time, Let the thread execute a busy loop (spin).



The default value for spin locks is 10 spins, which can be changed using the -xx :PreBlockSpin argument.



Adaptive spin means that the spin time is no longer fixed, but is determined by the previous spin time on the same lock and the state of the lock owner.



The lock to clear:



At runtime, the virtual machine just-in-time compiler removes locks that require synchronization on code but detect that there is no possibility of competing for shared data. The main criterion of lock elimination is supported by the data of escape analysis.



Lock coarsening:



If the virtual machine detects a series of consecutive operations that repeatedly lock and unlock the same object, the scope of lock synchronization will be extended (coarsened) outside the entire operation sequence.



Lock escalation 



Java SE1.6 introduces “biased lock” and “lightweight lock” in order to reduce the performance cost of acquiring and releasing locks. Therefore, there are four lock states in Java SE1.6: no lock state, biased lock state, lightweight lock state and heavyweight lock state, which will gradually upgrade with the competition situation. Locks can be upgraded but cannot be downgraded, meaning biased locks cannot be downgraded after being upgraded to lightweight locks. The purpose of this policy is to improve the efficiency of acquiring and releasing locks.



 



Biased locking 



The author of Hotspot has found through previous studies that in most cases locks are not only not contested by multiple threads, but are always obtained by the same thread for many times. Biased locks are introduced to make the cost of obtaining locks lower. When a thread, when accessing a synchronized block and obtain the lock will lock in the head and stack frame object record store to lock in the thread ID, after the thread on the entry and exit the synchronized block does not need to take the CAS operation to lock and unlock, and simply test object head Mark Word whether to store the point to the current thread to lock, if the test is successful, If the test fails, it needs to test again whether the bias lock identifier in Mark Word is set to 1 (indicating that the current is a bias lock). If not, CAS is used to compete for the lock. If so, CAS is used to try to point the bias lock of the object head to the current thread.



Biased lock cancellation: Biased locks use a mechanism that waits until a contention occurs to release the lock, so the thread holding the biased lock will release the lock only when other threads attempt to contest the biased lock. Biased locking revocation, needs to wait for the global security point (at this point in time not execute bytecode), it will first suspend threads with biased locking, and then check whether hold biased locking thread alive, if the thread is not active, head of state is set to do not have a lock will object, if the thread is still alive, stack with biased locking may be implemented, The lock records in the stack and the Mark Word of the object header are either rebiased to other threads, or reverted to lockless or marked that the object is not suitable as a biased lock, finally waking up the suspended thread. Thread 1 in the figure below illustrates the process of biased lock initialization and thread 2 illustrates the process of biased lock cancellation.



 



Close to lock: biased locking in Java 6 and Java 7 is enabled by default, but it’s on the application startup after a few seconds to activate, can use the JVM parameters to close the delay if necessary – XX: BiasedLockingStartupDelay = 0. If you are sure that all locks in your application are normally in contention, you can turn off biased locking with the JVM parameter -xx: -usebiasedLocking =false, and the lightweight locking will be switched on by default.



Lightweight lock:



Lightweight lock and lock: The JVM will create a space for storing lock records in the current thread’s stack frame before executing the synchronized block and copy the product header Mark Word (officially called product Mark Word) into the lock record. The thread then tries to use CAS to replace the Mark Word in the object header with a pointer to the lock record. If it succeeds, the current thread acquires the lock; if it fails, other threads compete for the lock and the current thread attempts to acquire the lock using spin.



Lightweight lock Unlock: The lightweight unlock will use atomic CAS operation to replace the product Mark Word back to the object head. If successful, no competition will occur. If it fails, it indicates that the current lock is competing, and the lock expands to a heavyweight lock. The following is a flow chart of two threads competing for a lock at the same time, resulting in lock inflation.



 



Because spin consumes CPU, once a lock is upgraded to a heavyweight lock, it does not revert to a lightweight lock state in order to avoid unwanted spin (such as when the thread that acquired the lock is blocked). When the lock is in this state, other threads trying to acquire the lock will be blocked. When the thread holding the lock releases the lock, these threads will be awakened, and the awakened thread will start a new round of contention for the lock.



Heavyweight lock:



A weight lock, also known as an object Monitor in the JVM, contains at least one queue for competing locks and one wait queue for mutual exclusion and one for thread synchronization.



Partial lock lightweight lock concept reference articlewww.infoq.com/cn/articles…