In a shared-memory multiprocessor architecture, each processor has its own cache and is regularly coordinated with main memory.

What is the memory model

Suppose a thread assigns the variable aVariable: aVariable=3;

The memory model needs to solve the question: “Under what conditions will a thread reading aVariable see this value as 3?” This may sound like a silly question, but without synchronization, there are many factors that prevent a thread from seeing the results of another thread’s operations immediately, or even forever.

The order of instructions generated in the compiler can be different from the order in the source code, and the compiler stores variables in registers instead of memory. The processor can execute instructions out of order or in parallel. The cache may change the order in which written variables are committed to main memory; Furthermore, values stored in the processor’s local cache are not visible to other processors. These factors can make it impossible for one thread to see the latest value of a variable, and can cause memory operations in other threads to appear to be executed out of order if proper synchronization is not used.

The memory model of the platform

Modern computer system to add a layer or multilayer, speaking, reading and writing speed as close as possible to the processor speed in the Cache (Cache) for use as a buffer between memory and processor: will the operation need to use the data is copied to the Cache, let operation can quickly, when the operation ended from memory Cache synchronization back again, so you don’t need to wait for slow memory processor, speaking, reading and writing. It introduces a new problem: Cache Coherence

This provides different levels of Cache Coherence across different processor architectures, allowing different processors to see different values from the same storage location at any one time. It would be very expensive to ensure that each processor knew what the other processors were doing at any given moment. Most of the time, this information is not necessary, so processors relax storage consistency guarantees in exchange for improved performance.

The memory model defined by the architecture tells the application what guarantees can be obtained from the memory system, and special instructions (called memory fences or fences) are also defined that enable additional storage coordination guarantees when data needs to be shared.

Physical machines with different architectures can have different memory models, and Java virtual machines have their own memory models, and JVMS mask the differences between the JMM and the underlying platform memory models by inserting memory fences in place.

The interaction between processor, cache, and main memory

reorder

In Order to maximize the use Of the processor’s internal units, the processor may optimize input code for out-of-order Execution. The processor reorganizes the out-of-order Execution after the computation to ensure that the result is consistent with the sequential Execution. However, it is not guaranteed that the sequence of each statement in the program is the same as that in the input code. Therefore, if there is a calculation task that depends on the intermediate results of another calculation task, the sequence cannot be guaranteed by the sequence of the code.

There are three types of reordering.

  1. Compiler optimized reordering. The compiler can rearrange the execution order of statements without changing the semantics of a single-threaded program.
  2. Instruction – level parallel reordering. Modern processors use instruction-level Parallelism (ILP) to overlap multiple instructions. If there is no data dependency, the processor can change the execution order of the machine instructions corresponding to the statement.
  3. Memory system reordering. Because the processor uses caching and read/write buffers, this makes the load and store operations appear to be out of order.

Similar to out-of-order execution optimizations for processors, Instruction Reorder optimizations are available in the Just-in-time compiler for the Java virtual machine. The sequence of instructions from the Java source code to the actual execution goes through the following three reorders

1 above belongs to compiler reorder, 2 and 3 to handler reorder. These reorders can cause memory visibility problems in multithreaded programs. For compilers, the JMM’s compiler reordering rules disallow certain types of compiler reordering (not all compiler reordering is prohibited). For processor reordering, the JMM’s processor reordering rules require the Java compiler to insert Memory Barriers (Intel calls them Memory fences) of a specific type to prevent reordering of a particular type of processor when generating the sequence of instructions.

Common processors allow reordering types:

Java memory model

The main purpose of the Java memory model is to define the access rules for various variables in the program, namely the low-level details of storing variable values into and out of memory in the virtual machine. For better performance, the Java memory model does not limit the execution engine to use specific registers or caches of the processor to interact with main memory, nor does it restrict the just-in-time compiler to perform optimizations such as adjusting the order in which code is executed.

Main memory vs. working memory

The Java Memory model specifies that all variables are stored in Main Memory (which is the same name as the Main Memory mentioned in the introduction of physical hardware, but is physically only a part of the virtual machine’s Memory). Each thread has its own Working Memory (Working Memory, and speak in front of the processor cache analogy), thread’s Working Memory holds the variables used by the thread of a copy of the main Memory, threads of variables all the operations (read, assignment, etc.) must be done in Working Memory, and cannot be directly read and write data in main Memory. Different threads cannot directly access variables in each other’s working memory, and the transfer of variable values between threads needs to be completed through the main memory.

Interaction between thread, main memory, and working memory

Intermemory operation

The Java memory model defines the following eight operations for the specific protocol of interaction between main memory and working memory, that is, how a variable is copied from main memory to working memory and synchronized from working memory back to main memory.

Java virtual machine implementations must ensure that each of the operations described below is atomic and non-separable

  • Lock: A variable acting on main memory that identifies a variable as a thread-exclusive state.
  • Unlock: A variable that acts on main memory. It releases a locked variable so that it can be locked by another thread.
  • Read: a variable acting on main memory that transfers the value of a variable from main memory to the thread’s working memory for subsequent load action.
  • Load: Variable acting on working memory, which puts the value of the variable obtained from main memory by the read operation into a copy of the variable in working memory.
  • Use: variable applied to working memory, which passes the value of a variable in working memory to the execution engine. This operation is performed whenever the dummy reaches a bytecode instruction that needs to use the value of the variable.
  • Assign: a working memory variable that assigns a value received from the execution engine to the working memory variable. This operation is performed whenever the virtual machine accesses a bytecode instruction that assigns a value to the variable.
  • Store: Variable applied to working memory that transfers the value of a variable in working memory to main memory for subsequent write operations.
  • Write: a variable operating on main memory that places the value of a variable in main memory obtained from the store operation in working memory.

If you want to copy a variable from main memory to working memory, read and load are performed sequentially. If you want to synchronize variables from working memory back to main memory, store and write are performed sequentially. Note that the Java memory model only requires that these two operations be performed sequentially, but not consecutively. This means that other instructions can be inserted between read and load, and between store and write. For example, when accessing variables A and B in main memory, one possible order is read A, read B, load B, load A.

In addition, the Java memory model specifies that the following rules must be met when performing the eight basic operations described above:

  • One of the read and load, store and write operations is not allowed to occur separately, that is, a variable is read from main memory but not accepted by working memory, or the working memory writes back but not accepted by main memory.
  • A thread is not allowed to discard its most recent assign operation, which means that after a variable has changed in working memory, it must synchronize the change back to main memory.
  • A thread is not allowed to synchronize data from the thread’s working memory back to main memory without any assign operation.
  • A new variable can only be created in the main memory. It is not allowed to use an uninitialized variable in the working memory. In other words, the use and store operations on a variable must be performed before the use and store operations are performed on the variable.
  • A variable can be locked by only one thread at a time. However, a variable can be locked multiple times by the same thread. After multiple unlock operations are performed, the variable can be unlocked only after the same unlock operations are performed.
  • If you perform a lock operation on a variable, the value of the variable will be emptied from working memory. Before the execution engine can use the variable, load or assign operations need to be performed again to initialize the value of the variable.
  • It is not allowed to unlock a variable that has not been previously locked by a lock operation, nor to unlock a variable that has been locked by another thread.
  • Before performing an unlock operation on a variable, the variable must be synchronized back to main memory (store, write)

This definition is very strict, but it is also extremely cumbersome, and extremely troublesome in practice. Later, the Java design team probably recognized this problem and simplified the Operations of the Java memory model into four types: read, write, Lock, and unlock. But this was only a simplification of language equivalence, and the basic design of the Java memory model did not change. Probably no other developer outside of the virtual machine development team thinks about concurrency in this way, we just need to understand the definition of the Java memory model.

Special rules for volatile variables

The Java memory model defines special access rules for volatile. When a variable is volatile, it has two properties:

The first is to ensure that the variable is visible to all threads, where “visibility” means that when one thread changes the value of the variable, the new value is immediately visible to other threads. This is not the case with ordinary variables, whose values are passed from thread to thread through main memory. For example, if thread A modifies the value of A common variable and writes back to main memory, thread B will read back to main memory after thread A has written back, and the value of the new variable will be visible to thread B.

Volatile variables in each thread of working memory is not consistency problems (from the perspective of physical storage, each thread working memory is inconsistent with the volatile variables can also exist, but because all want to refresh before each use, execution engine don’t think not the same, so you can think that there is no consistency issues). But Java’s operand operators are not atomic, making operations on volatile variables unsafe concurrently.

/** * public static volatile int race = 0; public static void increase() { race++; } private static final int THREADS_COUNT = 5; public static void main(String[] args) { Thread[] threads = new Thread[THREADS_COUNT]; for (int i = 0; i < THREADS_COUNT; i++) { threads[i] = new Thread(new Runnable() { @Override public void run() { for (int i = 0; i < 1000; i++) { increase(); }}}); threads[i].start(); } while (thread.activecount () > 1) thread.yield (); System.out.println(race); } -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the output -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the result of the output should be 5000, output of result is different, is a number less than 5000Copy the code

The problem is in the increment operation “race++”, which outputs bytecode through the IDEA plugin jclasslib bytecode viewer

The increase() method, which is only one line long, consists of four bytecode instructions in the Class file. At the bytecode level, it is easy to analyze the cause of concurrency failure: The volatile ensures that race is correct when getStatic takes race to the top of the stack, but by the time iconST_1 and iADD are executed, race may have been changed by other threads and the top of the stack becomes stale. So it is possible to synchronize smaller RACE values back into main memory after the putStatic instruction is executed.

Volatile variables should be used if and only if all of the following conditions are met:

  • Writes to variables do not depend on the current value of the variable, or you can ensure that only a single thread updates the value of the variable.

  • This variable will not be included in the invariance condition along with other state variables.

  • No locks are required to access variables.

    /** ** doWork() methods executed by * in all threads are guaranteed to stop immediately when shutdown() is called ** / volatile Boolean shutdownRequested; public void shutdown() { shutdownRequested = true; } public void doWork() { while (! ShutdownRequested) {// Business logic of code}}Copy the code

The second is to prohibit instruction reordering optimization. Ordinary variables can only ensure that correct results can be obtained in all places that depend on assignment results during the execution of this method, but cannot guarantee that the order of variable assignment operations is consistent with the execution order in the program code. This is what the Java memory model describes As “within-thread as-if-Serial Semantics” because you can’t perceive this when executing a method on the same Thread.

static int x = 0, y = 0; static int a = 0, b = 0; public static void main(String[] args) throws InterruptedException { Thread one = new Thread(new Runnable() { @Override public void run() { a = 1; x = b; }}); Thread other = new Thread(new Runnable() { @Override public void run() { b = 1; y = a; }}); one.start(); other.start(); one.join(); other.join(); System.out.println("x = "+x+" y="+y); }Copy the code

In the above program, it is difficult to infer the behavior of even the simplest concurrent programs without proper synchronization. It’s easy to imagine outputs (1,0) or (0,1) or (1,1) : thread A can finish before thread B starts, thread B can finish before thread A starts, or it can alternate between the two. But oddly enough, you can also print (0,0). Because there is no data flow dependency between operations in each thread, these operations can be performed out of order. (Even if these operations are performed sequentially, this can happen at different times when the cache is flushed to main memory, and assignments in thread A may be performed in reverse order from thread B’s perspective.)

The volatile keyword prevents this from happening, and instruction reordering is one of the most confusing areas of concurrent programming for developers.

How the volatile keyword disallows instruction reorder optimization:

In terms of hardware architecture, instruction reordering refers to the adoption of a processor that allows multiple instructions to be separately developed and sent to each corresponding circuit unit for processing in an unprogrammed order. But it doesn’t mean that the instructions are arbitrarily rearranged, the processor has to be able to handle the instruction dependencies correctly to ensure that the program gets the correct result. For example, instruction 1 increases the value of address A by 10, instruction 2 multiplies the value of address A by 2, and instruction 3 subtracts the value of address B by 3. In this case, instruction 1 and instruction 2 are dependent and the order between them cannot be rearranged — (A+10)*2 is obviously not equal to A*2+10, but instruction 3 can be rearranged before or in the middle of instruction 1 and 2. Just make sure that the processor gets the correct A and B values for any subsequent operations that depend on them. So in the same processor, reordered code still looks ordered. Therefore, when the lock addL 0x0, (0x0, (% ESP) instruction synchronizes changes to memory, it means that all previous operations have been performed, thus creating the effect that instruction reordering cannot cross the memory barrier. Volatile variables assigned with the lock addl0x0, (0x0, (%esp) operation, which acts as a Memory Barrier or Memory Fence Memory barriers are not required when memory is accessed by only one processor; But if two or more processors access the same block of memory, and one of them is observing the other, a memory barrier is required to ensure consistency.

Volatile reads have the same performance cost as normal variables, but writes can be slower because they require many memory-barrier instructions to be inserted into the native code to keep the processor from executing out of order. Even so, the overall cost of volatile is lower than locking in most scenarios.

Atomicity, visibility and order

The Java memory model is built around how atomicity, visibility, and orderliness are handled during concurrency

1. Atomicity

Atomic variable operations directly guaranteed by the Java memory model include read, load, assign, use, Store, and write. It can be roughly considered that access, read, and write of basic data types are atomic

The Java memory model also provides lock and unlock operations to meet this requirement. Although the virtual machine does not open lock and UNLOCK operations directly to the user, it does provide higher-level bytecode instructions Monitorenter and Monitorexit to implicitly use them. These two bytecode instructions are reflected in Java code as synchronized blocks — the synchronized keyword, and thus the atomicity of operations between synchronized blocks.

2. Visibility

Visibility means that when one thread changes the value of a shared variable, other threads are immediately aware of the change. The Java memory model provides visibility by relying on main memory as a transfer medium by synchronizing the new value back to main memory after the variable is modified and flushing the value from main memory before the variable is read, both for common and volatile variables. Normal variables differ from volatile variables in that the special rules of volatile ensure that new values are immediately synchronized to main memory and flushed from main memory immediately before each use. Thus we can say that volatile ensures visibility of variables in multithreaded operations.

In addition to volatile, Java has two other keywords for visibility: synchronized and final. The visibility of synchronized blocks is obtained by the rule that a variable must be synchronized back to main memory (store, write) before unlock is performed. Visibility of the final keyword means: Once a field modified by final is initialized in the constructor and the constructor does not pass a reference to “this” (this reference escape is a dangerous thing because other threads may access the “half-initialized” object through this reference), the value of the final field is visible to other threads.

3. Orderliness

The natural orderliness of Java programs can be summed up in the following sentence: if viewed within the thread, all operations are ordered; If you observe another thread in one thread, all operations are out of order. The Semantics of the in-thread as-if-serial Semantics are defined in terms of the Semantics of in-thread as-if-serial Semantics, and the Semantics of in-thread as-if-serial Semantics are defined in terms of the Semantics of in-thread as-if-serial Semantics, and the Semantics of in-thread as-if-serial Semantics are defined in terms of the Semantics of in-thread as-if-serial Semantics.

The Java language provides the keywords volatile and synchronized to ensure the order of operations between threads. Volatile contains the semantics to forbid instruction reordering. Synchronized is acquired by the rule that a variable can only be locked by one thread at a time, which determines that two synchronized blocks holding the same lock can enter only serially.

Happens-before rules

If all the ordering in the Java memory model was done by volatile and synchronized alone, many operations would become very verbose. The JMM defines the happens-before principle for all program actions.

A happens-before relationship must be satisfied between A and B to ensure that the thread performing operation B sees the result of operation A (whether or not A and B are executed in the same thread). If two operations lack a happens-before relationship, the JVM can reorder them arbitrarily. When a variable is read by multiple threads and written by at least one thread, data race problems can occur if there is no happens-before ordering between reads and writes.

Happens-before rules.

  • Program Order Rule: In a thread, actions written earlier take place before those written later, in Order of control flow. Note that we are talking about the control flow sequence, not the program code sequence, because we have branches, loops, and so on to consider.
  • Monitor Lock Rule: An UNLOCK operation occurs first when a subsequent Lock operation is performed on the same Lock. What must be emphasized here is “the same lock”, and “behind” refers to the sequence of time.
  • Volatile Variable Rule: Writes to a volatile Variable must be performed before reads to that Variable.
  • Thread Start Rule: The Start () method of a Thread object must be executed before the Thread performs any operation.
  • Thread Termination Rule: Any operation ina Thread must be executed before other threads detect the Thread and terminate, either successfully from Thread.join, or when thread. isAlve is called, return false.
  • Interruption Rule: Interrupt () calls to the interrupt() method occur when the interrupted Thread code detects that the Interruption has occurred. Thread:: Interrupted () checks whether the Interruption has occurred.
  • Finalizer Rule: An object’s initialization completes (the end of constructor execution) before its Finalize () method.
  • Transitivity: If operation A precedes operation B and operation B precedes operation C, then operation A must precede operation C.

The Java memory model describes the situations in which memory operations on one thread are visible to other threads. This includes ensuring that the operations are sorted according to a happens-before partial ordering relationship that is defined at levels such as memory operations and synchronous operations.

reference

An in-depth understanding of the Java Virtual Machine: Advanced JVM features and Best Practices (version 3) Java Concurrent Programming hands-on