Art is long, life is long
reorder
The Java compiler reorders instructions at runtime. What is the effect of this reordering in the single-threaded versus multi-threaded case?
Data dependency
If two operations access the same variable, and one of them is a write operation, there is a data dependency between the two operations. There are three types of data dependencies:
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the name code example writing after reading a =1; b = a; After writing a variable, read the position. So let me write a is equal to a1; a =2; You write a variable, and then you write that variable. A = b; b =1; After reading a variable, write the variable. -------- -------------- -------------------------------Copy the code
In the above three cases, the result of the program’s execution will be changed by reordering the execution order of the two operations.
The compiler and processor may reorder operations. The compiler and processor adhere to data dependencies when reordering, and do not change the order in which two operations with data dependencies are executed.
Note that data dependencies are only for sequences of instructions executed in a single processor and operations performed in a single thread. Data dependencies between different processors and between different threads are not considered by compilers and processors.
The as – if – serial semantics
The as-if-serial semantics mean that the execution result of a (single-threaded) program cannot be changed, no matter how much reordering is done (to improve parallelism by the compiler and processor). The compiler, runtime, and processor must comply with the AS-IF-Serial semantics.
To comply with the as-if-serial semantics, the compiler and processor do not reorder operations that have data dependencies because such reordering changes the execution result. However, if there are no data dependencies between the operations, they may be reordered by the compiler and processor. To illustrate, look at the following code example for calculating the area of a circle:
double pi = 3.14; //A
double r = 1.0; //B
double area = pi * r * r; //C
Copy the code
The data dependencies of the above three operations are shown below:
As shown in the figure above, there is A data dependency relationship between A and C, as well as between B and C. Therefore, C cannot be reordered before A and B in the final instruction sequence. But there is no data dependency between A and B, and the compiler and processor can reorder the execution order between A and B. Here are the two execution sequences of the program:
The as-if-serial semantics protect single-threaded programs. Compilers, runtime, and processors that adhere to the AS-IF-serial semantics together create the illusion for programmers who write single-threaded programs that single-threaded programs are executed in sequence. The as-IF-serial semantics free single-threaded programmers from having to worry about reordering interfering with them or memory visibility issues.
Procedural order rule
The sample code above that calculates the area of a circle has three happens-before relationships according to the happens-before procedural order rule:
- A happens-before B;
- B happens-before C;
- A happens-before C;
The third happens-before relationship here is derived from the transitivity of happens-before.
A happens before B, but B is actually executed before A (see the reordered order above). If A happens-before B, the JMM does not require that A be executed before B. The JMM simply requires that the previous operation (the result of the execution) be visible to the latter, and that the former operation precedes the second in order. The result of operation A does not need to be visible to operation B; Moreover, the result of reordering operations A and B is the same as that of operations A and B in happens-before order. In this case, the JMM considers the reordering not illegal, and the JMM allows it.
In computers, software technology and hardware technology have a common goal: to develop as much parallelism as possible without changing the results of program execution. Compilers and processors follow this goal, and as you can see from the definition of happens-before, the JMM follows this goal as well.
The impact of reordering on multithreading
Now let’s see if reordering changes the performance of multithreaded programs. Take a look at the following sample code:
int a = 0;
boolean flag = false;
public void writer(a) {
a = 1; / / 1
flag = true; / / 2
}
public void reader(a) {
if (flag) { / / 3
int i = a * a; / / 4... }Copy the code
The flag variable is a flag that indicates whether variable A has been written. Here we assume that we have two threads A and B, with A first executing writer() and then THREAD B executing reader(). When thread B performs operation 4, can thread A see that thread A is writing to the shared variable A in operation 1?
The answer is: not necessarily.
Since operations 1 and 2 have no data dependencies, the compiler and processor can reorder these two operations; Similarly, operations 3 and 4 have no data dependencies, and the compiler and processor can also reorder these two operations. Let’s first look at what might happen when operations 1 and 2 reorder. Take a look at the sequence diagram below:
As shown in the figure above, operations 1 and 2 are reordered. When the program executes, thread A first writes the flag variable, which thread B then reads. Since the condition is judged true, thread B will read variable A. At this point, variable A has not been written by thread A at all, where the semantics of the multithreaded program are broken by reordering!
※ Note: In this paper, red virtual arrow line represents wrong read operation, and green virtual arrow line represents correct read operation.
Now let’s see what happens when operations 3 and 4 are reordered (with this reordering, we can incidentally illustrate control dependencies). The following is a sequence diagram of the program after reordering operations 3 and 4:
In the program, operations 3 and 4 have control dependencies. When there is a control dependency in the code, it will affect the parallelism of instruction sequence execution. To do this, compilers and processors employ a guess execution to overcome the effect of controlling correlation on parallelism. In the case of the processor’s guess execution, the processor executing thread B can read and calculate A * A ahead of time, and then temporarily save the results to a hardware cache called the Reorder Buffer ROB. When the condition for the following operation 3 is judged to be true, the result is written to variable I.
As we can see from the figure, the guess execution essentially reorders operations 3 and 4. Reordering breaks the semantics of multithreaded programs here!
In a single-threaded program, reordering operations with control-dependencies does not change the execution result (which is why the as-if-serial semantics allow reordering operations with control-dependencies). But in multithreaded programs, reordering operations that have control dependencies may change the program’s execution results.
How does it work out? This brings us to the sequential consistent Memory Model and the Java Memory Model
Sequential consistency memory model with JMM
When it comes to data consistency, it’s data competition.
Data contention and sequential consistency assurance
What is data competition? It is defined in the Java Memory model specification:
- Write a variable in a thread
- Read the same variable in another thread
- Write and read are not ordered by synchronization
When the code contains data contention, the program execution results will often exceed your imagination. If a multithreaded program can synchronize correctly, the program will have no data contention.
The JMM guarantees memory consistency for properly synchronized multithreaded programs as follows:
- If the program is correctly synchronized, the execution of the program will be sequentially consistent — that is, the execution of the program will be the same as if the program had been executed in the sequentially consistent memory model (as we’ll see in a moment, this is a strong guarantee for programmers). Synchronization here refers to synchronization ina broad sense, including the proper use of common synchronization primitives (lock, volatile, and final).
Sequential consistent memory model
Sequential consistent memory model is a theoretical reference model idealized by computer scientists, which provides programmers with a strong guarantee of memory visibility. Sequential consistent memory model has two characteristics:
- All operations in a thread must be executed in program order.
- All threads (whether or not the program is synchronized) see a single order of execution. In the sequential consistent memory model, every operation must be performed atomically and immediately visible to all threads.
For better understanding, let’s use two diagrams to further illustrate the characteristics of the sequential consistency model.
Suppose you have two threads A and B executing concurrently. Where thread A has three operations, they are in the program order: A1->A2->A3. Thread B also has three operations in the program order: B1->B2->B3.
Suppose the two threads use the monitor to synchronize properly: thread A releases the monitor after three operations, and thread B acquires the same monitor. Then the execution effect of the program in the sequential consistency model will be as follows:
Now let’s assume that the two threads are not synchronized. Here is a schematic of the execution of the unsynchronized program in the sequential consistency model:
In the sequential consistency model of unsynchronized programs, although the overall execution order is unordered, all threads can only see a consistent overall execution order. For example, thread A and thread B see the order of execution: B1->A1->A2->B2->A3->B3. This guarantee is made possible because each operation in the sequential consistent memory model must be immediately visible to any thread.
However, there is no such guarantee in the JMM. Not only is the overall execution order of an unsynchronized program out of order in the JMM, but the execution order of operations seen by all threads may be inconsistent. For example, a write is visible only to the current thread until the current thread has cached the data in local memory and has not flushed it to main memory. From the perspective of other threads, it would appear that the write operation has not been performed by the current thread at all. Write operations are visible to other threads only after the current thread has flushed data from local memory to main memory. In this case, the actions seen by the current thread and other threads will be executed in a different order.
The sequential consistency effect of a synchronous program
Take a look at the following sample code:
int a = 0;
boolean flag = false;
public synchronized void writer(a) {
a = 1;
flag = true;
}
public synchronized void reader(a) {
if (flag) {
int i = a;
……
}
}
Copy the code
In the example code above, we assume that thread A executes writer() and thread B executes Reader (). This is a properly synchronized multithreaded program. According to the JMM specification, the execution result of this program will be the same as the execution result of the program in the sequential consistency model. The following is a comparison of the execution sequence of the program in the two memory models:
In the sequential consistency model, all operations are executed sequentially, exactly in program order. In JMM, code within a critical zone can be reordered (but the JMM does not allow code within a critical zone to “escape” outside of the critical zone, which would break the semantics of the monitor). The JMM does some special processing at the critical points of exit and entry of the monitor so that the thread has the same memory view as the sequential consistency model at those points (more on that later). Although thread A resorts within the critical region, thread B here cannot “observe” thread A’s reorder within the critical region due to the mutually exclusive nature of the monitor. This kind of reordering not only improves the execution efficiency, but also does not change the execution result of the program.
Here we can see the basic approach of the JMM implementation: leave the door open for compiler and processor optimization as much as possible without changing the results of (properly synchronized) program execution.