The volatile keyword is often used in concurrent programming to ensure visibility and order, but it should be used with caution. This article focuses on the nature and implementation of the volatile keyword.





First, Java memory model

To understand why volatile ensures visibility, you need to understand what the memory model looks like in Java.

The Java memory model specifies that all variables are stored in main memory. Each thread also has its own working memory, which holds variables used by the thread (these variables are copied from main memory). All operations (reads, assignments) by a thread to a variable must be done in working memory. Different threads cannot directly access variables in each other’s working memory, and the transfer of variable values between threads needs to be completed through the main memory.





Based on this memory model, the problem of data “dirty read” in multithreaded programming arises.

To take a simple example: In Java, execute the following statement:

i  = 10;Copy the code

The executing thread must assign variable I to the cache line in its worker thread before writing to main memory. Instead of writing the number 10 directly to main memory.

For example, if two threads execute this code at the same time and the initial value of I is 10, we want the value of I to be 12 after both threads finish executing. But will that be the case?

It is possible to start with two threads reading I and storing it in their respective working memory, and then thread 1 increments by 1 and writes the latest value of I, 11, to memory. Now thread 2’s working memory is still 10 for I, and after adding 1, it’s 11, and then thread 2 writes the value of I to memory.

The final value of I is 11, not 12. This is known as the cache consistency problem. Variables that are accessed by multiple threads are commonly referred to as shared variables.

So how do you ensure that shared variables output correctly when accessed by multiple threads?

Before we solve this problem, we need to understand the three concepts of concurrent programming: atomicity, orderliness, and visibility.

Atomicity

Definition 1.

Atomicity: An operation or operations are either all performed without interruption by any factor, or none at all.

Example 2.

A classic example is the bank account transfer problem:

For example, transferring 1000 yuan from account A to account B must involve two operations: subtract 1000 yuan from account A and add 1000 yuan to account B.

Imagine what would happen if these two operations were not atomic. Suppose after you subtract $1000 from account A, the operation suddenly stops. In this way, although 1000 yuan was deducted from account A, account B did not receive the 1000 yuan transferred from this account.

So these two operations must be atomic so that there are no unexpected problems.

What happens when you do the same with concurrent programming?

As a simple example, think about what would happen if assignment to a 32-bit variable were not atomic.

i = 9;Copy the code

If a thread executes this statement, I’ll assume for the moment that assigning a 32-bit variable involves two processes: assigning the lower 16 bits and assigning the higher 16 bits.

Then it is possible to write a low 16 bit value and suddenly stop while another thread reads the value of I, reading the wrong data.

3. Atomicity in Java

In Java, reads and assignments to variables of primitive data types are atomic operations, that is, they are uninterruptible and either performed or not performed.

Although the above sentence seems simple, it is not so easy to understand. Look at the following example I:

Please analyze which of the following operations are atomic operations:

x = 10;         //语句1
y = x;         //语句2
x++;           //语句3
x = x + 1;     //语句4Copy the code

At first glance, you might say that the operations in the above four statements are atomic operations. In fact, only statement 1 is atomic; the other three statements are not atomic.

Statement 1 assigns the value 10 directly to x, meaning that the thread executing this statement writes the value 10 directly to the working memory.

Statement 2 actually contains two operations that read x and then write x to the working memory. Although reading x and writing x to the working memory are atomic operations, they are not atomic operations combined.

Similarly, x++ and x = x+1 involve three operations: reading the value of x, incrementing by 1, and writing the new value.

Therefore, only the operation of statement 1 has atomicity in the above four statements.

That is, only a simple read, assignment (and must be a number assigned to a variable; assignment between variables is not atomic) are atomic operations.

As you can see from the above, the Java memory model only guarantees that basic reads and assignments are atomic operations, and that atomicity for broader operations can be achieved through synchronized and Lock. Since synchronized and Lock guarantee that only one thread executes the code block at any one time, atomicity is naturally eliminated and thus guaranteed.

For details about the use of synchronized and ReentrantLock, see: Multithreaded synchronization and ReentrantLock

Visibility

Definition 1.

Visibility means that when multiple threads access the same variable and one thread changes the value of the variable, other threads can immediately see the changed value.

Example 2.

For a simple example, look at this code:

/ / thread1Execute code int I =0;
i = 10;/ / thread2Code executedj = i;Copy the code

If thread 1 executes the statement I =10, it will load the initial value of I into the working memory and then assign the value to 10.

Thread 2 will execute j = I, and it will go to main memory to read the value of I and load it into thread 2’s working memory. Note that the value of I in memory is still 0, so that the value of j is 0 instead of 10.

This is the visibility problem. After thread 1 makes changes to variable I, thread 2 does not immediately see the value changed by thread 1.

Visibility in Java

For visibility, Java provides the volatile keyword to ensure visibility.

When a shared variable is volatile, it guarantees that the value is immediately updated to main memory, and that it will read the new value in memory when another thread needs to read it.

Common shared variables do not guarantee visibility, because it is uncertain when a common shared variable will be written to main memory after modification. When another thread reads a common shared variable, it may still have the old value in memory, so visibility cannot be guaranteed.

Visibility is also guaranteed by synchronized and Lock, which ensure that only one thread at a time acquies the Lock and executes the synchronization code, flushing changes to variables into main memory before releasing the Lock. So visibility is guaranteed.

Fourth, order

Definition 1.

Orderliness: that is, the order in which the program is executed is the order in which the code is executed.

Example 2.

For a simple example, look at this code:

int i = 0;              

boolean flag = false;

i = 1; / / 1
flag = true; / / 2Copy the code

The above code defines an int variable, a Boolean variable, and then assigns to both variables. If statement 1 precedes statement 2 in code order, does the JVM guarantee that statement 1 precedes statement 2 when it actually executes this code? Not necessarily. Why? Instruction Reorder can occur here.

Explain what is under the instruction reordering, generally speaking, the processor in order to improve the efficiency of the program runs, may optimize the input code, it does not guarantee that all the statements in the program execution of the order the order of the code, but it would ensure that program execution results and the results of the code sequence is the same.

For example, in the code above, it makes no difference which statement 1 or statement 2 executes first, so it is possible that statement 2 executes first and statement 1 follows during execution.

However, it is important to note that although the processor will reorder the instructions, it will guarantee that the final result of the program will be the same as the result of the sequential execution of the code, so how does this guarantee? Here’s another example:

int a = 10;    //语句1
int r = 2;    //语句2
a = a + 3;    //语句3
r = a*a;     //语句4Copy the code

This code has four statements, so one possible order of execution is:





Statement 2 statement 1 statement 4 statement 3

This is not possible because the processor considers data dependencies between instructions when reordering. If Instruction 2 must use the results of Instruction 1, the processor guarantees that Instruction 1 will be executed before Instruction 2.

Although reordering does not affect the results of program execution within a single thread, what about multithreading? Here’s an example:

/ / thread 1:

context = loadContext();   //语句1
inited = true;             //语句2

 / / thread 2:
while(! inited ){sleep()
}
doSomethingwithconfig(context);Copy the code

In the above code, statements 1 and 2 May be reordered because they have no data dependencies. If a reorder occurs, thread 1 executes statement 2 first, and thread 2 thinks the initialization is done, it will break out of the while loop and execute the doSomethingwithconfig(Context) method when the context has not been initialized, It will cause the program to fail.

As can be seen from the above, instruction reordering does not affect the execution of a single thread, but does affect the correctness of concurrent execution of the thread.

That is, in order for concurrent programs to execute correctly, they must be atomic, visible, and ordered. As long as one of them is not guaranteed, it may cause the program to run incorrectly.

3. Order in Java

In the Java memory model, the compiler and processor are allowed to reorder instructions, but the reordering process does not affect the execution of a single-threaded program, but affects the correctness of multithreaded concurrent execution.

In Java, you can use the volatile keyword to ensure some “order.” In addition, synchronization and Lock can be used to ensure order. Obviously, synchronized and Lock ensure that one thread executes the synchronized code at each moment, which is equivalent to ordering the thread to execute the synchronized code sequentially, which naturally ensures order.

In addition, the Java memory model has some innate “orderliness”, that is, orderliness that can be guaranteed without any means, which is often referred to as the happens-before principle. If the order of two operations cannot be deduced from the happens-before principle, they are not guaranteed to be ordered, and the virtual machine can reorder them at will.

Here’s a look at the happens-before principle:

(1) Program order rule: in a thread, according to the code order, the first operation written before the operation written later

② Lock rule: An unLock operation occurs first when a lock operation is performed later

③ Volatile variable rules: Writes to a variable occur first after reads to that variable

④ Transfer rule: if operation A occurs first in operation B, and operation B occurs first in operation C, it can be concluded that operation A occurs first in operation C

Thread start rule: The start() method of the Thread object occurs first for each action of the Thread

⑥ Thread interrupt rule: Calls to the threadinterrupt () method occur first when the code of the interrupted thread detects the occurrence of an interrupt event

⑦ Thread termination rule: All operations in a Thread occur before the Thread terminates. We can detect that the Thread has terminated by means of the end of thread.join () method and the return value of thread.isalive ()

(8) Object finalization rule: The initialization of an object takes place first at the start of finalize() method. The first four rules are more important, and the last four rules are obvious.

Let’s explain the first four rules:

For the program order rule, the execution of a piece of program code appears to be ordered in a single thread. Note that the virtual machine may reorder program code, although this rule states that “operations written first take place after operations written later.” This should make it appear that the program executes in code order. Although reordering is performed, the result of the final execution is the same as the result of the sequential execution of the program. It only reorders instructions that do not have data dependence. Therefore, it is important to understand that program execution appears to be ordered in a single thread. In fact, this rule is used to guarantee the correctness of programs executed in a single thread, but not in multiple threads.

The second rule is also easy to understand, which states that a locked lock must be released before it can be continued, whether in a single thread or in multiple threads.

The third rule is a more important one. Intuitively, if a thread writes a variable first and then reads it, the write must precede the read.

The fourth rule actually reflects the transitivity of the happens-before principle.

Fifth, in-depth understanding of the volatile keyword

1. Volatile ensures visibility

Once a shared variable (a member variable of a class, a static member variable of a class) is volatile, there are two levels of semantics:

1) It ensures visibility when different threads operate on the variable, i.e. one thread changes the value of a variable and the new value is immediately visible to other threads.

2) Forbid instruction reordering.

Let’s take a look at some code. If thread 1 executes first and thread 2 executes later:

/ / thread 1
boolean stop = false;
while(!stop){
    doSomething();
}

/ / thread 2
stop = true;Copy the code

This code is a typical piece of code that many people might use when interrupting a thread. But will this code actually work correctly? Is it necessary to interrupt the thread? Not necessarily. Most of the time, this code may interrupt the thread, but there is a chance that it will not interrupt the thread (although this is very unlikely, but once it does, it will cause an endless loop).

Here’s how this code might make it impossible to interrupt the thread. As explained earlier, each thread has its own working memory during execution, so thread 1 will make a copy of the value of the stop variable into its working memory when it runs.

So when thread 2 changes the value of the stop variable, but before it can write to main memory, thread 2 moves on to do something else, and thread 1 doesn’t know about thread 2’s change to the stop variable, so it continues to loop.

But using volatile is different:

First, using the volatile keyword forces the modified value to be written to main memory immediately.

Second, using volatile will invalidate the stop line in thread 1’s working memory when thread 2 changes it (or the L1 or L2 line on the CPU).

Third: thread 1 reads stop again from main memory because the cache line of stop is invalid in thread 1’s working memory.

So when thread 2 stop value (of course including two operations, here in modified thread 2 working memory value, then the value of the modified into memory) and make the thread 1 stop working memory cache variable cache line is invalid, then read by a thread 1, found himself the cache line is invalid, it will wait for the cache line corresponding to the main memory address is updated, Then go to the corresponding main memory to read the latest value.

Thread 1 reads the latest correct value.

2. Volatile does not ensure atomicity

Here’s an example:

public class Test {
    public volatile int inc = 0;

    public void increase() {
        inc++;
    }

    public static void main(String[] args) {
        final Test test = new Test();
        for(int i=0; i<10; i++){new Thread(){
                public void run() {
                    for(int j=0; j<1000; j++) test.increase(); }; }.start(); }while(Thread.activeCount()>1)  // Ensure that all previous threads are finished
            Thread.yield(a); System.out.println(test.inc); }}Copy the code

What is the output of this program? Maybe some of my friends think 10,000. But in fact, running it will find that every time you run it, the result is inconsistent, with a number less than 10,000.

The value of inc is increment by volatile, and the value of inc increment by volatile is visible to all other threads. Ten threads each increment by inc 1000 times. So the final value of inc should be 1000*10=10000.

The error is that the volatile keyword guarantees visibility, but the program fails to guarantee atomicity. Visibility only ensures that the latest value is read each time, but volatile does not guarantee atomicity on variables.

As mentioned earlier, increment is non-atomic and involves reading the original value of a variable, incrementing by one, and writing to working memory. That is, the three suboperations of the increment operation may be executed separately, which may result in the following situation:

If at some point the value of inc is 10,

Thread 1 increments the variable. Thread 1 reads the original value of variable inc, and thread 1 blocks.

Then thread 2 variables for the operation, thread 2 also went to read the original value of the variable inc, because thread 1 just to read operation of variable inc, without modify variables, so will not lead to thread 2 working memory cache variable inc cache line is invalid, also won’t lead to the values in the main memory refresh, Therefore, thread 2 will directly go to main memory to read the value of inc. When it finds the value of inc, it increments by 1, writes 11 to working memory, and then writes it to main memory.

Since thread 1 has read the value of inc, notice that the value of inc in thread 1’s working memory is still 10, so thread 1 increments the value of inc to 11, then writes 11 to the working memory, and finally to main memory.

So after the two threads each perform an increment operation, Inc only increases by 1.

At the root of this is that augmentation is not atomic, and volatile does not guarantee that any operation on a variable is atomic.

Solution: You can use synchronized or lock to ensure atomicity. AtomicInteger is also available.

In Java 1.5 Java. Util. Concurrent. Atomic package provides some atomic operation, namely the basic data types on the (1), since (1) operation, reduction and addition operation (plus a number), subtraction operation (subtract a number) for the encapsulation, ensure that these operations are atomic operations. Atomic operations use CAS to perform atomic operations (Compare And Swap). CAS is actually implemented using the CMPXCHG instruction provided by the processor, which is an atomic operation.

3. Volatile ensures order

The volatile keyword mentioned earlier prevents instruction reordering, so volatile ensures some degree of order.

The volatile keyword disallows instruction reordering in two ways:

1) When a program performs a read or write to a volatile variable, all changes to the preceding operation must have occurred and the result must be visible to subsequent operations; The operation behind it has certainly not taken place;

2) During instruction optimization, statements following or following volatile variables must not be executed, nor must statements following volatile variables be executed in front of them.

Here’s a simple example:

//x and y are non-volatile variables
//flag is volatile

x = 2;        //语句1
y = 0;        //语句2
flag = true;  //语句3
x = 4;         //语句4
y = - 1;       5 / / statementCopy the code

Since flag is volatile, instruction reordering does not place statement 3 before statement 1 or 2, nor does it place statement 3 after statement 4 or 5. Note, however, that the order of statements 1 and 2 or 4 and 5 is not guaranteed.

And the volatile keyword guarantees that statements 1 and 2 must have completed by the time statement 3 is executed, and that the results of statements 1 and 2 are visible to statements 3, 4, and 5.

So let’s go back to the example we gave earlier:

/ / thread 1:
context = loadContext();   //语句1
inited = true;             //语句2

/ / thread 2:
while(! inited ){sleep()
}
doSomethingwithconfig(context);Copy the code

In the previous example, it was possible that statement 2 would execute so long before statement 1 that the context was not initialized, and thread 2 would use the uninitialized context, causing the program to fail.

This problem does not occur if the inited variable is modified with the volatile keyword, because the context must be initialized by the time statement 2 is executed.

Implementation principle of volatile

1. The visibility

In order to improve processing speed, the processor does not communicate with the memory directly. Instead, the data in the system memory is cached internally and then operated. However, it is unknown when the data will be written to the memory after the operation.

If a volatile variable is written, the JVM sends the processor an instruction prefixed with Lock to write the variable’s cached row to system memory. This step ensures that the data in main memory is updated immediately if other threads make changes to volatile variables declared.

But at this time the other processor cache or the old, so in a multiprocessor environment, in order to ensure that each processor cache is consistent, each processing will be spread by sniffer on the bus data to check whether its own cache expiration, when the processor found himself cache line corresponding to the memory address is modified, will replace the current processor cache line set to invalid state, When the processor modifies the data, it forces the data to be read from system memory back into the processor cache. This step ensures that any declared volatile variables obtained by other threads are retrieved from main memory.

2. Order

The Lock prefix instruction acts as a memory barrier (also known as a memory fence), which ensures that instructions are not reordered in a position in front of the barrier or behind the barrier. By the time you get to the memory barrier, everything in front of it has already been done.

Vii. Application Scenarios for Volatile

Synchronized prevents multiple threads from executing a piece of code at the same time, which can affect program execution efficiency. Volatile is better than synchronized in some cases. However, it is important to note that volatile is not a substitute for synchronized. The volatile keyword does not guarantee atomicity. In general, two conditions must be met to use volatile:

1) Write operations to variables do not depend on the current value

2) This variable is not included in invariants with other variables

Here are a few scenarios where volatile is used in Java.

①. Amount of status markers

volatile boolean flag = false;
 / / thread 1
while(! flag){ doSomething(); }/ / thread 2
public void setFlag(a) {
    flag = true;
}Copy the code

Terminates the thread based on the status marker.

② double check in singleton mode

class Singleton{
    private volatile static Singleton instance = null;

    private Singleton() {

    }

    public static Singleton getInstance() {
        if(instance==null) {
            synchronized (Singleton.class) {
                if(instance==null)
                    instance = new Singleton(); }} return instance; }}Copy the code
Why use volatile to modify instance?

Instance = new Singleton(), which is not an atomic operation, actually does three things in the JVM:

1. Allocate memory for instance

2. Call the Singleton constructor to initialize a member variable

3. Refer the instance object to the allocated memory space (after this step, instance is not null).

But there is an optimization for instruction reordering in the JVM’s just-in-time compiler. That is to say, the order of step 2 and step 3 above is not guaranteed, and the final execution order may be 1-2-3 or 1-3-2. If it is the latter, instance is preempted by thread 2 before instance 3 is executed and instance 2 is already non-null (but not initialized), so thread 2 returns instance directly, uses it, and then naturally reports an error.

Refer to the article

Java concurrency: Volatile keyword parsing —– In-depth analysis of the implementation principle of volatile Java concurrency mechanism underlying implementation principle of volatile implementation principle