In Java, the volatile keyword has special memory semantics. Volatile has two main functions:

  • Ensure memory visibility of variables
  • Disallow the reordering of volatile and normal variables (JSR133 suggests that this “enhanced volatile memory semantics” is only available in Java 5)

1. Memory visibility

Start with some sample code:

public class VolatileExample {
    int a = 0;
    volatile boolean flag = false;

    public void writer() {
        a = 1; // step 1
        flag = true; // step 2
    }

    public void reader() {
        if (flag) { // step 3
            System.out.println(a); // step 4
        }
    }
}
Copy the code

In this code, we use the volatile keyword to modify a Boolean flag variable.

Memory visibility means that when a thread writes to a volatile variable (such as step 2), the JMM immediately flusher the value of the shared variable from the thread’s local memory to the main memory. When a thread reads a volatile variable (such as step 3), the JMM invalidates the thread’s local memory immediately and reads the value of the shared variable from main memory.

In this respect, volatile has the same memory effect as locks, with writes of volatile variables having the same memory semantics as locks being released, reads of volatile variables having the same memory semantics as locks being acquired.

Suppose that on the timeline, thread A executes the writer method first and thread B executes the Reader method later. There must be the following:

If the flag variable is not volatile, thread A’s local memory will not be updated to main memory immediately at step 2, and thread B will also not fetch the latest values from main memory. Thread B’s local memory will still use the values A = 0, flag = false.

2. Disable reordering

In the old Java memory model prior to JSR-133, volatile variables were allowed to be reordered from ordinary variables. In the above case, it might be reordered into the following sequence:

  1. Thread A writes volatile, step 2, and sets flag to true.
  2. Thread B reads the same volatile, step 3, and reads flag true.
  3. Thread B reads the ordinary variable, step 4, and reads a = 0;
  4. Thread A modifies the common variable, step 1, and sets A = 1;

As you can see, if a volatile variable is reordered from a normal variable, even though a volatile variable guarantees memory visibility, it can also cause a normal variable read error.

Therefore, in the old memory model, volatile write-reads did not have the same memory semantics as lock deallocations. To provide a mechanism for thread to thread communication that was lighter than locking, the JSR-133 expert group decided to enhance the memory semantics of volatile by severely restricting the compiler and processor reordering of volatile and normal variables.

How can a JVM still limit processor reordering? It is implemented through a memory barrier.

What is a memory barrier? At the hardware level, there are two types of memory barriers: Load barriers and Store barriers. Memory barriers serve two purposes:

  1. Prevents reordering of instructions on both sides of the barrier
  2. Force write buffer/cache dirty data etc. to write back to main memory, or invalidate corresponding data in cache.

Note that caches mainly refer to CPU caches, such as L1, L2, etc

When the bytecode is generated, the compiler inserts a memory barrier into the sequence of instructions to prevent reordering of a particular type of handler. The compiler chose a conservative JMM memory barrier insertion strategy to ensure that volatile memory semantics were correct in any program on any processor platform. The strategy is:

  • Insert a StoreStore barrier before each volatile write;
  • Insert a StoreLoad barrier after each volatile write;
  • Insert a LoadLoad barrier after each volatile read;
  • Insert a LoadStore barrier after each volatile read.

The schematic diagram looks something like this:

Explain each of these barriers one by one. Note: Load represents read operations and Store represents write operations

LoadLoad barrier: for statements such as Load1; LoadLoad; Load2: Ensure that the data to be read by Load1 is completed before the data to be read by Load2 and subsequent read operations are accessed. StoreStore barrier: for statements like Store1; StoreStore; Store2. This barrier forces Store1 to be flushed to memory before Store2 and subsequent writes are executed, making Store1 writes visible to other processors. LoadStore barrier: for statements such as Load1; LoadStore; Store2, ensure that the data to be read by Load1 is completed before Store2 and subsequent write operations are flushed out. StoreLoad barrier: for statements like Store1; StoreLoad; Load2, ensure that writes to Store1 are visible to all processors before Load2 and all subsequent reads are executed. It has the largest overhead of the four barriers (flushing the write buffer, emptying the invalid queue). In most processor implementations, this barrier is a universal barrier that doubles as the other three memory barriers

For consecutive volatile reads or writes, the compiler makes certain optimizations to improve performance, such as:

The first volatile read;

LoadLoad barrier;

The second volatile read;

LoadStore barrier

Reordering rules for volatile and normal variables:

  1. If the first operation is a volatile read, no matter what the second operation is, it cannot be reordered.

  2. If the second operation is volatile write, no matter what the first operation is, it cannot be reordered.

  3. If the first operation is volatile write and the second is volatile read, reorder cannot be performed.

For example, in our case, step 1 is a write to a normal variable and step 2 is a write to a volatile variable, which complies with rule 2. The two steps cannot be reordered. Step 3 is a volatile variable read, and step 4 is an ordinary variable read, which conforms to the first rule and cannot be reordered.

However, if the first operation is a normal variable read and the second is a volatile variable read, then reordering is possible:

Int a = 0; // Declare the common variable volatile Boolean flag = false; Int I = a; int I = a; Boolean j = flag; // Volatile readsCopy the code

3. The purpose of the volatile

In terms of the memory semantics of volatile, volatile guarantees memory visibility and prohibits reordering.

Volatile has the same memory semantics as locks in terms of memory visibility, so it can be used as a “lightweight” lock. But because volatile guarantees atomicity only for reads/writes to individual volatile variables, locking guarantees atomicity for the execution of entire critical sections of code. So locks are functionally more powerful than volatile; In terms of performance, volatile has an advantage.

Volatile is also useful in disallowing reordering. For example, we are familiar with the singleton pattern, where one implementation is “double lock checking”, such as this code:

public class Singleton { private static Singleton instance; Public static Singleton getInstance() {if (instance == null) {// line 7 synchronized (Singleton.class) { if (instance == null) { instance = new Singleton(); // Line 10}}} return instance; }}Copy the code

If the variable declaration does not use the volatile keyword, an error may occur. It may be reordered:

instance = new Singleton(); Memory =allocate(); // Allocate (); // Allocate (); // Assign memory equivalent to C malloc 2 ctorInstanc(memory) // initialize object 3 s=memory // Set S to the newly allocated address // The above three steps may be reordered to 1-3-2, i.e. :  1 memory=allocate(); // Set s to the newly allocated address 2 ctorInstanc(memory) // initialize the objectCopy the code

Once such A reordering occurs, let’s say thread A performs steps 1 and 3 on line 10, but step 2 is not complete. At line 7, another thread, B, decides that instance is not empty and returns an uninitialized instance!

So with the jSR-133 enhancements to volatile, the prohibition of reordering from volatile is still very useful.