In Java, the volatile keyword was introduced to address data inconsistencies in concurrent programming.

The profile

When there is no concurrent synchronization, the compiler or runtime or processor applies various optimizations. Caching and reordering are optimizations in the context of concurrency, and Java and JVM provide many ways to control memory ordering, including the volatile keyword.

What happens without volatile?

Consider the following example:

public class TaskRunner {

    private static int number;
    private static boolean ready;

    private static class Reader extends Thread {

        @Override
        public void run(a) {
            while (!ready) {
                Thread.yield();
            }

            System.out.println(number);
        }
    }

    public static void main(String[] args) {
        new Reader().start();
        number = 42;
        ready = true; }}Copy the code

The TaskRunner class maintains two simple variables. In its main method, it creates another thread that spins on the ready variable as long as it is false. When the variable becomes true, the thread prints the number variable.

We expect the program to simply print 42 after a short delay. In practice, however, the delay could be much longer. It might even hang forever, or even print a 0.

These exceptions are due to a lack of proper memory visibility, reordering, and, for the purposes of this article, the use of the volatile keyword to modify variables.

Memory visibility

In the previous article, the Java memory model briefly mentioned the visibility of memory. To put it simply, multiple threads run on multiple cpus, and each thread has its own cache. Therefore, it is impossible to guarantee the order of reading data from main memory, that is, it is impossible to guarantee the consistency of variable data read by threads on different cpus.

Combined with the above program, the main thread keeps copies of ready and number in its core cache, and the Reader thread does the same, after which the main thread updates the cached values. On most modern processors, write requests are not applied immediately after they are issued. In fact, the processor tends to queue these writes in a special write buffer. After a period of time, they write it to main memory all at once.

So when the main thread updates the number and ready variables, there is no guarantee of what the Reader thread will see. In other words, the Reader thread might see the updated value immediately, with some delay, or not at all.

reorder

As mentioned above, in addition to continuing in an infinite loop, the program has a small probability of printing a 0, which is why reordering occurs. When the CPU executes the instruction, it updates the ready variable and then executes the thread operation.

Reordering is an optimization technique used to improve performance that may be applied to different components:

  • The processor can flush its write buffer in any order other than program order
  • The processor may apply out-of-order execution techniques
  • The JIT compiler can optimize by reordering

The volatile keyword

So what does the volatile keyword do?

The volatile keyword prefixes the variable with the Lock instruction at assembly time and ensures visibility between threads through the MESI cache consistency protocol. Changes made by any thread to a variable are synchronized to all threads that read the variable at the same time. Simply put, one change guarantees that all changes are made.

Here first look at the assembly layer Lock instruction, the early CPU to Lock the bus to achieve this instruction, the mediator chooses a CPU exclusive bus, so that other CPUS can not communicate with memory through the bus, atomic; Of course, this method is inefficient, and cache locking is generally adopted at present. Data consistency in this scenario is accomplished through MESI cache consistency protocol.

The cache consistency protocol is not detailed here, but the idea is that the CPU constantly sniffs the data exchange on the bus, and when a cache reads or writes to memory on behalf of its own CPU, other cpus are notified to synchronize their caches.

In the Java memory model, there are atomic operations that are critical to controlling concurrency in the Java memory model.

  • Read: Reads data from main memory
  • Load: Writes data read from main memory to working memory, that is, the cache
  • Use: To compute by reading data from working memory
  • Assign: Reassigns the calculated value to the working memory
  • Store: Writes data from working memory to main memory
  • Write: Assigns the store’s past variable value to a variable in main memory
  • Lock: Locks the main memory variable, marking it as thread-exclusive
  • Unlock: Unlocks a main memory variable that can be locked by other threads

Under the volatile keyword, the store and write operations must be contiguous, grouped into atomic operations that must be immediately synchronized to main memory and flushed from main memory to ensure volatile visibility.

The volatile keyword also uses memory barriers to prevent instruction reordering. The memory visibility impact of volatile variables extends beyond the volatile variables themselves.

More specifically, suppose thread A writes A volatile variable, and thread B reads the same volatile variable. In this case, values that are visible to A before volatile is written will be visible to B after volatile is read:

Technically, any write to a volatile field occurs before each subsequent read of the same field. This is the volatile variable rule for the Java memory model.

Because of the advantage of memory sorting, we can sometimes piggyback on the visibility property of volatile as another variable. For example, in our example, we simply mark the ready variable as volatile:

public class TaskRunner {

    private static int number; // not volatile
    private volatile static boolean ready;

    // same as before
}
Copy the code

After reading the ready variable, anything that is written to true before the ready variable is visible to anything. Therefore, the number variable carries with it the memory visibility enforced by the ready variable. In short, even if it is not volatile, it behaves as volatile.

By leveraging these semantics, we can make the few variables in the class volatile and optimize visibility.