Android engineers learn JVM(IX)-Java memory model and volatile keyword

preface

In the Learning JVM series, you’ve covered the JVM specification, the Class file format, and how to read bytecode, ASM bytecode processing, Class life cycles and custom Class loaders, memory allocation, bytecode execution engines, garbage collection mechanisms, and more. This article introduces the garbage collector in HotSpot, which will be helpful in understanding the garbage collection mechanism

If you are interested in JVMS, bytecodes, Class file formats, ASM bytecode processing, Class loading and custom Class loaders, memory allocation, bytecode execution engines, garbage collection mechanisms, and garbage collectors, check out the previous articles and learn more

Android engineers learn the garbage collector in THE JVM(8)-HotSpot

Android Engineers learn about JVM(7)- Garbage collection

Android engineers learn JVM(vi)- bytecode execution engine

Android engineers learn the basics of JVM(5)- memory allocation

Android engineers learn about JVMS (4)- class loading, connecting, initialization, and unloading

Android engineers learn how to use THE JVM(iii)- bytecode framework ASM

Android Engineer learning JVM(II)- Teaches you to read Java bytecode

Android Engineers learn about JVM(I)- Overview of the JVM

1. Java memory model and memory interaction

1.1. Java Memory Model

The Java Memory Model is defined by the Java Virtual Machine specification. It is used to mask the differences in Memory access between Java programs running on different hardware and operating systems. The purpose is to achieve the consistency of Memory access between Java programs running on different platforms. Avoid situations like C/C ++ programs that might run properly on Windows but incorrectly on Linux

The main goal of the Java memory model is to define the rules for accessing variables in a program, the low-level details of storing variables in and out of main memory in the virtual machine. The schematic diagram is as follows:

All variables (shared) are stored in main memory, and each thread has its own working memory; Working memory holds a copy of the variables used by the thread

Note that the variables here are not exactly the same as the variables we wrote in Java. Variables here are instance fields, static fields, and elements that make up array objects, but not local variables and method parameters (because these are thread private).

You can simply think of main memory as the heap in the Java virtual machine memory area. Local variables and method parameters are defined in the virtual stack, thread private, and in working memory. If a variable in the heap is used in multiple threads, since a copy of the variable is used in working memory, there is a problem of consistency between the values of the variable in the heap and in different virtual machine stacks

On the VM, the modification of main memory variables by a thread must be performed in the working memory of the thread. Threads cannot access each other’s working memory. If a variable’s value needs to be passed between threads, it must be mediated through main memory

Note: Main memory, working memory, and the Java heap, virtual machine stack, and method areas in the Java memory area are not the same level of memory partition, the two are basically unrelated, just for the sake of understanding, do the analogy

1.2. Interaction between memory

The working memory and main memory interactions of threads in Java memory are performed by eight operations defined by the Java virtual machine, each of which must be atomic

The interaction between main memory and working memory in the Java virtual machine is how a variable is transferred from main memory to working memory and how the modified variable is synchronized from working memory back to main memory. Take a look at the diagram below:

Lock: a variable in main memory that can only be locked by one thread at a time. This operation indicates that the line has exclusive access to the variable

Unlock: A variable operating on main memory that is released from its locked state so that it can be locked by other threads

Read: Applies to a main memory variable to transfer the value of a main memory variable to the thread’s working memory for subsequent load operations

Load: a variable that operates on the thread’s working memory and represents placing the value of a variable read from main memory into a copy of the working memory variable (the copy is relative to the main memory variable)

Use: a variable in the thread’s working memory that passes the value of a variable in the working memory to the execution engine whenever the virtual machine reaches a bytecode instruction that requires the value of the variable to be used

Assign: a variable applied to the thread’s working memory that assigns the result returned by the execution engine to a variable in the working memory. This operation is performed whenever the virtual machine receives a bytecode instruction that assigns a value to the variable

Store: A variable in the thread’s working memory that passes the value of a variable in the working memory to the main memory for subsequent write operations

Write: a variable that operates on main memory and places the value of a variable in main memory obtained from the store operation in working memory

1.3 Rules for memory interaction

To transfer a variable from main memory to working memory, read and load operations are performed sequentially.

To write a variable back from working memory to main memory, store and write operations are performed sequentially.

For ordinary variables, the virtual machine only requires sequential execution, not sequential execution, so the following is also true. For two threads, reading a and B from main memory is not the same as reading a; load a; read b; load b; The order of execution also appears: read a; read b; load b; load a; (There are other rules for volatile variables, as detailed below.) The virtual machine also has a set of rules for these eight operations. The following rules must be followed when performing these eight operations:

One of the read and load, store and write operations is not allowed to occur separately, that is, it is not allowed to read the value of a variable from main memory but not received by working memory, or it is not allowed to write the value of a variable from working memory back to main memory but not received by main memory

It is not allowed for a thread to discard the most recent assign operation. That is, it is not allowed for a thread to change the value of a variable in its own worker thread without synchronizing/writing back to main memory

A thread is not allowed to write an unassigned variable back to main memory. That is, it is not allowed to write the variable’s value back to main memory if the variable has not been assigned to the thread’s working memory

New variables can only be created in main memory. It is not allowed to directly use an uninitialized variable in working memory, i.e. load or assign is not performed. That is, load and assign operations must be performed on the same variables before use and store

A variable at the same time can only be one thread to lock operation, that is to say, once a thread for a variable after the lock, before the thread does not release the lock, other threads can’t for the lock, but the same thread for a variable after the lock, can continue to lock, lock is released at the same time when releasing the lock number must be the same and the number of lock.

A lock operation on a variable emptying its value from working memory. Before the execution engine can use the variable, it needs to load or assign its value again

You cannot unlock a variable that is not being locked. You cannot unlock a variable that is not being locked. A thread cannot unlock a variable that is being locked by another thread

Before you can unlock a variable, you must first synchronize the variable back to main memory, performing store and write operations

2. Volatile

Volatile is basically the lightest synchronization mechanism that a virtual machine provides. Volatile variables are visible to all threads, meaning that writes to volatile variables are immediately reflected in other threads

2.1. Role of the volatile keyword

The Java memory model specifically defines some special access rules for volatile:

Given that T represents a thread and V and W represent volatile variables, read, load, use, assign, Store, and write operations must comply with the following rules:

1. Thread T can execute the use action on variable V only if the previous action of thread T on variable V is load; And thread T can load V only if the next action it takes on V is use. Therefore, the use action of thread T on variable V is associated with the read and load actions of thread T on variable V and must occur consecutively together. In other words, before using variable V in thread T’s working memory, it must retrieve the latest value from main memory to ensure that thread T can see the latest value of variable V modified by other threads.

T can execute the store operation on V only if the previous operation on V is assign. In addition, thread T can execute assign on variable V only if the next action T executes on variable V is store. Therefore, the assign operation of thread T to variable V must be associated with the store and write operation of thread T to variable V and must appear consecutively together. In other words, in the working memory of thread T, the modification of variable V must be immediately synchronized back to the main memory to ensure that the modification of variable V by thread T can be immediately seen by other threads.

3. Assume that A is the use or assign action applied by thread T to variable V, that F is the load or store action associated with action A, and that P is the read or write action corresponding to action F to variable V. Similarly, assume that action B is a use or assign action applied by thread T to variable W, action G is a load or store action associated with action B, and action Q is a read or write action corresponding to action G on variable W. If action A precedes action B, then P precedes Q. That is, volatile variables are not reordered within the same thread, ensuring that code is executed in the same order as the program.

The first two rules state that volatile variables guarantee visibility to all threads. The third rule is that volatile variables prohibit instruction reordering optimization

2.2. Visibility of volatile variables and thread-safety

Visibility means that when one thread changes the value of a variable, the new value (the modified value) is immediately available to other threads. As with the previous two rules, volatile variables are immediately synchronized to main memory each time they are modified and re-read from main memory each time they are used.

Volatile variables are immediately visible to all threads, and any changes to volatile variables are immediately reflected in other threads. So are operations based on volatile variables thread-safe under concurrency?

This conclusion is wrong. Volatile rules ensure the order and continuity of read, load, and use, as well as the order and continuity of assign, Store, and write. The actions are guaranteed to be atomic, but the operations on variables are not guaranteed to be atomic. If a variable is modified in multiple steps, then multiple threads will get the latest value from main memory at the same time, but it is possible to overwrite the value of main memory after multiple steps.

Here’s an example:

public class VolatileDemo {
  public static volatile int num = 0;
  public static void increase(a) {
    num++;
  }

  private static final int THREADS_COUNT = 20;

  public void static main(String[] args) {
      Thread[] threads = new Thread[THREADS_COUNT);
      for (int = 0; i < THREADS_COUNT; i++) {
          threads[i] = new Thread(new Runnable(){
              @Override
              public void run(a) {
                  for (int j = 0; j < 10000; j++) { increase(); }}}); threads[i].start(); }while (Thread.activeCount() > 1) { Thread.yield(); } System.out.println(race); }}Copy the code

The code starts 20 threads on volatile variables. Each thread increments the variable 1w times. If volatile variables were operated concurrently, the result would be 20w, but the result would run less than 20w each time because num++ is not atomic. It’s done in multiple steps.

Assuming that two threads of a and b at the same time get the value of the main memory is 0, that is no problem, but may in the hypothesis of the + + operation thread a execution to a half, thread b performed, then thread b immediately synchronized to the main memory, primary memory has a value of 1, and thread a also performed at this time, synchronized to the main memory, the value is 1, The result of thread B is overwritten.

Thus, volatile only guarantees visibility, not atomicity, and therefore is not thread-safe.

2.3. Instruction rearrangement

Instruction reordering: When the JVM, for optimization purposes, rearranges instructions in such a way that it runs the next instruction that can be executed immediately, avoiding the wait to get the data needed for the next instruction

However, the instruction rearrangement mechanism only considers the serial semantics within threads, not the semantics between threads

In a single thread, the virtual machine guarantees that the assignment will be correct when it needs to be done, even if the code is executed in a different order. In this case, it looks like the order of code execution is the same as the order of code execution, but it is not.

However, in multithreading, exceptions may occur. Here is an example:

Map configOptions;
char[] configText;
// Volatile variables
volatile boolean initialized = false;

// Suppose the following code is executed in thread A
// Simulate reading configuration information and consider initialization complete after reading
configOptions = new HashMap();
configText = readConfigFile(fileName);
processConfigOptions(configText, configOptions);
initialized = true;

// Suppose the following code is executed in thread B
After initialized is true, read the configuration information and perform operations
while(! initialized) { sleep(); }// Operate the read configuration
doSomethingWithConfig();
Copy the code

In practice, initialized is a normal variable and is not volatile. In thread A, initialized=true may be executed before the configuration file is read. Because reordering instructions in thread A does not affect the results of thread A, the JVM can reorder instructions in this way. If initalized is assigned to true first, it is possible that thread B will execute the doSomethingWithConfig() method before thread A can read the configuration, and an exception will occur.

In this case, using volatile to modify variables and to prohibit instruction reordering is necessary.

3, summary

1. Introduce the Java memory model, understand main memory, working memory, and the rules for memory interaction

2. Introduce the principle of volatile, corresponding to changes in the operation rules of the memory model

3. Introduce volatile visibility and disallow the use of instruction reorders