The keyword volatile is one that many of you have probably heard of and probably used. Before Java 5, it was a controversial keyword because its use in programs often led to unexpected results. After Java 5, the volatile keyword found a new lease of life.

About Storage Media

As Java developers we all know about the Java memory model, and the JMM introduces the concepts of working memory and main memory to improve execution performance. There are four storage media that must be clarified before we continue: registers, advanced cache, RAM, and ROM. RAM and ROM we are more familiar with, can be seen as we often say memory and hard disk. Registers are part of the processor, and the advanced cache cache is a cache introduced by CPU designers to improve performance, or part of the processor.

Why we need them

In the use of CPU operations must involve the reading of operands, if the CPU read ROM directly, so this reading speed is simply unbearable, so the introduction of memory RAM. This did speed things up a lot, but since CPUS were growing so fast and RAM was growing so slowly because of technical and cost constraints, it created a paradox that was hard to reconcile: CPU processing was orders of magnitude faster than reading data from RAM. We’re all familiar with the bucket principle, where the size of a bucket depends on the size of the shortest piece. Because of this paradox, which inevitably affects processor efficiency, advanced caching was introduced. Several different levels of caches have been added directly to the CPU. Although they are not as fast as the registers, they are much faster and can almost match the CPU’s computing speed.

Summed up in a word is: in order to solve the CPU speed and read speed of the contradiction, the introduction of a variety of storage mechanisms. The order of read speed is as follows: Register > Cache >RAM>ROM. Let’s use a concept that’s easier to understand but not exactly correct. Because the register is closest to the CPU, it is the fastest to read. Cache is next, RAM is third, ROM is the farthest and the slowest. Of course, this problem cannot be explained completely by distance, but it is easier to understand with distance. Other factors include different hardware designs and different ways of working.

How the medium works

The four storage media of the machine are related. When the general program runs, ROM related program data will be read into RAM, and the data to be used in the operation or operation process will be read into the cache or register. If all the data and instructions needed to perform an operation are in registers and caches, the operation is very flat. There is no performance bottleneck because the speed of the computation matches the speed of the reading.

The ORDER in which the CPU reads the data is to try to read the register first, or to read the cache if none exists. If it does not already exist, RAM is read, and ROM is read last. A typical CPU has a level 3 cache, which reads from one level to the next until it finds the desired operand. A good CPU’s level 3 cache can achieve a hit ratio of more than 95%.

Java memory model

With this knowledge, it is easy to explore further. If we compare the Java memory model to the multi-level storage mechanism, we can see that Java introduced the concept of working memory to improve performance. Can put the Java model in the main memory and working memory and RAM respectively and match them with the cache or register, each thread of working memory in advance the needed data is copied to the cache or register (but does not guarantee that all the work memory copy of the variable is in the cache, may also in RAM, specific to see how the JVM implementation). In this way, the speed of reading data during thread execution is improved, and the performance is guaranteed when multiple threads are concurrent. Of course registers and caches have capacity limitations due to cost, which is also a challenge for JVM implementations.

It is important to note that the JMM is an abstract memory model, so local memory, main memory are abstract concepts that do not necessarily correspond to the ACTUAL CPU cache and physical memory

Data Synchronization Problems

Usually when we introduce a mechanism that solves one problem, it also introduces another problem. Another problem with data synchronization is whether the variable value used in the current operation is always the most recent value at the current time. If the variable value is not up to date, it will result in dirty reads of the data, which may result in very different results. One might recall that the volatile keyword in Java guarantees visibility, no doubt, so that each thread gets the latest variable value in main memory, but is it sufficient to ensure data synchronization?

Let’s look at a typical example. The pseudocode looks like this. After executing all the thread tasks, we expect 30*10000. But the actual number is less than 30*10000. It must feel a little strange at first, but it becomes clear when you think about it. count++; Instead of an atomic operation, it is a combination of several instructions. In the Java memory model, count++; It’s divided into steps that are not atomic. If the count variable in main memory is read by another thread during completion, this will obviously result in dirty reads.

Volatile unlocked

The reason for this problem was that volatile did not have locking operations, and it was not difficult to solve the problem by making these operations atomic. That is to ensure that no other thread can read the count variable until the thread is finished. To achieve this goal, just add a mutex to the count variable. If count is locked before execution, other threads cannot access count. The lock is released as soon as the thread completes execution, and other threads are allowed to acquire the variable from this point on.

conclusion

Volatile is a very confusing keyword, and many experienced developers fail to use it correctly. This paper introduces the corresponding Java memory model from the machine structure, and then leads to the problem of data synchronization between main memory and working memory. This further explains the exact meaning of volatile, which only guarantees visibility, not synchronization.

Java concurrent programming

Basic knowledge of

  • Concurrent Programming in Java: How to prevent deadlocks when threads block and wake up

  • Concurrent Programming in Java: How do threads block and wake up

  • Java concurrent programming: The Task Executor interface

  • Java concurrent programming: Deadlock formation conditions and handling in concurrency

  • Concurrent Programming in Java: How Java serialization works

  • Java concurrent programming: processes, threads, parallelism, and concurrency