A few days ago, I published an article explaining the differences between JVM memory structures, the Java Memory model, and the Java object model. There are many feedback partners hope to explain each knowledge point in depth. The Java memory model is the most obscure of the three, and involves a lot of background and related knowledge.

There are many articles about the Java memory model on the Web, as well as books such as Understanding the Java Virtual Machine and The Art of Concurrent Programming in Java. However, many people are still confused after reading it, and some say they are even more confused. In this article, to the overall introduction of the Java memory model, the purpose is very simple, so that you read this article, you will know what the Java memory model is, why there is a Java memory model, Java memory model to solve the problem.

In this article, there are many definitions and expressions, are the author’s own understanding of the definition. Hopefully, this will give the reader a clearer picture of the Java memory model. Of course, if there is bias, welcome to correct.


Before I introduce the Java memory model, let’s take a look at what a computer memory model is, and then what the Java memory model does on top of the computer memory model. To talk about the memory model of the computer, it is necessary to talk about the ancient history of the memory model, see why there is a memory model.

Memory Model, English name Memory Model, he is a very old fuddy-duddy. It is a concept related to computer hardware. So let me tell you what he has to do with hardware.

CPU and cache consistency

We all know that when a computer executes a program, every instruction is executed in the CPU, and when it executes, it has to deal with data. The data on the computer is stored in main memory, the physical memory of the computer.

At first, nothing happened, but as CPU technology developed, CPU execution became faster and faster. Since the technology of memory has not changed much, the process of reading and writing data from memory has become more and more different from the CPU’s execution speed, which causes the CPU to spend a lot of time waiting for each memory operation.

It’s like a startup. At the beginning, the founder and his employees had a good working relationship, but as the founder became more and more capable and ambitious, there was a gap between the founder and the employees, and the average employees couldn’t keep up with the CEO. Every order of the boss will cost a lot of time when it is transmitted to the basic staff due to their lack of understanding and execution ability. This has virtually slowed down the efficiency of the entire company.

However, can not because of the memory read and write speed is slow, do not develop CPU technology, always can not let the memory become the bottleneck of computer processing.

So, people came up with a good idea, is to increase the cache between the CPU and memory. The concept of caching is well known, keeping a copy of data. It is characterized by high speed, low memory, and high cost.

The execution of the program then becomes:

When a program is running, it copies the data needed for an operation from main memory to the CPU’s cache. Then the CPU can read and write data directly from its cache when performing a calculation. When the operation is complete, the data in the cache is refreshed to main memory.

After that, the company began to set up middle management personnel, who were directly under the leadership of the CEO. If the leader had any instructions, he told the management directly, and then he could do his own work. Managers are responsible for coordinating the work of lower-level employees. Because managers know their people and what they’re responsible for. So, most of the time, the CEO only needs to communicate with the management about all kinds of decisions, announcements, etc.

With the continuous improvement of CPU capacity, a layer of cache is slowly unable to meet the requirements, and gradually derived from multi-level cache.

The CPU cache can be divided into level-1 cache (L1), level-2 cache (L3) and some high-end cpus have level-3 cache (L3) according to the order of data reading and the degree of close integration with the CPU. All data stored in each level cache is part of the next level cache.

The technical difficulty and manufacturing cost of these three caches are relatively decreasing, so their capacity is also relatively increasing.

So, with multi-level caching, the program’s execution becomes:

When the CPU wants to read a piece of data, it first looks for it in the level-1 cache, then in the level-2 cache if it doesn’t find it, and then in the level-3 cache or memory if it still doesn’t find it.

As the company got bigger and bigger, the boss had more and more things to manage, and the management department of the company began to reform, and there began to be top, middle and bottom managers. Level by level management.

Single-core CPU only has a set of L1, L2, L3 cache;

If the CPU has multiple cores, that is, a multi-core CPU, each core has a L1 (or even L2) cache and shares L3 (or L2) caches.

There are many different kinds of companies. Some companies have only one big Boss, who makes all the decisions. But some companies have mechanisms like co-managing directors, partners, etc.

A single-core CPU is like a company with only one boss and all the orders come from him, so you only need one management team.

A multi-core CPU is like a company founded by multiple partners. In this case, each partner needs to set up a set of senior management personnel for his direct leadership, while the lower-level employees of the company are shared by multiple partners.

Others, as they grew, began to separate their subsidiaries. Each subsidiary is a number of cpus, did not share resources with each other. Each other.

The following figure shows a single-CPU dual-core cache structure.


With the continuous improvement of computer power, began to support multithreading. So here’s the problem. We will analyze the impact of single – thread and multi-thread on single-core CPU and multi-core CPU respectively.

Single thread. The CPU core’s cache is accessed by only one thread. Cache exclusivity, no access conflicts and other issues.

Single-core CPU, multithreading. Multiple threads in a process can access the shared data in the process at the same time. After the CPU loads a block of memory into the cache, different threads will map to the same cache location when accessing the same physical address. In this way, the cache will not be invalid even if the thread switch occurs. But because only one thread can be executing at any time, cache access conflicts do not occur.

Multi-core CPU, multi-thread. Each core has at least one L1 cache. If multiple threads access a shared memory in a process and execute on different cores, each core keeps a buffer of shared memory in its own CAEHE. Since multiple cores can be parallel, it is possible for multiple threads to write to their caches at the same time, and the data in their caches may be different.

Adding caches between CPU and main memory can lead to cache consistency issues in multi-threaded scenarios, that is, in a multi-core CPU, each core’s own cache may have different contents for the same data.

There would be no problem if the company’s orders were issued sequentially.

If the company’s orders are issued in parallel, and they are all issued by the same CEO, this mechanism is fine. Because his enforcers only have one management system.

If the company’s orders are issued in parallel, and they are issued by multiple partners, there is a problem. Because each partner will only give orders to his immediate management, and the low-level staff managed by multiple managers may be common.

For example, partner 1 wants to fire employee A, and partner 2 wants to promote employee A. If he is promoted, the decision to fire him again needs to be made by multiple partners at a meeting. The two partners sent their orders separately to their managers. After partner 1 gives the order, and manager A fires the employee, he knows that the employee has been fired. At that time, the manager of Partner 2 thought employee A was on the job before he got the news, so he gladly accepted the order of promotion A from partner 2.


Processor optimization and instruction rearrangement

As mentioned above, adding cache between CPU and main memory can cause cache consistency issues in multi-threaded scenarios. In addition to this, there is another hardware issue that is important. The processor may execute the input code out of order in order to maximize the use of the processor’s internal arithmetic unit. This is processor optimization.

In addition to the fact that many popular processors optimize out-of-order code, compilers in many programming languages have similar optimizations, such as the Just-in-time compiler (JIT) for the Java Virtual machine (JVM) for instruction reordering.

As you can imagine, all sorts of problems can result if processor optimizations and compiler rearrangements of instructions are left unchecked.

As for the organizational adjustment of employees, if the personnel department is allowed to disassemble or rearrange randomly after receiving multiple orders, it will have a great impact on the employee and the company.


You may be a little confused about the concept of hardware mentioned above, and have no idea what it has to do with software. But you should know something about concurrent programming issues such as atomicity, visibility, and orderliness.

Well, atomicity, visibility and order. People define it abstractly. The underlying problems of this abstraction are the cache consistency, processor optimization and instruction rearrangement problems mentioned above.

Here is a quick review of the three questions, not intended to go into depth, but for those who are interested. We say that concurrent programming, in order to ensure data security, needs to meet the following three characteristics:

Atomicity means that the CPU cannot pause and then schedule an operation without interrupting, completing, or not executing it at all.

Visibility means that when multiple threads access the same variable and one thread changes the value of the variable, other threads can immediately see the changed value.

Orderliness means that the order in which a program is executed is the order in which the code is executed.

See, the cache consistency problem is really a visibility problem. Processor optimization can cause atomicity problems. Reordering of instructions leads to ordering problems. As a result, the hardware concepts will be left out of this article, and the familiar atomicity, visibility, and orderliness will be used directly.


As mentioned earlier, cache consistency issues and processor optimized instruction rearrangements are the result of hardware upgrades. So, is there a good mechanism to solve these problems?

The simplest and most straightforward way to do this is to do away with the processor and processor optimization techniques, do away with CPU caching, and let the CPU interact directly with main memory. However, doing so can guarantee concurrency problems in multiple threads. But that’s a bit of throwing out the baby with the bath water.

Therefore, in order to ensure that concurrent programming can meet the atomicity, visibility and order. There is an important concept, and that is the memory model.

In order to ensure the correctness (visibility, orderliness, atomicity) of shared memory, the memory model defines the specification of read and write operation of multithreaded program in shared memory system. These rules are used to regulate the read and write operations of memory, so as to ensure the correctness of instruction execution. It’s about the processor, it’s about the cache, it’s about concurrency, it’s about the compiler. It solves the memory access problems caused by multi-level CPU cache, processor optimization and instruction rearrangement, and ensures the consistency, atomicity and order in concurrent scenarios.

The memory model solves the concurrency problem in two main ways: limiting processor optimization and using memory barriers. This article will not go into the underlying principles to introduce, interested friends can learn by themselves.


Earlier, I introduced the computer memory model, which is an important specification for solving concurrency problems in multi-threaded scenarios. So what’s the implementation of that? It might vary from language to language.

As we know, Java programs need to run on the Java virtual machine. The Java Memory Model (JMM) is a kind of Memory Model specification that shields the access differences of various hardware and operating systems. A mechanism and specification that ensures consistent access to memory by Java programs on various platforms.

When I refer to the Java Memory Model, I generally refer to the new Memory Model that started in JDK 5, mainly described by JSR-133: JavaTM Memory Model and Thread Specification. Interested in can see the PDF document (http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf)

The Java memory model stipulates that all variables are stored in the main memory, and each thread has its own working memory. The working memory of the thread stores a copy of the main memory of the variables used in the thread. All operations on variables must be carried out in the working memory of the thread, instead of reading and writing the main memory directly. Different threads cannot directly access variables in each other’s working memory, and the transfer of variables between threads requires data synchronization between their own working memory and main memory.

The JMM is used to synchronize data between working memory and main memory. It specifies how and when to synchronize data.


The reference to main memory and working memory can be easily likened to the concept of main memory and cache in the computer memory model. In particular, it is important to note that main and working memory are not directly analogous to the Java heap, stack, method area, and so on in the JVM’s memory structure. In Understanding the Java Virtual Machine, the main memory corresponds primarily to the object instance data part of the Java heap, in terms of definitions of variables, main memory, and working memory, if at all. Working memory corresponds to a portion of the virtual machine stack.

So, to summarize, the JMM is a specification that addresses the problems of local memory inconsistencies, compiler reordering of code instructions, and out-of-order code execution by processors when multiple threads communicate through shared memory. The goal is to ensure atomicity, visibility, and order in concurrent programming scenarios.


Those of you familiar with Java multithreading know that Java provides a series of keywords related to concurrent processing, such as volatile, synchronized, final, concurren packages, and so on. These are the keywords that the Java memory model provides to programmers by encapsulating the underlying implementation.

When developing multithreaded code, we can directly use keywords like synchronized to control concurrency and never need to worry about underlying compiler optimizations, cache consistency, and so on. Therefore, the Java memory model, in addition to defining a set of specifications, provides a set of primitives that encapsulate the underlying implementation for developers to use directly.

This article does not attempt to cover the use of all keywords individually, as there is much information available on the web about the use of individual keywords. Readers can learn for themselves. Another important point of this article is that, as we mentioned earlier, concurrent programming should solve the problems of atomicity, orderliness, and consistency. Let’s take a look at the methods used to ensure this in Java.

atomic

In Java, two high-level bytecode instructions, Monitorenter and Monitorexit, are provided to ensure atomicity. Synchronized is the key word corresponding to these two bytecodes in Java.

Therefore, synchronized can be used in Java to ensure that operations within methods and code blocks are atomic.

visibility

The Java memory model relies on main memory as a transfer medium by synchronizing the new value back to main memory after a variable is modified and flushing the value from main memory before the variable is read.

The volatile keyword in Java provides the ability to synchronize modified variables to main memory immediately after they are modified, and to flush variables from main memory each time they are used. Therefore, volatile can be used to ensure visibility of variables in multithreaded operations.

In addition to volatile, the Java keywords synchronized and final are also visible. It’s just implemented in a different way. It’s not expanded anymore.

order

In Java, synchronized and volatile can be used to ensure order between multiple threads. Implementation methods are different:

The volatile keyword disallows instruction reordering. The synchronized keyword ensures that only one thread can operate at a time.

Ok, so this is a brief introduction to the keywords that can be used to solve atomicity, visibility, and order in Java concurrent programming. As readers may have noticed, the synchronized keyword seems to be all-purpose, satisfying all three of these attributes at once, which is why so many people abuse synchronized.

Synchronized, however, is a performance deterrent, and while the compiler provides many lock optimization techniques, overuse is not recommended.


By the end of this article, you should know what the Java Memory model is, what it does, and what it does in Java.

I hope you can continue to learn more about these keywords related to the memory model in Java, and write a few examples to experience for yourself. See the books “Understanding the Java Virtual Machine in Depth” and “The Art of Java Concurrent Programming.”

– MORE | – MORE excellent articles

  • JVM memory structure VS Java memory model VS Java object model

  • Three years after graduation, why is the gap in technical ability widening?

  • I finally figured out what happened with String.

  • How do I refactor “arrow” code

If you saw this, you enjoyed this article.

So please long press the QR code to follow Hollis

Forwarding moments is the biggest support for me.