There are many articles on the Web about the Java memory model, but many people are still confused after reading them, and some say they are even more confused.
After reading this article, you’ll know what a Java memory model is, why it exists, and what problems it solves.
Many of the expressions in this article are defined by the author after his own understanding. Hopefully, this will give the reader a clearer picture of the Java memory model.
Why do we have a memory model
Before we introduce the Java memory model, let’s look at what a computer memory model is, and then what the Java memory model does on top of the computer memory model.
To talk about the memory model of the computer, it is necessary to talk about the history of the memory model, see why there is a memory model.
Memory Model: English name Memory Model, it is an antique. It is a concept related to computer hardware. So, let me tell you what it has to do with hardware.
CPU and cache consistency
We should know that when a computer executes a program, every instruction is executed in the CPU, and when it executes, it has to deal with data.
The data on the computer is stored in main memory, which is the physical memory of the computer.
At first, nothing happened, but as CPU technology developed, CPU execution became faster and faster.
Since the technology of memory has not changed much, the process of reading and writing data from memory has become more and more different from the CPU’s execution speed, which causes the CPU to spend a lot of time waiting for each memory operation.
It’s like a startup. In the beginning, the founder and the employees have a good working relationship, but as the founder’s ability and ambition grow, there is a gap between the founder and the employees, and the average employees are increasingly unable to keep up with the CEO.
Every order of the boss will cost a lot of time when it is transmitted to the basic staff due to their lack of understanding and execution ability. This has virtually slowed down the efficiency of the entire company.
However, just because the memory read and write speed is slow, does not stop the development of CPU technology? You can’t let memory become a processing bottleneck, can you?
So, people came up with a good idea, is to increase the cache between the CPU and memory.
The concept of caching is well known, keeping a copy of data. It is characterized by fast speed, small memory, and expensive.
As the program runs, it copies the data it needs from main memory to the CPU’s cache.
The CPU can then perform calculations by reading and writing data directly from its cache and, when the computation is complete, flushing the data from the cache to main memory.
After that, the company began to set up middle management personnel, who were directly under the leadership of the CEO. If the leader had any instructions, he told the management directly, and then he could do his own work. Managers are responsible for coordinating the work of lower-level employees.
Because managers know their people and what they’re responsible for. So most of the time, the CEO only needs to communicate with the management about all kinds of decisions, announcements, etc.
With the continuous improvement of CPU capacity, a layer of cache is slowly unable to meet the requirements, and gradually derived from multi-level cache.
The CPU cache can be divided into level-1 cache (L1), level-2 cache (L2), and level-3 cache (L3) according to the order in which data is read and how closely it is combined with the CPU. All data stored in each level cache is part of the next level cache.
The technical difficulty and manufacturing cost of these three caches are relatively decreasing, so their capacity is also relatively increasing.
So, with multilevel caching, the execution of a program becomes: when the CPU reads a piece of data, it looks from the level-1 cache, if it doesn’t find it, it looks from the level-2 cache, or if it doesn’t find it, it looks from level-3 cache or memory.
As the company got bigger and bigger, the boss had more and more things to manage, and the management department of the company began to reform, and there began to be top, middle and bottom managers. Level by level management.
Single-core CPU only has a set of L1, L2, L3 cache; If the CPU has multiple cores, that is, a multi-core CPU, each core has a L1 (or even L2) cache and shares L3 (or L2) caches.
There are many different kinds of companies. Some companies have only one big Boss, who makes all the decisions. But some companies have mechanisms like co-managing directors, partners, etc.
A single-core CPU is like a company with only one boss and all the orders come from him, so you only need one management team.
A multi-core CPU is like a company founded by multiple partners. In this case, each partner needs to set up a set of senior management personnel for his direct leadership, while the lower-level employees of the company are shared by multiple partners.
Others, as they grew, began to spin off their subsidiaries. Each subsidiary is a number of cpus, did not share resources with each other. Each other.
The following figure shows a single-CPU dual-core cache structure:
With the continuous improvement of computer power, began to support multithreading. So the problem comes, we respectively to analyze the single thread, multithreading in single-core CPU, multi-core CPU in the impact.
Single thread: The CPU core’s cache is accessed by only one thread. Cache exclusivity, no access conflicts and other issues.
Single-core CPU, multi-threading: Multiple threads in a process can access the shared data in the process at the same time. After the CPU loads a block of memory into the cache, different threads will map to the same cache location when accessing the same physical address. In this way, the cache will not fail even if the thread switch occurs.
But because only one thread can be executing at any time, cache access conflicts do not occur.
Multi-core CPU, multi-threading: Each core has at least one L1 cache. If multiple threads access a shared memory in a process and each thread executes on a different core, each core keeps a buffer of shared memory in its Cache.
Since multiple cores can be parallel, it is possible for multiple threads to write to their caches at the same time, and the data in their caches may be different.
Adding caches between CPU and main memory can lead to cache consistency issues in multi-threaded scenarios, that is, in a multi-core CPU, each core’s own cache may have different contents for the same data.
There would be no problem if the company’s orders were issued sequentially.
If the company’s orders are issued in parallel, and they are all issued by the same CEO, this mechanism is fine. Because his enforcers only have one management system.
If the company’s orders are issued in parallel, and they are issued by multiple partners, there is a problem.
Because each partner will only give orders to his immediate management, and the low-level staff managed by multiple managers may be common.
For example, partner 1 wants to fire employee A, and partner 2 wants to promote employee A. If he is promoted, the decision to fire him again needs to be made by multiple partners at a meeting. The two partners sent their orders separately to their managers.
After partner 1 gives the order, and manager A fires the employee, he knows that the employee has been fired.
At that time, the manager of Partner 2 thought employee A was on the job before he got the news, so he gladly accepted the order of promotion A from partner 2.
Processor optimization and instruction rearrangement
As mentioned above, adding cache between CPU and main memory can cause cache consistency issues in multi-threaded scenarios.
In addition to this, there is another hardware issue that is important. The processor may execute the input code out of order in order to make full use of its internal arithmetic unit. This is processor optimization.
In addition to the fact that many popular processors optimize out-of-order code, compilers in many programming languages have similar optimizations, such as the Just-in-time compiler (JIT) for the Java Virtual machine (JVM) for instruction reordering.
As you can imagine, all sorts of problems can result if processor optimizations and compiler rearrangements of instructions are left unchecked.
As for the organizational adjustment of employees, if the personnel department is allowed to disassemble or rearrange randomly after receiving multiple orders, it will have a great impact on the employee and the company.
Concurrent programming problems
You may be a little confused about the concept of hardware mentioned above, and have no idea what it has to do with software.
But you should be familiar with the problems of concurrent programming, such as atomicity, visibility, and ordering.
In fact, the problems of atomicity, visibility and order are abstractly defined. The underlying problems of this abstraction are the cache consistency, processor optimization and instruction rearrangement problems mentioned above.
To briefly review these three issues, we say that concurrent programming, in order to ensure data security, needs to meet the following three characteristics:
- Atomicity means that in an operation, the CPU can not be paused and then scheduled, that is, the operation is not interrupted, the execution is either completed, or not executed.
- Visibility means that when multiple threads access the same variable and one thread changes the value of the variable, other threads can immediately see the changed value.
- Orderliness, that is, the order in which the program is executed is the order in which the code is executed.
See, the cache consistency problem is really a visibility problem. Processor optimization can cause atomicity problems. Reordering of instructions leads to ordering problems.
As a result, the hardware concepts will be left out of this article, and the familiar atomicity, visibility, and orderliness will be used directly.
What is the memory model
As mentioned earlier, cache consistency issues and processor-optimized instruction rearrangements are the result of hardware upgrades. So, is there a good mechanism to solve these problems?
The simplest and most straightforward way to do this is to do away with the processor and processor optimization techniques, do away with CPU caching, and let the CPU interact directly with main memory.
However, doing so can guarantee concurrency problems in multiple threads. But that’s a bit of throwing out the baby with the bath water.
Therefore, in order to ensure that concurrent programming can meet the atomicity, visibility and order. There is an important concept, and that is the memory model.
In order to ensure the correctness (visibility, orderliness, atomicity) of shared memory, the memory model defines the specification of read and write operation of multithreaded program in shared memory system.
These rules are used to regulate the read and write operations of memory, so as to ensure the correctness of instruction execution. It’s about the processor, it’s about the cache, it’s about concurrency, it’s about the compiler.
It solves the memory access problems caused by CPU multilevel cache, processor optimization and instruction rearrangement, and ensures consistency, atomicity and order in concurrent scenarios.
The memory model solves the concurrency problem in two main ways:
- Constrained processor optimization
- Use memory barriers
This article will not go into the underlying principles to introduce, interested friends can learn by themselves.
What is the Java Memory model
We introduced the computer memory model, which is an important specification for solving concurrency problems in multi-threaded scenarios.
What about the implementation? Different programming languages may have different implementations.
As we know, Java programs need to run on the Java virtual machine. The Java Memory Model (JMM) is a kind of Memory Model specification that shields the access differences of various hardware and operating systems. A mechanism and specification that ensures consistent access to memory by Java programs on various platforms.
When I refer to the Java Memory Model, I generally refer to the new Memory Model that started in JDK 5, mainly described by JSR-133: JavaTM Memory Model and Thread Specification.
For those interested, check out this PDF:
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf
The Java memory model specifies that all variables are stored in main memory and that each thread has its own working memory.
The working memory of a thread holds a main memory copy of variables used by the thread. All operations on variables must be performed in the working memory of the thread instead of reading or writing directly to the main memory.
Different threads cannot directly access variables in each other’s working memory, and the transfer of variables between threads requires data synchronization between their own working memory and main memory.
The JMM is used to synchronize data between working memory and main memory. It specifies how and when to synchronize data.
The reference to main memory and working memory can be easily likened to the concept of main memory and cache in the computer memory model.
In particular, it is important to note that main and working memory are not directly analogous to the Java heap, stack, method area, and so on in the JVM’s memory structure.
In Understanding the Java Virtual Machine, the main memory corresponds primarily to the object instance data part of the Java heap, in terms of definitions of variables, main memory, and working memory, if at all. Working memory corresponds to a portion of the virtual machine stack.
So, to summarize, JMM is a specification to solve the problems caused by local memory data inconsistency, compiler reordering of code instructions, processor out-of-order code execution, etc., when multiple threads communicate through shared memory.
The goal is to ensure atomicity, visibility, and order in concurrent programming scenarios.
Implementation of the Java memory model
Those of you familiar with Java multithreading know that Java provides a series of keywords related to concurrent processing, such as Volatile, Synchronized, Final, Concurren packages, and so on.
These are the keywords that the Java memory model provides to programmers by encapsulating the underlying implementation.
When developing multithreaded code, we can directly use keywords like Synchronized to control concurrency without worrying about underlying compiler optimizations, cache consistency, and so on.
Therefore, the Java memory model, in addition to defining a set of specifications, provides a set of primitives that encapsulate the underlying implementation for developers to use directly.
As mentioned earlier, concurrent programming addresses issues of atomicity, orderliness, and consistency. Let’s take a look at the different methods used to ensure this in Java.
atomic
In Java, two high-level bytecode instructions, Monitorenter and Monitorexit, are provided to ensure atomicity.
Synchronized is the key word corresponding to these two bytecodes in Java.
Therefore, Synchronized can be used in Java to ensure that operations within methods and code blocks are atomic.
visibility
The Java memory model relies on main memory as a transfer medium by synchronizing the new value back to main memory after a variable is modified and flushing the value from main memory before the variable is read.
The Volatile keyword in Java provides the ability to synchronize variables modified by it to main memory immediately after they are modified.
The variable it decorates is flushed from main memory before each use. Therefore, Volatile can be used to ensure visibility of variables in multithreaded operations.
In addition to Volatile, the Java keywords Synchronized and Final are also visible. It’s just implemented in a different way. It’s not expanded anymore.
order
In Java, Synchronized and Volatile can be used to ensure order between multiple threads.
The implementation is different: the Volatile keyword disallows instruction reordering. The Synchronized keyword ensures that only one thread can operate at a time.
Ok, so this is a brief introduction to the keywords that can be used to solve atomicity, visibility, and order in Java concurrent programming.
As readers may have noticed, the Synchronized keyword seems to be all-purpose, satisfying all three of these attributes, which is why so many people abuse Synchronized.
Synchronized, however, is a performance deterrent, and while the compiler provides many lock optimization techniques, overuse is not recommended.
conclusion
By the end of this article, you should know what the Java Memory model is, what it does, and what it does in Java.