1: What is the JVM

So you can think about, what is a JVM? What is the JVM for? I’ve listed three concepts here. The first is the JVM, the second is the JDK, and the third is the JRE. I’m sure you’re familiar with these three, I’m sure you’ve all used them, but do you have a clear idea of these three concepts? I don’t know if you know, if you know. Now you’ll see what I understand about the JVM.

(1) : JVM

The JVM, short for Java Virtual Machine, is a specification for computing devices. The JVM is an imaginary computer that is implemented by emulating various computer functions on an actual computer.

With the introduction of Java language virtual machines, the Java language does not need to be recompiled when running on different platforms.

The Java language uses the Java Virtual Machine to mask platform-specific information,

Enables Java language compilers to generate only object code (bytecode) that runs on the Java virtual machine,

It can run unmodified on multiple platforms.

When the Java virtual machine executes bytecodes, it interprets the bytecodes as machine instructions on a specific platform.

This is why Java has the ability to “compile once, run anywhere”.

(2) : JDK

The Java Development Kit (JDK) is a software Development Kit (SDK) of the Java language.

The basic components included in the JDK include:

  1. Javac – compiler that converts source programs into bytecode

  2. Jar – A packaging tool that packages related class files into a single file

  3. Javadoc – Document generator that extracts documents from source comments

  4. JDB – Debugger, error checking tool

  5. Java – Run a compiled Java program (suffix.class)

  6. Appletviewer: Applets browser, a Java browser that executes Java applets on HTML files.

  7. Javah: Generates C procedures that can call Java procedures, or creates header files for C procedures that can be called by Java programs.

  8. Javap: A Java disassembler that displays the accessibility and data in the compiled class files, along with the meaning of the bytecode.

  9. Jconsole: Java tool for system debugging and monitoring

(3) : the JRE

Java Runtime Environment (JRE), a collection of environments required to run Java programs, including the JVM standard implementation and Java core class libraries.

It consists of two parts:

Java Runtime Environment:

  • Is the Java platform on which applications can be run, tested, and transferred.

  • It includes the Java Virtual Machine (JVM), Java core class libraries, and support files.

  • It does not include development tools (JDK)– compilers, debuggers, and other tools.

  • The JRE requires auxiliary software — the Java Plug-in– to run the applet in the browser.

The Java plug-in.

Allows Java applets and Javabeans components to run in browsers that use Sun’s Java Runtime Environment(JRE),

Instead of running it in a browser that uses the default Java runtime environment.

The Java Plug-in is available for Netscape Navigator and Microsoft Internet Explorer.

2: JVM runtime data area

During the execution of Java programs, the Java VIRTUAL machine divides the memory it manages into different data areas. Each of these areas has its own purpose and has been created and destroyed by time, some areas are created with the start of the virtual machine process, others are created and destroyed depending on the start and end of the user thread. According to the Java Virtual Machine Specification (Java SE 7), the memory managed by the Java Virtual Machine will include the following run-time data areas, as shown in the figure below:

2.1 Program counter

The Program Counter Register is a small memory space that can be thought of as a line number indicator of the bytecode being executed by the current thread. In the concept of virtual machine model (just the conceptual model, all kinds of virtual machine may be through some of the more efficient way to achieve), bytecode interpreter to work is by changing the counter value to select a need to be performed under the bytecode instruction, branch, loop, jump, exception handling, thread to restore the basic function such as all need to rely on the counter.

Multithreading in the Java virtual machine is implemented by switching threads in turn and allocating processor execution time. At any given moment, a processor will execute instructions in only one thread. Therefore, in order to restore the correct execution position after the thread switch, each thread needs to have an independent program counter, counters between each thread do not affect each other, independent storage.

If the thread is executing a Java method, this counter records the address of the bytecode instruction being executed. If the Native method is being executed, this counter value is null (undefined).

This memory region is the only one where the Java Virtual Machine specification does not specify any OutOfMemoryError cases.

A program counter is thread-private and has the same life cycle as a thread (live and die with the thread).

2.2. Java Virtual Machine Stack

The Java Virtual Machine Stack (Java Virtual Machine Stack) describes the memory model of the execution of Java methods: Each method is executed with a Stack Frame created to store information such as local varitables, operand stacks, dynamic links, method exits, etc. The process of each method from being called to completion corresponds to the process of a stack frame moving from the virtual machine stack to the virtual machine stack.

In the Java Virtual Machine specification, two exceptions are specified for this area:

StackOverflowError is raised if the thread requests a stack depth greater than the virtual machine allows.

If the virtual stack can be extended dynamically (as most Java virtual machines currently can), OutOfMemoryError will be thrown if sufficient memory cannot be allocated during the extension.

Like program registers, the Java virtual machine stack is thread-private and has the same lifetime as a thread.

2.3. Local method stack

Native Method Stack and virtual machine Stack play very similar roles. The difference between them is that the virtual machine Stack serves for virtual machines to execute Java methods, while the local Method Stack serves for the Native methods used by virtual machines. The virtual machine specification does not mandate the language, usage, or data structure of methods in the local method stack, so specific virtual machines can implement it freely.

Like the virtual stack, the local method stack area throws StackOverflowError and OutOfMemoryError exceptions.

Like the virtual machine stack, the local method stack is thread-private.

2.4. Java Heap

For most applications, the Java Heap is the largest chunk of memory managed by the Java virtual machine. The Java heap is an area of memory that is shared by all threads and is created at virtual machine startup. The sole purpose of this memory area is to hold object instances, and almost all object instances and arrays are allocated memory here.

The Java Heap is the primary area managed by the Garbage collector and is often referred to as the “Garbage Collected Heap”. From the point of view of memory collection, the Java heap can also be subdivided into: new generation and old generation; The new generation can be divided into Eden space, From Survivor space and To Survivor space.

According to the Java Virtual Machine specification, the Java heap can be in a physically discontinuous memory space, as long as it is logically contiguous, like our disk space. When implemented, it can be either fixed size or extensible, but most current virtual machines are implemented as extensible (controlled by -xms and -xmx). OutOfMemoryError is thrown if there is no memory in the heap to complete the allocation of instances, and the heap can no longer expand.

2.5. Method Area

The Method Area, like the Java heap, is an Area of memory shared by individual threads that holds information about classes that have been loaded by the virtual machine, constants, static variables, JIT-compiled code, and more. The method area is created when the virtual machine starts.

The Java Virtual Machine specification is very relaxed about method areas, which, like the heap, do not require discontinuous memory space and can be fixed size or extensible, and optionally do not implement garbage collection.

According to the Java Virtual Machine specification, OutOfMemoryError is thrown if the memory space of the method area does not meet the memory allocation requirements.

2.6. Runtime constant pool

The Runtime Constant Pool is part of the method area. The Constant Pool Table is used to store various literals and symbolic references generated at compile time. This part of the Constant Table is stored in the runtime Constant Pool after the Class is loaded into the method area.

2.7. Direct Memory

Direct Memory is not part of the run-time data region of the virtual machine, nor is it defined in the Java Virtual Machine specification. However, this portion of memory is also frequently used and can cause OutOfMemoryError exceptions.

The NIO (New Input/Output) class was introduced in JDK 1.4, introducing a Channel and Buffer based I/O method that can allocate off-heap memory directly using Native libraries. It then operates through a DirectByteBuffer object stored in the Java heap as a reference to this memory. This can significantly improve performance in some scenarios because it avoids copying data back and forth between the Java heap and Native heap.

Here we have an overview of the time zone in which the Java VIRTUAL machine is running, and more information about the JVM will follow.

Here I recommend an exchange learning group: 575745314, which will share some video recordings of senior architects: Spring, MyBatis, Netty source code analysis, high concurrency, high performance, distributed, microservice architecture principle, JVM performance optimization these become the necessary knowledge system of architects. I can also get free learning resources, which I benefit a lot from now

3: JVM memory model

The Java Memory Model is also known as the Java Memory Model, or JMM. The JMM defines how the Java Virtual Machine (JVM) works in computer memory (RAM). The JVM is the entire computer virtual model, so the JMM belongs to the JVM.

If we want to understand concurrent programming in Java, we need to understand the Java memory model well. The Java memory model defines the visibility of shared variables between multiple threads and how shared variables can be synchronized when needed. The original Java memory model was not very efficient, so it was refactored in Java version 1.5, which is still used in Java version 8.

About Concurrent programming

In the world of concurrent programming, there are two key issues: communication and synchronization between threads.

Communication between threads

Thread communication refers to the mechanism by which threads exchange information. In imperative programming, there are two communication mechanisms between threads: shared memory and message passing.

In the shared memory concurrency model, threads share the common state of the program and communicate with each other implicitly through the common state of the writer-read memory. The typical shared memory communication mode is to communicate through shared objects.

In the concurrent model of messaging, there is no common state between threads, and threads must communicate explicitly by explicitly sending messages. Typical messaging methods in Java are wait() and notify().

For information about communication between Java threads, see Thread Signal.

Synchronization between threads

Synchronization is the mechanism that a program uses to control the relative order in which operations occur between different threads.

In the shared memory concurrency model, synchronization is done explicitly. Programmers must explicitly specify that a method or piece of code needs to be executed mutually exclusive between threads.

In the concurrent model of message delivery, synchronization is implicit because the message must be sent before the message is received.

Concurrency in Java is a shared memory model

Communication between Java threads is always implicit and completely transparent to the programmer. Java programmers writing multithreaded programs are likely to encounter all sorts of strange memory visibility problems if they don’t understand how implicit communication between threads works.

Java memory model

The Java shared memory model (JMM) determines when a thread’s write to a shared variable is visible to another thread. From an abstract point of view, JMM defines an abstract relationship between threads and main memory: Shared variables between threads are stored in main memory, and each thread has a private local memory where it stores copies of shared variables to read/write. Local memory is an abstraction of the JMM and does not really exist. It covers caches, write buffers, registers, and other hardware and compiler optimizations.

From the above figure, thread A and thread B must go through the following two steps in order to communicate:

1. First, thread A flusher the updated shared variables from local memory A to main memory. 2. Thread B then goes to main memory to read the shared variables that thread A has updated before.Copy the code

The following is a schematic illustration of these two steps:

As shown in the figure above, local memory A and B have copies of the shared variable X in main memory. So let’s say that at the beginning, all three of these memory x values are 0. When thread A executes, it temporarily stores the updated x value (suppose 1) in its local memory, A. When thread A and thread B need to communicate, thread A will first refresh the modified X value in its local memory to the main memory, and the x value in the main memory becomes 1. Thread B then goes to main memory to read thread A’s updated x value, and thread B’s local memory x value also changes to 1.

Taken as A whole, these two steps are essentially thread A sending messages to thread B, and this communication must go through main memory. The JMM provides Java programmers with memory visibility assurance by controlling the interaction between main memory and local memory for each thread.

As mentioned above, the Java memory model is an abstraction, so how does it work in Java? To better understand how the Java memory model works, the JVM’s implementation of the Java memory model, the hardware memory model, and the bridging between them are described in detail.

JVM implementation of the Java memory model

Within the JVM, the Java memory model divides memory into two parts: the thread stack area and the heap area. The following diagram shows the logical view of the Java memory model within the JVM:

Each thread running in the JVM has its own thread stack, which contains information about the method call performed by the current thread, also known as the call stack. The call stack changes over time as the code executes.

The thread stack also contains all local variable information for the current method. A thread can only read its own thread stack, that is, local variables in a thread are not visible to other threads. Even if two threads execute the same piece of code, they each create local variables in their own thread stack, so local variables in each thread will have their own version.

All the primitive types (Boolean, byte, short, char, int, long, float, double) local variables are stored directly in the thread stack, for they are independent between the value of each thread. For native variables of primitive type, one thread can pass a copy to another thread when they cannot be shared.

The heap contains information about all objects created by Java applications, regardless of the thread in which they were created, including encapsulated classes of primitive types (such as Byte, Integer, Long, and so on). Whether an object belongs to a member variable or a local variable in a method, it is stored in the heap area.

The following figure shows how the call stack and local variables are stored on the stack and objects are stored on the heap:

A local variable of primitive type is stored entirely on the stack.

A local variable may also be a reference to an object, in which case the local reference is stored on the stack, but the object itself is still stored in the heap.

Member methods of an object that contain local variables still need to be stored on the stack, even if the object they belong to is in the heap.

The member variables of an object, whether primitive or wrapped, are stored in the heap.

Static variables and information about the class are stored in the heap along with the class itself.

Objects in the heap can be shared by multiple threads. If a thread acquires the application of an object, it can access the object’s member variables. If two threads call the same method on the same object at the same time, then both threads can access the object’s member variables at the same time, but for local variables, each thread copies a copy to its own thread stack.

The following figure shows the process described above:

Hardware memory architecture

No matter what the memory model is, it ultimately runs on computer hardware, so it is necessary to understand the computer hardware memory architecture. The following diagram briefly describes the contemporary computer hardware memory architecture:

Modern computers typically have more than two cpus, and each CPU may contain multiple cores. Therefore, if our application is multithreaded, these threads might run in parallel across CPU cores.

Inside the CPU, there is a set of CPU registers, which are the CPU’s memory. The CPU can manipulate registers much faster than the computer’s main memory. There is also a CPU cache between main memory and THE CPU registers, and the CPU operates on the CPU cache faster than main memory but slower than the CPU registers. Some cpus may have multiple cache layers (level 1 and level 2). A computer’s main memory, also known as RAM, is accessible to all cpus and is much larger than the caches and registers mentioned above.

When a CPU needs to access main memory, it first reads some of the main memory data into the CPU cache, and then the CPU cache into the register. When the CPU writes data to main memory, it also flushes registers to the CPU cache and then, on some nodes, flushes the cached data to main memory.

Bridges between Java memory models and hardware architectures

As mentioned above, the Java memory model is not consistent with the hardware memory architecture. There is no distinction between stack and heap in the hardware memory architecture. From the point of view of hardware, most of the data, whether stack or heap, will be stored in main memory, of course, some of the stack and heap data may be stored in CPU registers, as shown in the following figure, Java memory model and computer hardware memory architecture is an intersection:

When objects and variables are stored in various memory areas of the computer, there are bound to be some problems, of which the two main problems are:

1. Visibility of shared objects to threads 2. Contention of shared objects 123Copy the code

Visibility of shared objects

When multiple threads operate on the same shared object simultaneously, if the volatile and synchronization keywords are not used properly, updates made by one thread to the shared object may be invisible to other threads.

Imagine that our shared object is stored in main memory. A thread in a CPU reads main memory data into the CPU cache and then makes changes to the shared object, but the changed object in the CPU cache has not been flushed into main memory, so changes made by one thread to the shared object are not visible to threads in other cpus. The end result is that each thread ends up copying the shared object, and the copied object resides in a different CPU cache.

The following figure shows the process described above. The thread running on the left CPU copies the shared object obj from main memory into its CPU cache, changing the count variable of object obj to 2. But this change is not visible to the thread running on the right CPU because the change has not been flushed into main memory:

To solve the problem of shared object visibility, we can use the Java volatile keyword. Java’s volatile keyword. Volatile ensures that variables are read directly from main memory, and that updates to variables are written directly to main memory. The volatile principle is implemented based on CPU memory barrier instructions, as discussed later.

Competitive phenomenon

If multiple threads share an object, this creates contention if they modify the shared object at the same time.

As shown in the figure below, thread A and thread B share an object, obj. Let’s say thread A reads the obj. count variable from main memory into its CPU cache, and thread B also reads the obj. count variable into its CPU cache, and both threads increment obj. count. At this point, the obj. count + 1 operation is executed twice, but in different CPU caches.

If the increments are sequential, the obj. count variable will add 2 to the original value, and eventually obj. count in main memory will have a value of 3. In the following figure, however, the two increment operations are parallel. Either thread A or thread B flush the results into main memory first. Eventually, the obj. count increment in main memory only increases once to 2, even though there are two increment operations.

To solve this problem, use Java synchronized blocks. Synchronized blocks guarantee that only one thread can enter the contention area at a time. Synchronized blocks also guarantee that all variables in the block will be read from main memory. Updates to all variables are flushed into main memory when a thread exits the block. It does not matter whether these variables are volatile or not.

The difference between volatile and synchronized

  1. Volatile essentially tells the JVM that the value of the current variable in the register (working memory) is indeterminate and needs to be read from main memory; Synchronized locks the current variable so that only the current thread can access it and other threads are blocked.

  2. Volatile can only be used at the variable level; Synchronized can be used at the variable, method, and class levels

  3. Volatile only enables change visibility of variables, not atomicity. Synchronized can guarantee the change visibility and atomicity of variables

  4. Volatile does not block threads; Synchronized can cause threads to block.

  5. Volatile variables are not optimized by the compiler; Variables of the synchronized tag can be optimized by the compiler

The underlying principles that support the Java memory model

Instruction reordering

When executing a program, the compiler and processor reorder instructions to improve performance. However, the JMM ensures consistent Memory visibility for the upper layer by inserting specific types of Memory barriers that prohibit certain types of compiler reordering and processor reordering across different compilers and different processor platforms.

  1. Compiler optimization reordering: The compiler can rearrange the execution order of statements without changing the semantics of a single-threaded program.

  2. Instruction level parallel reordering: If there is no data dependence, the processor can change the execution order of the corresponding machine instructions.

  3. Reordering of memory systems: Processors use caches and read/write buffers, which make loading and storing operations appear to be out of order.

Data dependency

If two operations access the same variable and one of them is a write operation, there is a data dependency between the two operations.

The compiler and processor do not change the execution order of two operations that have a data dependency relationship, that is, they are not reordered.

as-if-serial

No matter how reordered, the result of execution in a single thread cannot be changed. The compiler, runtime, and processor must obey the as-if-serial semantics.

Memory barriers

As mentioned above, memory barriers can prevent reordering of certain types of processors, allowing programs to run as expected. A memory barrier, also known as a memory barrier, is a CPU instruction, which is basically an instruction like this:

  1. Ensure the order in which certain operations are performed.

  2. Affects the memory visibility of some data, or the execution result of an instruction.

The compiler and CPU can reorder instructions to ensure the same end result in an attempt to optimize performance. Inserting a Memory Barrier tells the compiler and CPU that no instructions can be reordered with the Memory Barrier.

Another thing that Memory barriers do is force various CPU caches to flush. A write-barrier, for example, will flush out all data written to the cache before the Barrier, so that any thread on the CPU can read the latest version of the data.

What does this have to do with Java? Volatile, as described in the Java Memory model above, is implemented based on Memory barriers.

If a variable is volatile, the JMM inserts a write-barrier instruction after the field is written and a read-barrier instruction before it is Read. This means that if we write a volatile variable, we can guarantee that:

  1. When a thread writes variable A, any thread that accesses the variable gets its latest value.

  2. A write prior to writing variable A updates data that is visible to other threads. Because Memory barriers flush out all previous writes in the cache.

happens-before

Starting with JDK5, Java uses the new JSR-133 memory model to address memory visibility between operations, based on the concept of happens-before.

In the JMM, if the result of an operation needs to be visible to another operation, there must be a happens-before relationship between the two operations, which can either be in the same thread or in two different threads.

The happens-before rule, which is closely related to programmers, is as follows:

  1. Procedure order rule: Every action in a thread happens-before any subsequent action in that thread.

  2. Monitor lock rule: the happens-before action to unlock a lock is followed by action to lock the lock.

  3. Volatile field rule: Writes to a volatile field are happens-before any subsequent reads to that volatile field by any thread.

  4. The transitive rule: If A happens-before B, and B happens-before C, then A happens-before C.

Note: The happens-before relationship between two actions does not mean that the former action must be executed before the latter! Only the execution result of the previous operation is required to be visible to the latter operation, and the former operation is ordered before the latter operation.

Here I recommend an exchange learning group: 575745314, which will share some video recordings of senior architects: Spring, MyBatis, Netty source code analysis, high concurrency, high performance, distributed, microservice architecture principle, JVM performance optimization these become the necessary knowledge system of architects. I can also get free learning resources, which I benefit a lot from now.