This article will focus on the JVM, covering the JVM memory model, class loaders, GC collection algorithms, and GC collectors.

This article is not suitable for beginners, suitable for more than 3 years of development experience of technical personnel, welcome everyone to exchange and share, if there are deficiencies in the article, welcome readers to point out, thank you first.

The relationship between JDK, JRE and JVM is clarified

The JDK, JRE, and JVM architecture diagram is shown below. It is easy to see the relationship between them.

(1) The JDK contains the JRE, which in turn contains the JVM

(2) JDK is mainly used in the development environment, jre is mainly used in the release environment, of course, the release environment also has no problem with JDK, but the performance may be a little affected, JDK and JRE is similar to the relationship between the debug version and the release version

(3) In terms of file size, the JDK is larger than the JRE. As can be seen from the figure, the JDK has one more layer of toolkit than the JRE, such as common Javac, Java commands, etc

 

Class two loaders

The JVM class loader can be summarized as follows:

 

1. Why class loaders?

(1) Load the bytecode file into the runtime data area. Java source code is a bytecode file (.class) compiled by the Javac command and loaded into the JVM by the class loader.

(2) Determine the uniqueness of the bytecode file in the runtime data area. The same bytecode file, through different classloaders, forms different files, so the uniqueness of the bytecode file’s data area at runtime is determined by both the bytecode file and the classloader that loads it

2. Class loader types

In terms of categories, class loaders are mainly divided into four categories

(1) Bootstrap ClassLoader (root Bootstrap ClassLoader) : this ClassLoader is located at the top of the ClassLoader and mainly loads jar packages related to the jre core, such as /jre/lib/rt.jar

(2) Extension ClassLoader: this ClassLoader is located at the second layer of the ClassLoader. It mainly loads jar packages related to jre extensions, such as /jre/lib/ext/*.jar

(3) Application ClassLoader App: the ClassLoader is located in the third layer of the ClassLoader and mainly loads relevant jar packages under the classpaht

(4) User – defined ClassLoader: this ClassLoader is a User – defined ClassLoader that loads relevant jar packages in the path specified by the User

3. Class loader mechanism (parental delegation)

For bytecode loading, the class loading mechanism is parental delegation. What is parental delegation?

After obtaining the bytecode file, the class loader does not load it directly, but passes the bytecode file to its immediate parent class loader, which in turn passes it to the immediate parent loader of its immediate parent, and so on to the root parent loader, if the root parent loader

If it can be loaded, it will be loaded; otherwise, it will be loaded by its direct child loader. If it can not be loaded, it will be loaded by the user defined class loader.

4. How to implement class loader in JDK 1.8?

The following is a recursive implementation of the JDK 1.8 classloader

protected Class<? > loadClass(String name, boolean resolve) throws ClassNotFoundException { synchronized (getClassLoadingLock(name)) { // First, check if the class has already been loaded Class<? > c = findLoadedClass(name); if (c == null) { long t0 = System.nanoTime(); try { if (parent ! = null) { c = parent.loadClass(name, false); } else { c = findBootstrapClassOrNull(name); } } catch (ClassNotFoundException e) { // ClassNotFoundException thrown if class not found // from the non-null parent class loader } if (c == null) { // If still not found, then invoke findClass in order // to find the class. long t1 = System.nanoTime(); c = findClass(name); // this is the defining class loader; record the stats sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0); sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1); sun.misc.PerfCounter.getFindClasses().increment(); } } if (resolve) { resolveClass(c); } return c; }}Copy the code

5. Break the parental delegation model

In some cases, the parent class loader is unable to load the required file due to loading scope limitations, so the parent class loader needs to delegate its subclass loader to load the corresponding bytecode file.

For example, the database Driver interface is defined in the JDK, but the implementation of this interface is implemented by different database vendors, which causes such a problem: Bootstrap ClassLoader

To implement unified management, the Bootstrap ClassLoader can only load the corresponding files in jre/lib

Bootstrap ClassLoader delegates its subclass loader to load the Driver of the Dirver interface implementation class implemented by various vendors (the Dirver implementation class is loaded by the Application ClassLoader)

To implement, thereby breaking the parental delegation model.

Three types of life cycles

The life cycle of a Java class in the JVM is roughly divided into five phases:

1. Loading stage: obtain the binary stream of bytecode, transform the static storage structure into the runtime data structure of the method area, and generate the corresponding Class object (java.lang.class object) in the method area as the data access entry of the Class.

2. Connection stage: This stage consists of three small stages, namely verification, preparation and analysis

(1) Verification: ensure that bytecode files meet the requirements of vm specifications, such as metadata verification, file format verification, bytecode verification and symbol verification, etc

(2) Preparation: Allocate memory for static tables in inner and set JVM defaults. For non-static variables, no memory allocation is required at this stage.

(3) Parsing: Convert symbolic references in the constant pool to direct references

3. Initialization phase: the necessary initialization of class objects before they are used

The following quotes from a blogger’s opinion, I think it is well explained.

In Java code, if we want to initialize a static field, we can assign it directly at declaration time, or we can assign it in a static code block.

With the exception of final static modified constants, direct assignment operations and code in all static code blocks are placed in the same method by the Java compiler and named < clinit >. The purpose of initialization is to mark as

Field assignment of constant values, and the procedure for executing the < Clinit > method. The Java virtual machine locks to ensure that the class’s < clinit > methods are executed only once.

Under what conditions does class initialization occur?

(1) When the VM starts, initialize the main class (main function) specified by the user;

(2) Initialize the target class of the new instruction when encountering the new instruction for creating the target class instance;

(3) Initialize the class of the static method when it encounters the instruction to call the static method;

(4) Initialization of a subclass triggers initialization of its parent class;

(5) If an interface defines the default method, the initialization of the class directly or indirectly implementing the interface will trigger the initialization of the interface;

(6) Initialize a class when a reflection call is made to the class using the reflection API;

(7) When you first call a MethodHandle instance, initialize the class to which the method points.

4. Usage phase: Objects are used in the JVM

5. Unload phase: Objects are unloaded from the JVM. What conditions can cause class unload to occur in the JVM?

(1) The classloader that loaded the class is recycled

(2) All instances of the class have been reclaimed

(3) The java.lang.Class object corresponding to this Class is not referenced anywhere

 

Four JVM memory model

1. What is the JVM memory model?

The following is a diagram of the JVM memory model architecture, which I will not cover here because I have covered it in previous articles. I will focus on the heap.

 

Prior to JDK 1.8, the heap was divided into new generation, old generation, and permanent generation. After JDK 1.8, the permanent generation was removed and the MetaSpace section was added. Here, we’ll share JDK 1.8.

According to JDK1.8, the heap logic is abstracted into three parts:

(1) New Generation: including Eden Area, S0 area (also called from Area), S21 area (also called TO area)

(2) the old age

(3) the Metaspace area

2. What is the memory size of the new generation and the old generation?

According to the official recommendation, the new generation accounts for one-third (Eden:S0:S1=8:1:1) and the old generation accounts for two-thirds, so the memory allocation diagram is as follows:

 

3. How does GC collection work?

The object runs in Eden first. When Eden’s memory usage is full, Eden performs two operations: At this time, the names of s0 and S1 are exchanged, that is, S0 -> S1, S1 -> S0. After object recycling, space is released in Eden. When Eden is full again next time, the same steps are performed and the cycle is executed successively. Then, a Minor GC will be triggered, and the uncollected objects will be put into the old area, and the cycles will be executed successively. When the Eden area triggers a Minor GC and the remaining object capacity is greater than the remaining capacity of the old area, the old area will trigger a Major GC and then a Full GC will be triggered. It is important to note that when a Major GC occurs, it is almost always accompanied by a Full GC. Full GC is very performance draining, so be careful when tuning the JVM.

Below is a GC image that I captured in a production environment with the monitoring tool VisualVM

 

4. What are the garbage collection algorithms?

(1) Mark-clear algorithm

The algorithm is divided into two stages, namely, the marking stage and the clearing stage. First, all the objects to be reclaimed are marked, and then the marked objects are reclaimed. The algorithm is inefficient and prone to memory fragmentation.

A. Low efficiency: memory needs to be traversed twice, marking the first time and recycling the marked object the second time

B. Fragmentation can occur due to discontinuous memory fragments, and Full GC can occur when objects are too large

The following figure shows the comparison diagram of mark-clear algorithm before and after recycling

 

(2) Mark-copy algorithm

The algorithm solved the “tag – clear” algorithm efficiency is low and most memory fragmentation issues, it’s the size of the memory is divided into two equal, using only one piece at a time, when one piece need to be recycled, just copies the area also live objects to another block, then the block of memory of disposable, cycling.

The following figure shows a brief diagram of mark-copy algorithm before and after collection

 

However, since most of the objects in the young generation reside for a very short time, 98% of the objects are reclaimed very quickly, so there are very few surviving objects that need to be divided by memory 1:1, but by 8:1:1.

Place 2% of the surviving objects in s0.

Eden: S0: S1 =8:1:1 is shown below

 

(3) Mark-collation algorithm

The algorithm is divided into two stages, namely, marking and defragmenting. First, marking all living objects, moving these objects towards one end, and then directly cleaning up the memory beyond the end boundary. This algorithm is suitable for old objects because of their long survival time.

The marking process remains the same as the mark-clean process, but instead of cleaning up the recyclable objects directly, the next step is to move all living objects to one end and then clean up memory directly beyond the end boundary.

The following is the schematic diagram of “mark-collation algorithm” recovery period and after recovery

 

(4) Generational collection algorithm

This algorithm is the current JVM algorithm, which adopts the generational idea and has the following model:

 

5. What are the common GC collectors?

(1) SerialGC

SerialGC, also known as serial collector, is also the most basic GC collector. It is mainly suitable for single-core CPUS. The new generation adopts the copy algorithm, while the old generation adopts the mark-compression algorithm.

This causes STW problems, and the JVM annotation parameter is: -xx :+UseSerialGC.

(2) ParallelGC

ParallelGC is based on SerialGC, which is used to solve the SerialGC serial problem. ParallelGC is also used to solve the SerialGC serial problem.

A. -xx :+UseParNewGC, denoting new generation parallel (copy algorithm) old generation serial (mark-compression)

X:+UseParallelOldGC

(3) CMS GC

CMSGC is an old-time collector that uses a “mark-clean algorithm” to avoid STW problems.

-xx :+UseConcMarkSweepGC, which indicates that the CMS collector is used in the old age

Garbage First

Garbage First is intended for the JVM Garbage collector, which achieves high throughput with short pauses, is suitable for servers with multi-core cpus and large memory, and is the default Garbage collector for JDK9.

Five summarizes

The JVM memory model is deeply analyzed, which focuses on the relationship between JDK, JRE and JVM, JVM class loader, JVM heap memory partition, GC collector and GC collector algorithm, etc. The whole is more theoretical. Due to the limited space, this article does not analyze how these technologies are used in the actual tuning of JVM. Will share with you in the next article.