Introduction to JVM

1. Common problems with the JVM

What is your understanding of the JVM? What’s new for the java8 virtual machine?

What is OOM? What is a StackOverflowError? What are the methods of analysis?

What do you know about common JVM parameter tuning?

What do you know about classloaders in the JVM?

2. The role of the JVM

To run Java code, you need a JRE environment. In the JRE, the Java VIRTUAL machine and the Java core class library are included. Java programmers typically install the JDK, which already includes the JRE, along with common development and diagnostic tools.

The Java virtual machine loads bytecodes, or class files, into the JVM. Interpreted and executed by the JVM

The JVM runs on top of the operating system and has no direct interaction with the hardware.

3.JVM architecture

The Java virtual machine divides the runtime memory area into five parts: method area, heap, PC register, Java method stack, and local method stack.

Executing Java code begins by loading its compiled class file into the Java virtual machine using the class loader. The loaded Java classes are stored in the Method Area. At actual runtime, the virtual machine executes the code in the method area.

The method area and heap are shared by threads and are the focus area of garbage collection. The stack space is private to threads and garbage collection rarely occurs.

The Java virtual machine subdivides the stack into Java method stacks for Java methods, local method stacks for native methods (native methods written in C++), and PC registers (program counters) for each thread’s execution location.

At run time, each time the call enters a Java method, the Java virtual machine generates a stack frame (an area of the stack) in the Java method stack of the current thread that holds local variables and the operands of the bytecode. The stack frame size is calculated in advance, and the Java virtual machine does not require stack frames to be continuously distributed in memory space. When exiting a currently executing method, either normally or abnormally, the Java virtual machine pops up the current stack frame for the current thread and discards it.

Class loaders

1. Introduction

A ClassLoader, or ClassLoader, is responsible for loading a class file that is marked at the beginning of the file. The ClassLoader is only responsible for loading the class file. Execution Engine determines whether the ClassLoader can run.

2. Class loader classification

Class loaders for VMS:

  • Bootstrap: Loads the most basic and important classes in the JRE. Such as $JAVA_HOME/jre/lib/rt.jar, and classes specified by the vm parameter -xbootCLASspath. Since it is implemented in C++ code, there is no corresponding Java object, so attempts to get this class in Java can only be referred to by null.
  • The Extension class loader, implemented by Java code, is used to load relatively minor but generic classes, such as classes in jar packages stored in the LIB /ext directory of the JRE, and classes specified by the system variable java.ext.dirs. Such as $JAVA_HOME/jre/lib/ext / *. The jar.
  • The AppClassLoader, implemented by Java code, is responsible for loading classes under the application path. (In this case, the application path is the path specified by the vm parameter -cp/-classpath, system variable java.class.path, or environment variable classpath.) By default, classes contained in an application are loaded by the application class loader. User-defined classes
  • User-defined loader: a subclass of java.lang. ClassLoader. Users can customize the loading mode of classes. For example, class files can be encrypted and decrypted by a custom class loader at load time.

3. Parent delegation mechanism

Parental delegation model: Whenever a class loader receives a load request, it forwards the request to the parent class loader first. The class loader only tries to load the requested class if the parent class loader does not find the requested class.

Advantage:

  • Avoid class reloading. There is no need for the child ClassLoader to load the class again when the parent has already loaded the class.
  • Prevents types defined in the Java core API from being maliciously replaced and tampered with by users, resulting in errors.

The memory model of the JVM

1. Execution Engine

The execution engine interprets commands and submits them to the operating system for execution. In HotSpot, the process of translating bytecode into machine code takes two forms. The first is interpreted execution, which translates bytecode into machine code and executes it. The second is just-in-time compilation, which compiles all the bytecode contained in a method into machine code before execution. The former has the advantage of not having to wait to compile, while the latter has the advantage of actually running faster. HotSpot defaults to hybrid mode, combining the benefits of both explain execution and just-in-time compilation. It interprets the execution of bytecode, and then compiles the hot code in it, on a methods-by-methods basis, in real time.

2.Native Method Stack

Defines a number of methods that call the local operating system, also known as the local method interface.

Local method stack (for the operating system to execute) :

Native methods -> push into the local method stack -> send instructions to the operating system, hand them to the execution engine to interpret commands -> call the local method interface -> use the local method library

The Native Method Stack registers Native methods and loads the Native Method Libraie during Execution by the Execution Engine.

3.PC register (program counter)

Each thread has a program counter, thread is private, is a pointer and connection methods on the stack, pointing to the method area of bytecode (used to store to the address of the next instruction, also the instructions to be executed code), by reading the next instruction execution engine, is a very small memory space, almost can ignore don’t remember.

PC registers are mainly responsible for counting and scheduling. It can be thought of as a line number indicator of the bytecode being executed by the current thread. Because Java virtual machine multithreading is implemented by switching threads and allocating processor execution time, each processor will execute instructions in only one thread. Therefore, in order to restore the correct execution position after the thread switch, each thread has an independent program counter, counters between each thread do not affect each other, independent storage. The program counter memory area is the only area in the virtual machine where OutOfMemoryError cases are not specified.

4. The method area

The method area is shared by all threads, all fields and method bytecodes, as well as special methods such as constructors, and the interface code is also defined here. In short, the information for all defined methods is stored in this zone, which is a shared zone.

** Static variables + constants + class information (constructor/interface definition)+ runtime constant pool exists in the method area. ** Instance variables are stored in heap memory, independent of the method area.

5. The stack
5.1 the stack is introduced

Stack is also called stack memory, in charge of the operation of Java programs, is created when the thread is created, its life is to follow the life of the thread, the end of the stack memory is released, there is no garbage collection for the stack, as long as the end of the thread the stack ends, the life cycle is consistent with the thread, is the thread private.

The eight basic types of variables + reference variables of objects + instance methods are allocated in stack memory.

byte short int long float double char boolean

StackOverflowError exception: Specifies two exception states in the stack area: if the thread requests a stack depth greater than the virtual machine allows

OutOfMemoryError: If the vm stack can be dynamically expanded, sufficient memory cannot be allocated during the expansion

5.2 the stack frame

Each method of a thread will divide an area on the stack when called, which is used to store variables and other information required by the method. This area is called stack frame. A stack consists of multiple stack frames

5.3 Stack Operation Principle

All the data in the Stack exist as the carrier of Stack Frame. On the stack, methods are called in a fifo/LIFO order.

The stack frame mainly stores three types of data:

  • Local Variables: input and output parameters and Variables within methods.

  • Operand Stack: record the operation of loading and unloading the Stack;

  • Frame Data: includes class files, methods, and so on.

5.4 Stack method area interaction

6. The heap
6.1 Logical Design

The heap is the largest chunk of memory managed by the Java Virtual machine, an area of memory shared by all threads and created when the virtual machine is started. The size of the heap is adjustable, and the initial heap is 1/64 of the physical memory and 1/4 of the maximum

All object instances and arrays are allocated on the heap

OutOfMemoryError is thrown if there is no memory in the heap to complete the instance allocation and the heap can no longer be extended

The Java Heap is the primary area managed by the Garbage collector, and as such is also known as the “Garbage Collected Heap”

Heap memory is logically divided into three parts:

Young Generation Space Young/New

Tenure Generation space Old/ Tenure

Permanent Space Indicates the Permanent Space (1.8 or later) Perm

Garbage collection mechanism:

Minor GC: Reclaim Eden Park and survive from Zone

Full GC: Global GC

When the Eden area runs out of memory, a lightweight garbage collection mechanism called the Manager GC is triggered

Eden zone survivors will now move to survive to zone after garbage collection

In the surviving FROM area, if there is no garbage collection for 15 times, it will enter the TO area; if there are more than 15 times, it will enter the endowment area

Whichever surviving zone is empty is the surviving to zone

Heavyweight garbage collection: mainly garbage collection in old-age care area

An OOM error will appear if garbage collection fails in the pension area

If the new object occupies a large amount of memory, it will not fit the Eden area, but the endowment area can be put into the endowment area, and it will not fit the OOM directly

6.2 Physical Design

In Java, the heap is divided into two distinct regions: Young and Old.

The new generation is divided into three regions: Eden, From Survivor, and To Survivor

The FROM and to regions are two Spaces of equal size that can swap roles. In most cases, objects are first allocated in Eden area. After Cenozoic recycling, if the object is still alive, it will enter S0 or S1 area. After each Cenozoic recycling, if the object is alive, its age will be increased by 1.

6.3 the permanent area

The permanent storage area is a resident area of memory used to store metadata about classes and interfaces carried by the JDK itself. That is, it stores information about classes that are necessary for the runtime environment. Data loaded into this area is not easily collected by the garbage collector.

After JDK1.8 there is no permanent generation, which is replaced by metaspace. The permanent generation is an implementation of the method area, which was permanent before version 1.7 and meta-space after 1.8

JVM parameters

Parameter names meaning
-XX:+PrintGC The log is printed each time GC is triggered
-XX:+PrintGCDetails More detailed GC logs
-Xms Heap initial value (default: 1/64 of physical memory)
-Xmx Maximum heap available value (default: 1/4 of physical memory)
-Xmn Initial value of Cenozoic reactor
-XX:SurvivorRatio Used to set the ratio between Eden space and FROM /to space in the new generation. The default value is 8
-XX:NewRatio Configure the ratio of new generation and old age, the default is 1:2
-Xss The stack size of each thread is 1 MB by default. Do not set this value too large, otherwise it will reduce the number of concurrent threads.

When the default free heap is less than 40%, the JVM increases the heap to the maximum limit of -xmx

1.OutOfMemoryError

Error reason: the Java. Lang. OutOfMemoryError: Java heap heap space of memory

Solution: Adjust the heap memory size

2.StackOverflowError

Error reason: the Java. Lang. StackOverflowError expressed as stack overflow, generally produce in recursive calls.

Workaround: Set the maximum call depth of the thread. The default is 1m

-Xss5m Sets the maximum call depth

Five, the GC

1. GC definition

Garbage Collection, or GC, in the JVM periodically cleans unreachable objects from the heap.

2. The classification of GC

JVM GC does not always collect the above three memory regions together, but most of the time it collects the new generation. Therefore, there are two types of GC according to the collection region, one is minor GC, and the other is global GC (Major GC or Full GC).

Minor GC: GC for Cenozoic regions only.

Major GC or Full GC: GC for the old generation, occasionally accompanied by GC for the New generation and GC for the permanent generation.


Minor GC triggering mechanism: A Minor GC is triggered when a young generation is full. In this case, the Eden region is full. A Survivor full does not trigger A GC.

Full GC trigger mechanism: When the old generation is Full, Full GC will collect both young and old generations

3. Working characteristics of GC

In GC work, an algorithm is used to detect memory areas in the JVM and garbage collection is performed on unreachable objects detected.

In theory, Young region is frequently collected in GC, Old region is rarely collected, and Perm region (meta space/method region) is basically left untouched.

4. Mark unreachable objects
4.1 Reference counting method

Reference counting means that if an object is not referred to by any reference, it is considered garbage. The disadvantage of this approach is that it does not detect the presence of the circular pointing. Add a reference counter to the object, incrementing the counter by 1 each time a reference is made to it; When a reference is invalidated, the counter value is reduced by 1. Objects with a counter value of 0 at any time are no longer usable

Disadvantages: It is difficult to solve the problem of objects referring to each other circularly

public class MyObject {
	public Object ref;
	public String name;
	public static void main(String[] args) {
		MyObject myObject1 = new MyObject();
		MyObject myObject2 = new MyObject();
		myObject1.ref=myObject2;
		myObject2.ref=myObject1;
		myObject1=null;
		myObject2=null; }}Copy the code

After assigning null to myObject1 and myObject2, the virtual machines cannot be reclaimed because they still point to and depend on each other

4.2 the GC ROOTS algorithm

Accessibility Analysis (GC ROOTS Algorithm)

See if there are any out-of-heap references to in-heap references.

5. Three ways to recycle garbage
5.1 remove

The first is to sweep, which marks the memory occupied by dead objects as free memory and records it in a free list. When a new object needs to be created, the memory management module looks for free memory from the free list and allocates it to the new object.

The principle of cleaning up this collection is extremely simple, but there are two drawbacks.

First, it will cause memory fragmentation. Because objects in the Java virtual machine heap must be continuously distributed, there can be extreme cases where the total free memory is sufficient, but cannot be allocated.

The other is less efficient allocation. If we are a contiguity of memory, we can allocate it around pointer addition. With the free list, the Java virtual machine needs to access the list items one by one to find free memory that can fit into the newly created object.

5.2 compressed

The second is compact, which aggregates the living objects to the beginning of the memory area, leaving a contiguous memory space. This approach solves the memory fragmentation problem at the cost of the performance overhead of the compression algorithm.

After compression, the address of each memory changes, resulting in high performance overhead

5.3 copy

The third method is copy, which divides the memory area into two equal parts, maintains the two Pointers from and to, and allocates memory only to the memory area pointed to by the FROM pointer. When garbage collection occurs, surviving objects are copied to the memory region pointed to by the TO pointer, and the contents of the FROM and TO Pointers are swapped. Copying can also solve the problem of memory fragmentation, but its disadvantages are also very obvious, namely the use of heap space is extremely inefficient.

5.4 summarize

There are three ways to reclaim the memory of dead objects: cleaning up memory fragmentation, compression with high performance overhead, and replication with low heap usage. Of course, modern garbage collectors tend to combine these methods, synthesizing their advantages while avoiding their disadvantages.

Garbage collection algorithm
6.1 Mark-copying algorithm

When we call the new directive, it marks out a block of memory in the Eden area as a storage object. When Eden runs out of space, the Java virtual machine triggers a Minor GC to collect the new generation’s garbage. The objects that survive are sent to the Survivor zone.

There are two Survivor zones in the new generation, and we refer to them as from and to. The Survivior region directed by to was empty. When a Minor GC occurs, surviving objects in the Eden region and Survivor region pointed to by FROM are copied to Survivor region pointed to by TO, and the FROM and TO Pointers are swapped to ensure that the next Minor GC, The Survivor zone to points to is still empty.

The Java virtual machine records how many times objects in the Survivor zone are copied back and forth. If an object is copied 15 times (corresponding to the VM parameter -xx :+MaxTenuringThreshold), the object will be promoted to the old age. In addition, if a single Survivor zone is already 50% occupied (corresponding to the vm parameter -xx :TargetSurvivorRatio), objects with higher replication times will also be promoted to the old age. In case of a large number of live objects, the memory of the to field may not be enough to store them, and in this case, the space of the old age is used.

Therefore, the Minor GC uses a mark-copy algorithm. The old Survivor in the Survivor zone is promoted to the old age, and the remaining survivors and Eden zone survivors are copied to another Survivor zone. Ideally, objects in the Eden region are basically dead, so there is very little data to copy, so this mark-copy algorithm works extremely well.

Copy must be exchanged, whoever is empty is to

6.2 Mark-sweep algorithm

Older generations are typically implemented by tag clearing or a mixture of tag clearing and tag compression.

The tag clearing algorithm is generally applied to older generations because of the long life cycle of objects. The algorithm first marks all accessible objects and then walks through the heap, recycling unmarked objects (marked alive).

Disadvantages:

(1) When recycling, the application needs to suspend, that is, stop the world, resulting in a very poor user experience. If there is no suspension operation, when the marking operation is completed, there is an object from the new area, but it is still referenced, but because it is not marked at the time of recycling, it will be recycled, resulting in an error.

② Because of the need to traverse the whole heap object, the efficiency is low (recursive and full heap object traversal).

③ Memory fragmentation occurs

6.3 Mark–Compact algorithm

The tag clearing algorithm is very much the same as the tag compression algorithm, but the tag compression algorithm addresses memory fragmentation on top of the tag clearing algorithm.

Advantages: Resolves the memory fragmentation problem. It also eliminates the high cost of halving memory in the replication algorithm

Disadvantages: low efficiency, compression phase, need to update references due to moving available objects.

6.4 Mark-sweep-Compact algorithm

The mark-sweep-compact algorithm is a combination of the mark-sweep-compact algorithm and the mark-sweep-compact algorithm. The principle is the same as the marker clearing algorithm, except that a Compact operation is performed after multiple GCS.