It contains 1,890 words and takes about six minutes to read.

Last article we talked about some implementation details and classical algorithms of garbage marking, but this article will systematically explain the classical algorithm of garbage collection, and the Hotspot VIRTUAL machine to perform garbage collection some implementation details, such as safe points and safe zones.

Because each platform of virtual machine memory operation method is different, and involves a large number of program implementation details, so this paper will not discuss the specific implementation of the algorithm too much, will only introduce several algorithm ideas and development process.

Garbage collection algorithm

1. Mark-clear algorithm

The mark-clear algorithm is the most basic algorithm. As its name suggests, the algorithm is divided into two stages: “mark” and “clear”. The first stage is to mark the objects to be recycled, and the marked objects are collected uniformly after the marking is completed.

Advantages: Simple implementation.

Disadvantages: generate discontinuous memory fragmentation; Both “mark” and “clear” are not executed efficiently.

Execution process diagram of mark-clear algorithm:

(The image is from Understanding the Java Virtual Machine.)

2. Replication algorithm

The copy algorithm divides the memory into two pieces of the same size. When this piece is used up, the current living objects are copied to the other piece, and the current block is emptied at once.

Advantages: High execution efficiency.

Disadvantages: Low space utilization, because the replication algorithm only uses half of the memory at a time.

3. Mark-tidy algorithm

Also known as the mark-compression algorithm, the mark-collation algorithm uses the same object “marks” as the mark-clearing algorithm, but does not subsequently clean up the recyclable objects. Instead, it moves the surviving objects to one end of the free space and then cleans up the memory space beyond the boundary.

Advantages: It solves the memory fragmentation problem and has higher space utilization than the replication algorithm.

Disadvantages: Low relative efficiency because of local object movement.

Execution process diagram of mark-finishing algorithm:

4. Generational collection algorithm

Currently, all commercial VMS use the generation collection algorithm, which divides the memory into several blocks according to the object life cycle. In Java, the memory is divided into the new generation and the old generation. We divide the objects with low survival rate into the new generation and use the replication algorithm to improve the performance of garbage collection. In the old generation, we store the objects with high survival rate and use the mark-clean and mark-tidy algorithms to improve the utilization rate of memory space.

The specific introduction and parameter configuration of the new generation and old generation will be explained in detail in subsequent articles.

Details of garbage collection execution

This section covers the details of how the HotSpot VIRTUAL machine performs garbage collection in order to give the reader a better understanding of the Java Virtual machine.

HotSpot VIRTUAL machine: It is the Sun JDK and OpenJDK custom virtual machine, is the most widely used virtual machine.

Garbage collection process: Before Java virtual machine memory collection, in order to ensure The consistency of memory, it must first suspend The program execution, namely The legendary Stop The World (STW), enumerate GC Roots using The reachable analysis algorithm, mark The dead objects, and then garbage collection.

Garbage collection problem: If you want to stop the program, it must be short and manageable, otherwise the disaster will be devastating.

Solution: HotSpot is clearly designed with this in mind, so JIT compilation uses OopMap data structures to record references to stacks and registers so that the virtual machine knows directly where the object references are stored, as shown below. Compile some of the native code for the string.hashCode () method for me:

As you can see, pointer references to ordinary objects are stored using the OopMap data structure.

Check assembly method, the start command line: Java – XX: XX: + UnlockDiagnosticVMOptions – + PrintAssembly YouTestClass

Could not load hsdis-amd64. DLL; library not loadable; PrintAssembly is disabled

Error solution: Use the compiled HSdis. DLL to: JRE installation directory \bin\server directory, hsdis. DLL address: pan.baidu.com/s/1-D6u0gnU…

Safepoint

With the help of OopMap, HotSpot can do GC Roots enumeration quickly, but there are a lot of instructions that cause the OopMap content to change, and generating an OopMap for each object would cause a lot of extra space, which would make GC expensive. So HotSpot only generates oOPMaps at “specific locations” that are “safe spots”.

HotSpot doesn’t always pause for GC, but only after the program has reached a safe point, so don’t set too few safe points, make GC wait too long, or add too much run-time cost.

Two safe ways to interrupt threads

Interrupts: No thread execution code is required to actively cooperate. When GC occurs, all threads are forced to interrupt, and if any thread is found not in the safe point, the program is resumed until it reaches the safe point.

Active interrupts: Do not force the thread to interrupt, but simply set an interrupt flag. Each thread polls for this flag during execution, and if the flag is changed (the interrupt flag appears), it will interrupt and suspend itself after running to a safe point. Currently, all commercial VMS use active interrupt mode.

Safety area (Saferegion)

The safe point mechanism simply ensures that it doesn’t take too long for a program to get to a safe point for GC, but in special cases such as thread sleep, thread blocking, etc., HotSpot cannot wait for a blocked or hibernated thread to wake up and execute properly. This is where the concept of a safe zone is introduced.

Saferegion: A safe zone is a zone in which it is safe to start GC anywhere without changing object references, etc. When a thread runs, it marks itself as being in a safe zone, and then if a thread is blocked, hibernated, etc., then HotSpot initiates a GC ignoring those threads that are in a safe zone. When the thread is awakened again, it first checks to see if the GC Roots enumeration (or the GC process) is complete, and if so continues, otherwise it will wait until it receives a signal from the Safe Region that it is Safe to leave.

reference

In-depth Understanding of the Java Virtual Machine

Algorithms and Implementation of Garbage Recycling

The last

Pay attention to the public account, send the keyword “GC”, receive “Garbage Recycling algorithm and Implementation” learning materials.