If you need to reprint the original link, please retain some pictures from Baidu, if there is infringement, please contact to delete

This article directory

  • What is the GC
  • A brief introduction to the JVM memory structure
  • Accessibility analysis and GC Roots
  • Common garbage collection algorithms

1. What is GC

Garbage Collection, in computing, means that dynamic memory (memory space) on a computer should be released when it is no longer needed, so that it can be used for other purposes. This storage resource management is called garbage collection.

Some languages have no garbage collection mechanism, such as C and C++. If you need to free the memory space of useless variables, you can handle it yourself.

Other languages such as Java and C# support garbage collectors, Java virtual machines (JVMS) or. When the NET CLR finds that memory resources are tight, it automatically cleans up the memory space occupied by useless objects (objects that are not referenced).

Today we are going to focus on GC in Java.


The JVM automatically cleans up unwanted objects.

  • JVMWhat section of the object was cleared?
  • Which objects are cleaned up, and why is A cleaned up and not B?
  • JVMHow is it cleaned up?

These three questions will be answered in the following three sections respectively.

2. Brief introduction to THE MEMORY structure of JVM

As we all know, Java code is meant to run on a virtual machine, and as the virtual machine executes Java programs, the memory it manages is divided into different data areas, each of which has its own purpose. The JVM runtime memory region structure is described in the Java Virtual Machine Specification (Java SE 8) as follows:

The above is the Java virtual machine specification. Different virtual machine implementations may differ, but generally follow the specification

  • Method area: stores class information, constants, and static variables that have been loaded by VMS
  • Heap: The heap is the largest chunk of memory managed by the Java Virtual machine. The only purpose is to hold object instances. Only references are stored in the virtual stack, and references refer to objects in the heap. The area where GC is primarily used.
  • Virtual machine stack: local variable table, operand stack, etc. Virtual machine Stack describes the memory model of Java method execution: when each method is executed, a Stack Frame is created at the same time to store information such as local variation table, operation Stack, dynamic link, method return address, etc.
  • Native method stack: Similar to the virtual machine stack, it serves native methods.
  • Program counter: Records the number of lines that the method executed by the current thread has reached. If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed. If the Natvie method is being executed, this counter value is null (Undefined).

3. Accessibility analysis and GC Roots

3.1 Accessibility analysis

In Java, reachability analysis is used to determine whether an object is “garbage.”

The basic idea of this method is to search through a series of “GC Roots” objects as the starting point. If there is no reachable path between “GC Roots” and an object, the object is said to be unreachable. However, it should be noted that the object judged as unreachable does not necessarily become a recoverable object. An object judged to be unreachable must undergo at least two marking processes to become a recyclable object, and if it still does not escape the possibility of becoming a recyclable object during these two marking processes, it is basically truly recyclable.

Note that the essence is to identify the rest of the space as “useless” by finding all the live objects, not finding all the dead objects and reclaiming the space they occupy.

As shown in the figure below, when object5, 6, 7 do not exist a reference to GC Roots, that is, GC Roots cannot be reached, they are judged to be unreachable.

3.2 the GC Roots

There are a lot of different opinions on which objects can be considered GC Roots, some of which are not authoritative and some of which are not comprehensive. Finally, I found an official eclipse document that said: Garbage Collection Roots

A garbage collection root is an object that is accessible from outside the heap. The following reasons make an object a GC root:

The System Class is loaded by bootstrap/ System Class loader. For example, the System Class is loaded by bootstrap/ System Class loader. Everything from the rt.jar like java.util.*. 2. JNI Local (some user-defined JNI code or JVM internal code Local variable) Local variableinNative code, such as user defined JNI code or JVM internal code. 3. JNI Global (JNI Global) Global variableinnative code, Such as user defined JNI code or JVM internal code. 5. Thread Block (referred to by the blocked Thread) Object Referred to from a currently Active thread block. 6. A started thread, but not stopped, Thread. 7. Busy Monitor (waiting thread) Everything that has been calledwait() or notify() or that is synchronized. For example, by calling synchronized(Object) or by entering a synchronized method. Static method means class, Non-static method means object.8. Java Local (still in the stack of the method passed arguments or methods created inside the object) Local variable. input parameters or locally created objects of methods that are stillinInput or output parameters in the Native method stack, for example, methods or reflection parameters for file/network I/O. In or out parametersin native code, such as user defined JNI code or JVM internal code. 
    This is often the case as many methods have native parts and the objects handled as method parameters become GC roots.
    For example, parameters used forFile /network I/O methods or Reflection. 10.Finalizable (in the reclaim queue) An objectwhich is in11. Unfinalized (An object that overwrites the Finalize method but is not yet put into the garbage queuewhichIt is also in France, in France, that the finalizer queue is finalized. It is also a finalized method, but it has not been finalized and is not yet on the finalizer queue. But marked as root by the Memory Analyzer Tool so that the object can be included in the analysiswhich is unreachable from any other root, 
    but has been marked as a root by MAT to retain objects which otherwise would not be included in the analysis.

13. Java Stack Frame
    A Java stack frame, holding local variables.
    Only generated when the dump is parsed with the preference set to treat Java stack frames as objects.
    
14. Unknown
  An object of unknown root type. Some dumps, such as IBM Portable Heap Dump files, do not have root information. 
  For these dumps the MAT parser marks objects 
  which are have no inbound references or are unreachable from any other root as roots of this type. 
  This ensures that MAT retains all the objects in the dump.
Copy the code

A brief summary is as follows:

  • An object loaded by the System Class loader
  • Live threads, including threads that are waiting or blocking
  • Some parameters/local variables of the currently invoked method (Java method, native method)
  • The object referenced by static variables and constants in the method area
  • Held by JVM – An object that the JVM reserves for the GC for special purposes, but is actually relevant to the IMPLEMENTATION of the JVM. Some of the types that might be known are: system classloaders, some important exception classes that the JVM knows about, some pre-allocated objects for handling exceptions, and some custom classloaders.

4. Common garbage collection algorithms

4.1 Mark-sweep algorithm

This is the most basic garbage collection algorithm, and it’s the most basic because it’s the easiest to implement and the simplest idea. The mark-clear algorithm is divided into two stages: the marking stage and the clearing stage. The task of the mark phase is to mark all the objects that need to be reclaimed. The clear phase is to reclaim the space occupied by the marked objects. The specific process is shown in the figure below:

  • It is easy to see from the figure that the mark-clear algorithm is easy to implement
  • However, a serious problem is that it is easy to generate memory fragmentation. Too much fragmentation may lead to the failure to find enough space for large objects in the subsequent process and trigger a new garbage collection action in advance.
4.2 Copying algorithms

In order to solve the defects of the Mark-sweep algorithm a Copying algorithm was proposed. It divides the available memory by capacity into two equally sized pieces, one of which is used at a time. When this area of memory is used up, the surviving objects are copied to the other area, and then the used memory space is cleaned up all at once, so that the problem of memory fragmentation is not likely to occur. The specific process is shown in the figure below:

  • This algorithm is simple to implement, efficient to run and not prone to memory fragmentation
  • But it came at a high cost in memory usage, as the available memory was cut in half.
4.3 Mark-Compact algorithm

The marking phase of this algorithm is the same as Mark-sweep, but after the marking is done, instead of cleaning up the recoverable objects directly, it moves all the living objects to one end and then cleans up memory beyond the end boundary. The specific process is shown in the figure below:

  • Mark-clean algorithm is based on mark-clean algorithm, but also carries on the object movement, so the cost is higher
  • But it does solve the memory fragmentation problem.
4.4 Generational Collection algorithms

The generational collection algorithm is currently used by most JVM garbage collectors. Its core idea is to divide memory into several different regions according to the life cycle of the object. Normally, the heap is divided into Tenured Generation and Young Generation. In HotSpot, the designers included the method region as part of GC Generation collection and gave it a name, PerGen Space.

  • Old age: Typically only a few objects need to be collected at each garbage collectionMark-compact,Mark-sweepAlgorithm.
  • New generation: the feature is that there are a large number of objects to be recycled every time the garbage is recycled, for the new generation are adoptedCopyingAlgorithm.

In the new generation, most objects need to be recycled every time the garbage is collected, that is to say, the number of operations that need to be copied is less, but in practice, it is not according to 1: The ratio of 1 to divide the space of the new generation is generally divided into a large Eden space and two smaller Survivor Spaces. Each time Eden space and one of the Survivor Spaces are used, the surviving objects in Eden and Survivor are copied to another Survivor space when the collection takes place, and Eden and the Survivor space that was just used are cleaned up. Generally, the allocation ratio is 80% for Eden, 10% for SURVIVor1 and 10% for SURVIVor2.

  • Persistent generation: The relationship between a method area and a persistent generation is much like the relationship between an interface and a class in Java that implements the interface, and a persistent generation is the HotSpot VIRTUAL machine’s way of implementing the method area in the VIRTUAL machine specification.

Garbage collection in the method area is generally less cost-effective because there are two main parts of garbage collection: discarded constants and useless classes. Recycling deprecated constants is similar to recycling objects from other eras, but the following conditions are required to determine whether a class is useless:

  1. All instances of this class have been reclaimed, and no instances of this class exist in the Java heapAny instance;
  2. Corresponding to the classClass objectIt is not referenced anywhere (that is, a method of the class that cannot be accessed by reflection anywhere);
  3. Loading of the classClassLoaderIt has been recycled.

However, even if the above conditions are met, it may not be collected. Hotspot VM also provides the -xnoClassGC parameter control (to turn off CLASS garbage collection).

Reference: cloud.tencent.com/developer/a…

The HotSpot virtual machine has been replaced by a meta space, where the meta information of the class is stored.

5. Conclusion

This has covered the basic concepts of Java GC, the memory structure of the JVM, and the basic mechanism of GC collection. If you have any questions or find any errors in the article, please leave a comment below. Things like the Java garbage collector and meta-space are not covered in detail here, but will be discussed separately later if we have time.