Since garbage collection predates Java and is now highly automated, a deeper understanding of garbage collection techniques can help analyze memory spills and leaks, as well as adjust and improve when garbage collection mechanisms become performance bottlenecks in multi-threaded scenarios.

Determining object state

The first step in garbage collection is to determine which objects are no longer in use and can be recycled.

Reference counting method

By adding a reference counter to an object, one reference is incremented and one reference is decayed. This method is easy to implement and efficient. However, the VM does not use this method because it cannot correctly determine the reference of the object referenced in the loop.

Accessibility analysis

Starting with a special set of object GC Roots, all unreachable objects can be recycled as they are analyzed down the reference chain. Objects that can be classified as GC Roots are:

  • Objects referenced in the virtual machine stack local variable table
  • Static object of the method area class
  • Method area constants
  • Objects referenced by JNI in the local method stack

reference

Prior to JDK 1.2, the concept of a reference was limited to the fact that a value in reference data was the starting address of another chunk of memory, and that objects could only be referred or unreferenced.

After JDK 1.2, in order to keep some unnecessary objects from being reclaimed when there is enough memory, four types of references were introduced: strong reference, soft reference, weak reference, and virtual reference, with the intensity of reference decreasing in order.

Strong references References such as Object obj = new Object() are not recycled as long as the reference exists.

Soft-referenced objects are useful but not required, and are only reclaimed until an overflow occurs. Use the SoftReference class.

Weak-referenced objects are not required. Weak-referenced objects will only survive until the next garbage collection, and will be collected whether or not there is enough memory. Using WeakReference class to achieve.

Virtual references Virtual references are notified by the system when they are reclaimed. Implemented with PhantomReference.

Finalize the role of

In the reachability analysis, an object marked as unreachable will be filtered once. If the object does not overwrite finalize method or the Finalize method of the object has been called, the object will be classified as an object that does not need to execute Finalize method. Objects that do not need to finalize objects will be collected, objects that need to finalize objects will be put into a queue, and Finalizer threads automatically created by the virtual machine will execute the Finalize methods of objects in turn. However, the VIRTUAL machine does not guarantee that the Finalize method of waiting for every object will be finished, so that the garbage collection mechanism will collapse because some methods execute too slowly or enter into an infinite loop.

When the Finalize method is executed, objects will not be recycled if they can be associated with objects in the reference chain. However, the Finalize method of an object can only be used once, and the objects that have “saved themselves” will be recycled when they are judged to be unreachable next time.

In practice, Finalize method is expensive to run and cannot determine the call order of objects, so its function can be completely replaced by try Finally.

Method area recovery

Although the collection efficiency of method area is much lower than that of heap, it still has its necessity. This part of the recycling is mainly constant recycling and class recycling. The process of reclaiming constants is similar to that of reclaiming objects. Classes must meet three criteria to be considered recyclable:

  • All instances of the class have been reclaimed.
  • The ClassLoader that loaded the class has been reclaimed.
  • The java.lang.Class object of this Class is not referenced anywhere, that is, it cannot be accessed anywhere through reflection.

Garbage collection algorithm

Mark-clear

There are two stages: mark and clear. Mark all the memory that needs to be reclaimed first and then clear. This is the most basic method, but it is inefficient and can cause memory fragmentation. When fragmentation reaches a certain level, large objects have no continuous memory to allocate and can only trigger the next garbage collection in advance.

copy

Divide the memory into two parts and use only one part at a time. When the space in this part is used up, the objects that are still needed are copied to the other part of the memory to clean up the whole of the first part. In this method, if the memory is equally divided, the memory utilization will be low, equivalent to only half of the memory is available. If the memory is not equally divided, the size of the subsequent replicated objects varies, requiring additional memory as a guarantee. In addition, in scenarios with high object survival rate, frequent object replication can also become a bottleneck of efficiency.

The HotSpot virtual machine divides the memory into Eden space and Survivor space in a size ratio of 8:1. Each time Eden space and Survivor space are used, the object is copied to another Survivor space when reclaimed, and the old generation is used as the memory guarantee.

Mark-compress

Mark all surviving objects, gather them into one contiguous piece of memory, and clean up the rest of memory. Suitable for old generation memory with high object survival rate.

Generational collection

Vm memory is divided into different areas based on the lifetime of objects. Generally, Java heap is divided into new generation and old generation, and different collection algorithms are used according to the characteristics of different generations. The survival rate of new generation objects is low, and more memory needs to be cleared. Old age objects have a high survival rate, no extra space guarantee, are not suitable for copying and require mark-clean or mark-compress methods.