Article from: JVM Garbage Collector and Memory Allocation Strategy

1. Summary of GC

Garbage Collection (GC), whether it is Python or Go or Java, is closely related to GC, so do you have the question:

  1. What memory needs to be reclaimed?
  2. When to recycle?
  3. How to recycle?

You might say: With dynamic memory allocation and memory reclamation technology now quite mature, everything seems to be “automated”, so why do we need to know about GC and memory allocation?

The answer is when you need to check for various memory overruns, memory leaks, and when garbage collection becomes a system reachHigh concurrencyWe need to implement the necessary monitoring and adjustment of these “automated” technologies. In the following, we will explore the answers to the above questions respectively in combination with the following distribution map:

1.1 Which Memory Needs to be reclaimed

  • Among the various parts of the Java memory runtime area, the program counter, the virtual machine stack and the local method stack are born with the thread and die with the thread;
  • The stack frame in the stack systematically executes the push and exit operations as methods enter and exit. How much memory is allocated in each stack frame is basically known as soon as the class structure is determined, so these areas are automatically reclaimed at the end of a method or thread.
  • Multiple implementation classes in an interface may require different amounts of memory. Multiple branches in a method may require different amounts of memory. We only know which objects will be created at runtime, so the focus of memory reclamation and allocation is on the heap memory and method memory.

1.2 When to recycle

In the heap world, where almost all the object instances in Java are stored, the first thing the garbage collector needs to do before collecting the heap is to determine which objects are “alive” and which are “dead” (objects that can no longer be used in any way).

  • For the method areaPermanent garbage collection consists of two parts: discarded constants and useless classes.
  • For the pile, which stores the object instance, for the object instance recycling, we first needDetermining which objects are “alive”For that part of the “dead” object, that’s what we want to recycle.

The method to determine whether an object is alive is:

  1. Reference counting algorithm
  2. Accessibility analysis algorithm

1.3 Accessibility analysis algorithm

Both Java and C# use reachability analysis algorithm to determine whether an object is alive or not. Starting with a series of objects called “GC Roots”, search down from these nodes. The path taken by the search is called the Reference Chain. When an object is not connected by any Reference Chain to the GC Roots (in the figure, it is unreachable from the GC Roots to the object), Proves that this object is not available.

Reachable analysis algorithm to determine whether an object is recyclable, then in the Java language, what kinds of objects can be used as GC Roots include:

  1. Objects referenced in the virtual machine stack (local variable table in the stack frame).
  2. An object referenced by a class static property in the method area
  3. Method area: objects referenced by constants;
  4. Object referenced in the JNI(Native method) stack

Whether the number of references of an object is judged by reference counting algorithm or whether the reference chain of an object is reachable by reachability analysis algorithm, the survival of an object is determined by reference.

2. Reference mechanism in Java

The reference held by the object after it is created on the heap is actually a variable type, and the reference chain can be formed by assigning values to each other. Starting from THE GC Root, we can judge whether the reference is reachable. The reachable of the reference is the basic condition to judge whether it can be garbage collected. The phase of garbage collection can be determined according to the strength of the semantics of the reference type.

Whether the number of references of an object is judged by reference counting algorithm or whether the reference chain of an object is reachable by reachability analysis algorithm, the survival of an object is related to “reference”.

Prior to JDK 1.2, the definition of a reference in Java was very traditional: if the value stored in data of type Reference represented the starting address of another piece of memory, that piece of memory represented a reference. This definition is very pure, but too narrow, an object in this definition is only referred to or not referred to two states, for how to describe some “tasteless to eat, it is a pity to discard” objects appear powerless. We want to describe objects that remain in memory when there is enough memory; If memory space is still tight after garbage collection, these objects can be discarded. Many systems have caching capabilities that fit this application scenario.

After JDK 1.2, Java expanded the concept of Reference, and divided it into four types: Strong Reference, Soft Reference, Weak Reference, and Phantom Reference. These four reference strengths gradually decrease in turn.

2.1 Strong Reference

A variable declaration and definition such as Object Object =new Object() produces a strong reference to that Object. As long as the object is strongly referred to and GC Root is reachable, Java memory reclamation does not reclaim the object even if it is close to running out of memory.

2.2 Soft Reference

References with weak and strong forces are used in situations where objects are not required. In the run-up to OOM, the garbage collector will add these references to the collection scope, to get more memory space, so that the program can continue to run healthy. It is mainly used to cache the intermediate calculation results of the server and the user behavior that does not need to be saved in real time.

2.3 Weak Reference

Reference strength is weaker than the previous two and is used to describe non-required objects. If only the weak reference line exists for the object to which the weak reference refers, it will be collected in the next Y GC (young generation GC). Because the timing of the YGC is uncertain, there is also uncertainty about when a weak reference will be reclaimed. WeakReference is mainly used to point to an object that is easy to disappear. After the strong reference is broken, this reference will not hijack the object, and the call to weakreference.get () will return null.

2.4 Phantom Reference

Once the definition is complete, the object to which the reference refers cannot be retrieved. The only purpose of setting a virtual reference for an object is to receive a system notification when the object is reclaimed.

Pay attention to

One difference between virtual references and soft and weak references is that virtual references must be used in conjunction with the ReferenceQueue (ReferenceQueue). When the garbage collector is ready to reclaim an object, if it finds that it has a virtual reference, it will add the virtual reference to the reference queue associated with it before reclaiming the object’s memory