IT brother
A big factory to do advanced Java development program ape
Follow wechat public account: IT elder brother
Java foundation, Java Web, JavaEE all tutorials, including Spring Boot, etc
Reply: Resume template, you can get 100 beautiful resumes
Reply: Java learning route, you can get the latest and most complete a learning roadmap
Re: Java ebooks, get 13 must-read books for top programmers
Foreword (thumbs up)
Today we are going to talk about garbage collection algorithms for the JVM. We are going to learn what garbage collection algorithms are, and we are going to learn what garbage collectors are, and then we are going to learn how to tune JVM parameters.
Garbage Collection (GC), as its name implies, frees up space occupied by Garbage and prevents memory leaks. Efficient use of usable memory, cleaning and recycling dead or unused objects in the memory heap.
How to find the garbage in the program
Reference counting method
Add a counter RC to each object that increments by one when the object is referenced somewhere and decays by one when the reference fails. The object counter is 0 to determine whether the object can be reclaimed. Disadvantages: Cannot solve the problem of circular references.
I’m going to create a String, String m = new String(” Jack “); “Jack” has a reference to m. Then set m to null, and the number of references to “jack” equals zero, which in the reference-counting algorithm means that this piece of content needs to be reclaimed.
The reference counting algorithm spreads garbage collection over the entire application, rather than suspending the entire application until all objects in the heap have been processed. Therefore, garbage collection with reference counting is not strictly stop-the-world garbage collection.
It seems nice, but we know that JVM garbage collection is stop-the-world, so what led us to finally abandon The reference counting algorithm? Look at the following example.
public class ReferenceCountingGC {
public Object instance;
public ReferenceCountingGC(String name) {}public static void testGC(a){
ReferenceCountingGC a = new ReferenceCountingGC("objA");
ReferenceCountingGC b = new ReferenceCountingGC("objB");
// A and B refer to each other
a.instance = b;
b.instance = a;
a = null;
b = null; }}Copy the code
As you can see, the last two objects are no longer accessible, but because they refer to each other, their reference counts will never be zero, and the reference counting algorithm will never tell the GC collector to reclaim them.
Accessibility analysis algorithm
The path through GC ROOT’s object as the starting point for the search and down through the reference is called the reference chain. Determine whether an object can be reclaimed by whether it has a path to the reference chain (objects that can be GC ROOT: objects referenced in the virtual machine stack, objects referenced by class static attributes in the method area, objects referenced by constants in the method area, objects referenced by JNI in the local method stack)
The reachable algorithm successfully solves the problem of circular dependencies that reference counting cannot solve, as long as you cannot establish a direct or indirect connection to the GC Root, you are considered recyclable. This raises the question of which ones belong to GC Root.
Objects in the Java memory region that can be used as GC ROOT:
Object referenced in the virtual machine stack
public class StackLocalParameter {
public StackLocalParameter(String name) {}
public static void testGC(a) {
StackLocalParameter s = new StackLocalParameter("localParameter");
s = null; }}Copy the code
When s is null, the localParameter object breaks the reference chain with GC Root and will be reclaimed.
The object referenced by the class static property in the method area
public class MethodAreaStaicProperties {
public static MethodAreaStaicProperties m;
public MethodAreaStaicProperties(String name) {}
public static void testGC(a){
MethodAreaStaicProperties s = new MethodAreaStaicProperties("properties");
s.m = new MethodAreaStaicProperties("parameter");
s = null; }}Copy the code
In this case, s is GC Root, and s is set to null. After GC, the properties object pointed to by S will be reclaimed because it cannot establish a relationship with GC Root. M, as a static property of the class, also belongs to GC Root. The parameter object is still connected to GC Root, so the parameter object will not be reclaimed.
The object referenced by the constant in the method area
public class MethodAreaStaicProperties {
public static final MethodAreaStaicProperties m = MethodAreaStaicProperties("final");
public MethodAreaStaicProperties(String name) {}
public static void testGC(a) {
MethodAreaStaicProperties s = new MethodAreaStaicProperties("staticProperties");
s = null; }}Copy the code
M is either a constant reference in the method area or GC Root. If S is set to null, final objects will not be reclaimed because there is no contact with GC Root.
Objects referenced in the local method stack
Any native interface will use some kind of local method stack. If the implemented local method interface uses the C connection model, then its local method stack is C stack. When a thread calls a Java method, the virtual machine creates a new stack frame and pushes it onto the Java stack. However, when it calls a local method, the virtual machine keeps the Java stack unchanged and no longer pushes new frames into the thread’s Java stack. The virtual machine simply connects dynamically and calls the specified local method directly.
Garbage collection algorithm
After determining what garbage can be collected, all the garbage collector has to do is start garbage collection, but there is a problem: how to do garbage collection efficiently. Here we discuss the core ideas of several common garbage collection algorithms.
Mark-clear algorithm
Mark-sweep is one of the most basic garbage collection algorithms. It is divided into two parts. First, it marks the objects in the memory area, marks which objects are recyclable, and then picks up the garbage and cleans it up. As shown above, the garbage that is cleared becomes unused memory area, waiting to be used again. But it has a big problem, and that is memory fragmentation.
The medium squares above are assumed to be 2M, the smaller ones 1M, and the larger ones 4M. By the time we’re done recycling, the memory will be cut into multiple segments. We know that to create memory, we need contiguous memory areas, so we need a 2M area of memory, and two 1M areas are unusable. As a result, we actually have so much memory, but we can’t use it.
Replication algorithm
Copying algorithms are evolved on the basis of marker removal algorithms to solve the memory fragmentation problem of marker removal algorithms. It divides the available memory by capacity into two equally sized pieces, one of which is used at a time. When this area of memory is used up, the surviving objects are copied to the other area, and the used memory space is cleaned up again. The continuous availability of memory is guaranteed, and there is no need to consider the complexity of memory fragmentation when allocating memory. The copying algorithm exposes another problem, such as the high cost of using only 200 gigabytes of a 500 GB hard drive.
Mark-compression algorithm
Mark-compression algorithm the marking process is still the same as the mark-clean algorithm, but instead of cleaning up the recyclable objects directly, the next step is to move all surviving objects to one end and clean up the memory area beyond the end boundary.
The mark compression algorithm solves the problem of memory fragmentation and avoids the disadvantage that the copy algorithm can only use half of the memory region. The mark compression algorithm changes the memory more frequently and needs to sort out the reference addresses of all living objects, which is much less efficient than the replication algorithm. Typically, the Java heap is divided into the new generation and the old generation, so that the most appropriate collection algorithm can be used for each generation.
Generational collection algorithm
Generation collection algorithm The generation collection algorithm is not strictly an idea or theory, but a combination of the above three basic algorithm ideas and different algorithms for different situations. According to the different life cycle of objects, the memory is divided into several blocks.
In the new generation, a large number of objects are found dead and only a few survive in garbage collection, so the replication algorithm is selected, and only a small amount of the replication cost of the surviving objects can be collected.
In the old days, because the object had a high survival rate and there was no extra space to allocate it, it had to be recycled using mark-clean or mark-clean algorithms.