preface
As a Java programmer, memory management is always an unavoidable step if you want to advance to the higher level. As one of the most important parts of memory management, Garbage Collection (GC) mechanism must be mastered. Today, I would like to share my understanding of garbage collection mechanism and generational collection strategy.
directory
-
1. The background
-
2. Two recycling mechanisms
- 2.1. Reference counting
- 2.2. Accessibility analysis
-
3. Recycling algorithm
- 3.1. Tag clearing algorithm
- 3.2. Replication algorithm
- 3.3. Tag compression algorithm
-
4. Generational recycling strategy
- 4.1. The new generation
- 4.2. The old generation
-
5. Four big quotes
1. The background
Generally speaking, during the process of programming, we are constantly writing data to the memory, and this data should be emptied from the memory in time, otherwise it will cause OutOfMemory(memory overflow), so every programmer must follow this principle. Heard (I don’t know C ~) in C phase, is need programmers manual recycling garbage, and we Javaer relatively is much more happiness, because the JVM GC mechanism, that is to say, the JVM will help us automatically clearing up the rubbish, but there is a price to be happy, because will always be some quirks garbage objects to avoid GC algorithm, This phenomenon is also known as memory leaks, so only by mastering the GC mechanism can you avoid writing programs that leak memory.
2. Two recycling mechanisms
2.1 Reference Counting
What is reference counting? For example, if A A = new A(), the reference count of object A is held by reference A, then the reference count of object A will be +1. If A sets the reference to null, then the reference count of object A will be 0. The GC algorithm detects that object A has A zero reference count and reclaims it. Simple, but reference counting has its drawbacks
The scenario is as follows:
A a = new A();
B b = new B();
a.next = b;
b.next = a;
a = null;
b = null;
Copy the code
Will the A and B objects be reclaimed after executing the above code? It looks like references have been set to null, but in fact a and B’s next hold each other’s references, creating a situation where a and B become garbage objects and cannot be collected. Some of you might say, well, memory leaks are so obvious that a and B would just leave their next empty. Well, that’s true, but it’s hard to see a huge business logic memory leak in real business at a glance. So the JVM later abandoned reference counting in favor of reachability analysis.
2.2 Accessibility analysis
Reachability analysis is actually a mathematical concept. In the JVM, special references are treated as GcRoot, and objects reachable through GcRoot are not treated as garbage. To put it another way, if an object is held directly or indirectly by GcRoot, it is not treated as a garbage object. Here’s a picture that looks something like this:
In the figure, A, B, C, and D can be accessed by GcRoot, so they will not be recycled. E and F are not reachable by GcRoot, so they are marked as garbage objects. The most typical examples are G and H, which are referred to each other but cannot be reachable by GcRoot, so they are also marked as garbage objects. To sum up: Reachability analysis can solve the problem that object references to each other cannot be reclaimed in reference counting.
What types of references can be used as GcRoot? There are about four types:
- Local variables on the stack
- Static variables in the method area
- Constants in the method area
- Reference object to the local method stack JNI
Pay attention to the point
Never confuse a reference with an object. An object exists in memory, whereas a reference is a variable/constant that holds the object’s address in memory.
Let me verify several gcroots with some code
A local variable
I use Android code for debugging, do not understand Android students as onCreate main method can be.
public class MyApp extends Application { @Override public void onCreate() { super.onCreate(); method(); } private void method(){ Log.i("test","method start"); A a = new A(); try { Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); } Log.i("test","method end"); } class A{ @Override protected void finalize() throws Throwable { Log.i("test","finalize A"); }}}Copy the code
prompt
- In Java, an object is called when it is reclaimed
finalize
methods- Garbage collection takes place in a separate thread in the JVM. For better verification effect, add 2000 ms delay here
The print result is as follows:
17:58:57.526 method start
17:58:59.526 method end
17:58:59.591 finalize A
Copy the code
Method takes 2000 milliseconds to execute, and object A is reclaimed as soon as method ends. So you can assume that local variables in the stack can be used as GcRoot
Local method area static variables
public class MyApp extends Application { private static A a; @Override public void onCreate() { super.onCreate(); Log.i("test","onCreate"); a = new A(); try { Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); } a = null; Log.i("test","a = null"); }}Copy the code
The print result is as follows:
18:12:35.988 a = new a () 18:12:38.028 a = null 18:12:38.096 finalize aCopy the code
Create an object called A and assign it to static variable A. After 2000 milliseconds, set static variable A to null. The log shows that object A is reclaimed immediately after static variable A is null. So you can assume that static variables can be GcRoot
Method area constant and static variable verification process is exactly the same, the native verification process is complicated, interested students can verify by themselves.
Verify that a member variable is available asGcRoot
public class MyApp extends Application { @Override public void onCreate() { super.onCreate(); A a = new A(); B b = new B(); a.b = b; a = null; } class A{ B b; @Override protected void finalize() throws Throwable { Log.i("test","finalize A"); } } class B{ @Override protected void finalize() throws Throwable { Log.i("test","finalize B"); }}}Copy the code
The print result is as follows:
13:14:58.999 finalize A
13:14:58.999 finalize B
Copy the code
According to the log, both objects A and B are reclaimed. Although B objects are held by B references in A objects, member variables cannot be treated as GcRoot, so B objects are unreachable and thus treated as garbage.
3. Recycling algorithm
The previous summary described the GC mechanism, but the implementation depends on the algorithm. Let me briefly describe several common GC algorithms.
3.1. Tag clearing algorithm
Get all the GcRoot and traverse all the objects in memory, mark them if they can be GcRoot, and all the remaining objects will be garbage removed.
- Advantages: Simple implementation and high execution efficiency
- Disadvantages: Memory fragmentation (available memory is spread out) and GC can be triggered frequently if large contiguous chunks of memory need to be allocated
3.2. Replication algorithm
Divide memory into two pieces, using only one piece at a time. All objects are first iterated and the available objects are copied to another piece of memory. At this point, the previous piece of memory can be regarded as all garbage. After cleaning up, the new memory block is set to the current available. And so on and so forth
- Advantages: Solves the problem of memory fragmentation
- Disadvantages: Memory needs to be allocated sequentially and available memory is reduced to half of what it used to be.
3.3. Tag compression algorithm
Get all the GcRoot, and GcRoot starts iterating through all the objects in memory, compressing the available objects to the other end, and removing the junk objects. In effect, time complexity is sacrificed to reduce spatial complexity
- Advantages: Eliminates memory fragmentation for token clearing and does not require memory fragmentation in the copy algorithm
- Disadvantages: Still need to move objects, execution efficiency is slightly lower.
4. Generational recycling strategy
The garbage collector is very busy in the JVM. If an object lives for a long time and avoids further burden on the garbage collector by repeated creation/collection, is it possible to cache it at the expense of memory? The answer is yes. The JVM has a generational collection policy that sets the life cycle for each object, and the heap memory is divided into different areas to store objects for each life cycle. Typically, an object has a lifetime of new generation, old generation, and permanent generation (Java 8 deprecated).
4.1. The new generation
First, let’s look at the memory structure diagram of the new generation:
The new generation memory is classified into Eden, SurvivorA, and SurvivorB by 8:1:1
New Generation memory workflow:
- When an object is newly created, it is placed in Eden zone. When Eden zone is about to be full, a garbage collection is done. The surviving objects are copied to SurvivorA and Eden is emptied
- When Eden is full the next time, a garbage collection is done again. The surviving object is copied to a SurvivorB and then all Eden and SurvivorA objects are reclaimed.
- When Eden is full again, another garbage collection is done, the surviving object is copied to a SurvivorA, and Eden and SurvivorB objects are reclaimed. This is repeated about 15 times, placing the final surviving objects into the old age area.
The new generation workflow is more consistent with the application scenarios of replication algorithm. Replication algorithm is adopted because replication is the core.
4.2. The old age
According to the previous section, we can know that when an object survives for a long time, it will be stored in the old age area. When the old age area is about to be full, it will do a garbage collection. Therefore, the old age area is characterized by more living objects and fewer garbage objects, and less movement and no memory fragmentation when using the mark compression algorithm. Therefore, marker compression algorithm can be used to further improve the efficiency of old regions.
5. Four big quotes
In the process of our program development, it is inevitable to create some relatively large objects, such as the Bitmap used to carry pixel information in Android. If you use it improperly, it will cause memory leakage. If there are a large number of similar objects, it will have a great impact on memory. To avoid this as much as possible, the JVM provides you with a choice of four object references: strong, soft, weak, and virtual. Here’s a table to illustrate the analogy
- Assume that the following objects are reachable by GcRoot
Reference types | Recovery time |
---|---|
Strong reference | Never recycled (default) |
Soft references | Reclaimed when the memory is insufficient |
A weak reference | It is collected the first time GC is triggered |
Phantom reference | It’s going to be recycled at any time. It doesn’t make sense |
Reference: Lecture 2 of Advanced 34 Android Engineers
conclusion
This article describes the GC mechanism from five aspects.
- GC was created to improve developer productivity
- Reachability analysis solves the problem of reference counting cross-references
- Using different GC algorithms in different scenarios can improve efficiency
- Generation collection strategy further improves GC efficiency
- The clever use of four references can solve memory leaks to some extent