preface
The most notable feature of Java compared to C/C++ is the introduction of automatic garbage collection (GC), which solves one of THE most frustrating memory management problems in C/C++, allowing programmers to focus on the program itself and not worry about the annoying problem of memory collection. This is one of the main reasons for Java can be popular, GC really let the programmer productivity has been released, but the programmer is difficult to perceive it exists, it’s like, after dinner we put the plates on the table or out, the waiter will pack these plates for you, when you don’t care about the waiter will come, how to collect.
Some people say that since the GC has been automated and we have done the cleaning, it seems that there is no problem not knowing GC. In most cases this is fine, but if it involves performance tuning, troubleshooting, etc., it is essential to know more about GC. In the past, The service response time of TP90 and TP99 decreased by 10ms+ by adjusting JVM GC parameters, which greatly improved service availability. So understanding GC is a must for becoming a good Java programmer!
Garbage recycling is divided into two parts, the first part will talk about garbage recycling theory, mainly including
- Several main GC collection methods: the principle and characteristics of mark clearing, mark sorting, copy algorithm, their advantages and disadvantages
- What are the advantages and disadvantages of Serial, CMS, G1 and other garbage collectors? Why is there no universal garbage collector
- Why should the new generation be set as Eden, S0 and S1? Based on what consideration
- Out-of-heap memory is not controlled by GC, so how do you free it
- Does an object have to be recycled if it is recyclable?
- What is SafePoint and What is Stop The World
The second part mainly talks about the practice of garbage recycling, mainly including
- What about the GC log format
- What are the main OOM scenes
- If OOM occurs, how to locate it? What are the common memory debugging tools
This article discusses garbage collection from the following aspects
- JVM memory region
- How to Identify Garbage
- Reference counting method
- Reachable algorithm
- Mark clearance
- Copy the method
- Label finishing
- Generational collection algorithm
More text, but also for readers to understand a lot of GC animation, I believe that there will be a lot of harvest after reading
JVM memory region
To understand the mechanics of garbage collection, we need to know what data is collected by garbage collection and in what area. So let’s take a look at the memory area of the JVM
-
Virtual machine stack: Describes a method to perform the memory model, thread is private, life cycle is the same as the thread, each method will be performed at the same time create a stack frame (see below), which save execution method of local variables and the operand stack method, dynamic connection and the return address information, such as method execution stack, methods performed out of the stack, the stack is equivalent to empty the data, The timing of loading and unloading is clear, so this area does not need to be GC.
-
Local method stack: This function is very similar to that of the virtual machine stack, except that the virtual machine stack serves when the VIRTUAL machine executes Java methods, while the local method stack serves when the virtual machine executes local methods. This area also does not require GC
-
Program counter: Thread-specific, can be thought of as a line number indicator of the bytecode being executed by the current thread, such as the following bytecode contents. Each bytecode ‘is preceded by a number (line number), which we can think of as the contents stored by the program counterRecord these Numbers (instruction address) have a purpose, we know through the thread multithreaded Java virtual machine is in turn switch and allocation of processor time to complete, at any time, a processor can only execute one thread, if this thread is allocated time slice performed (thread is hung), processor will switch to another thread, Turn on at the present time suspended threads (awakened thread), where, how do you know the last execution by recording the line number in the program counter indicator can know, so is the main purpose of the program counter records thread running state, convenient thread to be awakened when can be hung from the last state to continue, it is important to note that the program counter isThe only oneThe Java Virtual Machine specification does not specify any OOM areas, so this area is alsoNo GC is required
-
Local memory: Thread shared area. In Java 8, local memory, also known as off-heap memory, contains both meta space and direct memory. Notice the difference in the figure above between Java 8 and pre-Java 8 JVM memory areas, before Java 8 there was the concept of permanent generation, In fact, it refers to the persistent generation of the HotSpot VIRTUAL machine, which implements the method area functions defined by the JVM specification, mainly storing class information, constants, static variables, just-in-time compiler compiled code, etc. This is partly implemented in the heap, managed by the GC. If you dynamically generate classes or execute a String. Intern (String. Intern), you can easily create OOM. But it’s hard to determine the right size, because it depends on how many classes you have, how many constants you have. So in Java 8, The implementation of The method area is moved to The local memory meta-space, so that The method area is not under The control of The JVM and GC is not performed, thus improving performance. In this way, there is no OOM exception caused by the permanent generation size limitation. (If the total memory is 1G and the JVM is allocated 100M, the meta-space can be allocated 2G-100m = 1.9G, which is sufficient for unified management in the meta-space.) In summary, GC is not required in this area after Java 8
Voiceover: Consider a question: how to free memory out of the heap, which is not controlled by GC
- Heap: the first few data areas are not GC, there is only heap, yes, this is the area where GC occurs! Object instances and arrays are allocated on the heap, and the GC mainly collects these two types of data, which is the area we will focus on later
How to Identify Garbage
In the previous section, we looked at the memory area of the JVM in detail, and we saw that GC mostly happens in the heap, so how can the GC determine whether an object instance or data in the heap is garbage, or what methods can be used to determine whether some data is garbage?
Reference counting method
The easiest way to think about it is reference counting, which simply means that if an object is referenced once, the number of references is added to its head. If it is not referenced (the number of references is zero), the object is recyclable
String ref = new String("Java");
Copy the code
The code ref1 above refers to the object defined on the right, so the number of references is 1
If a ref = null is added after the above code, the object is not referenced and the number of references is set to 0. Since the object is not referenced by any variable, it will be reclaimed. The GIF is as follows
It does seem like there’s nothing wrong with reference counting, but it doesn’t solve one major problem: circular references! What is circular reference
public class TestRC { TestRC instance; Public TestRC(String name) {} public static void main(String[] args) {A A = new TestRC(" A "); B b = new TestRC("b"); // execute the second step a.stance = b; b.instance = a; // step 3 a = null; b = null; }}
Copy the code
Draw the picture step by step
At step 3, both a and B are null, but they cannot be recycled because they point to each other (with a reference count of 1). Modern virtual machines do not use reference counting to determine whether objects should be recycled because they cannot solve the problem of circular references.
Reachable algorithm
The principle of the reachability algorithm is to start from a series of objects called GC Root and lead to the next node that they point to. The next node leads to the next node that this node points to. Until all nodes have been traversed, objects that are not in any reference chain starting with GC Root are considered “garbage” and will be collected by GC.
As shown, the problem of circular references can be solved by using the reachability algorithm, because a, B can be collected because the GC Root does not reach A, B
Does it have to be recycled if objects A and B are recyclable? When objects are unreachable (recyclable), when GC happens, we will judge whether finalize method is implemented first, and if not, finalize method will be implemented first. We can associate the current object with GC Roots in this method, so that after finalize method is implemented, GC will judge whether the object is reachable again. If not, it will be recycled, if reachable, it will not be recycled!
Note: Finalize method will be executed only once. If finalize method is executed for the first time, the object will not be recycled when Finalize method becomes reachable. However, if finalize method is GC again, the object will be ignored and recycled. Remember this!
So what are these GC Roots, and which objects can be used as GC Roots
- The object referenced in the virtual machine stack (the local variable table in the stack frame)
- The object referenced by the class static property in the method area
- The object referenced by the constant in the method area
- Objects referenced by JNI (commonly referred to as Native methods) in the Native method stack
Object referenced in the virtual machine stack
As shown in the following code, a is a local variable in the stack frame. When a = null, the object is reclaimed because it acts as GC Root and is disconnected from the instance new Test() to which it originally pointed.
public class Test { public static void main(String[] args) { Test a = new Test(); a = null; }}
Copy the code
The object referenced by the class static property in the method area
As shown in the following code, when the local variable a in the stack frame is null, the object a originally pointed to will be reclaimed because the object A originally pointed to is disconnected from GC Root (variable A), and since we assign a reference to the variable S, S is a class static attribute reference at this time. Acts as GC Root, and the object it points to is still alive!
public class Test { public static Test s; public static void main(String[] args) { Test a = new Test(); a.s = new Test(); a = null; }}
Copy the code
The object referenced by the constant in the method area
As shown in the following code, the object referred to by the constant S is not recycled because the object referred to by a is recycled
public class Test { public static final Test s = new Test(); public static void main(String[] args) { Test a = new Test(); a = null; }}
Copy the code
Objects referenced by JNI in the local method stack
This is a simple explanation for children who don’t know what the local method is: A native method is an interface for Java to call non-Java code. The method is not implemented in Java, but may be implemented in C or Python. Java calls native methods through JNI. Local methods are stored as library files (DLL files on WINDOWS platforms, SO files on UNIX machines). By calling the internal methods of the local library file, JAVA can achieve a tight connection with the local machine, calling various interface methods at the system level, still not clear? See references at the end of this article for details on the definition and use of local methods.
When calling a Java method, the virtual machine creates a stack frame and pushes it onto the Java stack. When calling a local method, the virtual machine leaves the Java stack unchanged and does not push a new frame into the Java stack. The virtual machine simply connects dynamically and calls the specified local method directly.
JNIEXPORT JNICALL Java_com_pecuyu_jnirefdemo_MainActivity_newStringNative(JNIEnv *env, jobject instance, jstring jmsg) {... Jclass jc = (*env)->FindClass(env, STRING_PATH); }
Copy the code
As shown in the code above, when Java calls the above local method, JC is pushed onto the local method stack. Jc is what we call the object reference to JNI in the local method stack, and is therefore released only after the execution of the local method is complete.
Main methods of garbage recycling
In the last section, we learned that you can use reachability algorithms to identify which data is garbage, and how to recycle that garbage. There are mainly the following ways
- Mark clearing algorithm
- Replication algorithm
- Label finishing
Mark clearing algorithm
The procedure is very simple
- First mark the corresponding recyclable object according to the reachability algorithm (yellow part in the figure)
- Recycle recyclable objectsIt’s really easy to operate and you don’t have to move data around, so what’s the problem? Take a closer look at the image above. Yes, memory fragmentation! Suppose we want to allocate a block of demand among the heap in the figure aboveContinuous memoryIf you can connect the unused 2M, 2M, or 1M memory to a region with 5M free space, what can you do?
Replication algorithm
Divide the heap into two regions, A and B. Region A is responsible for allocating objects, and region B is not allocated. Use the above notation for region A to mark the surviving objects. Then copy all the living objects from region A to region B (live objects are arranged right next to each other) and finally clean up all the objects from region A to free up space, thus eliminating the memory fragmentation problem.
However, the disadvantages of the replication algorithm are obvious, for example, the heap is allocated 500 MB of memory, only 250 MB is available, the space is halved for no reason! This is surely unacceptable! In addition, each collection also has to move the surviving objects to the other half, which is inefficient. (We can think of deleting array elements and then moving the undeleted elements to one end, which is obviously inefficient.)
Label finishing
The first two steps are the same as the mark purge method, except that it adds a defragmenting process on top of the mark purge method. In this way, all surviving objects are moved to one end, arranged next to each other (as shown in the figure), and then all areas of the other end are cleaned up. In this way, the memory fragmentation problem is solved.
However, the disadvantage is obvious: it is inefficient to move live objects frequently with each garbage sweep.
Generational collection algorithm
Generational collection algorithm integrated the above algorithm, integrated the advantages of these algorithms, the greatest degree avoid their disadvantages, so is the first choice for modern virtual machine adopts algorithm, it is not so much algorithm, is not to say it is a kind of strategy, because it is the integration of the above several algorithm together, why we need a generational collection, take a look at the distribution of the object have what ruleAs shown:The vertical axis represents allocated bytes, and the horizontal axis represents program runtime
As you can see from the figure, most objects are short-lived and are recycled within a short period of time (IBM professional research shows that 98% of objects in general die overnight and are recycled after a Minor GC). So the generational collection algorithm divides the heap into new generation and old generation (there was a permanent generation before Java8) based on the lifetime of the object. The default ratio is 1: 2. The Cenozoic generation is divided into Eden region, from Survivor region (S0) and to Survivor region (S1), with a ratio of 8:1: 1. In this way, the most suitable garbage collection algorithm can be selected according to the characteristics of the new and Old generations. We call the GC occurring in the new generation as Young GC (also known as Minor GC) and the GC occurring in the Old generation as Old GC (also known as Full GC).
Vo: Think about it, why do the new generation have so many divisions?
How does generational garbage collection work
Working principle of generational collection
1. Distribution and recovery of objects in the new generation
According to the above analysis, most objects will be recovered in a very short time, and objects are generally allocated in Eden area
The Minor GC is triggered when the Eden region is about to be full
What did we say before? Most objects are reclaimed for a short period of time, so only a small number of objects will survive after Minor GC, and they will be moved to S0. S1 = 8:1:1, Eden is much larger than S0. The reason for S1 is that the Minor GC triggered in Eden collects most of the objects (close to 98%), leaving only a small number of viable objects. Add one to the age of the object (the age of the object is the number of Minor GC’s), and finally clear all the objects in Eden to free up space, as shown in the GIF below
When the next Minor GC is triggered, The living objects in Eden and S0 (or S1) will be moved to S1 (the age of the living objects in Eden and S0 +1) after each Minor GC, and the space of Eden and S0 will be cleared.
If the next Minor GC is triggered, the previous step is repeated, except that the live object is copied from Eden, S1 to S0, and the live object is moved from Eden,S0(or S1) to S1(or S0) every time the garbage is collected, the roles of S0 and S1 are switched. In other words, the garbage collection in Eden area adopts the copy algorithm, because most of the allocated objects in Eden area die out after Minor GC, and only a few live objects are left (this is why Eden:S0:S1 defaults to 8:1:1). S0 and S1 areas are also small. Therefore, it minimizes the overhead of frequent object copying caused by the replication algorithm.
2, when the object is promoted to the old age
-
When the object’s age reaches the threshold we set, it is promoted from S0 (or S1) to the old ageAs shown here, the age threshold is set to 15, and when the next Minor GC occurs, one of the objects in S0 reaches 15, reaching our threshold, and advancing to the old age!
-
If a large object needs a large amount of continuous memory for allocation, the creation of the object will not be allocated in Eden but will be directly allocated in the old age, because if the large object is allocated in Eden and then moved to S0 after Minor GC,S1 will have a large overhead (the replication of the object is slow and takes up space). It’s going to fill up S0 and S1 pretty quickly, so it’s just going to move to the old age.
-
There is another case that can promote objects to the old age, that is, if the sum of the size of objects of the same age in S0 (or S1) area is greater than half of S0 (or S1) space, then objects of that age or greater will also be promoted to the old age.
3. Guarantee of space allocation
Stop The World
If The old age is Full, Full GC will be triggered, and Full GC will recycle both The new generation and The old generation (that is, GC The whole heap), which will cause Stop The World (STW for short) and incur considerable performance overhead.
What is STW? In the so-called STW, only the garbage collector threads are working during the GC (minor or Full GC), while the other worker threads are suspended.
Voiceover: Why are other worker threads suspended during garbage collection? Imagine that you are collecting rubbish while another group of people are throwing rubbish. Can the rubbish be picked up?
Full GC usually causes worker threads to pause too long (because Full GC cleans up the whole heap of unusable objects, it usually takes a long time), and if the server receives too many requests, it will be denied service! So we try to minimize Full GC (Minor GC also causes STW, but only Minor STW is triggered because most objects in Eden are recycled and only a few live objects are migrated to S0 or S1 through the copy algorithm, so it’s relatively ok).
Now we should understand that setting the new generation as Eden, S0, S1 or setting an age threshold for objects or setting the default space size of the new generation and the old age to 1:2 is to avoid objects entering the old age too early and triggering Full GC as late as possible. Think about what would happen if the Cenozoic generation only set Eden. The consequence would be that after each Minor GC, the surviving object would prematurely enter the old age, and the old age would fill up very quickly, and the Full GC would be triggered very quickly, and most objects would die after two or three Minor GC’s. So with S0 and S1 buffering, only a few objects will age, and the age size will not grow as fast, avoiding triggering Full GC too early.
Because Full GC (or Minor GC) can affect performance, we want to initiate GC at an appropriate time, known as a Safe Point, that is not too small to allow GC to take too long and cause the program to lag too long. Nor should it be so frequent that it overloads the runtime. Generally, when the thread state is determinable at this point in time, such as determining GC Root information, the JVM can begin GC safely. Safe Point refers primarily to a specific location:
- End of loop
- Method before returning
- After calling the method’s call
- Because of the nature of the new generation (most objects die after Minor GC), the Minor GC uses the copy algorithm, while the old GC uses more objects and takes up more space. Using the replication algorithm will have a large overhead (the replication algorithm will carry out multiple replication operations when the object survival rate is high, and waste half of the space at the same time). Therefore, according to the characteristics of the old generation, GC carried out in the old age generally adopts the tag collation method to recycle.
Garbage collector types
If the collection algorithm is the methodology of memory collection, then the garbage collector is the concrete implementation of memory collection. Java virtual machine not norms garbage collector should be how to implement, so generally speaking different vendors, different versions of the virtual machine provided by the garbage collector implementation may have a difference, in general will give parameters to let users according to the characteristics of the application to combine each s use of the collector, basically has the following the garbage collector
- Garbage recyclers that work in the new generation: Serial, ParNew, ParallelScavenge
- Garbage collectors working in the Old days: CMS, Serial Old, Parallel Old
- Garbage collector that also works in the new generation: G1
If there are lines between the garbage collectors in the picture, it means they can work together. Let’s look at the specific functions of each garbage collector.
Cenozoic collector
Serial collector
Serial is a new generation, single-threaded garbage collector. Single-threaded garbage collector means that it only uses one CPU or one collection thread to collect garbage. Moreover, remember STW, which does garbage collection while other user threads pause until the garbage collection is finished. This means that the application is not available during GC.
The single-threaded garbage collector may not seem very practical, but anything we need to know about the use of the technology is not out of context. In Client mode, it is simple and efficient (compared to the single-threaded collector of other collectors). For a single-CPU-restricted environment, Serial single-threaded mode does not need to interact with other threads, reducing overhead. Focusing on the GC to its advantage to the maximum single thread, in the user’s desktop application scenarios, memory allocated to the virtual machine is not a great general, collect dozens or even YiLiangBaiZhao (is only part of the new generation of memory, a desktop application basic won’t big), can control the STW time in more than one hundred milliseconds, as long as it’s not frequent, This pause is acceptable, so for virtual machines running in Client mode, the Serial collector is the default collector for the new generation
ParNew collector
The ParNew collector is a multithreaded version of the Serial collector. In addition to using multiple threads, other collectors like collection algorithms,STW, object allocation rules, and collection policies are accomplished as the Serial collector. At the bottom, the two collectors also share a considerable amount of code
ParNew mainly works in Server mode, we know that if the Server receives more requests, the response time is very important, multithreading can make garbage collection faster, that is, reduce THE STW time, can improve the response time, so it is the first generation of many virtual machines running in Server mode collector. Another non-performance related reason is that the Serial collector is the only one that works with the CMS collector, an epoch-making garbage collector that is truly concurrent in the sense that for the first time garbage collection threads work (basically) at the same time as user threads, The Parallel Exploitoring and G1 collector use the traditional GC collector code framework, which is shared with Serial and ParNew, so it can work with both. The Parallel Exploitoring and G1 collector do not use traditional GC collector code frameworks. Other collectors only share parts of the framework code and therefore do not work with the CMS collector.
In the case of multiple cpus, garbage collection is no doubt faster due to the multi-threaded collection nature of ParNew, which can effectively reduce STW time and improve application responsiveness.
Parallel avenge
The Parallel Insane collector is the same as the ParNew collector. The Parallel Insane collector is the same as the ParNew collector. The Parallel Insane collector is the same as the ParNew collector
The focus is different. A garbage collector such as CMS focuses on minimizing the downtime of user threads during garbage collection, whereas the Parallel Scavenge avenge is to achieve a controlled throughput (throughput = time to run user code/(time to run user code + garbage collection). This means that a garbage collector like a CMS is better for programs that interact with the user because shorter pause times lead to a better user experience, whereas a Parallel Scavenge collector is focused on throughput and is better for tasks that don’t require much user interaction, such as background computing.
The Parallel Scavenge collector provides two parameters to precisely control throughput, the -xx :MaxGCPauseMillis parameter to control the maximum garbage collection time and the -xx :GCTimeRatio parameter to directly set the throughput size (default 99%).
The Parallel Exploiter can be used as the XX:UseAdaptiveSizePolicy insane. There is no need to manually specify the size of the new generation,Eden and Survivor ratio and other details, just set the basic heap size (-xmx sets the maximum heap), and the maximum garbage collection time and throughput size, the VM will collect monitoring information based on the current system running status. Adjust these parameters dynamically to achieve the maximum garbage collection time or throughput size we set. Adaptive strategies are also an important difference between the Parallel Insane and ParNew!
Old age collector
Serial Old collector
The Serial collector is a single-thread collector that works in the Client mode. The Serial collector is a single-thread collector that works in the Server mode. The Serial collector is a single-thread collector that works in the Client mode. Used in JDK 1.5 and earlier with the Parallel Insane, The other is used as a fallback for the CMS collector in the event of Concurrent collection Concurrent Mode Failure (described below), as illustrated below in conjunction with the Serial collector
Parallel Old collector
The Parallel Old collector is an older version of the Parallel Avenge collector that uses multi-threading and the tag collation method. The combination of the two is illustrated below. Both are multi-threaded collectors that truly achieve the goal of “throughput first”
CMS collector
The CMS collector is a collector whose goal is to achieve the shortest STW time. If the application is serious about service response speed and wants to give the user the best experience, the CMS collector is a great choice!
We said before that the old era mainly uses the mark arrangement method, while THE CMS works in the old era, but uses the mark elimination method, which mainly has the following four steps
- Initial tag
- Concurrent tags
- To mark
- Concurrent remove
From the figure, it can be seen that STW will occur in the two stages of initial marking and re-marking, resulting in user thread suspension. However, initial marking only marks objects that GC Roots can associate with, which is very fast. Concurrent marking is a process of GC Roots Tracing. Recallmarking is to correct the mark record of the part of the object that the mark changes during concurrent marking because the user thread continues to run. This phase of pause is generally slightly longer than the initial marking phase, but much shorter than the concurrent marking phase.
Concurrent tagging and tag cleaning are the most time-consuming parts of the process, but the user thread is working in both phases, so it does not affect the normal use of the application. So in general, the CMS collector’s memory collection process can be considered concurrent with the user thread.
But the CMS collector is far from perfect and has three major shortcomings
- The CMS collector is very sensitive to CPU resources for understandable reasons. For example, I could have 10 user threads processing requests, but now I have three collector threads, and the throughput is reduced by 30%. By default, CMS starts the number of reclaim threads (number of cpus +3) / 4. If there are only one or two cpus, then throughput drops by 50%, which is obviously unacceptable
- The CMS is unable to handle Floating Garbage, which may result in a “Concurrent Mode Failure” that results in another Full GC. Since the user thread is still running during the Concurrent cleanup phase, new Garbage is constantly being cleaned up at the same time. This garbage can only be removed during the next garbage collection (i.e., cloud garbage), and the user threads need to continue to run during the garbage collection phase. Therefore, enough space needs to be left for the user threads to execute properly, which means that the CMS collector cannot wait until the age is up like other collectors. JDK 1.5 by default when using the old s 68% space will be activated after, of course the proportion can be – XX: CMSInitiatingOccupancyFraction to set, but if set too high could easily lead to reserved memory during the CMS operation can’t meet the requirements of the program, A Concurrent Mode Failure will result in the Serial Old collector being enabled to re-collect the Old age, which we know is a single-threaded collector, resulting in a longer STW.
- The CMS uses a token cleanup method, which, as mentioned above, creates a lot of memory fragmentation, which can cause problems with large memory allocations. If you can’t find a large enough contiguous space to allocate objects, the Full GC will be triggered, which can affect application performance. , of course, we can open – XX: + UseCMSCompactAtFullCollection (default is open), is used to hold the CMS collector to FullGC open memory fragments of combined finishing process, memory consolidation can lead to STW, pause time get longer, Can also use another parameter – XX: CMSFullGCsBeforeCompation used to set the execution of how many times does not bring a compression after Full GC follow with compression.
G1 (Garbage First) collector
The G1 collector is a service-oriented garbage collector, known as the All-conquering garbage collector, with the following features
- Like the CMS collector, can be executed concurrently with application threads.
- Decluttering free space is faster.
- GC pause times are needed to be more predictable.
- It doesn’t sacrifice a lot of throughput performance like CMS.
- A larger Java Heap is not required
- During operation, there will be no memory fragmentation. G1 adopts the mark-collation method as a whole, and the local (two regions) is realized based on the replication algorithm. Both algorithms will not generate memory fragmentation and provide regular available memory after collection, which is conducive to the long-term running of the program.
- A predictable pause time model is built on STW. Users can specify the expected pause time, and G1 will keep the pause time within the user set pause time.
The main reason why G1 can build a predictable pause model is that G1 allocates heap space differently from the traditional garbage collector. The traditional memory allocation is continuous, as described above, divided into new generations, old generations, and new generations into Eden,S0, and S1, as follows
However, the STORAGE addresses of G1 generations are not contiguous. Each generation uses n discontiguous regions of the same size. Each Region occupies a contiguous virtual memory address, as shown in the figure
In addition to being different from the traditional new generation, regions also have an extra H, which stands for Humongous. This means that these regions store Humongous objects (H-OBj), that is, objects with a size greater than or equal to half of a Region. In this way, superlarge objects are directly assigned to the old age, preventing repeated copying and moving. So what’s the advantage of allocating G1 like this?
In traditional collector, if Full GC is performed on the whole heap, and allocated to each Region, G1 can keep track of the value of garbage accumulation in each Region (the amount of space obtained by collection and the experience value required for collection), so as to maintain a priority list according to the value. According to the allowed collection time, the Region with the greatest collection value is collected first, thus avoiding the collection of the whole old era and reducing the pause time caused by STW. At the same time, because only part of regions are collected, STW time can be controlled.
The G1 collector works as follows
- Initial tag
- Concurrent tags
- In the end tag
- Screening of recycling
As you can see, the overall process is very similar to that of the CMS collector, in that the filter phase sorts the collection value and cost of each Region and makes a collection plan based on the expected GC downtime of the user.
conclusion
This article briefly describes the principle of garbage collection and garbage collector types, I believe that you should have a deeper understanding of some questions raised at the beginning of the production environment, we should choose the combination of garbage collector according to different scenarios, if it is running on the desktop environment in Client mode, The Serial + Serial Old collector is more than enough. If you want fast response times and a good user experience, you can use ParNew + CMS. Even the G1, which claims to be “in control of everything,” needs to adjust the corresponding JVM parameters according to throughput and other requirements. There is no most awesome technology, only the most appropriate use scenarios, remember!
In the next chapter, we will enter into manual operation. We will operate some demos together and do some experiments to verify some phenomena we have seen. For example, when objects are generally allocated in the new generation, under what circumstances will they be directly transferred to the old age? What tools should I use to debug OOM? Wait, stay tuned!
reference
Out of memory recovery mechanism analysis of https://www.jianshu.com/p/35cf0f348275
Java call native methods – jni profile, https://blog.csdn.net/w1992wishes/article/details/80283403
Let’s say from beginning to end a Java garbage collection at https://mp.weixin.qq.com/s/pR7U1OTwsNSg5fRyWafucA
In-depth understanding of the Java virtual machine
Some key technologies of the Java Hotspot G1 GC https://tech.meituan.com/2016/09/23/g1.html
Welcome to follow the public account, add the author’s wechat “Geekoftaste” to communicate