Hello, today I’m going to share with you JVM garbage collection, so take out your little notebook and write it down
1. Jvm garbage collection
The Java virtual machine is divided into five main modules: the classloader subsystem, the runtime data area, the execution engine, the local method interface, and the garbage collection module. The Java Virtual Machine specification does not require Java virtual machine garbage collection, but before the invention of unlimited memory, most JVM implementations did.
The Java heap is the largest chunk of memory management, shared by all threads, and the main area for the garbage collector.
Virtual machine garbage collection mechanism is perfect, dynamic memory allocation and collection is relatively mature, in the memory management mechanism, most of us do not need to consider memory collection, only the Java heap and method area need us to consider dealing with memory problems. Generally, the first step for memory reclamation is to determine whether a certain part is alive or dead, mainly through the following two algorithms:
One is the reference counting algorithm, the algorithm is simple to implement, the efficiency of judgment is relatively high, a lot of software use this algorithm, but the mainstream Java did not choose this algorithm, the core problem is that the algorithm is difficult to deal with the object call each other.
Second is called the accessibility analysis algorithm, this algorithm is the core idea is relying on the judgment whether the object, to achieve the survival of this algorithm is through a series of objects of GC ROOTS as a starting point, USES the search algorithm traversing references, didn’t find the node if the search process, argues that the node is inaccessible, can be recycled, in Java, You can generally use this algorithm to deal with problems.
2. Scope
- The JVM heap
- The young generation
- The old s
- dimension
Classification of 3.
The current virtual machine garbage collection adopts generational collection algorithm, which has no new idea, but divides the memory into several blocks according to the different object life cycle. The Java heap is generally divided into the new generation and the old generation, so that we can choose the appropriate garbage collection algorithm based on the characteristics of each generation.
Garbage collection algorithm
4.1 Mark-copy algorithm
To solve the efficiency problem, “copy” collection algorithms emerged. It divides memory into two equally sized pieces and uses one piece at a time. When this area of memory is used up, the surviving objects are copied to another area, and then the used space is cleaned up again. In this way, half of the memory range is reclaimed each time
4.2 Mark-clear algorithm
The algorithm is divided into “mark” and “clear” stages: mark the surviving objects and uniformly recycle all the unmarked objects (this is generally chosen); Alternatively, you can mark all the objects that need to be recycled, and uniformly recycle all the marked objects when the markup is complete. It is the most basic collection algorithm and is relatively simple, but presents two obvious problems:
- Efficiency problem (inefficient if there are too many objects to mark)
- Space issues (large number of discontinuous fragments after marker clearing)
4.3 Mark-collation algorithm
The marking process is still the same as the “mark-clean” algorithm, but instead of collecting the recyclable objects directly, the next step is to move all surviving objects towards one end and then directly clean up the memory beyond the end boundary.
Garbage collector
5.1 Serial Collector (-xx :+UseSerialGC -XX:+UseSerialOldGC)
The Serial collector is the most basic and oldest garbage collector. As you can see from the name, this collector is a single-threaded collector. Its “single-threaded” significance not only means that it uses only one garbage collection thread to complete The garbage collection, but also that it must pause all other worker threads (” Stop The World “) until The garbage collection is complete.
The new generation adopts copy algorithm and the old generation adopts mark-collation algorithm.
Avenge (-xx :+UseParallelOldGC).
The Parallel collector is essentially a multithreaded version of the Serial collector, with the exception of using multiple threads for garbage collection, the behavior (control parameters, collection algorithms, collection strategies, and so on) is similar to that of the Serial collector. The default number of threads to be collected is the same as the number of CPU cores. Of course, you can specify the number of threads to be collected with the parameter (-xx :ParallelGCThreads), but this is not recommended.
The Parallel Scavenge collector focuses on throughput (efficient CPU utilization). Garbage collectors such as CMS focus more on the pause times of user threads (improving user experience). Throughput is the ratio of the CPU time spent running user code to total CPU consumption. The Parallel Collector provides a number of parameters to find the most appropriate pause times or maximum throughput, and leaving memory management optimization to the virtual machine is a good option if you don’t know how the collector operates. The new generation adopts copy algorithm and the old generation adopts mark-collation algorithm.
The Parallel Old collector is an older version of the Parallel Avenge collector. Use multithreading and mark-tidy algorithms. The Parallel Insane and Parallel Old collectors (the default generation and generation collectors of JDK8) can be used preferentially for throughput and CPU resources.
5.3 ParNew Collector (-xx :+UseParNewGC)
The ParNew collector is similar to the Parallel collector, except that it can be used in conjunction with the CMS collector.
The new generation adopts copy algorithm and the old generation adopts mark-collation algorithm.
It is the first choice for many virtual machines running in Server mode, and in addition to the Serial collector, it is the only one that works with the CMS collector (a truly concurrent collector, described below).
5.4 CMS Collector (-xx :+UseConcMarkSweepGC(old))
The CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. Ideally suited for use in ux applications, it is the HotSpot VIRTUAL machine’s first truly concurrent collector, enabling the garbage collector thread to work (basically) at the same time as the user thread.
As the word Mark Sweep in its name implies, the CMS collector is implemented as a mark-and-sweep algorithm, which is a bit more complex than the previous garbage collectors. The whole process is divided into four steps:
- Initial marker: Suspend all other threads (STW) and record objects that GC Roots can reference directly, which is fast.
- Concurrent marking: The concurrent marking phase is the process of traversing the entire object graph from the directly associated objects of GC Roots. This process is time-consuming but does not require the user thread to be paused and can be run concurrently with the garbage collection thread. As the user program continues to run, it may cause the state of the marked object to change.
- Relabelling: The relabelling phase is to correct the mark record of the part of the object that the mark changes because the user program continues to run during the concurrent marking phase. The pause time of this phase is usually slightly longer than the initial marking phase, and much shorter than the concurrent marking phase. The main use of tricolor markup in the incremental update algorithm (see the details below) for re-marking.
- Concurrent cleanup: The user thread is started and the GC thread begins to clean the unmarked area. Any new objects in this phase will be marked black without any processing (see below for a detailed explanation of the three-color marking algorithm).
- Concurrent reset: Resets the marked data during this GC.
As its name suggests, it is an excellent garbage collector with its main advantages: concurrent collection and low pauses. But it has the following obvious disadvantages:
- Sensitive to CPU resources (competing with services for resources)
- Unable to handle floating garbage (garbage generated during concurrent tagging and concurrent cleanup phases, which can only be cleaned up by the next GC);
Recovery algorithm, it USES “tag – clear algorithm can lead to” when it has a lot of space debris gather over, of course through the parameter – XX: + UseCMSCompactAtFullCollection allows the JVM to do after the execution of the tag to remove the uncertainty in the process of execution, A concurrent mode failure can occur when a garbage collection is triggered again, especially during concurrent tagging and concurrent cleanup phases. A concurrent mode failure can occur when a garbage collection is not completed and a full GC is triggered again. At this point, stop the World is entered and collected by the Serial Old garbage collector
5.5 CMS Core Parameters
-xx :+UseConcMarkSweepGC: enable the CMS
-xx :ConcGCThreads: indicates the number of concurrent GC threads
– XX: + UseCMSCompactAtFullCollection: FullGC do compression after finishing (pieces)
FullGC – XX: CMSFullGCsBeforeCompaction: how many times after compression, the default is 0, on behalf of each FullGC was followed by compression
– XX: CMSInitiatingOccupancyFraction: triggered when use the old s reached the proportion FullGC (default is 92, this is the percentage)
– XX: + UseCMSInitiatingOccupancyOnly: they only use recycling threshold set (set of values – XX: CMSInitiatingOccupancyFraction), if not specified, the JVM is only used for the first time set data, follow-up will automatically adjust
– XX: + CMSScavengeBeforeRemark: Starting a Minor GC before the CMS GC reduces the overhead of the CMS GC marking phase (which also marks young generations, reducing some of the marking time if many pairs of garbage objects are killed in the Minor GC), which typically takes up to 80% of the CMS GC time
– XX: + CMSParallellnitialMarkEnabled: said at the time of initial tag multithreaded execution, shorten the STW
-xx :+ CMSPARallelEnabled: Multi-threaded execution during re-marking, shorten the STW;
6. Implementation of the underlying algorithm of garbage collection
- Three color tag
During concurrent marking, because the application thread continues to run during the marking, references between objects may change, and multiple marks and missing marks may occur.
Here we introduce the “three-color mark” to explain to everyone, the object encountered in the process of Gcroots reachability analysis traversal object, according to the condition of “access” to mark the following three colors:
- Black: Indicates that the object has been accessed by the garbage collector and that all references to the object have been scanned. The black object indicates that it has been scanned, and it is safe and survivable. If there is another object reference pointing to the black object, there is no need to scan again. A black object cannot point directly to a white object (without passing the gray object).
- Gray: Indicates that the object has been accessed by the garbage collector, but at least one reference to the object has not been scanned.
- White: indicates that the object has not been accessed by the garbage collector. Obviously, at the beginning of the reachability analysis, all objects are white. If at the end of the analysis, all objects are still white, that is, they are unreachable.
summary
Jvm optimization is primarily to prevent fullgc, shortening the time of STW, put an end to OOM, but realize this method is mainly rely on the garbage collection mechanism, specifically garbage collector, the garbage collector and points for a variety of, single thread, concurrency, recovery algorithm, etc., the difference of the optimization, so to do depth must understand its underlying mechanism.
Well, this is the end of today’s article, I hope to help you confused in front of the screen