No matter how unhappy you are now, believe that tomorrow will be better than today.
An overview of the
Garbage Collection, often referred to as “GC,” was created in 1960 in MIT’s Lisp language and has matured over half a century. In the JVM, program counters, virtual machine stacks, and local method stacks are created and removed with the thread. Stack frames are loaded and unloaded as methods enter and exit, realizing automatic memory cleaning. Therefore, our memory garbage collection is mainly concentrated in the Java heap and method area. This portion of memory is allocated and used dynamically.
Object survival judgment
Reference count: Each object has a reference count property. When a new reference is added, the count is increased by 1. When a reference is released, the count is decreased by 1. This method is simple and does not solve the problem of objects referring to each other circularly.
Reachability Analysis: Start with GC Roots and search down the path called the reference chain. When an object is not connected to GC Roots by any reference chain, the object is proved to be unavailable. Unreachable objects.
In the Java language, GC Roots includes:
Garbage collection algorithm
Mark-clear algorithm
The Mark-sweep algorithm is divided into two phases: Mark and Sweep:
-
First, mark all objects that need to be reclaimed.
-
All marked objects are uniformly recycled after marking is complete.
It has two major disadvantages:
-
Efficiency problem, marking and cleaning process is not efficient;
-
Space problems. A large number of discrete memory fragments are generated after the token is cleared. Too much space fragmentation may cause the program to be unable to find enough contiguous memory when it needs to allocate large objects later in the run and have to trigger another garbage collection action prematurely.
Replication algorithm
Algorithms for Copying collection:
-
It divides the available memory by capacity into two equally sized pieces, one of which is used at a time. When this area of memory is used up, the surviving objects are copied to the other area, and the used memory space is cleaned up again.
-
In this way, each time a piece of memory is reclaimed, memory allocation does not have to consider the complexity of memory fragmentation, as long as the heap top pointer is moved, in order to allocate memory, simple implementation, efficient operation.
Main disadvantages:
- The cost of this algorithm is to reduce memory by half, and the efficiency of continuously copying long-lived objects is reduced.
Mark-collation algorithm
The copy-collection algorithm will perform more replication operations when the object survival rate is high, and the efficiency will be low. More importantly, if you do not want to waste 50% of the space, you need to have extra space for allocation guarantee, in the extreme case that all objects in the memory used are 100% alive, so in the old days, this algorithm generally cannot be used directly.
According to the characteristics of the old s, someone put forward another “tag – finishing” (Mark – Compact) algorithm, the labeling still like “tag – clear” algorithm, but the subsequent steps are not directly to recyclable objects to clean up, but let all live objects move to the end, and then clear directly outside the boundary of the memory
Generational collection algorithm
The basic assumption of GC generation is that most objects have very short lifetimes.
A “Generational Collection” algorithm divides the Java heap into Generational and older generations so that the most appropriate Collection algorithm can be used for each generation. In the new generation, a large number of objects are found dead and only a few survive in garbage collection, so the replication algorithm is selected, and only a small amount of the replication cost of the surviving objects can be collected. In the old days, because the object has a high survival rate and there is no extra space to allocate it, it has to use the “mark-clean” or “mark-tidy” algorithm for recycling.
Garbage collector
Serial collector
Serial collectors are the oldest, most stable, and efficient collectors, and may produce long pauses and only use one thread to collect. The new generation and the old use serial recycling; New generation replication algorithm, old age marker – compression; – Garbage collection stops The World
Parameter control: -xx :+UseSerialGC serial collector
ParNew collector
The ParNew collector is essentially a multithreaded version of the Serial collector. New generation parallel, old age serial; New generation replication algorithm, old age marking – compression
Parameter control:
-xx :+UseParNewGC ParNew collector -xx :ParallelGCThreads limit the number of threads
The Parallel collector
The Parallel Collector is similar to the ParNew collector in that the Parallel collector focuses on the throughput of the system. You can set parameters to enable the adaptive adjustment policy. The vM collects performance monitoring information based on the current system running status and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput. You can also use parameters to control how many milliseconds or percentages GC takes; New generation replication algorithm, old age marking – compression
Parameter control: -xx :+UseParallelGC using Parallel collector + old serial
CMS collector
The CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. At present, a large part of Java applications are concentrated on the server side of Internet sites or B/S systems. These applications pay special attention to the response speed of services and hope that the system pause time is the shortest, so as to bring users a better experience.
As the name (which includes “Mark Sweep”) suggests, the CMS collector is based on a “mark-sweep” algorithm, which is more complex than the previous collectors. The process is divided into four steps, including:
- CMS Initial Mark
- CMS Concurrent Mark
- Re-marking (CMS Remark)
- CMS Concurrent sweep
The initial marking and re-marking steps still need to “Stop The World”. Initial marking only marks the objects directly associated with GC Roots, which is very fast. Concurrent marking is the process of GC Roots Tracing, while re-marking is to revise the marking records of those objects whose marks are changed due to the continuous operation of user programs during concurrent marking. The pause time in this phase is generally slightly longer than in the initial tagging phase, but much shorter than in concurrent tagging.
Since the collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup, the CMS collector’s memory reclamation process is generally executed concurrently with the user thread. Old collector (New generation using ParNew)
Advantages: Concurrent collection, low pauses Disadvantages: large amount of space debris, concurrent phase can reduce throughput
Parameter control:
– XX – XX: + UseConcMarkSweepGC use CMS collector: + UseCMSCompactAtFullCollection Full GC, after a defragmentation; Finishing process is exclusive, causes the pause time – XX: + CMSFullGCsBeforeCompaction set for a few times after Full GC, -XX:ParallelCMSThreads sets the number of threads in your CMS (usually approximately equal to the number of available cpus)
G1 collector
G1 is one of the latest developments in technology and the HotSpot development team has assigned it the mission of replacing the CMS collector released in JDK1.5 in the future. Compared to the CMS collector, the G1 collector has the following features:
- Space consolidation, G1 collector uses the tag collation algorithm, does not generate memory space fragmentation. Allocating large objects does not trigger the next GC prematurely because contiguous space cannot be found.
- Predictable pauses, this is another big advantage of G1, reduce the pause time is the common concern of G1 and CMS, but G1 in addition to the pursuit of low pause, also can establish predictable pauses model, can let the user specify in a length of N segment within milliseconds, time on garbage collection may not consume more than N milliseconds, This is almost already a feature of the real-time Java (RTSJ) garbage collector.
The garbage collector mentioned above collects the entire Cenozoic or old generation, which is no longer the case with G1. When using the G1 collector, the memory layout of the Java heap is very different from that of the other collectors. It divides the entire Java heap into independent regions of equal size. While the concept of new generation and old generation is retained, the new generation and old generation are no longer physically separated. They are all collections of partial (possibly discontinuous) regions.
G1’s Cenozoic collection is similar to ParNew’s. When the Cenozoic occupation reaches a certain proportion, the collection starts. Like CMS, the G1 collector pauses briefly to collect old objects.
Collection steps:
-
The marking phase, initial-mark, stops the World Event and fires a normal Mintor GC. Corresponding GC log:GC pause (young) (Inital -mark)
-
Root Region Scanning. During program running, survivor Region (stored in a lifetime) will be reclaimed, which must be completed before the Young GC.
-
Concurrent Marking, the Concurrent Marking (and application execution) of the entire heap, may be interrupted by the Young GC. During the concurrent marking phase, if all objects in a region object are found to be garbage, the region is immediately reclaimed (X in figure). At the same time, the object activity (the percentage of living objects in the region) of each region is calculated during concurrent tagging.
-
Remark, then mark, there will be a short pause (STW). The re-mark phase is used to collect the concurrent mark phase and generate new garbage (the concurrent phase runs with the application); G1 uses a faster initial snapshot algorithm than CMS :snapshot-at-the-beginning (SATB).
-
Copy/Clean up, multithreading Clean up deactivated objects, STW. G1 copies the live objects of the reclaimed region to the new region, clears the Remember Sets, and simultaneously clears the reclaimed region and returns it to the free region linked list.
-
After the copy/clean process. The active objects in the recovery area have been concentrated in the dark blue and dark green areas.