An overview of the
- Garbage collection
Garbage Collection
Often referred to as”GC
It was born in 1960MIT
theLisp
Language, after more than half a century, is now very mature. - The Jvm, the program counter, a virtual machine, local method stack are born as threads with thread, do stack frame with the method of the entry and exit into the stack and the stack operation, realized the automatic memory clear, therefore, our memory garbage collection are mainly focused on the Java heap and method of area, during the program is running, this part is dynamic memory allocation and use
Determination of object survival
There are two ways to determine whether an object is alive or not:
- Reference count: Each object has a reference count property. When a new reference is added, the count is increased by 1. When a reference is released, the count is decreased by 1. This method is simple and does not solve the problem of objects referring to each other circularly.
- Accessibility analysis (
Reachability Analysis
) :GC Roots
The downward search begins, and the path the search takes is called the chain of references. When an object goes toGC Roots
When no reference chain is attached, the object is proved to be unavailable. Unreachable objects.
In the Java language, GC Roots includes:
- Object referenced in the virtual machine stack.
- The object referenced by the class static attribute entity in the method area.
- The object referenced by the constant in the method area.
- Objects referenced by JNI in the local method stack.
Garbage collection algorithm
Mark-clear
The Mark-sweep algorithm, as its name suggests, is divided into two phases: “mark” and “Sweep.” It flags all objects that need to be reclaimed, and when it’s done, it sweeps all of them. The reason why it is the most basic collection algorithm is that subsequent collection algorithms are based on this idea and improve its shortcomings.
Its main disadvantages are two: one is the efficiency problem, the efficiency of the marking and cleaning process is not high; Another problem is the space problem. After the mark is cleared, a large number of discrete memory fragments will be generated. Too many space fragments may cause the program to be unable to find enough contiguous memory when it needs to allocate large objects in the future and have to trigger another garbage collection action in advance.
Copy collection algorithm
A collection algorithm for Copying available memory into two equally sized pieces by capacity, using only one piece at a time. When this area of memory is used up, the surviving objects are copied to the other area, and the used memory space is cleaned up again.
In this way, each time a piece of memory is reclaimed, memory allocation does not have to consider the complexity of memory fragmentation, as long as the heap top pointer is moved, in order to allocate memory, simple implementation, efficient operation. However, the cost of this algorithm is to reduce memory by half, and the efficiency of continuously copying long-lived objects is reduced.
Mark-compression algorithm
The mark-compression algorithm copy-collection algorithm will perform more replication operations when the object survival rate is high, and the efficiency will be low. More importantly, if you do not want to waste 50% of the space, you need to have extra space for allocation guarantee, in the extreme case that all objects in the memory used are 100% alive, so in the old days, this algorithm generally cannot be used directly.
According to the characteristics of the old s, someone put forward another “tag – finishing” (Mark – Compact) algorithm, the labeling still like “tag – clear” algorithm, but the subsequent steps are not directly to recyclable objects to clean up, but let all live objects move to the end, and then clear directly outside the boundary of the memory
Generational collection
The basic assumption of GC generation is that most objects have very short lifetimes.
A “Generational Collection” algorithm divides the Java heap into Generational and older generations so that the most appropriate Collection algorithm can be used for each generation. In the new generation, a large number of objects are found dead and only a few survive in garbage collection, so the replication algorithm is selected, and only a small amount of the replication cost of the surviving objects can be collected. In the old days, because the object has a high survival rate and there is no extra space to allocate it, it has to use the “mark-clean” or “mark-tidy” algorithm for recycling
Garbage collector
If the collection algorithm is the methodology of memory collection, the garbage collector is the concrete implementation of memory collection
Serial collector
- Serial collectors are the oldest, most stable, and efficient collectors, and may produce long pauses and only use one thread to collect. The new generation and the old use serial recycling; New generation replication algorithm, old age marker – compression; During garbage collection
Stop The World
(Service suspension) - Parameter control:
-XX:+UseSerialGC
Serial collector
ParNew collector
- The ParNew collector is just that
Serial
Multithreaded version of collector. New generation parallel, old age serial; New generation replication algorithm, old age marking – compression - Parameter control:
-XX:+UseParNewGC ParNew
The collector -XX:ParallelGCThreads
Limit the number of threads
The Parallel collector
Parallel Scavenge
Collector similarParNew
The collector,Parallel
Collectors are more concerned with the throughput of the system. You can set parameters to enable the adaptive adjustment policy. The vM collects performance monitoring information based on the current system running status and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput. You can also use parameters to control how many milliseconds or percentages GC takes; New generation replication algorithm, old age marking – compression- Parameter control:
-XX:+UseParallelGC
useParallel
Collector + old serial
Parallel Old collector
Parallel Old
isParallel Scavenge
Older versions of the collector use multithreading and a mark-and-tidy algorithm. This collector was only available in JDK 1.6- Parameter control:
-XX:+UseParallelOldGC
useParallel
Collector + old age parallel
CMS collector
CMS (Concurrent Mark Sweep)
A collector is a collector whose goal is to obtain the shortest collection pause time. At present, a large part of Java applications are concentrated on the server side of Internet sites or B/S systems. These applications pay special attention to the response speed of services and hope that the system pause time is the shortest, so as to bring users a better experience.- From names (including”
Mark Sweep
“) it can be seen that the CMS collector is based on the “mark-clear” algorithm, its operation process is relatively more complex than the previous several collectors, the whole process is divided into four steps, including:- CMS Initial Mark
- CMS Concurrent Mark
- Re-marking (CMS Remark)
- CMS Concurrent sweep
The initial marking and re-marking steps still need to “Stop The World”. Initial marking only marks the objects directly associated with GC Roots, which is very fast. Concurrent marking is the process of GC Roots Tracing, while re-marking is to revise the marking records of those objects whose marks are changed due to the continuous operation of user programs during concurrent marking. The pause time in this phase is generally slightly longer than in the initial tagging phase, but much shorter than in concurrent tagging.
Since the collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup, the CMS collector’s memory reclamation process is generally executed concurrently with the user thread. Old collector (New generation using ParNew)
Advantages: Concurrent collection, low pauses Disadvantages: large amount of space debris, concurrent phase can reduce throughput
Parameter control:
- XX - XX: + UseConcMarkSweepGC use CMS collector: + UseCMSCompactAtFullCollection Full GC, after a defragmentation; Finishing process is exclusive, causes the pause time - XX: + CMSFullGCsBeforeCompaction set for a few times after Full GC, -XX:ParallelCMSThreads sets the number of threads in your CMS (usually approximately equal to the number of available cpus)Copy the code
G1 collector
G1 is one of the latest developments in technology and the HotSpot development team has assigned it the mission of replacing the CMS collector released in JDK1.5 in the future. Compared to the CMS collector, the G1 collector has the following features:
-
Space consolidation, G1 collector uses the tag collation algorithm, does not generate memory space fragmentation. Allocating large objects does not trigger the next GC prematurely because contiguous space cannot be found.
-
Predictable pauses, this is another big advantage of G1, reduce the pause time is the common concern of G1 and CMS, but G1 in addition to the pursuit of low pause, also can establish predictable pauses model, can let the user specify in a length of N segment within milliseconds, time on garbage collection may not consume more than N milliseconds, This is almost already a feature of the real-time Java (RTSJ) garbage collector.
The garbage collector mentioned above collects the entire Cenozoic or old generation, which is no longer the case with G1. When using the G1 collector, the memory layout of the Java heap is very different from that of the other collectors. It divides the entire Java heap into independent regions of equal size. While the concept of new generation and old generation is retained, the new generation and old generation are no longer physically separated. They are all collections of partial (possibly discontinuous) regions.
Collection steps:
- Marking phase, first initial marking (
Initial-Mark
), this stage is a pause (Stop the World Event
), and will trigger a normalMintor GC
. The correspondingGC log:GC pause (young) (inital-mark)
Root Region Scanning
The program runs to reclaim survivor zones, which must be done before the Young GC.Concurrent Marking
, executes concurrently throughout the heap (and the application executes concurrently), which may be interrupted by the Young GC. During the concurrent marking phase, if all objects in a region object are found to be garbage, the region is immediately reclaimed (X in figure). At the same time, the object activity (the percentage of living objects in the region) of each region is calculated during concurrent tagging.
Remark
, there will be a short pause (STW). The re-mark phase is used to collect the concurrent mark phase and generate new garbage (the concurrent phase runs with the application); G1 uses a faster initial snapshot algorithm than CMS:snapshot-at-the-beginning (SATB)
.Copy/Clean up
, multithreading to remove inactivated objects, there will be STW. G1 copies the live objects of the reclaimed region to the new region, clears the Remember Sets, and simultaneously clears the reclaimed region and returns it to the free region linked list.
- After the copy/clean process. The active objects in the recovery area have been concentrated in the dark blue and dark green areas.
Common collector combinations
— — — | Nascent GC generation policy | Generation GC policy | instructions |
---|---|---|---|
Mix 1 | Serial | Serial Old | Serial and Serial Old are single-thread GC and are characterized by suspending all application threads during GC. |
Combination of two | Serial | CMS+Serial Old | CMS (Concurrent Mark Sweep) is a Concurrent GC that enables GC threads and application threads to work concurrently without suspending all application threads. In addition, when the CMS fails to perform GC, the Serial Old policy is automatically used for GC. |
Group 3 | ParNew | cms | use-XX:+UseParNewGC Option to enable.ParNew isSerial You can specify the number of GC threads, which by default is the number of cpus. You can use-XX:ParallelGCThreads Option specifies the number of GC threads.If option is specified -XX:+UseConcMarkSweepGC Option, the new generation is used by defaultParNew GC Strategy. |
Combination of four | ParNew | Serial Old | Use the -xx :+UseParNewGC option to enable it. The new generation uses the ParNew GC policy, and the Old generation uses the Serial Old GC policy by default. |
Combination of five | Parallel Scavenge | Serial Old | The Parallel Insane strategy is primarily focused on a controlled throughput: application run time/(application run time + GC time) so that CPU utilization is as high as possible and is appropriate for persistent applications in the background and not for applications with a lot of interaction. |
Combination of 6 | Parallel Scavenge | Parallel Old | Parallel Old is a Parallel version of Serial Old |
Combination of 7 | G1GC | G1GC | + UnlockExperimentalVMOptions – – XX: XX: + UseG1GC # open -xx :MaxGCPauseMillis =50 # Pause time target -xx :GCPauseIntervalMillis =200 # Pause interval target -xx :+G1YoungGenSize= 512M # Young generation size -xx :SurvivorRatio=6 # SurvivorRatio |
JDK8 version
In JDK8, the permanent memory that holds metadata is moved from the heap to native memory, so that the permanent memory does not occupy the heap. It can be automatic growth to avoid JDK7 and common permanent memory errors in previous version (Java. Lang. OutOfMemoryError: PermGen).
JDK8 also provides a new parameter to set the memory size of Matespace: -xx :MaxMetaspaceSize= 128MB
Note: If the JVM is not set, the local meta-memory space will be automatically increased according to certain policies. If you set the metapram space to too small, your application may get the following errors:
java.lang.OutOfMemoryError: Metadata space
Copy the code
Syntax rules for unstable parameters:
- Boolean type parameter value
-XX:+<option>
‘+’ indicates that this option is enabled
-XX:-<option>
‘-‘ indicates to disable this option - Numeric type Parameter values:
-XX:<option>=<number>
Sets the option to a numeric type value that can follow the unit, for example: ‘m’ or ‘m’ for megabytes; ‘k’ or ‘k’ kilobytes; ‘g’ or ‘g’ gigabytes. 32K is the same size as 32768. - String type parameter values:
-XX:<option>=<string>
Sets an option to a string value, usually used to specify a file, path, or list of commands. Such as:-XX:HeapDumpPath=./dump.core
Blogger page
You can add the wechat account of the blogger to communicate: