1 GC classification and performance index
- Garbage collectors are not specified in the specification and can be implemented by different vendors and versions of JVMS.
- Due to the rapid iteration of JDK versions, Java has spawned numerous GC versions to date.
- By analyzing the garbage collector from different perspectives, GC can be categorized into different types.
1.1 According to the number of threads, it can be divided into serial garbage collector and parallel garbage collector
- Serial collection means that only one CPU is allowed to perform garbage collection at the same time, and the worker thread is suspended until the garbage collection is complete
- In cases where hardware platforms such as single-CPU processors or small application memory are not particularly superior, serial collectors can outperform parallel and concurrent collectors. So serial reclamation is applied by default to the JVM in client mode on the client side
- Parallel collectors produce shorter pause times than serial collectors on more concurrent cpus
- In contrast to serial, parallel collection can be used to perform garbage collection on multiple cpus at the same time, thus improving the throughput of the application. However, parallel collection is still the same as serial collection, which is exclusive and uses the STW mechanism
1.2 According to the working mode, it can be divided into parallel garbage collector and exclusive garbage collector
- And the garbage collector works alternately with application threads to minimize application pause time.
- Once run, the exclusive Stop the World garbage collector stops all user threads in the application until the garbage collection process is complete.
1.3 According to the way of debris treatment, it can be divided into compressed garbage collector and non-compressed garbage collector
- The compressed garbage collector compresses the surviving objects after the collection is complete to eliminate the recovered fragments.
- Redistribute object space using: pointer collisions
- Non-compressed garbage collectors do not do this.
- Reallocate object space usage: free list
1.4 According to the working memory interval, it can be divided into young generation garbage collector and old generation garbage collector
1.5 Evaluate GC performance indicators
throughput
- The percentage of total elapsed time spent running user code
- Total running time: the running time of the program ten memory reclamation time
- Throughput first, which means that STW has the shortest time per unit time
Garbage collection overhead
- The complement of throughput, the ratio of garbage collection time to total elapsed time
The pause time
- The time a program’s worker thread is suspended while garbage collection is being performed
Collect frequency
- Collect how often operations occur relative to the execution of the application
Memory footprint
- The size of memory occupied by the Java heap area
fast
- The time an object takes from birth until it is reclaimed
1.6 Impossible Triangle
- In a nutshell, catch two things: throughput and pause times
- High throughput and low pause times are a competitive pair. Because high throughput takes precedence, the frequency of memory collection execution must be reduced, resulting in longer pauses for GC to perform memory collection.
- If the principle of low latency first is selected, memory reclamation can only be performed frequently, resulting in the decrease of program throughput
Break down
- Throughput: Throughput is the ratio of the CPU time spent running user code to total CPU consumption, i.e. Throughput = user code time run/(user code time run + garbage collection time)
- For example, if the virtual machine runs for 100 minutes and garbage collection takes 1 minute, the throughput is 99%
- In this case, applications can tolerate high pause times, so high-throughput applications have a longer time baseline and fast response is not a concern
- Throughput first, which means that STW has the shortest time per unit time. 0.2 + 0.2 = 0.4
- Pause time: “Pause time” refers to the state during which the application thread is paused to allow the GC thread to execute
- For example, a 100-millisecond pause time during GC means that no application threads are active during that 100-millisecond period
- Pause time priority means keeping the time of a single STW as short as possible. 0.1+0.1 +0.1 +0.1 +0.1=0.5
- When designing (or using) a GC algorithm, we must determine our goals: a GC algorithm can only target one of two goals (i.e. focus only on large throughput or minimal pause times), or. Try to find a compromise.
- Now standard: Reduce pause times when maximum throughput is first
2. Iterative history of garbage collector development
- Serial GC:
- 1999年jdk1.3.1
- The first paragraph of GC
- ParNew: is a multithreaded version of the SerialGC collector
- Parallel GC and Concurrent Mark Sweep GC
- 0
- February 26, 2002
- ParallelGC is called the HotSpot default GC after JDK1.6
- G1
- Jdk1.7 u4
- In 2012,
- In 2017, G1 became the default garbage collector in JDK9, replacing CMS
- In March 2018, parallel full garbage collection from G1 garbage collector in JDK10 with improved parallelism and worst-case latency
- JDK11 was launched in September 2018. Introduced the Epsilon garbage collector, also known as the “No – 0P (No action) “collector. At the same time, the introduction of ZGC: Scalable low-delay garbage Collector (Experimental)
- In March 2019, JDK12 was released. Enhanced G1 to automatically return unused heap memory to the operating system. Meanwhile, Shenandoah GC: Low pause time GC (Experimental) is introduced.
- JDK13 was released in September 2019. Enhanced ZGC to automatically return unused heap memory to the operating system
- In March 2020, JDK14 was released. Delete the CMS garbage collector. Extend ZGC on macOS and Windows
2.1 7 classic garbage collectors
- Serial collector: Serial. Serial Old
- Parallel recycle: parnew.parallel Avenge. Parallel Old
- Concurrent collector: cms.g1
The relationship between seven classic garbage collectors and garbage generation
- Cenozoic collectors: Serial, ParNeW, Parallel Scavenge;
- Serial 0LD, Parallel 0LD, CMS;
- Whole heap collector: G1;
The composition of the garbage collector
- Serial 0LD is a backup plan for Concurrent Mode Failure of CMS.
- (red dotted line) Due to maintenance and compatibility testing costs, Serial+CMS,
Both combinations of ParNew+Serial 01D were declared obsolete (JEP 173) and were completely unsupported in JDK 9 (JEP214), i.e. : removed.
- Insane and Serial0ld GC JDK 14
- JDK 14: Delete CMS garbage collector (JEP 363)
2.2 Viewing the default garbage collector
- -xx: +PrintCommandLineFlags: View command line parameters (including the garbage collector used)
- Use the command line command: jinfo one flag Process ID for related garbage collector parameters
/** * -xx :+ printCommandFlags ** -xx :+UseSerialGC: indicates that the new generation uses the Serial Old GC ** -xx :+UseParNewGC: ParNew GC * * -xx :+UseParallelGC * -xx :+UseParallelOldGC: * * -xx :+UseConcMarkSweepGC: UseConcMarkSweepGC: UseConcMarkSweepGC At the same time, Public class GCUseTest {public static void main(String[] args) {ArrayList<byte[]> list = new ArrayList<>(); while(true){ byte[] arr = new byte[100]; list.add(arr); try { Thread.sleep(10); } catch (InterruptedException e) { e.printStackTrace(); }}}}Copy the code
- Parallel is used in JDK8
- Jdk9 uses G1
3 Serial collector: Serial collector
3.1 an overview of the
- The Serial collector performs memory collection using the replication algorithm, Serial collection, and STW mechanism
- In addition to the young generation, there is the Serial Old collector for executing the old generation, which also takes Serial collection but uses a tag compression algorithm
- Garbage collection is done using one CPU or one collection thread, and all other worker threads must be suspended while garbage collection is in progress
3.2 the advantages
- Simple and efficient (compared to the single-threaded collections of other collectors), the Seria1 collector naturally achieves the highest single-threaded collection efficiency in a single-CPU-constrained environment because it has no overhead of thread interaction
- Using a serial collector is acceptable as long as it happens infrequently
- In the HotSpot virtual machine, use the -xx: +UseSerialGC parameter to specify that both young and old generations use serial collectors. It is equivalent to replace Serial GC for freshmen and Serial 0LD GC for seniors
3.3 summarize
- This garbage collector, as you know, is no longer serial. And in the limited single-core CPU can only be used. It’s not even mono anymore
- For highly interactive applications, this garbage collector is unacceptable
4 ParNew collector: parallel collector
- If the Serial GC is a single-threaded garbage collector in the younger generation, the ParNew collector is a multithreaded version of the Serial collector. Par is short for Paralle1, New: can only deal with the New generation
- Apart from the adoption of parallel collection, there is little difference between Serial and Serial
- ParNew collector runs in a multi-CPU environment. Because it can take full advantage of physical hardware resources such as multi-CPU and multi-core, it can complete garbage collection more quickly and improve the throughput of the program
- In a single CPU environment, ParNew collector is not more efficient than Serial collector, switching costs
- Because in addition to Serial, only the ParNew GC currently works with the CMS collector
- In the program, the developer can specify the use manually with the option “a XX: +UseParNewGC”. The ParNew collector performs a memory reclamation task. It means that the younger generation uses the parallel collector without affecting the older generation
- -xx: ParallelGCThreads Limits the number of threads that are enabled by default
5 Parallel collector: throughput first
5.1 an overview of the
- The Parallel Avenge collector also uses replication algorithms, Parallel recycling, and a “Stop the World” mechanism
- Unlike the ParNew collector, the Parallel Avenge collector aims to achieve a controlled throughput
Throughput, which is also called a Throughput first garbage collector
- Adaptive tuning strategy is also an important difference between Parallel and ParNew
- Suitable for background operations that do not require much interaction, such as performing batch processing, order processing, payroll, scientific computing applications
- Parallel Old adopts tag compression algorithm, also based on Parallel collection and STW mechanism
5.2 Parameter Configuration
- -xx :+UseParallelGC: Manually specify the young generation to use this collector to perform memory reclamation tasks
- -xx :+UseParallelOldGC: manually specify the old generation to use the parallel collection collector, respectively for the new generation and the old generation, by default jdk8 is enabled, associated with the above two parameters, one is enabled, the other is enabled by default.
- -xx :ParallelGCThreads: Sets the number of threads used by the young collector to the same number of cpus. If the number of cpus is greater than 8, =3+ (5*N/8).
- -xx :MaxGCPauseMillis: sets the maximum pause time for the collector, in milliseconds. Use caution
- -xx :GCTimeRatio: indicates the percentage of garbage collection time in the total time, which measures throughput. The default value is 99. The value ranges from 0 to 100, meaning that the garbage collection time does not exceed 1%. Contrary to the previous parameter, the longer the pause, the Ratio parameter tends to exceed the set Ratio
- -xx :+UseAdaptiveSizePolicy: Enables the adaptive adjustment policy
- In this mode, parameters such as the size of the young generation, the ratio of Eden to Survivor, and the age of objects promoted to the old generation are automatically adjusted to reach a balance between heap size, throughput, and pause time
- To achieve a balance between heap size, throughput, and pause time
- In scenarios where manual tuning is difficult, you can use adaptive mode to specify only the maximum heap, target throughput, and pause time of the virtual machine, and let the virtual machine do the tuning itself
6 CMS collector: low latency
- Jdk1.5 introduces Concurrent Mark Sweep, which for the first time enables garbage collector threads to work simultaneously with user threads. The first truly concurrent collector in the HotSpot VIRTUAL machine
- The FOCUS of the CMS collector is to minimize the pause time of user threads during garbage collection
- Unfortunately, CMS, as an older collector, does not work with the Parallel Insane, a new generation collector that already exists in JDK 1.4.0, so when CMS was used to collect older ages in JDK 1.5, The new generation can only choose one of the ParNew or Serial collectors
6.1 Four stages of the CMS collector
The whole process of CMS is more complex than the previous collector. The whole process is divided into four main stages, namely the initial marking stage, concurrent marking stage, re-marking stage and concurrent clearing stage
- 1. Initial tag: STW is only the object that GC Roots can directly associate with at the tag. Once the tag is completed, all application threads suspended before will be resumed
- 2. Concurrent marking: The process of traversing the entire object graph starting from the directly associated objects of GCRoots, which is time-consuming but does not require the user thread to be paused. Can be run concurrently with garbage collection threads
- 3. Relabelling: In order to correct the record of the part of the object whose markup changed because the user program continued to operate during concurrent labeling
- 4. Concurrent clearance: clear the dead objects judged in the deletion stage and release the memory space. Since there is no need to move live objects, this phase can also be concurrent with the user thread
6.2 details
- The STW mechanism is still required for the initial tagging and re-tagging phases
- Since the user threads are not interrupted during the garbage collection phase, you should also ensure that the application user threads have enough memory available during the CMS collection process. So instead of waiting until the old age is nearly full, as other collectors do, the CMS collector starts collecting when heap memory usage reaches a certain threshold.
- A Concurrent Mode Failure occurs when the CMS is running without enough memory to meet the program’s requirements, and the virtual machine uses an alternate solution to temporarily enable the Serial Old collector to restart the old garbage collection, resulting in a long pause.
- CMS adopts the mark clearing algorithm, which will produce memory fragmentation and can only select the free list to perform memory allocation
The GARBAGE collection algorithm of the CMS collector adopts the mark-clean algorithm, which means that after each collection, some memory fragments will inevitably be generated because the memory space occupied by the useless objects that perform the collection is most likely to be discrete chunks. The CMS will not be able to use the Bump the Pointer technique to allocate memory for new objects, and will only be able to select the Free List to allocate memory
One might think that since Mark Sweep causes memory fragmentation, why not change the algorithm to Mark Compact?
The answer is simple, because when concurrent cleanup is done, using Compact memory, how do you use the memory used by the user thread? To ensure that the user thread can continue to execute, the resource it is running on is not affected. The Mark Compact is better suited for “Stop the World” scenarios
6.3 Advantages of CMS
- Concurrent collection
- Low latency
6.4 Disadvantages of CMS
- Memory fragmentation is generated
- It is very CPU sensitive and can take up a portion of the threads during the concurrent phase, causing the application to slow down
- Floating garbage cannot be handled. The concurrent marking phase is run at the same time as the worker thread. If garbage objects are generated in the concurrent phase, the CMS cannot mark them, so the newly generated garbage objects are not collected in time and can only be freed in the next GC
6.5 Parameter Settings
- -xx :+UseConcMarkSweepGC: Manually specify the CMS collector to perform the memory collection task,
- After this function is enabled, -xx :UseParNewGC is automatically enabled, that is, ParNew (Young zone) +CMS (old zone) +Serial GC
- – XX: CMSlnitiatingOccupanyFraction: set the heap memory usage threshold, once reached the threshold, began to recycle
- The default for JDK5 and before is 68, which means a CMS collection is performed when the space usage reaches 68% in the old days
- The default value for JDK6 and above is 92%
- If the memory growth is slow, you can set a larger value to reduce the triggering frequency of the CMS and reduce the number of reclaim times in the old age
- If your application’s memory usage increases rapidly, you should lower this threshold to avoid triggering the old serial collector too often.
- – XX: + UseCMSCompactAtFullCollection: after used to perform the Full GC to compress the memory space, but the memory compression cannot execute concurrently, will lead to longer pause time problem
- How many times – XX: CMSFullGCsBeforeCompaction: set the execution FullGC to compress the memory space after finishing
- -xx :ParallelCMSThreads: Sets the number of CMS threads
- The default number of threads started is (ParallelGCThreads+3)/4
- ParallelGCThreads is the number of threads for the young generation of parallel collectors
6.6 summary
- If you want to minimize memory usage and parallel overhead, choose Serial GC
- If you want to maximize the throughput of your application, choose ParallelGC
- If you want to minimize the interruption or pause time of the GC, select CMS GC
- Jdk9 is marked as deprecated, and when UseConcMarkSweepGC is used to enable the CMS collector, users will receive a warning that the CMS will be deprecated in the future.
- Jdk14 has been deleted. With UseConcMarkSweepGC, the JVM does not give an error, just a warning message, but does not exit, and starts the JVM in the default GC mode
G1 collector: Regionalization generation type
The official goal for the G1 is to achieve the highest throughput possible with manageable latency, hence the heavy burden and expectation of a “fully functional collector.
7.1 Why is it named Garbage First (G1)?
- G1 is a parallel collector that divides heap memory into a number of unrelated regions (physically discontinuous). Use different regions to represent Eden, Survivor 0, Survivor 1, old age, and so on
- The G1 GC systematically avoids region-wide garbage collection across the entire Java heap. G1 tracks the value of garbage accumulation in each Region (the amount of garbage collection space obtained and the experience value of garbage collection time), maintains a priority list in the background, and collects garbage from the Region with the highest value according to the allowed collection time
- Since this approach focuses on regions where the most Garbage is collected, we gave G1 the name Garbage First.
- In JDK1.7 version officially enabled, removed the identity of Experimental, is the default garbage collector after JDK 9, replacing CMS collector and Parallel + Parallel 0LD combination. Officially called “full-featured garbage collector” by Oracle
- CMS, meanwhile, has been marked deprecated in JDK 9. It is not the default garbage collector in JDK8 and needs to be enabled using a XX: +UseG1GC
7.2 advantage
- Parallelism and concurrency
- Parallelism: G1 can have multiple Gc threads working at the same time during collection, effectively leveraging multi-core computing power. At this point the user thread is STW
- Concurrency: G1 has the ability to alternate execution with the application, so that some work can be performed at the same time as the application, so that, generally speaking, the application does not completely block during the entire reclamation phase
- Generational collection: Consider both young and old generations
- In terms of generation, G1 is still a generational garbage collector. It differentiates the young generation from the old generation, and the young generation still has Eden and Survivor zones. However, from the structure of the heap, it does not require the whole Eden area, the young generation or the old generation to be continuous, nor does it insist on fixed size and fixed quantity.
- The heap space is divided into regions that contain logical young and old generations.
- Unlike previous recyclers, it takes care of both the young and the old. Compare other recyclers, either working in the younger generation or working in the older generation;
- Spatial integration
- CMS: “mark a clean” algorithm, memory fragmentation, a defragmentation after several Gc
- The G1 divides memory into regions. Memory reclamation is based on region. Region to Region is a copy algorithm, but overall can actually be regarded as a Mark to Compact algorithm, both algorithms can avoid memory fragmentation. This feature helps programs run for a long time and allocate large objects without triggering the next GC prematurely because contiguity memory space cannot be found. This is especially true when the Java heap is very large.
- Predictable pause time model
- Due to partitioning, G1 can select only part of the region for memory reclamation, which reduces the scope of reclamation, so that the occurrence of global pause can be well controlled
- Enables users to explicitly specify that no more than N milliseconds should be spent on garbage collection within a time segment of M milliseconds in length
- G1 tracks the value of garbage accumulation in each Region, maintains a priority list in the background, and collects garbage from the Region with the highest value according to the allowed collection time. The G1 collector is guaranteed to achieve the highest possible collection efficiency in a limited time
7.3 disadvantages
- Compared to CMS, G1 does not have a comprehensive, overwhelming advantage. For example, G1 has a higher memory footprint than CMS for both garbage collection and additional execution load at runtime
- Empirically, CMS is more likely to perform better in small-memory applications than G1, and G1 has more advantages in large-memory applications, with a balance point of 6-8GB
7.4 Parameter Settings
- -xx :+UseG1GC: Manually specifies the use of the G1 collector to perform memory reclamation tasks
- -xx :G1HeapRegionSize: set the size of each Region. The value is a power of 2 and ranges from 1MB to 32MB. The target is to divide 2048 regions based on the smallest Java heap
- -xx :MaxGCPauseMillis: Sets the maximum GC pause time that the JVM is expected to achieve, but not guaranteed. The default is 200ms
- -xx :ParallelGCThread: Sets the number of STW working threads. The maximum value is 8
- -xx :ConcGCThreads: Sets the number of concurrent threads to be marked, setting N to about 1/4 of the number of parallel garbage collection threads (parallelGCThreads)
- – XX: InitiatingHeapOccupancyPercent: set the trigger a concurrent GC cycle Java heap usage rate threshold, more than this value will trigger the GC, default is 45
7.5 Common Steps for G1 collector
1 is designed to simplify JVM performance tuning by developers in three simple steps:
- Step 1: Start the G1 garbage collector
- Step 2: Set the maximum memory for the heap
- Step 3: Set a maximum pause time
There are three garbage collection modes available in G1: YoungGC, Mixed GC, and Full GC, which are triggered under different conditions
7.6 Application Scenarios
- Server – oriented applications for machines with large memory and multiple processors
- The primary applications are applications that require low GC latency and have a large stack of solutions
- For example, in a heap size of about 6GB or larger, the predictable pause time can be less than 0.5 seconds, and G1 cleans up one region at a time to ensure that each GC pause is not too long
- To replace the CMS collector in JDK1.5; G1 may be better than CMS when:
- ① More than 50% of the Java heap is occupied by active data;
- (2) The frequency of object assignment or chronological lifting varies greatly;
- ③GC pause time is too long (longer than 0.5 to 1 seconds)
7.7 partition region
- Using the G1 collector, it divides the entire Java heap into about 2048 independent Region blocks of the same size. The size of each Region block depends on the actual size of the heap. The whole Region block is controlled between 1MB and 32MB, and it is controlled to the NTH power of 2, that is, 1MB, 2MB, 4MB, 8MB, 16MB. 32 MB. -xx: G1HeapRegionSize Can be set. All regions are the same size and do not change during the lifetime of the JVM
- Although the concept of Cenozoic and oldyn is still retained, Cenozoic and oldyn are no longer physically separated; they are collections of parts of regions (which do not need to be continuous). Dynamic Region allocation enables logical continuity
- A region may belong to Eden, Survivor, or 0LD /Tenured memory regions. However, a region can belong to only one role. In the figure, E indicates that the region belongs to Eden memory region, S indicates that the region belongs to Survivor memory region, and O indicates that the region belongs to 0LD memory region. Blank Spaces in the figure represent unused memory space
- The G1 garbage collector also adds a new memory region called the Humongous memory region, shown in block H. It is used to store large objects. If the number of regions exceeds 1.5, the region is added to H
- The reason for setting H: Large objects in the heap are directly assigned to the old age by default, but if it is a short-lived large object, this can have a negative impact on the garbage collector. To solve this problem, G1 has a Humongous section, which is dedicated to large objects. If an H block does not fit a large object, G1 looks for contiguous H blocks to store. Sometimes you have to start the Full GC in order to find consecutive H regions. Most of G1’s behavior treats the H region as part of the old age
7.8 Garbage collection process of G1 collector
- The garbage collection process of G1 GC mainly includes the following three steps:
- Young GC
- Concurrent Marking in the old days
- Mixed GC
- (Single-threaded, exclusive, high-intensity Full GC will still exist if needed. It provides a fail-safe mechanism against GC evaluation failures, i.e., strong collection.
- The application allocates memory and starts the young generation reclamation process when the young generation’s Eden area is exhausted. G1’s young-generation collection phase is a parallel, exclusive collector. During the young generation collection period, the G1 GC suspends all application threads and starts multithreading to perform the young generation collection. Then move the surviving object from the young generation to the Survivor or the old, or possibly both.
- When heap memory usage reaches a certain value (45% by default), the old-age concurrent marking process begins.
- Mark the finished horse. Start the mixed recycling process. For a mixed payback period, the G1 GC moves live objects from the old period to the free period, which becomes part of the old period. Unlike the young generation, the G1 collector of the old generation does not need to recycle the entire old generation, but only scan/reclaim a small number of old regions at a time. At the same time, the old Region is reclaimed along with the young generation
- For example, a Web server with a Java process with a maximum heap memory of 4 gigabytes responds to 1500 requests per minute and allocates about 2 gigabytes of new memory every 45 seconds. G1 does a young generation collection every 45 seconds, and every 31 hours the entire heap reaches a level of 45, starting the old generation concurrent marking process, and four or five mixed collections after the marking is complete
7.9 Memory Set and Write Barrier
- The problem of an object being referenced by different regions (generational referencing problem)
- A Region cannot be isolated. Objects in a Region can be referenced by objects in any Region. Do YOU need to scan the entire Java heap to determine whether an object is alive?
- This problem also exists in other generational collectors (more so in G1)
- This problem also exists in other generational collectors (more so in G1)
- This would reduce MinorGC’s efficiency;
- Solutions:
- Regardless of G1 or any other generational collector, the JVM uses a RememberedSet to avoid global scans
- Each Region has a corresponding Remembered Set
- A Write Barrier interrupts each Reference data Write
- Then check whether the Reference to be written refers to an object in a different Region than the Reference type.
- If not, the related references are recorded in the Remembered Set of the Region where the reference points to the object through CardTable
- When garbage collection is performed, add the enumeration scope of the GC root to Remembered Set. You can guarantee that no global scan will be done, and there will be no omissions
7.10 Details of G1 recovery process
7.10.1 Young GC
When JVM is started, G1 prepares Eden area first, and the program continuously creates objects to Eden area during the running process. When Eden space is exhausted, G1 will start a young generation garbage collection process, and the young generation garbage collection will only recover Eden area and Survivor area. First, G1 stops The execution of The application (Stop The World) and creates a Collection Set, which refers to The Collection of memory segments that need to be reclaimed. The Collection in The young generation reclamation process contains all memory segments in The Eden area and Survivor area of The young generation.
- 1. Scan roots
- The root refers to the object to which the static variable points, the local variable in the chain of method calls being executed, and so on. The root reference, along with the external reference to the Rset record, serves as the entry point for scanning the living object
- 2. Update Rset
- Process cards in the dirty card queue and update the Rset. After this stage, the Rset can accurately reflect the reference of objects in the memory segment where the old age is located
- Dirty Card Queue For application reference assignment statements object.field=object, the JVM performs special operations before and after to enqueue a card that holds object references in the dirty Card queue. During the recycle of the young generation, G1 will process all cards in the Dirty Card Queue to update the RSet and ensure that the RSet accurately reflects the reference relationship in real time. Why not update the RSet directly at the reference assignment statement? This is for the sake of performance, RSet processing requires thread synchronization, which can be very expensive, using queue performance is much better
- 3. Handle rsets
- Identify the objects in Eden that are pointed to by the old objects. The objects in Eden that are pointed to are considered alive
- 4. Copy objects
- When the object tree is traversed, the surviving objects in the memory segment of Eden block will be copied to the hollow memory segment of Survivor block. If the age of surviving objects in the memory segment of Survivor block does not reach the threshold, one will be added. When the age reaches the threshold, the surviving objects will be copied to the hollow memory segment of old block. Some data in Eden space will be promoted directly to the old space
- 5. Handle references
- Finally, the data in Eden space is empty, GC stops working, and the objects in target memory are continuously stored without fragmentation. Therefore, the replication process can achieve the effect of memory consolidation and reduce fragmentation.
7.10.2 Concurrent marking process
- 1. Initial marking stage: marking objects directly reachable from the root node. This phase is STW and triggers a young GC
- Root Region Scanning: THE G1 GC scans the old Region objects that are directly reachable to the Survivor Region and marks the referenced objects. This process must be completed before the Young GC
- Concurrent Marking: Concurrent Marking (and application execution) throughout the heap, which may be interrupted by the Young GC. During the concurrent marking phase, if all objects in a region object are found to be garbage, the region is immediately reclaimed. At the same time, the object activity (the percentage of living objects in the region) is calculated for each region during concurrent marking
- 4. Remark: As the application continues, the last marking result needs to be corrected. Is the STW. G1 uses a faster initial snapshot algorithm than CMS: Snapshot-at-the-beginning (SATB)
- 5. Exclusive cleanup (STW) : Calculate the ratio of live objects and GC collection in each area and sort it to identify areas that can be mixed collection. Set the stage for the next phase. Is the STW
- 6. Concurrent cleanup phase: Identify and clean completely idle areas
7.10.3 Mixed Recovery
- When more and more objects are promoted to old regions, the virtual machine triggers a mixed garbage collector to avoid running out of memory. This algorithm reclaims not only the whole Young region but also part of the old region. Also note that Mixed GC is not fullGC
- After the concurrent marking ends, memory segments that were 100% garbage in the old days are reclaimed. Part of the memory segment that is garbage is calculated. By default, these old memory segments are collected 8 times -XX:G1MixedGCCountTarget setting
- The return collection of the mixed collection includes the old eighths, Eden segment, and Survivor segment
- Since memory segments are reclaimed eight times by default in the old days, G1 prioritises memory segments with more garbage, and there is a threshold that determines whether or not memory segments are reclaimed. – XX: G1MixedGCLiveThresholdPercent, the default value is 65%. That means 65 percent of the waste is recycled. If the garbage ratio is low, it means that the live objects are higher and the replication takes more time.
- Mixed recycling doesn’t have to be done 8 times, there is a threshold: -xx :G1HeapWastePercent. The default value is 10%, which means that 10% of the total heap memory is allowed to be wasted, meaning that if garbage that can be collected is found to be less than 10% of the heap memory, mixed collection is no longer done because the GC takes more time but recycles less memory.
7.10.4 Full GC
- G1 was designed to avoid FULLGC, and if that doesn’t work, G1 will stop the application. Garbage collection using single thread memory collection algorithm is very poor performance. Application pauses are long
- For example, if the heap is too small and there is no empty memory segment available when G1 copies live objects, it will fall back to FullGC
- There are two possible reasons for FullGC: 1. There is not enough to-space to store promoted objects in the recycling stage
2. Space runs out before the concurrent processing is complete
summary
- Young generation size
- Avoid setting the size of the young generation explicitly with related options such as Xmn or XX: NewRatio
- Don’t be too strict with your pause time goals
- The throughput goal for the G1 GC is 90% application time and 10% garbage collection time
- When evaluating G1 GC throughput, don’t be too harsh with pause time goals. Being too stringent means you are willing to incur more garbage collection overhead, which directly affects throughput.
Summary of garbage collector
-
The configuration of the Java garbage collector is an important choice for JVM optimization, and choosing the right garbage collector can make a big difference in JVM performance.
-
How do I choose a garbage collector
- 1. Prioritize heap resizing to allow JVM adaptation.
- 2. If the memory is less than 100M, use the serial collector
- 3. If it is a single-core, single-machine program, and there is no pause time requirements, serial collector
- 4. If it is multi-CPU, requires high throughput, and allows pause times of more than 1 second, choose parallelism or the JVM’s choice
- 5. If you have multiple cpus and want low pause time, you need fast response (such as delay of no more than 1 second, such as Internet applications), use concurrent collector
-
The G1 is officially recommended for high performance. The current Internet projects are basically using G1.
-
Finally, one point needs to be clarified:
1. There is no best collector, and there is no universal collection; 2. Tuning is always for specific scenarios, specific needs, and there is no one-size-fits-all collector
11 GC log analysis
- -xx: +PrintGC Prints Gc logs. Verbose: a verbose: gc
- -xx: +PrintGCDetails Displays GC details logs
- -xx: +PrintGCTimeStamps prints GC timestamps (in base time)
- -xx: +PrintGCDateStamps prints the GC timestamp (in the form of a date, such as 2013/05-04T21:53:59.234 +0800)
- -xx: +PrintHeapAtGC Prints heap information before and after GC
- -xloggc:.. /logs/gc. log Output path of the log file
11.1 + PrintGC
Open GC log: one verbose: GC. This only shows the total GC heap change as follows:
[Allocation Failure (GC) 80832K >19298K (227840K), 0.0084018 secs] [Metadata GC Threshold) 109499K >21465K (228352K), 0.0184066 secs] [Full GC (Metadata GC Threshold) 21 465K一>16716K (201728K), 0.0619261 secs]Copy the code
Analysis:
GC, Full GC: GC type, GC only on Cenozoic, Full GC includes immortal generation, Cenozoic, old generation. Allocation Failure: The reason why GC occurs. 80832K - > 19298K: heap size before and after GC. 228840K: Current heap size. 0.0084018 SECs: duration of GC.Copy the code
11.2 + PrintGCDetails
[GC (Allocation Failure) [ PSYoungGen: 70640K一> 10116K(141312K) ] 80541K一>20017K (227328K),0.0172573 secs] [Times: user=0.03 sys=0.00, real=0.02 secs ]
[GC (Metadata GC Threshold) [PSYoungGen:98859K一>8154K(142336K) ] 108760K一>21261K (228352K),
0.0151573 secs] [Times: user=0.00 sys=0.01, real=0.02 secs]
[Full GC (Metadata GC Threshold) [PSYoungGen: 8154K一>0K(142336K) ] [ParOldGen: 13107K一>16809K(62464K) ] 21261K一>16809K (204800K),[Metaspace: 20599K一>20599K (1067008K) ],0.0639732 secs]
[Times: user=0.14 sys=0.00, real=0.06 secs]
Copy the code
Analysis:
The PSYoungGen insane insane insane insane insane insane insane insane insane insane insane Metaspace Metaspace is introduced in JDK1.8 to replace the permanent generation XXX secs: refers to the amount of time the Gc takes. Times: user: Refers to all CPU time spent by the garbage collector, sys: time spent waiting for system calls or system events, real: time spent by the GC from start to finish, including the actual time spent by other processes in the time slice.Copy the code
11.3 + PrintGCTimeStamps
With the date and time
2019-09-24T22:15:24.518 + 0800:3.287: [Allocation Failure (GC)] 13612k 一> 13613k (136192K)] 141425K一>17632K (222208K), 0.0248249secs] [Times: User =0.05sys=0.00, real= 0.03secs] 2019-09-24T22:15:25.559 + 0800:4.329: [GC (Metadata GC Threshold)] [PSYoungGen: [Times: Times: 3285k] [Times: 3285k] [Times: 3285k] [Times: 3285k] User = 0.00sys =0.00, real= 0.01 secs] 2019-09-24T22:15:25.569 + 0800:4.338: [Full GC (Metadata GC Threshold)] [PSYoungGen: 10068K一>0K (274944K)] [ParoldGen: 12590K一>13564K] 22658K一>13564K, [Metaspace: [Times: user=0.17 sys= 0.02, real= 0.05secs] [Times: user=0.17 sys= 0.02, real= 0.05secs]Copy the code
11.4 Supplementary Notes
- “[GC” and “[Full GC” indicate the type of pause for this garbage collection, and “Full” indicates that the GC has “StopThe World”
- Using Serial the collector in the New Generation is named De Fault New Generation, so it displays “[DefNew”
- – The name of the collector using ParNew will be changed to “[ParNew”, meaning “Parallel New Generation”.
- Use the Parallel Insane. The Cenozoic insane is called “PSYoungGen.”
- The name of the old collection is determined by the collector, just like the new one
- Using the G1 collector, it shows “garbage first heap”
- Allocation Failure indicates that this time GC is caused because there is not enough space in the young generation to store new data
- [PSYoungGen: 5986K一>696K (8704K)] 5986K一> 704K (9216K) GC collects the size of the previous young generation and the old generation, and the size after the collection, (total size of the young generation and the old generation)
- User indicates the reclaim time in user mode, sys kernel mode, and REA mode. Due to multicore, the sum of times may exceed real time
Minor GC Full GC
New development of garbage collector
GC is still in rapid development, the current default option G1 GC is constantly improving, many of the shortcomings that we originally thought of, such as serial Full GC, Card Table scan inefficiency, have been greatly improved, for example, after JDK 10, Fu1l GC has been running in parallel, in many scenarios, It also performs slightly better than the Parallel Full GC implementation of Parallel GC. Even Serial GC, although relatively old, is not necessarily obsolete in its simple design and implementation. Its overhead, whether GC related data structure overhead, or thread overhead, is very small. Therefore, with the rise of cloud computing, in new application scenarios such as Serverless, Serial GC has found a new arena. Unfortunately, CMS GC has been deprecated in JDK9 and removed in JDK14, although it still has a very large user base due to theoretical flaws in its algorithm
12.1 New JDK11 feature
- JEP318 :
Epsilon: No one Op A Garbage Collector (Epsilon Garbage Collector, “No one Op (No operation)” Collector) HTTP: / / openidk.java.net/ieps/318
- JEP333:
3. ZGC: A Scalable Low; Garbage Collector (Experimental) (ZGC: Scalable low delay retreat 竝 Sakan Collector, Experimental)
12.2 Open JDK12的Shenandoah GC
- The G1 collector has been the default collector for several years now.
- We also saw the introduction of two new collectors: ZGC (JDK11 appearing) and Shenandoah (Open JDK12), featuring low pause times
- Shenandoah is undoubtedly the loneliest of the many GC’s. The first HotSpot garbage collector not developed by the Oracle team. Inevitably ostracized by the authorities. For example, Oracle, which claims no difference between OpenJDK and OracleJDK, still refuses to support Shenandoah in OracleJDK12.
- Shenandoah Garbage Collector An implementation of the PauselessGC, a garbage collector research project originally undertaken by RedHat, was designed to address the need for low pauses for memory collection on the JVM. Contributed to OpenJDK in 2014
- Red Hat’s Shenandoah team claims that the Shenandoah garbage collector pauses regardless of the size of the heap, meaning 99.9% of the time, whether the heap is set to 200MB or 200GB, can limit garbage collection pauses to less than 10 milliseconds. However, actual usage performance will depend on the actual working heap size and workload
- Weakness of the Shenandoah GC: Throughput degradation under high operating load
- Shenandoah GC’s strength: low latency
12.3 Revolutionary ZGC
- ZGC is highly similar to Shenandoah’s goal of achieving low latency that limits garbage collection pauses to less than 10 milliseconds at any heap size with as little impact on throughput as possible.
* the deep understanding of the Java virtual machine “that define ZGC: ZGC collector is a Region of memory layout based, do not set the generational (temporarily), using the reading barriers, dyeing techniques such as multiple Pointers and memory mapping to implement concurrent mark a compression algorithm, with low latency as a prime target for a garbage collector.
- The working process of ZGC can be divided into four stages: concurrent marking, concurrent preparatory reallocation, concurrent reallocation, concurrent remapping, etc.
- ZGC is executed almost everywhere concurrently, except for the STW that is initially marked. So the pause time is almost spent on the initial tag, and the actual time for this part is very little
- New features in JDK14:
- JEP 364: ZGC for macOS
- JEP 365: ZGC was only supported by Linux before it was used on Windows JDK14
- Other garbage collectors :AliGC
AliGC is alibaba JVM team based on G1 algorithm, for LargeHeap application scenarios. Comparison in the specified scenario:
JVM full directory
Class loading mechanism 3. Runtime data area [PC register, vm stack, local method stack] 4. Runtime data area [heap] 5. Runtime data area [method area] 6. Temporary absence 7. Runtime data area [instantiated memory layout and access location of objects, direct memory] 8. String constant pool 10. Garbage collection [overview, related algorithms] 11. Garbage collection [related concepts] 12. Common OOM 14. JDK command line tools