Garbage collection
The generational nature of the Java HotSpot VM memory system provides the flexibility to use specific garbage collection algorithms tailored to the needs of a variety of applications. Java HotSpot VM supports a number of different garbage collection algorithms designed to meet different pause times and throughput requirements.
background
One of the main attractions of the Java programming language for programmers was that it was the first mainstream programming language to offer built-in automatic memory management or garbage collection (GC). In traditional languages, dynamic memory is allocated using the explicit allocation/free model. In fact, this is not only a major source of memory leaks, program errors, and crashes in programs written in traditional languages, but also a major performance bottleneck and barrier to modular, reusable code. (Determining free points across module boundaries is nearly impossible without explicit and incomprehensible cooperation between modules.) In the Java programming language, garbage collection is also an important part of the “safe” execution semantics needed to support the security model.
The garbage collector automatically handles the release of unused object memory in the background by reclaiming an object only if it can be proved that the object is no longer accessible by a running program. Automation of this process not only completely eliminates memory leaks due to too few releases, but also program crashes and hard-to-find reference errors due to too many releases.
Traditionally, garbage collection has been considered an inefficient process that hinders performance as opposed to an unexplicit model. In fact, with modern garbage collection techniques, performance has improved so much that the overall performance is actually significantly better than that provided by explicitly free objects.
Java HotSpot garbage collector
In addition to including the most advanced features described below, the memory system is designed as a clean, object-oriented framework that can be easily detected, experimented with, or extended to use new garbage collection algorithms.
The main features of the Java HotSpot garbage collector are described below. In general, these features are well suited for applications that require the highest performance, as well as long-running applications where memory leaks and memory inaccessibility due to fragmentation are undesirable.
accuracy
The Java HotSpot garbage collector is a completely accurate collector. By contrast, many other garbage collectors are conservative or partially accurate. While conservative garbage collection is attractive because it is easy to add to systems without garbage collection support, it has some disadvantages. In general, conservative garbage collectors are prone to memory leaks, do not allow object migration, and can lead to heap fragmentation.
Conservative collectors do not determine the location of all object references. Therefore, assuming that a memory word that appears to refer to an object is actually an object reference, it must be conservative. This means that it can produce certain types of errors, such as obfuscating the integer of an object pointer. Memory units that look like Pointers are treated as Pointers – and GC becomes inaccurate. This has several downsides. First, when such errors occur (which is actually rare), memory leaks can occur unpredictably in ways that the application programmer cannot reproduce or debug. Second, because it may have made an error, the conservative collector must either refer to the object indirectly using a handle – reducing performance – or avoid relocating the object, since relocating a handleless object requires updating all references to the object. The collector cannot complete this operation if it is not certain that the apparent reference is a real reference. Failure to relocate an object results in object memory fragmentation and, more importantly, prevents the use of the advanced generation copy collection algorithm described below.
Because the Java HotSpot collector is completely accurate, it can provide some strong design guarantees that conservative collectors cannot:
-
All unreachable object storage can be reliably reclaimed.
-
All objects can be relocated to allow object memory compression, thereby eliminating object memory fragmentation and increasing memory locality.
An accurate garbage collection mechanism avoids unexpected memory leaks, enables object migration, and provides complete heap compression. The GC mechanism in Java Hotspot VM scales well to very large heaps.
Copy sets by generation
Java HotSpot VM uses a state-of-the-art generational replication collector that provides two major benefits:
- Most programs have improved allocation speed and overall garbage collection efficiency compared to non-generational collectors
- A corresponding reduction in the frequency and duration of garbage collection pauses that the user perceives
Generational collectors take advantage of the fact that in most programs, the vast majority of objects (typically greater than 95%) are very transient (for example, they are used as temporary data structures). Generational collectors can accomplish multiple tasks by isolating newly created objects into the object nursery. First, because new objects are allocated sequentially in the object nursery in a stack-like fashion, allocation becomes very fast because it involves only updating a single pointer and performing a single check on the nursery overflow. Second, when the nursery overflows, most objects in the nursery are already dead, allowing the garbage collector to simply move the few surviving objects elsewhere and avoid any collection of dead objects in the nursery.
Parallel young generation collector
The single-threaded replication collector described above, while applicable to many deployments, can be a bottleneck for scaling in an application that would otherwise be parallelized to take advantage of multiple processors. To take full advantage of all available cpus on multiprocessor computers, the Java HotSpot VM provides an optional multithreaded collector for the younger generation, where tracking and copying of live objects is done by multiple threads working in parallel. The implementation has been carefully tuned to balance collection across all available processors, allowing the collector to scale to a large number of processors. This reduces the pause time for collecting young Spaces and maximizes garbage collection throughput. The parallel collector has been tested on a system containing over 100 cpus and 0. 5 terabytes of heap. The parallel young collector is the default garbage collection algorithm used with the server VM.
When objects are moved, the parallel collector tries to keep related objects together to improve memory locality and cache utilization, thereby improving the performance of the Mutator. This is done by copying objects depth-first.
The parallel collector can also make more optimal use of available memory. It does not need to preserve part of the old object space to ensure that the space of all live objects is copied. Instead, it uses a novel technique to try to copy objects speculatively. If there is little old object space, this technique allows the collector to smoothly switch to the compressed heap without reserving any space. This makes better use of available heap space.
Finally, the parallel collector is able to dynamically adjust its tunable parameters in response to the heap allocation behavior of the application, thereby improving garbage collection performance in a variety of applications and environments. This means less manual adjustment for the customer. This feature was originally introduced with the parallel collector and is now available for many other garbage collection algorithms.
Compared to the default single-threaded collector, the break-even point for the parallel collector appears to be between two and four cpus, depending on the platform and application. Further improvements are expected in future releases.
Mark-compact old items collector
Although the generational copying collector effectively collects most dead objects, long-lived objects still accumulate in the old object storage area. Sometimes old object garbage collection must be performed based on low memory conditions or programming requests. By default, The Java HotSpot VM uses the standard Mark-Compact collection algorithm, which traverses the entire graph of the live object from its root directory, then scans memory to compress the gaps left by the dead object. Memory fragmentation is eliminated by compressing gaps in the heap rather than collecting them into the free list, and old object allocation is simplified by eliminating free list searches.
Mainly a Concurrent mark-sweep Collector
For applications that require a heap, a collection pause caused by the default older Mark-Compact collector will typically result in an outage because the application threads are suspended for a time proportional to the size of the heap. The Java HotSpot VM implements an optional concurrent collector for the old object space, which can use alternate processor cycles (or alternate processors) to collect heaps while suspending application threads for very short periods of time. This is accomplished by doing a lot of tracing and cleaning while the application threads are executing. In some cases, peak application throughput may be slightly lower because some processor cycles are used for concurrent collection activities; However,
Parallel legacy collector
The current version of Java HotSpot VM introduces a parallel mark-compact collector for older generations, designed to improve the scalability of applications with heaps. The concurrent mark scan collector focuses on reducing pause times, while the parallel old collector focuses on increasing throughput by using multiple threads simultaneously to collect old generations during the pause of the stop world. The parallel legacy collector internally uses many novel technologies and data structures to achieve high scalability while retaining the benefits of precise garbage collection with minimal bookkeeping overhead during the collection cycle.
For more information
For more information about the garbage collection algorithms supported in Java HotSpot VM, see the Memory Management Whitepaper.
Collector details
Serial collector
The serial collector uses a single thread to perform all garbage collection work, which makes it relatively efficient because there is no communication overhead between threads.
It is best suited for uniprocessor machines because it cannot take advantage of multiprocessor hardware, although it is useful for multiprocessors with applications with small data sets (about 100 MB at most). By default, the serial collector is selected on certain hardware and operating system configurations, or the serial collector can be explicitly enabled with this option -xx :+UseSerialGC.
Parallel collector
A parallel collector, also known as a throughput collector, is a generational collector similar to a serial collector. The main difference between the serial and parallel collectors is that the parallel collector has multiple threads to speed up garbage collection.
Parallel collectors are suitable for applications with medium to large data sets running on multiprocessor or multithreaded hardware. You can enable it with the -xx :+UseParallelGC option.
Parallel compression is a feature that enables the parallel collector to execute major collections in parallel. Without parallel compression, major collections are executed using a single thread, which can significantly limit scalability. If -xx :+UseParallelGC is specified, parallel compression is enabled by default. You can disable it with the -xx: -useParalleloldgc option.
The most common collectors
The concurrent Mark Sweep (CMS) collector and garbage-First (G1) Garbage collector are the two main concurrent collectors. Most concurrent collectors perform some expensive work for the application at the same time.
-
G1 garbage collector: This server-side collector is suitable for multiprocessor computers with a lot of memory. It meets the garbage collection pause time target with high probability while achieving high throughput.
By default, G1 is selected on certain hardware and operating system configurations, or you can use explicitly enabled G1-XX :+UseG1GC.
-
CMS collector: This collector is suitable for applications that prefer shorter garbage collection pauses and can share processor resources with garbage collection.
The CMS collector can be enabled with the option -xx :+UseConcMarkSweepGC
Starting with JDK 9, the CMS collector is not recommended.
Z garbage collector
The Z garbage Collector (ZGC) is an extensible low-latency garbage collector. The ZGC performs all the expensive work at the same time without stopping the execution of the application thread.
ZGC is suitable for applications that require low latency (less than 10 ms pauses) and/or use very large heaps (multi-terabytes). You can enable this with the -xx :+UseZGC option.
Since JDK 11, ZGC has been offered as an experimental feature.
Select collector
Unless your application has fairly strict pause time requirements, run your application first and allow the VM to select the collector.
If necessary, adjust the heap size to improve performance. If performance still fails to meet your goals, use the following guidelines as a starting point for selecting a collector:
- If your application has a small data set (about 100 MB at most), select the serial collector with this option
-XX:+UseSerialGC
. - If the application will run on a single processor and there is no pause time requirement, select the serial collector with this option
-XX:+UseSerialGC
. - Let the VM choose collector or parallel collector if (a) peak application performance is the first priority and (b) there is no pause time requirement or it is acceptable to pause for a second or more
-XX:+UseParallelGC
. - If response time is more important than total throughput, and garbage collection pauses must be kept shorter than about a second, choose a mostly concurrent collector to use
-XX:+UseG1GC
or-XX:+UseConcMarkSweepGC
. - If response time is a high priority, and/or you are using a very large heap, choose a fully concurrent collector
-XX:UseZGC
.
These guidelines provide only a starting point for selecting a collector, because performance depends on the size of the heap, the amount of real-time data maintained by the application, and the number and speed of processors available.
If the recommended collector does not achieve the desired performance, you first try to adjust the heap and build size to meet the desired goals. If performance is still inadequate, try different collectors: use concurrent collectors to reduce pause times, and use parallel collectors to increase overall throughput on multiprocessor hardware.
Professional GC Settings:
Parallel collector
- The parallel collector (also called here)Throughput collector) is a generational collector similar to a serial collector. The main difference between the serial and parallel collectors is that the parallel collector has multiple threads to speed up garbage collection.
- Enable the parallel collector using the command line option
-XX:+UseParallelGC
. By default, with this option, you can run both minor and major collections in parallel to further reduce garbage collection overhead.
Concurrent mark scan (CMS) collector
- The Concurrent mark scan (CMS) collector is designed for applications that require shorter garbage collection pauses and can share processor resources with the garbage collector while the application is running.
- In general, applications with relatively large long-lived data sets (large legacy generations) and running on machines with two or more processors tend to benefit from the use of this collector. Enable the CMS collector using the command line option
-XX:+UseConcMarkSweepGC
. - The CMS collector is not recommended. The garbage-first collector is strongly considered.
G1 collector
Introduction to garbage First garbage collector
The garbage-First (G1) Garbage collector targets multiprocessor machines with a lot of memory. It attempts to meet garbage collection pause time targets with high probability, while achieving high throughput with little configuration required. Designed to provide the best balance between latency and throughput using the current target application and environment, G1 features include:
- Heap sizes are up to 10 GB or more, with more than 50% of the Java heap taking up real-time data.
- The rate of object allocation and promotion varies greatly over time.
- There is a lot of debris in the heap.
- Predictable pause time target target, no more than a few hundred milliseconds, avoiding long garbage collection pauses.
G1 replaces the Concurrent Mark-sweep (CMS) collector. It is also the default collector.
The G1 collector achieves high performance and attempts to achieve pause time goals in several ways described in the following sections.
To enable the G1
- The garbage-first Garbage collector is the default, so you generally don’t have to do anything else. You can use the
-XX:+UseG1GC
Provide it on the command line to explicitly enable it.
The basic concept
- G1 is a generational, incremental, parallel, mostly concurrent, stop-world, evacuation garbage collector that monitors pause time targets for pauses in each world. Like other collectors, G1 divides the heap into (virtual) young and old generations. Space recycling efforts focus on the younger generation, which is most effective, and the older generation occasionally carries out space recycling
- Some operations are always executed while paused around the world to improve throughput. Other actions that take more time to stop the application, for exampleThe global markThe entire heap operation is executed in parallel with the application. In order to avoid the world pause of space reclamation, G1 performs space reclamation step by step in parallel. G1 enables predictability by tracking information about previous application behavior and garbage collection pauses to model associated costs. It uses this information to determine the size of the work done in the pause. For example, G1 first reclaims space in the most efficient zones (that is, most zones are full of garbage, hence the name).
- G1 reclaims space primarily by using evacuation: Live objects found in the selected storage area are copied to the new storage area, compressing them in the process. Once the evacuation is complete, the space previously occupied by the live objects will be reused for application allocation.
- The garbage-first collector is not a real-time collector. It tries to meet the set pause time goal with high probability over a longer period of time, but is not always absolutely certain about a given pause.
Z garbage collector
The Z garbage Collector (ZGC) is an extensible low-latency garbage collector. ZGC performs all the expensive work simultaneously without stopping the execution of application threads for more than 10 milliseconds, making it suitable for applications that require low latency and/or use very large heaps (multi-terabytes).
Z the garbage collector can be used as an experimental sexual function, and use the command line option enabled – XX: + UnlockExperimentalVMOptions – XX: + UseZGC.
Set heap size
- The most important tuning option for the ZGC is to set the maximum heap size
(-Xmx)
. Since the ZGC is a concurrent collector, the maximum heap size must be selected so that 1) the heap can hold the real-time set of applications, and 2) there is enough space in the heap to allow the allocation of services to run while the GC is processing. How much space is required depends on your application’s allocation rate and real-time Settings size. In general, the more memory you give the ZGC, the better. But at the same time, wasting memory is not desirable, so it’s all about finding a balance between memory usage and GC run frequency.
Set the number of concurrent GC threads
- The second tuning option you might want to look at is to set the number of concurrent GC threads
(-XX:ConcGCThreads)
. The ZGC has a heuristic method for automatically selecting this number. This heuristic usually works well, but may need to be adjusted depending on the nature of your application. This option basically determines how much CPU time should be given. Give it too much and the GC will steal too much CPU time from the application. Give it too little, and your application may be able to allocate garbage faster than GC.
Ergonomic defaults for G1 GC
Options and default values | describe |
---|---|
-XX:MaxGCPauseMillis=200 |
Maximum pause time target. |
-XX:GCPauseTimeInterval = <ergo> |
Maximum pause interval target. By default, G1 does not set any targets, allowing G1 to perform back-to-back garbage collections in extreme cases. |
-XX:ParallelGCThreads = <ergo> |
Maximum number of threads used to work in parallel during garbage collection pauses. This is derived from the number of threads available on the computer on which the VM is running by using the thread if the number of CPU threads available to the process is 8 or less. Otherwise, eight fifths of the threads are added to the final thread count. The maximum number of threads used at the start of each pause is further limited by the maximum total heap size: G1 per -XX:HeapSizePerGCThread Java heap capacity does not use multiple threads. |
-XX:ConcGCThreads = <ergo> |
Maximum number of threads used for concurrent work. By default, this value-XX:ParallelGCThreads Divided by 4. |
-XX:+G1UseAdaptiveIHOP``-XX:InitiatingHeapOccupancyPercent=45 |
The default value used to control the start heap usage indicates that adaptive determination of this value is turned on, and the default value for G1 for the first few collection cycles will use the 45% usage of the older generation as the mark start threshold. |
-XX:G1HeapRegionSize= <ergo> |
A collection of heap area sizes based on initial and maximum heap sizes. This heap contains about 2048 heap regions. The size of the heap area can vary from 1 to 32 MB and must be a power of 2. |
-XX:G1NewSizePercent=5``-XX:G1MaxNewSizePercent=60 |
The total size of the young generation, which varies between these two values, is the percentage of the Java heap currently in use. |
-XX:G1HeapWastePercent=5 |
The allowed unreclaimed space in the collection sets the candidate to a percentage. If the available space in the collection collection is lower than that space, G1 stops the space reclamation phase. |
-XX:G1MixedGCCountTarget=8 |
The expected length of the space reclamation phase in many collections. |
-XX:G1MixedGCLiveThresholdPercent=85 |
Old-generation areas with a live object occupancy higher than this percentage are not collected during this space reclamation phase. |
Super fast thread synchronization
The Java programming language allows the use of multiple concurrent program execution paths – threads. The Java programming language provides language-level thread synchronization, which makes it easy to express multithreaded programs using fine-grained locking. Previous synchronization implementations, such as those in the Classic VM, were very inefficient compared to other microoperations in the Java programming language, making fine-grained synchronization a major performance bottleneck.
The Java HotSpot VM uses cutting-edge technology to enable uncontested and synchronous operations, which greatly improves synchronization performance. The uncontested synchronization operations that make up most synchronization are implemented using ultra-fast, constant-time techniques. With the latest optimizations, at best, these operations are essentially free, even on multiprocessor computers. Even for applications with high lock contention, content-synchronization operations use advanced adaptive rotation techniques to improve throughput. As a result, synchronization performance becomes so fast that the vast majority of real-world programs are not a significant performance issue.
A 64 – bit architecture
Earlier versions of Java HotSpot VM were limited to handling 4 GB of memory – even on 64-bit operating systems such as Solaris OE. While desktop systems have 4 gigabytes, modern servers can contain much more memory. For example, the Sun Fire E25K server supports up to 1.15 TB of memory per domain. With 64-bit JVMS, Java technology-based applications can now take advantage of the full memory of such systems.
There are several classes of applications where using 64-bit addressing can be useful. For example, those that store very large data sets in memory. Applications can now avoid the overhead of paging data from disk or extracting data from an RDBMS. This can lead to significant performance improvements in this type of application.
The Java HotSpot VM is now 64-bit secure and the Server VM includes support for both 32-bit and 64-bit operations. The user can select 32-bit or 64-bit operations by using the command line flag -d32 or -d64 respectively. Users of the Java Native Interface need to recompile their code to run it on a 64-bit VM.
Object packaging
Object packaging was added to minimize wasted space between data types of different sizes. This is primarily an advantage in 64-bit environments, but it is a small advantage even in 32-bit VMS.
Such as:
| | “`html public class Button { char shape; String label; int xposition; int yposition; char color; int joe; object mike; char armed; }
| ------------------------------------------------------------------------------------------------------------------------ --------------------------------- | | | ------------------------------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | it will waste: The space between 'color' and 'Joe' (three bytes fill the 'int' boundary) 'Joe' and 'Mike' (four bytes fill the pointer boundary on the 64-bit VM). Now, field is as follows: be reordered | | ` ` ` HTML... object mike; int joe; char color; char armed; . ` ` ` | | -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | | | ------------------------------------------------------------------------------------------------------------------------ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | in this example, won't waste any memory space. Articles to borrow: <https://www.oracle.com/technetwork/java/whitepaper-135217.html#memory> <https://docs.oracle.com/en/java/javase/11/gctuning/available-collectors.html#GUID-F215A508-9E58-40B4-90A5-74E29BF3BD3C> <https://docs.oracle.com/en/java/javase/11/gctuning/garbage-first-garbage-collector.html#GUID-082C967F-2DAC-4B59-8A81-0 CEC6EEB9016> Continue to update more exchangesCopy the code