I am participating in the Mid-Autumn Festival Creative Submission contest, please see: Mid-Autumn Festival Creative Submission Contest for details

Optimization objectives and Strategies (Ergonomics)

Garbage collector, heap, and runtime compiler are selected by default

G1(Garbage First) collector

The maximum number of GC threads is limited by the heap size and available CPU resources

  • The initial heap space (Xms) is 1/64 of physical memory
  • Maximum heap space (Xmx) is 1/4 of physical memory
  • Hierarchical compiler, using both C1 and C2

The Java HotSpot VM garbage collector can be configured to meet one of two goals in preference: maximum pause time and application throughput, and if the preferred goal is met, the collector will try to maximize the other goals.

Maximum pause-time Goal

The pause time is how long the garbage collector stops the application and restores space that is no longer used. The intent of the maximum pause time goal is to limit the maximum duration of these pauses.

  • Use the command line option-XX:MaxGCPauseMillis=<nnn>Specifies the maximum pause time target. This is interpreted to indicate to the garbage collector that the pause time required is NNN milliseconds or less.

  • The garbage collector adjusts the Java heap size and other garbage collection-related parameters to make the garbage collection pause time less than n milliseconds.

  • The default value of the maximum pause time target varies from collector to collector.

  • These adjustments can cause garbage collection to occur more frequently, reducing the overall throughput of the application. However, in some cases, the intended goal of pause times is not achieved.

Throughput Goal

Throughput goals are measured in terms of time spent collecting garbage, while time spent outside of garbage collection is application time.

Target by command line option-XX:GCTimeRatio=nSpecified. The ratio of garbage collection time to application time is 1/ (1+ NNN). For example, -xx :GCTimeRatio=19 sets a goal of 1/20 or 5% of total garbage collection time.

The time spent on garbage collection is the total time for pauses caused by all garbage collections.

If the throughput goal is not met, one action the garbage collector might take is to increase the size of the heap so that the application can spend more time between collection pauses.

Use space (Footprint)

  • If throughput and maximum pause time targets are met, the garbage collector reduces the heap size until one of the targets (always throughput targets) is not met.

  • Minimum and maximum heap sizes available to the garbage collector The minimum and maximum heap sizes can be set using -xms = and -xmx =, respectively.

Garbage Collector Implementation

Generational Garbage Collection

An Object is considered a “Garbage Object,” and the JVM can reuse its memory when it cannot be accessed from a reference to any other active Object in a running program.

Accessibility analysis markers

  • The simplest garbage collection algorithm iterates through every reachable object at each run. Anything left over is considered garbage.

  • This approach takes time proportional to the number of active objects, which should not be an option for large applications that maintain large amounts of active data.

The Java HotSpot VIRTUAL machine incorporates a number of different garbage collection algorithms, all of which, in addition to the ZGC collector, use a technique called generational collection.

While simple garbage collection checks every live object in the heap each time, generational collection takes advantage of some empirically observed properties of most applications to minimize the effort required to recycle unused (garbage objects).

  • The most important of these observed properties is the weak generation hypothesis, which states that most objects live only for a short time.
  • In addition, some objects allocated at initialization persist until the JVM ends and exits.

Efficient collection is possible by focusing on the fact that most subjects “die young.”

Generations analysis

  • To optimize this scenario, memory is managed by generation (memory pools that hold objects of different ages). Garbage collection occurs as each generation fills up.

  • The vast majority of objects are allocated in a pool dedicated to young objects (young generation), where most objects die.

  • When the garbage of the young generation fills up, the Minor GC is triggered, and only the garbage of the young generation is collected, not the garbage of the other generations. The cost of this collection, in the first phase, is proportional to the number of live objects being collected -> the younger generation recycles garbage very quickly.

  • Typically, during each Minor GC, a portion of the objects from the younger generation that survived (in the survivor zone) are moved to the older generation.

  • Eventually, the old age fills up and must be collected, resulting in a Major GC (Full GC) in which the entire heap (including the method area) is collected.

Major GC typically lasts much longer than Minor GC because the number of objects involved is much larger.

  • At startup, the Java HotSpot VM keeps the entire Java heap in the address space, but does not allocate any physical memory to it unless needed.

  • The entire address space covering the Java heap is logically divided into young and old generations. The full address space reserved for object storage can be divided into young generation and old generation.

The young generation

The Young generation consists of Eden and two Survivor Spaces.

Introduction to the Minor GC
  1. Most objects were originally assigned in Eden. A survivor space is empty at all times, we call it the To Survivor area, and acts as a destination for active objects in Eden and another From Survivor space during garbage collection, both Eden and From Survivor space are empty after garbage collection.

  2. In the next garbage collection, the use of the two survivor Spaces will be swapped. One of the most recent populated FROM survivor areas is To copy live objects into the To survivor space. Objects are copied in this way between the two survivor Spaces until they have been copied a certain number of times, or there is not enough space, and these objects are copied into the old age area. This process is also called “aging” (much like life!). .

Performance considerations

The primary metrics for garbage collection are throughput and latency.

  • Throughput is the percentage of total time not spent in garbage collection over a long period of time. Throughput includes the time spent on allocation (although tuning allocation speed is not usually required).

  • Latency is the responsiveness of the application. Garbage collection pauses affect the responsiveness of applications.

Users have different requirements for garbage collection.

Measurement of Throughput and Footprint

Throughput and footprint are best measured using application-specific metrics.

For example, the throughput of a Web server can be tested using a client load generator. However, by examining the diagnostic output of the virtual machine itself, it is easy to estimate the pauses caused by garbage collection. Command line option -verbose: GC prints information about heap and garbage collection.

Here’s an example:

[15,651s][info][GC] GC(36) Pause Young (G1 Evacuation Pause) 239M->57M(307M) (15,646s, Evacuation Pause Young (G1 Evacuation Pause) 238M->57M(307M) (16,146s, [info][GC] GC(38) Pause Full (system.gc ()) 69M->31M(104M) (16,202s, 16,367s) 164,581msCopy the code

The output shows the collection of two young generations, followed by the full collection initiated by the application by calling System.gc().

  • These lines start with a timestamp indicating the time since the application started.

  • Next is information about the log level (INFO) and flag (GC) for this row.

    • Then there are the GC identification numbers, which in this case are three GCS, 36, 37, and 38.

    • Then record the type of GC and the reason for declaring it.

    • After that, some information about memory consumption is logged. The format used for this log is “heap space used before GC” -> “heap space used after GC”.

      • The first line of the example is 239M->57M(307M), which means 239MB is used before GC, and GC clears most of the memory, but 57MB remains, with a heap size of 307 MB.
    • Notice that in this example, the Full GC shrinks the heap from 307 MB to 104 MB. After the memory usage information, the start and end times and end-start of the GC are recorded.


  • -verbose:gc is an alias of -xloggc.
  • -Xlog is a common logging configuration option for logging in HotSpot JVM.
    • This is a tag-based system in which GC is one of the tags. To get more information about what the GC is doing, you can configure logging to print any message that contains the GC tag and any other tag.

    The command line option for this option is -xloggc *.

Here is an example of a G1 young generation collection recorded with -xlog: GC * :

[10.178s][INFO][GC,start] GC(36) Pause Young (G1 Evacuation Pause) [10.178s][INFO][GC, Task] GC(36) Using 28 workers of Evacuation [10.191s][info][GC, Phases] GC(36) Pre Evacuate Collection Set: 37 ms [10.191s][info][GC, Phases] GC(36) Evacuate Collection Set: 37 Post Evacuate Collection Set: 37 ms [10.191s][info][GC, Phases] GC(36) Other: 0.2ms [10.191s][info][GC,heap] GC(36) Eden Regions: 286->0(276) [10.191s][info][GC,heap] GC(36) Survivor regions: 15->26(38) [10.191s][info][GC,heap] GC(36) Old Regions: 88->88 [10.191s][info][GC,heap] GC(36) Humongous Regions: 3->1 [10.191s][info][GC,metaspace] GC(36) metaspace: 8152K->8152K(1056768K) [10.191s][info][GC] GC(36) Pause Young (G1 Evacuation Pause) 391M->114M(508M) 13.075ms [10.191s][info][GC, CPU] GC(36) User=0.20s Sys=0.00s Real=0.01sCopy the code

Factors that affect garbage collection performance

The two most important factors that affect garbage collection performance are the total available memory and the proportion of the heap dedicated to the young generation.

Total Heap

The most important factor affecting garbage collection performance is the total available memory. Because the collection occurs when the generation fills, throughput is inversely proportional to the amount of memory available.

Heap options that affect generation size

Illustrates the difference between committed space and virtual space in the heap.

  • The entire heap is reserved when the virtual machine is initialized. You can specify the size of the reserved space using the -xmx option. If the value of the -xms parameter is less than the value of the -xmx parameter, not all reserved space is committed to the virtual machine immediately.

  • Uncommitted Spaces are marked as “virtual” in this diagram. Different parts of the heap, the old and the young, can grow to the limits of the virtual space as needed.

Some of these parameters are the ratio of one part of the heap to another. For example, the -xx :NewRatio parameter indicates the relative size of the old generation and the young generation.

Default option value for heap size
  • By default, the virtual machine grows or shrinks the heap in each collection to keep the ratio of free space to live objects in each collection within a specific range.

  • This target range is optional-XX:MinHeapFreeRatio=<minimum>-XX:MaxHeapFreeRatio=<maximum>Set to a percentage and limit the total size to- Xms < min >-xmx < > MaxIn between.

Use the command line options -xx :MaxHeapFreeRatio(default: 70%) and -xx :MinHeapFreeRatio (default: 40%) to reduce the correlation to minimize the Java heap size.

  • Introduction to spatial parameters
    • If the ratio of available space is less than 40%, then this generation will be extended to keep the available space at 40% up to the maximum allowable space size of this generation.
    • If the available space exceeds 70%, the generation shrinks so that only 70% of the space is available, depending on the minimum size of the generation.

The calculations used for the parallel collector in Java SE are now used for all garbage collectors. Part of the calculation is the upper limit on the maximum heap size for 64-bit platforms. A similar calculation is made for the client JVM, which results in a smaller maximum heap size than the server JVM.

Here are some general guidelines for server application heap size:

  • Unless you have pause problems, try to grant the virtual machine as much memory as possible. The default size is usually too small.

  • Setting -xms and -xmx to the same value increases predictability by removing the most important sizing decisions from the virtual machine.

  • If you need to minimize your application’s dynamic memory footprint (the maximum AMOUNT of RAM consumed during execution), you can do this by minimizing the Java heap size.

The young generation

In addition to the total available memory, the second most important factor affecting garbage collection performance is the proportion of the heap dedicated to the young generation.

Young generation scale options

By default, the size of the young generation is controlled by the option -xx :NewRatio.

For example, setting -xx :NewRatio=3 means that the ratio between the young and old generations is 1:3. In other words, the sum of Eden and survivor’s space would be a quarter of the total heap size.

  • The -xx :NewSize and -xx :MaxNewSize options set the lower and upper limits of the young generation.
    • This helps tune the young generation at a finer granularity than the integer multiples allowed by -xx :NewRatio.
  • Setting these values to the same values fixes the young generation, just as setting -xms and -xmx to the same values fixes the total heap size.

Survivor space adjustment

You can adjust the size of the survivor space using the -xx :SurvivorRatio option, but this is usually not important for performance.

  • For example, -xx :SurvivorRatio=6 sets the ratio between Eden and survivor space to 1:6. In other words, each survivor’s space is 1/6 of Eden’s, which is 1/8 of the younger generation’s (not 1/7, because there are Spaces for two survivors).

  • If the survivor space is too small, the replication collection will overflow directly into the old age.

  • If the survivor Spaces are too large, they are empty.

At each garbage collection, the virtual machine selects a threshold number, which is the number of times an object can be copied before it ages. This threshold was chosen to keep survivors half-full.

You can use the logging configuration -xloggc, -verbose: GC, and age to display this threshold as well as the age of the newly generated object. This is also useful for observing the life cycle distribution of an application.

-XX:NewRatio	2
-XX:NewSize	1310 MB
-XX:MaxNewSize	not limited
-XX:SurvivorRatio	8
Copy the code

The maximum space of the young generation is calculated from the maximum space of the total heap and the value of the -xx :NewRatio parameter. The default value of the -xx :MaxNewSize parameter “not limited” means that the calculated value is not limited by -xx :MaxNewSize unless a value of -xx :MaxNewSize is specified on the command line.

Available Collectors

The Java HotSpot VIRTUAL machine contains three different types of collectors, each with different performance characteristics.

  • Serial Collector
  • Parallel Collector
  • Gbage-first Garbage Collector
  • ZGC Collector (Z Garbage Collector)

Serial Collector

The serial collector uses a single thread to perform all garbage collection, which makes it relatively efficient because there is no communication overhead between threads.

  • It is best suited for single-processor machines because it cannot take advantage of multi-processor hardware, although it can be used on multiple processors for applications with small data sets (about 100MB).
  • On some hardware and operating system configurations, the serial collector is selected by default, or it can be explicitly enabled using the option -xx :+UseSerialGC.

Parallel Collector

  • A parallel collector, also known as a throughput collector, is a generational collector similar to a serial collector. The main difference between the serial and parallel collectors is that the parallel collector has multiple threads to speed up garbage collection.

  • Parallel collectors are used for applications with moderate to large data sets running on multiprocessor or multithreaded hardware. You can enable it with the -xx :+UseParallelGC option.

Parallel compression is a feature that enables the parallel collector to perform the Major GC in parallel.

Without parallel compression, the Major GC will be executed using a single thread, which greatly limits scalability. If the -xx :+UseParallelGC option is specified, parallel compression is enabled by default.

You can disable it with the -xx: -useParalleloldgc option.

Enable the parallel collector with the command line option -xx :+UseParallelGC. By default, with this option, both minor and major collections are run in parallel to further reduce garbage collection overhead.

Number of parallel garbage collector threads

You can control the number of garbage collector threads using the command-line option -xx :ParallelGCThreads=.

Gbage-first Garbage Collector

G1 is primarily a concurrent collector. Most concurrent collectors perform some costly work concurrently into the application. This collector is designed to scale from small machines to large multiprocessor machines with a lot of memory. It provides the ability to meet pause time targets with high probability while achieving high throughput.

In most hardware and operating system configurations, G1 is chosen by default, or G1 can be explicitly enabled using -xx :+UseG1GC.

The Z Garbage Collector

  • The Z garbage Collector (ZGC) is a scalable low-latency garbage collector. The ZGC performs all the expensive work concurrently without stopping the execution of the application thread.

  • ZGC is suitable for applications that require low latency (pauses of less than 10 milliseconds) or use very large heaps (terabytes). This can be enabled by using the -xx :+UseZGC option. ZGC is an experimental feature, starting with JDK 11.

Select collector

  • Adjust the heap size if necessary to improve performance. If performance still does not meet your goals, use the following criteria as a starting point for selecting a collector:

  • If the application has a small data set (about 100 MB), use the option -xx :+UseSerialGC to select the serial collector, or if the application will run on a single processor and there is no pause time requirement, use the option -xx :+UseSerialGC to select the serial collector.

  • If (a) peak application performance is the first priority, and (b) there is no pause time requirement or pauses of one second or more are acceptable, then let the virtual machine select the collector or select the parallel collector with -xx :+UseParallelGC.

  • If response time is more important than total throughput, and garbage collection pauses must be shorter, choose the predominantly concurrent collector and use -xx :+UseG1GC.

  • If response time is a high priority, or if you are using a very large heap, choose a fully concurrent collector using -xx :UseZGC.

  • These guidelines are only a starting point for selecting a collector, because performance depends on the size of the heap, the amount of real-time data maintained by the application, and the number and speed of processors available.

If the recommended collector is not performing as expected, try first to adjust the heap and generation sizes to meet the desired goals. If performance is still inadequate, try another collector: use a concurrent collector to reduce pause times, and use a parallel collector to increase total throughput on multiprocessor hardware.

Summary:

  • If your application is running on a small data set or a single processor, select the serial collector.
  • If throughput is the first priority and there is no pause time requirement, choose the parallel collector.
  • If response time is more important than throughput, choose the G1 collector.
  • If response time is the most important concern, or if the heap is very large (terabytes), use the Z-collector.