JVM garbage collector and memory allocation strategy

The JVM garbage collection can let developers no longer relationship and release memory, focus on the writing of the code, not the underlying relationship between way to handle it, but as a developer with curiosity, or it is necessary to understand the JVM is recycled object, an object in memory is how to allocate, better operating the JVM and write code

How do I determine if an object is recyclable

Since the JVM can do garbage collection automatically, the first question is how do you determine what objects can be collected

When an object is no longer in use, it can be reclaimed. How do you determine if an object is no longer in use

  • Reference counting method

    The JVM adds a reference counter to each object, which is incremented each time the object is referenced, and decayed when the reference is invalid. When an object’s counter reaches zero, it can be reclaimed.

    Reference counting method is a very simple algorithm with high efficiency in implementation and judgment, but it cannot solve the problem of circular reference

    Reference counting is not used in the JVM as a standard for whether an object can be reclaimed

  • Accessibility analysis algorithm

    Mainstream JVMS use reachability analysis algorithms to determine whether an object is alive or not

    • A series of “GC Roots” objects are selected as the starting objects to start the downward search. The path in the search process is called reference chain. When there is no reference chain between an object and GC Roots object, it means that the object is unreachable, and the object is no longer used and can be recycled

    • The GC Roots object

      There are certain objects defined in the JVM that can be used as the starting point for the reachable analysis algorithm as GC Roots objects. These objects are characterized by long life cycles and are not quickly reclaimed.

      • Object referenced in the virtual machine stack
      • Object referenced by a static variable in a method area
      • The object referenced by a constant in the method area
      • Objects referenced in the local method stack

References in Java

Both reference counting and reacheability analysis algorithms use references to determine whether an object can be reclaimed. What is a reference in Java

  • What is a quote

    How can a reference type data store the address of another piece of memory, but not the actual data, so that this piece of memory represents a reference

  • Classification of references

    There are four types of references in Java: strong reference, weak reference, soft reference, and virtual reference. Different reference types have different life cycles and characteristics

    • Strong reference

      The most common references in development, manually created objects, are generally strong references, for example :Object obj = new Object()

      Features: A strong reference object is not recycled as long as it is in use, even if the JVM runs out of memory and throws an OOM exception. However, a strong reference object is recycled if it is not in use, such as a strong reference defined in a method. Once the method is finished executing, Strong references are collected with collection two of the method stack. ,

    • Soft references

      An object implemented by the SoftReference class is called a SoftReference. A SoftReference is not reclaimed when the JVm has enough memory, but is reclaimed when the JVm runs out of memory.

      Features: Reclaiming data when the JVM runs out of memory can be used to implement functions such as local caching, which places data in the cache during a certain period of time, and reclaiming the cached data when the JVM runs out of memory

Garbage collection algorithm

Different garbage collection algorithms are used in different garbage collectors in the JVM. Each garbage collection algorithm adopts different collection strategies. Different garbage collectors choose different collection algorithms at different times to ensure efficient garbage collection

Mark-clear algorithm
  • What is a tag clearing algorithm

    The mark-sweep algorithm is the most basic garbage collection algorithm. The whole process is divided into two stages. The first stage is marking: the objects to be recycled are marked first, and the second stage is clearing: the marked objects are uniformly recycled.

  • Advantages:

    • It is simple to implement and can accurately recycle objects
  • Disadvantages:

    • The efficiency is slow, and the efficiency of both the marking and cleaning phases is not high
    • Generates memory fragmentation, which is generated during the cleaning process, and can easily cause large objects to fail to allocate memory, triggering garbage collection ahead of time
Replication algorithm
  • What is an assignment algorithm

    Divide the memory into two equal pieces and use only one piece at a time. When the piece is used, the surviving objects in the current memory are copied to another unused piece of memory and the current memory block is cleaned up

  • Advantages:

    • Simple to implement, and high recovery efficiency
    • No memory fragmentation is generated
  • Disadvantages:

    • Splitting memory in half leads to low effective memory utilization and a high number of garbage collection triggers
    • If the survival rate of objects is high, the replication algorithm needs to move more objects each time, resulting in low efficiency

Generally, the replication algorithm uses the new generation of objects in the JVM memory (the replication algorithm is used in Survivor areas), because the survival rate of objects in the new generation is low, the replication algorithm is more efficient

Mark-collation algorithm
  • What is mark-de-clutter algorithm

    Tags – sorting algorithm is mainly based on the characteristics of the old s design, mark – sorting algorithm is mainly divided into two stages, the first phase of tag: will need to recycle the object, the second stage arrangement: no tag object is the object of survival, and then let live objects move to one end of the memory, and then in order to survive the boundary of the object started to clear all of memory

  • Advantages:

    • Memory fragmentation is not generated, and the cleaning efficiency is high
  • Disadvantages:

    • To generate object movement, STW(Stop the world) is required
Generation collection algorithm
  • What is a generational recycling algorithm

    Memory into old age and new generation, according to object life cycle of each s different characteristics in different recovery algorithm, the new generation of a large number of objects need to be recycled, so only need to copy a small amount of object replication algorithm can be used, the object of the old s life cycle is long, no extra space partition, So you can use mark-clean or mark-clean

Garbage collector

The garbage collector is used to do the work of garbage collection, each garbage collector USES a different garbage collection algorithms to recovery of memory, and the garbage collector is mainly divided into new generation garbage collector, the garbage collector and old s new generation of the garbage collector and old s need, garbage collector in the JVM parameter specified in the use of the garbage collector

Serial collector
  • The characteristics of

    The new generation of single-thread collectors use a replication algorithm for garbage collection and must temporarily suspend all other worker threads until the garbage collection is completed (STW). So if the collection time is long or GC is triggered too many times, it can cause long pauses or slow processing times

ParNew collector
  • The characteristics of

    ParNew collector is a new generation of multi-threaded collector, similar to Serial collector, but using multi-threaded collection, and is also using a copy algorithm, during garbage collection, also need STW, but ParNew collector is using multi-threaded collection, so the efficiency of garbage collection is relatively higher than Serial

The ParNew collector is a companion to the CMS collector in the new generation. If the CMS collector is enabled, the new generation uses ParNew by default

Enforce the use of ParNew with the JVM argument: -xx :+UseParNewGC option

Parallel avenge
  • The characteristics of

    New generation multithreaded collector, using copy algorithm, the basic strategy and ParNew collector similar. The biggest difference is that the Parallel collector’s primary purpose is to improve throughput, that is, CPU utilization, so it can be used for CPU-intensive tasks

Serial Odl collector
  • The characteristics of

    The Serial Old collector is a version of the Serial Old collector, which uses the mark-collation algorithm, single-thread collection, and requires all other worker threads to be paused when collecting (STW).

The Serial Old collector is not only an Old collector for Serial, but can also be used as a backup collector for the CMS collector

Parallel Old collector
  • The characteristics of

    Parallel Old is an older version of Parallel collector, multi-threaded collector works on the mark-collation algorithm strategy of the older generation, Parallel Old and Parallel matching collection

Both Prallel – oriented and Parallel collectors are throughput – oriented collectors that can be used for CPU-intensive tasks

CMS collector
  • The characteristics of

    CMS collector is a collector to obtain the shortest collection pause time for the goal, function in the old era, multi-threaded concurrent collection using mark-clear collection algorithm. The CMS collector can primarily improve the response time of the program, because the main goal is to reduce STW time

  • Recycling process

    • Initial tag

      STW is needed to do the first marking, only marking GC Roots can be associated with objects, which is fast

    • Concurrent tags

      In the GC Roots Tracing stage, the concurrent marking process is executed concurrently by the GC thread and the user without STW

    • To mark

      STW is needed for accurate marking, and the purpose of re-marking is to count exactly what objects need to be reclaimed, because during concurrent marking, the GC thread and the used thread execute concurrently, and the user thread may again generate garbage

    • Concurrent remove

      The concurrent cleanup phase cleans marked objects. In this phase, the user thread and the GC thread execute concurrently without STW, but floating garbage may be generated during this phase, which cannot be collected in this GC and will only be collected in the next GC phase

    The overall recycling process, except the initial marking and re-marking need STW, other processes are executed concurrently with the user thread, so the overall STW time is relatively short

CMS collector use tags – clear algorithm, can produce memory fragments, so may trigger a Full GC for many times, but the CMS collector provides – XX: + UseCMSCompactAtFullCollection, this parameter can be to merge sort of memory, prevent trigger a Full GC, But the STW will be longer

-xx :+UseConcMarkSweepGC uses the CMS collector, so the default collector used by the new generation is the ParNew collector, and the Serial Old collector is used as an alternate collector to the CMS collector

Because the CMS collector and the user thread execute concurrently, the CMS collector cannot be used when memory is low and a single-threaded collector such as Serial Old is needed for collection

G1 collector
  • The characteristics of

    The G1 collector, the latest garbage collector in JDK1.7, aims to give the user control over how much pause time it takes to reclaim memory for the best value within the limited pause time. And G1 collector is the scope of the whole heap, there is no physical Cenozoic in G1 collector, the concept of old age, don’t divide different s, but there is still a logical Cenozoic and old s), the whole pile can be divided into several small Region, concurrent to recycle each heap memory, partial use replication algorithm, and overall the tag – sorting algorithm

    • Parallel and concurrent: G1 can make full use of the advantages of multi-core CPU, and use parallel collection method to reduce STW time

    • Generation collection: Divides different regions into logical years and allocates different regions based on the number of objects in different years. In addition, the system collects data based on different years to improve collection efficiency

    • Space consolidation: G1 uses the copy algorithm partially and the mark-collation algorithm as a whole, without memory fragmentation, preventing Full GC from being triggered in advance

    • Predictable pause times: One of the biggest features of the G1 is the ability to allow users to specify pause times and remember garbage collection during pause times, ensuring faster application response times

      G1 maintains a priority list of different regions. Each Region with the highest priority in the list is reclaimed based on the allowed pause time, ensuring G1 collection efficiency and STW time

  • Recycling process

    • Initial tag

      STW is required for initial marking of objects directly associated with GC Roots

    • Concurrent tags

      The GC thread and the user thread concurrently mark objects associated with GC Roots

    • To mark

      STW is required to re-mark all objects to ensure their accuracy

    • Screening of recycling

      First, the priority of all regions is sorted, and the collection plan is made according to the pause time set by the user. The GC thread and the user thread execute concurrently.

Memory allocation policy

Allocation rules

Objects created accompanied with the allocation of memory, the object is typically assigned in the heap memory (not considering escape analysis), because the heap in a generational collection is divided into a new generation and the old s, new generation in general is to store the newly created object, old age are generally placed longer object life cycle, new generation and divided into Eden and Survivor, Where the Eden region is the memory after the newly created object is allocated, Survivor is Survivor after a MinorGC, if the object is still alive, and if it is unreachable, it is reclaimed in the MinorGC.

Of course, allocation rules are not fixed, because there are special cases where special allocation rules are enforced to ensure proper allocation and better optimization of memory in the JVM

  • Objects are allocated in Eden first

    Generally, during the allocation process, objects will be allocated directly to Eden area. If Eden area does not have enough memory for allocation, a Minor GC will be performed. If there is still no memory space after Minor GC, the memory will run out

    Minor GC: New generation GC, Minor GC triggers more frequently and can be collected more quickly

    Major GC: Older GC, Major GC collects more slowly than Minor GC

  • Big object goes straight to the old age

    When the created cash is large enough, the object will be directly allocated to the old age, because if the large object is allocated to the new generation, there will be many copies between Eden and two Survivor, which is slow and occupies a large amount of continuous memory. So the JVM specifies that large objects are allocated directly to the older generation to avoid frequent Minor GC or copy degradation, and that the memory of the younger generation is smaller than the memory of the older generation

    • What is a large object

      Objects that require a large amount of contiguous memory are large objects, such as long strings or large arrays

    • How large an object is considered to be

      The JVM provides parameters – XX: PretenureSizeThreshold used to represent object, more than the size of the configuration parameters of object is considered large objects

  • Long-lived objects enter the old age

    JVM uses generational collection to manage heap memory, so objects enter the new generation initially. If an object survives a Minor GC in Eden, the GC age increases by 1, and moves to Survivor, and Minor GC continues in Survivor. When the GC age reaches 15 (the default), the object is considered a long declaration cycle object and will be moved to the old age.

    The JVM provides the -xx :MaxTenuringThreshold parameter to set the number of GC ages to move to the old age. The default value is 15

  • Dynamic age judgment

    JVM for the distribution of the objects will have special rules, does not necessarily need to be 15 times after GC will enter old age, if the object of all the objects of the same age group in the Survivor memory size more than half of the Survivor space, so will age is equal to or greater than the object of all the objects move directly to the old age.

    This is similar to a large object going straight into the old age, except for creating multiple objects in a row

  • Space allocation guarantee

    When performing a Minor GC, it checks in advance whether the old generation has enough contiguous space to hold all the memory of the new generation. If so, it can perform a Minor GC. If not, Then look at the JVM – XX: HandlePromotionFailure whether the value of the parameter is true, if false, then will conduct a Full GC directly, if it is true, said the JVM allows guarantees failure, If so, a Minor GC is performed. If so, a Full GC is performed, indicating that there is not enough memory in the old age. If sufficient memory cannot be reclaimed after Full GC, an OutOfMemoryError is thrown