Introduction to the
This article introduces three common garbage collection algorithms (Mark-sweep, Mark-Compact, and Mark-copy), which are the basis of various Java VIRTUAL machine garbage collector algorithms, and will lay a foundation for learning hotspot VIRTUAL machine garbage collector later
Problems solved by the garbage collector
Developers may prematurely reclaim objects that are still being referenced, which can cause dangling pointer problems.
2. Developers may not release objects after the application has finished using them, resulting in memory leaks;
Benefits of garbage collection
[Fixed] 1 will not have secondary release issues
2 reduces the coupling degree of the program, the developer only needs to pay attention to their own module, or only pay attention to a small amount of code of other related modules, the displayed memory management can not meet the principle of low coupling of software engineering, need to introduce some additional interfaces.
3 completeness and timeliness
Ideally, the garbage collection process should be complete, namely in the heap all garbage should be recycled, but this is often not realistic, from the performance into consideration, once again, only deal with the heap object in the process of recycling may be more reasonable, such as the generational collector will be in accordance with the age in the heap can be divided into two or more generation, and put the focus on young generations, This not only improves the recovery efficiency, but also reduces the pause time of a single recovery. In concurrent garbage collector, assignment and collector to work at the same time, its aim is to avoid or minimize the user program pause, such floating garbage collector encountered problems, namely the object in the recovery process starts to become garbage, so the object can only be in the next recovery cycle is recycled, so in the concurrent collector, A better measure of integrity is to count the final collection of all garbage, rather than the collection of a single collection cycle.
4 Pause Time
Many in garbage collector for terminal assignment thread, thus causes in the process of thread execution pauses, the collector should try to reduce the program execution process, so the pause time as short as possible, for example a generational type collector by frequent and rapid recovery of smaller, younger objects to shorten the pause time, and the larger, Older objects are recycled only sporadically (what a philanderer who likes younger objects).
5 Space Overhead
The purpose of memory management is to use memory space safely and efficiently. Regardless of whether it is displayed or automatic, different management strategies have different levels of space overhead. Some garbage collectors need to occupy a certain amount of space within each object (such as storing reference counts). Some collectors take fields that already exist on the existing layout of objects (forwarding Pointers are recorded on user data). Collectors may also introduce heap-level memory overhead. For example, one similar collector needs to break up pairs into two halves, one half being used by assigners at any one time, and the other half being retained by the collector to which live objects will be copied during recycling. The collector also sometimes needs some auxiliary data structures, such as the trace collector to navigate the graph of Pointers in the heap through the tag stack, commonly known as the root node enumeration, through the whole heap according to the GC root set, and the collector can also use additional bitmaps to mark objects. For collectors that need to divide the heap into separate regions, additional memory sets are required to hold the positions of Pointers modified by the assignment and Pointers across regions.
Four garbage collection strategies
Mark-sweep, Mark-copy, and Mark-compact are the four most basic garbage collection strategies for reference counting, and most collectors combine these basic garbage collection strategies in different ways
1 mark – Sweep
The mark-sweep algorithm’s interface to the assignment is simple: if the thread is unable to allocate an object, it invokes the collector and then tries to allocate again
Collector must first before traverse object graph structure tag process need to use the starting work list (gc root set), namely to tag each root object and add it to the work list, the collector can tag object head a bit (or byte) on the tag, the bit (bytes) can also be located in an extra table
Note that the mark-sweep collector requires the heap layout to meet certain conditions:
- The Mark-sweep collector does not move objects, so the memory manager must be able to control heap fragmentation, as excessive fragmentation can cause the allocator to be unable to meet newly allocated requests, increasing the frequency of garbage collection
- The sweeper must be able to traverse every object in the heap;
Optimization of the Sweep collector
1 Bitmap tag
Collector object labeling can be stored in the device of the head in a word, also can use to maintain an independent bitmap tag, bitmap can have one, can also be multiple, such as in the pile of block structure, the collector can maintain for each memory block an independent bitmap, this approach can avoid the waste caused by the pile of discontinuous memory.
Using bitmap markers can reduce the number of page breaks in the recycle process. Any page breaks caused by the collector are generally unacceptable. Objects are born in clusters and die in batches, and many allocators distribute objects in adjacent Spaces. Using bitmap to guide cleaning can read/empty the mark bits of a batch of objects in batches, and through bitmap marking can easily determine whether all objects in a memory block are garbage, and then can recycle the whole memory block at one time
For the strategy of using tag bits in the head of an object, bitmaps can make the tag bits more dense; For a collector that uses Sweep, the marking procedure simply reads the pointer field of the living object without modifying any objects; The sweeper does not do any reading or writing to live objects and only overwrites fields in the process of discarding garbage objects, so the bitmap reduces the number of bytes in memory that need to be modified and also reduces writing to the cache
2 Lazy cleaning
The time complexity of the marking process is O(L), where L is the number of viable objects in the heap, and the time complexity of the cleaning process is O(H), where H is the heap size. Although H>L, the memory access of the mark process is unpredictable, while the predictability of the cleaning process is much higher, and the cost of the cleaning object is much lower than that of the tracking object. In the Mark-sweep algorithm, objects of the same size are usually distributed in a continuous space. In this case, the collector can sweep objects of the same size in a fixed stride.
This scheme uses alters to play the role of cleaners, that is, to shift the task of finding available space to the allocate process, thus eliminating the need for separate sweep phases. The simplest sweep strategy is to simply move the allocate pointer forward, Until a large enough space is found in the contiguous unmarked object
Distributor is usually only a memory block allocated objects of the same size, each size grading will correspond to one or more used to assign the block of memory, and a memory block for recycling chain table, in the process of recycling, recycling machine still need all live objects in the heap tag, but mark or collector in no hurry to clean the whole heap, Instead, it simply returns the completely empty memory block to the block allocator and adds other memory blocks to the queue of their corresponding space size. Once stop the world is over, the assigner immediately starts working. The allocate method first attempt from the right space size grading allocated in a free slot, if it fails, call sweeper executed sloth, cleaning, that is, from the size grading of recycling the queue cleaned out one or more blocks of memory, until meet the requirements of distribution, if there is no memory blocks are to be clean, or cleaning the memory block does not contain the jehol free slots, The allocator tries to fetch a new block of memory from a lower level block allocator
The mark-sweep algorithm needs to be considered
1 the throughput
Mark-sweep collectors scrape up all of the assignment threads during execution, and the recovery pause time depends on how the application is running and its inputs
2 Space Utilization Rate
Compared with Mark-Copy, Mark-Sweep has a higher space utilization, but sweep recycling generates some space debris. CMS can use the Compact algorithm to clean space debris once in a while
3 Determine whether to move the object
The biggest advantage of sweep is that it doesn’t move objects,
In generational recycling, object survival is likely to be high when aging generations are being GC, and assuming that there isn’t much free space available for copying algorithms, it’s more likely to use one of the other two algorithms, especially the Mark-sweep algorithm that doesn’t move objects.
However, all collectors in HotSpot VM other than CMS are moving objects, either copying or Mark-Compact variants.
Mark-compact
To solve the memory fragmentation problem in Sweep, which requires defragmenting live objects in the heap to reduce the recovery strategy for memory fragmentation, Compact allows extremely fast sequential allocation
- Arbitrary order: The way objects are moved is independent of their original order and drinking relationship
- Linear order: Objects that have related relationships, such as objects with reference relationships, or adjacent objects in a unified data structure, are arranged together
- Sliding order: The process of sliding objects onto the heap to squeeze out garbage, thereby preserving the order in which objects are allocated in the heap
Most of the collation collectors we know follow an arbitrary order and slide order
Arbitrary ordering is simple and fast to implement, especially if all objects are of equal size. However, arbitrary ordering may scatter previously adjacent objects into different cache rows or virtual memory pages, reducing the spatial locality of the valuer. So modern compact recyclers use a sliding collation order
Several different types of collation algorithms are described below
1. Double pointer collation algorithm
In the initial phase, the pointer free points to the beginning of the region, and the pointer scan points to the end of the region. During the first walk, the collector moves the pointer free back and forth until it finds gaps in the heap. If the two Pointers are interleaved, the phase ends. Otherwise, the object to which the pointer to scan points is moved to the position of pointer free, and a field in the original object is changed to a forwarding address, and then the pointer continues to move. Double pointer have something simple, fast and traversal of the process operation is less, but sure is double pointer is any type of heavy objects sorted order, thus undermining the locality of the assignment, then because of related objects are always into a bunch of birth and death in bulk, we can move the continuous live objects as a whole to the large gap, rather than moving them one by one
Lisp2 algorithm
The algorithm goes through three traversals, but each traversal does not do much work;
On the first pass at the end of the tag, the collector calculates the final address of each surviving object and stores it in the object’s forwardingAddres field,
During the second traversal, the collector updates the assignment thread root and references to the tagged object using the record forwarding address in the object header field, which ensures that they point to the new location of the object
During the third walk, the relocate finally moves each surviving object to the new destination location
Questions to consider
throughput
Compared with mark-Sweep, Mark-Compact algorithm requires more traversal times, resulting in poor throughput and high overhead for each traversal. A general solution is to use Mark-Sweep algorithm for a long time as possible. After fragmentation reaches a certain degree, Only once is reclaimed using the Mark-Compact algorithm
Longevity data
The Mark-Compact algorithm can choose not to defragment objects in the “dezone” at the expense of a small amount of memory fragmentation
local
Different compact algorithms can obtain different locality. Although the random algorithm is simple and efficient, it will destroy the locality of the valuer, so the sliding collation algorithm is more friendly to locality
Mark-copy
Sweep the recycling cost is lower, but its memory fragments of problems, in a good system, recycling usually accounts for only a small part of the overall execution time, assignment of execution cost will determine the overall program performance, so should try to reduce the overhead of assignment, especially as far as possible to gain its allocation speed, compact can eliminate the memory fragments, However, multiple traversals are required. The copy algorithm collector performs heap collation during the copy process, increasing the allocation speed of the assignment, but reducing the available space of the heap by half
distribution
The cleaned heap allocation memory is fast and the allocation process is simple.
Space and locality
The determination of copy algorithm is to maintain the second half region. In a given memory size, the available space of the half region copy algorithm is about the same as that of the whole heap, which leads to the recycling times required by the copy collector is more than other collectors.
Local first traversal affects the breadth-first traversal order, which tends to separate the parent nodes, while depth-first traversal tends to be closer to the expected assigned nodes of the child nodes
Moving objects
Whether to use copy depends on the overhead of moving objects. In a generational collector, the old generation has a large number of living objects, so it is not suitable to use copy algorithm, while the single young generation has a small number of living objects, so it is suitable to use Mark-copy
Compare the three algorithms
In generational GC, the old generation usually uses Mark-sweep; Or a mark-sweep/ Mark-compact hybrid, usually mark-sweep, and mark-Compact when the amount of debris reaches a certain level. This is because traditionally, it was thought that objects from the old generation might survive for a long time and have a high survival rate, or that they might be large, so it wasn’t economical to copy them, so it was better to collect them in situ. Of the three basic algorithms mark-sweep, Mark-Compact, and Copying, only Mark-sweep does not move objects (that is, does not copy), so Mark-sweep (CMS) is used.
Three basic algorithms are briefly compared:
mark-sweep | mark-compact | copying | |
---|---|---|---|
speed | medium | The slowest | The fastest |
The space overhead | Less (but will pile up debris) | Less (do not accumulate debris) | Usually twice the size of a live object (no shards) |
Moving objects? | no | is | is |
About time overhead: Mark-sweep: the Mark phase is proportional to the number of live objects, and the sweep phase is proportional to the size of the whole heap
If you put together the time spent on marking, sweeping, compact, and copying, it looks something like this: (While compactiont and copying both involve moving objects, depending on the algorithm, Compact might evaluate the target address of the object once, then correct the pointer, and then move the object; One can imitate these things in one piece, so it will be faster. It is also important to be aware of the overhead incurred by the GC, not just by the collector’s time, but also by the Allocator side. If we can ensure that memory is not fragmented, we can use pointer bumping to distribute data around us, which is very fast; If memory is fragmented, it has to be managed like a freelist, which is usually slower.)
In the generational assumption, objects in young generations should have very low survival rates during minor GC. This is the most cost-effective method to use copying algorithms because the time cost is proportional to the size of live objects. If there are few live objects, it is very fast. And young Gen itself should be small, even if it needs twice as much space, it will only waste not too much space. Aging generations are likely to have a high rate of object survival when GC is performed, and assuming that there isn’t much free space available for copying algorithms, it’s more likely to use one of the other two algorithms, especially the Mark-sweep algorithm that doesn’t move objects.
However, all collectors in HotSpot VM other than CMS are moving objects, either copying or Mark-Compact variants.
This article has been written for a long time, referring to some garbage algorithm data and literature, next I want to write about the details of garbage collection in hotspot, and the advantages and disadvantages of the garbage data collector in hotspot, as well as the attention points in the code when checking gc logs and optimizing garbage collection.