Serial GC
This article is published by the Java Architects Association, which updates technical articles daily
Weak generation hypothesis
SerialGC is the most classic and oldest garbage collector, started with -xx :+UseSerialGC. Its Java heap conforms to the Weak generation hypothesis. The weak generation hypothesis, which implies that most objects die young, has been proven in various programming paradigms or languages. At the other end of the spectrum is the strong generational hypothesis, which suggests that older subjects are less likely to die, but there is less evidence to support this hypothesis. It is cognitive to note that most objects have very short lifetimes, because allocating local variables in a method is almost the most common operation, and many of these objects are disposable.
Based on the weak Generation hypothesis, the virtual machine implements the Generation heap model, which divides the Java heap into Old Generation with large space and YoungGeneration with small space. Among them, the Cenozoic era accommodates the new objects that live and die, and garbage recycling occurs more frequently in this region, while the old age accommodates objects with a longer life cycle. It can be simply considered that the objects still alive after multiple garbage recycling have a longer life cycle. Older generations grow slowly, so garbage collection occurs less frequently, and such a heap is called a generational heap. In the generation model, GC is no longer oriented to the whole heap, but “special generation and special collection”. Young GC (hereinafter referred to as YGC) only recovers the new generation, while Full GC (hereinafter referred to as FGC) recovers the whole heap. The advent of YGC eliminates the need for GC to traverse the heap looking for live objects and reduces the frequency of old age collection.
Generational heaps benefit from differentiation of object lifecycles, but are also controlled by them. Before, you only need to traverse the whole heap to find all the surviving objects, but after generation, you cannot simply traverse a single generation, because there may be references from the old era to the new generation, that is, cross-generation references. If only the new generation is traversed, it may mismark some objects that originally have references and then kill them, whereas the principle of garbage collection is “better to miss than kill by mistake”, and it is absolutely not acceptable to incorrectly clean up living objects. The problem is that after generation generation, the objects of the new generation will not only be referenced by GC Root, but also be referenced by the old generation. If you have to traverse the old generation and GC Root with large space to find the surviving objects of the new generation, then you lose the advantage of generation generation, and the gain is not worth the loss.
Cross-generation references are a problem that all generational garbage collectors must face. To deal with cross-generation references, a data structure called Remember Set (RSet) is needed to record references from the old to the new. Another problem is that many objects might have actually die, old s object if the old s death is not clear in time, the new generation of recycling will be GCRoot and dead object as the root of these old s to find live objects, killed was supposed to be the new generation of objects are marked as live objects, the resulting floating garbage, In extreme cases floating garbage can cancel out the benefits of heap generation.
The memory set implies that THE GC has the ability to discover every write to an object, and every time an object write occurs, the GC checks whether the object being written is in a different generation and decides whether to place it in the memory set. The component that gives the GC this “discover all write operations” capability is the GC barrier, and in this context, the write barrier. Writing objects is part of the code execution system and is done by the GC barrier in conjunction with the JIT compiler and template interpreter.
In Serial GC, FGC traverses the entire heap without considering cross-generation references, whereas YGC only happens in the new generation and needs to deal with cross-generation references. The Serial GC uses a coarse-grained set of memories called card tables, which are described below.
Card table
A Card Table is a coarse-grained memory set that can store cross-generation references. Instead of accurately recording objects and references that point to the new generation in the old age, it divides the old age into two power pages and records which pages they are in. Using card tables to map these pages reduces the memory overhead of the memory set itself, while also minimizing the need to traverse the entire age. The standard card table implementation is usually a bitmap with each bit corresponding to a page of memory. See Figure 10-2.
Figure 10-2 Card Table
When a Mutator threads to execute a class member variable assignment operation, virtual opportunity to check whether an old s object or reference assigned to members of the new generation, if it is, is the member variables in memory pages corresponding to the bit in the card table, the subsequent need to traverse the card table only the winning demerit bit corresponding memory pages, without having to traverse the entire old age.
However, using bitmaps can be quite slow, since one bit of a bitmap requires reading the entire machine word, updating it, and then writing it back, as well as several instructions to perform bit operations on RISC processors. An effective performance improvement is to replace bitmaps with byte arrays, which use 8 times as much memory as Bitmaps, but still account for less than 1% of the heap. The HotSpot VM CardTable is implemented by CardTable, which uses byte arrays instead of bitmaps. The CardTable::byte_for function is responsible for mapping memory addresses to byte arrays of the CardTable, as shown in listing 10-7:
Listing 10-7 CardTable::byte_for
jbyte* CardTable::byte_for(const void* p) const {
jbyte* result = &_byte_map_base[uintptr_t(p) >> card_shift];
return result;
}
Copy the code
Where card_shift is 9. From the implementation, it is easy to see that the virtual machine defines a memory page as 512 bytes, and marks the items in the byte_map_base array as dirty whenever a memory page has cross-generation references.
Young GC
The Serial GC names the new generation as DefNewGeneration and the old generation as TenuredGeneration. DefNewGeneration divides the new generation into Eden space and Survivor space, and Survivor space can be further divided into From and To space, as shown in Figure 10-3.
Figure 10-3 Details of the Serial GC generation
YGC uses replication algorithms to clean up the space of the new generation. A common scenario for YGC is To allocate small objects in Eden space at first. When Eden space is insufficient, YGC occurs, and the live objects in Eden space and From space are marked. Then the virtual machine moves the live objects in both Spaces To To space. If the To space can hold objects, the Eden space and the From space are cleared, and the From space and To space swap roles, then there is an empty Eden space, the From space with some living objects, and the empty To space. When the next YGC occurs, repeat the above steps.
When some objects are still alive after multiple YGC, it can be considered that the object has a long life cycle and is not a dying object, so GC will promote the object from the new generation to the old generation. In addition to the common situations mentioned above, there are also some special situations that need to be considered. For example, Eden space cannot accommodate large objects at the beginning, and promotion objects cannot be accommodated in the old days. The full implementation of the YGC logic is shown in Listing 10-8, which also includes special cleanup handling:
Listing 10-8 DefNewGeneration: : collect
void DefNewGeneration::collect(...) {... if (! Collection_attempt_is_safe ()) {// Check whether the old age can accommodate the promoted object heap->set_incremental_collection_failed(); return; } FastScanClosure fsc_with_no_gc_barrier(...) ; FastScanClosure fsc_with_gc_barrier(...) ; CLDScanClosure cld_scan_closure(...) ; FastEvacuateFollowersClosure evacuate_followers(...) ; StrongRootsScope SRS (0); heap->young_process_roots(&srs, &fsc_with_no_gc_barrier, &fsc_with_gc_barrier, &cld_scan_closure); } evacuate_followers.do_void(); // Handle objects that are not GC Root direct and whose member fields are reachable... // Soft references, weak references, virtual references, final references, etc. Swap From, To space; Adjusted old age promotion threshold if (! _promotion_failed) { eden()->clear(SpaceDecorator::Mangle); from()->clear(SpaceDecorator::Mangle); swap_spaces(); } else {// Otherwise inform old age promotion failed, still swap From and To space swap_spaces(); _old_gen->promotion_failure_occurred(); }... }Copy the code
Check whether the garbage collection is safe (collection_attempt_is_safe) before doing YGC. Safety is to determine whether the old generation can safely accommodate the new generation in the worst case, when the new generation is full of viable candidates for promotion. If you can do YGC again.
Young_process_roots () scans all types of GC roots and scans the card table memory set for references from the old To the new generation, then copies them into the To space using the fast scan closure. The FastScanClosure, FastScanClosure, abstracts operations on one object (thread, object, klass, and so on) into closure operations, which are then passed into logical code that processes contiguous objects. Since the C++ 98 language standard used by HotSpot VM does not have lambda expressions, closures can only be simulated using classes [1]. The FastScanClosure closure is shown in Listing 10-9:
Listing 10-9 FastScanClosure closure
Template <class T> inline void FastScanClosure::do_oop_work(T* p) {heap_oop = RawAccess<>::oop_load(p); if (! CompressedOops::is_null(heap_oop)) { oop obj = CompressedOops::decode_not_null(heap_oop); If ((HeapWord*)obj < _boundary) {// If (HeapWord*)obj < _boundary) { New_obj = obj-> is_Forwarded()? obj->forwardee(): _g->copy_to_survivor_space(obj); RawAccess<IS_NOT_NULL>::oop_store(p, new_obj); If (is_scanning_a_cld()) {// Set gc_barrier do_cld_barrier() as required; } else if (_gc_barrier) { do_barrier(p); }}}}Copy the code
Starting from GC Root and the old age, all reachable objects are live objects, and FastScanClosure is applied to each live object. If it encounters an object that has a forward pointer set, that is, copied, it returns the copied object directly, otherwise copy it using copy_to_SURVIVor_space as shown in Listing 10-10:
Listing 10-10 COPY_to_SURVIVor_space
oop DefNewGeneration::copy_to_survivor_space(oop old) { size_t s = old->size(); oop obj = NULL; If (old->age() < tenuring_threshold()) {obj = (oop) To ()->allocate_aligned(s); If (obj == NULL) {obj = _old_gen->promote(old, s); if (obj == NULL) { handle_promotion_failure(old); return old; }} else {/ / To the space distribution of successful const intx interval = PrefetchCopyIntervalInBytes; Prefetch::write(obj, interval); Copy:: aligned_disjoinT_words ((HeapWord*)old,(HeapWord*)obj,s); Obj ->incr_age(); age_table()->add(obj, s); } old->forward_to(obj); old->forward_to(obj); return obj; }Copy the code
Copy_to_survivor_space () copies an object To the To space or promotes it To an older age, as appropriate, and then sets a new object address for the older object To forward a pointer. Set forward the meaning of the pointer is GC Root could point to the same object slot, if the simple moving objects, and amend the slot for the new object address, the second GC Root slot will visit the old object to the wrong address, and set the forwarding pointer, follow-up visit to the old object will be forwarded to the correct new objects.
The above procedure touches the GC Root and objects directly reachable from the old age and moves them To the To space (or To the old age), which may contain reference fields that may be indirectly reachable To other objects. The Serial GC maintains a save_mark pointer and a pointer to the top of the allocated space (to()->top()). Objects in the region from the bottom of the TO space to save_mark represent objects that have been scanned for both themselves and their fields. Save_mark objects in the region at the top of the space represent objects whose own scan is complete but whose own fields are incomplete.
FastEvacuateFollowersClosure task is scanning save_mark into space at the top of the object, and through their fields, and can achieve the object of moving to the bottom of the space to save_mark area, and then move forward save_mark, Until save_mark equals the top of the space, the scan is complete.
The same logic applies To the older generation as the new generation objects may move To the To space or be promoted To the older generation.
Full GC
For historical reasons, the implementation of FGC is located in Serial /genMarkSweep. Although SerialGC’s FGC implementation appears by name to be based on a tag clearing algorithm, it is actually based on a tag compression algorithm, as shown in Figure 10-4.
Figure 10-4 Serial GC
The markup algorithm used by FGC is based on the Lisp2 algorithm proposed by Donald E. Knuth: Mark the living objects first, then move all the living objects Compact to one end of the space. FGC began TenuredGeneration: : collect, it will record some log before and after the GC, you can use – Xlog: GC * output these logs, as shown in listing 10 to 11:
Listing 10-11 FGC log
GC(1) Phase 1: Mark live objects
GC(1) Phase 2: Compute new object addresses
GC(1) Phase 3: Adjust pointers
GC(1) Phase 4: Move objects
Copy the code
The FGC process is divided into four phases, as shown in Figure 10-5.
Figure 10-5 FGC phases of the Serial GC
1. Mark Live Objects
In the first stage, the virtual machine iterates through all types of GC Root and then marks all viable objects from that GC Root using XX::oops_do(root_closure). XX represents GC Root type, and root_closure represents a closure that marks a viable object. Root_closure MarkSweep: namely: FollowRootClosure closure, give it an object, you can mark the objects, the iterative marking members of the object, and mark all the objects in the stack object and its members, as shown in listing 10 to 12:
Listing 10-12 marks the survivable object
Template <class T> inline void MarkSweep::follow_root(T* p) {// If the object to which the reference points is not empty and unmarked T heap_oop = RawAccess<>: oop_load(p); if (! CompressedOops::is_null(heap_oop)) { oop obj = CompressedOops::decode_not_null(heap_oop); if (! obj->mark_raw()->is_marked()) { mark_object(obj); // mark the object follow_object(obj); }} follow_stack(); } // If the object is an array object, mark the array, Otherwise mark the member of the object inline void MarkSweep:: FOLLOW_object (oop obj) {if (obj->is_objArray()) { MarkSweep::follow_array((objArrayOop)obj); } else { obj->oop_iterate(&mark_and_push_closure); }} void MarkSweep::follow_stack() {// Mark the entire stack of references to do {// mark while (! _marking_stack.is_empty()) { oop obj = _marking_stack.pop(); follow_object(obj); } // If the stack of objects is not empty, mark one by one if (! _objarray_stack.is_empty()) { ObjArrayTask task = _objarray_stack.pop(); follow_array_chunk(objArrayOop(task.obj()), task.index()); } }while(! _marking_stack.is_empty()||! _objarray_stack.is_empty()); } // mark the Class and number group members of the array type, such as String[] p = new String[2] // mark java.lang. p[1],p[2] inline void MarkSweep::follow_array(objArrayOop array) { MarkSweep::follow_klass(array->klass()); if (array->length() > 0) { MarkSweep::push_objarray(array, 0); }}Copy the code
2. Compute New Object Address
After all the live objects are marked, the Serial GC calculates a new address for the live objects and stores it in the object header in preparation for the subsequent Compact. The idea is to set cur_obj and compact_top to point to the bottom of the space, and then scan from the bottom of the space. If cur_obj scans a viable object, set the new address of that object to compact_top, and then scan again. Until cur_obj reaches the top of space.
3. Adjust the object Pointer
Although new address to calculate the object, the GC Root point is still the old object, at the same time, members of the object reference is also the old object address, this time by adjusting the object Pointers can modify these relations, let the GC Root address new objects, then members of the object reference will be adjusted for the new object reference address.
4. Move objects
When everything is ready, memory is allocated for the object at the new address, and the reference relationship has been modified, but the object at the new address does not contain valid data, so the object data is copied one by one from the old object address to the new object address, and the FGC is done. The Serial GC resets GC related data structures and logs GC information.
The pause
As mentioned in Section 10.1.2, Stop The World (STW) is when all Mutator threads are paused. Serial GC YGC and FGC are single-threaded, so all Mutator threads must be suspended while GC is working. The larger the Java heap, the more obvious the STW is, and the longer the STW is, the less acceptable it is for GUI programs or other programs that require pseudo-real-time and fast response. So STW is one of the most frustrating aspects of garbage collection technology: it takes time for all Mutator threads to reach a safe point, and the garbage collection itself takes a lot of time after STW. So, can we use modern processor multi-core to parallelize part of the garbage collection after STW? Parallel GC gives a satisfactory answer to this point.