First take a look at the MEMORY distribution of the JVM
Again, how do you identify garbage
1. Enumeration referenced +1 not referenced -1 Defect: cannot solve the loop problem.
These objects can never be reclaimed because they reference each other
The principle of the GCRoot reachability algorithm is to start from a series of objects called GCRoot, lead to the next node they point to, and then the next node as the starting point, lead to the next node this node points to, to the next node, to the next node, to the next node, to the next node, will form a chain. The last nodes that do not appear in these GCRoot chains are recyclable
Where C is considered to be recycled
GCRoot can be 1. Reference object (local variable) in the virtual machine stack 2. Static member variables of the class 3. objects referenced by constants in the method area 4. objects referenced by JNI in the method area
How does the JVM start GC
1.OopMap: The JVM (HotSpot for example) will report some class information into the OoMap data structure, which records what data types are on what offsets. It is placed in this data structure when the class is loaded, so that you don’t have to scan all the classes each time to find the reference relationship of the class.
2. SafePoint: With OopMap,HotSpot can do GC Root enumeration quickly and accurately. The JVM stops The World during GCRoot enumeration and garbage collection. A scheme designed to prevent user threads from changing references to marked nodes during RCRoot enumeration or garbage collection when starting GCRoot enumeration, The user thread pauses when it reaches this safe point, and these safe points are basically voted on “whether they have characteristics that make the program run for a long time.”
2.1 So how do you make a thread (one that does not contain JNI calls) stop at a specified safe point? 2.1.1 Preemptive interrupt: When garbage collection occurs, the system will first interrupt all user threads. If it is found that the user thread is not interrupted at the safe point, the system will resume and let the thread continue to execute until it reaches the safe point (almost no virtual machine is used). 2.1.2 Active interrupt: Do not operate directly on the thread, but only set an identification bit. Each thread goes back to the loop bit (using a memory-protected trap) and suspends itself at the nearest safe point once it finds that the interrupt bit is true
3. Safe areas: Security point too much is not very good ah, can have a look at what can be used as a safe point), so the concept and design a safe area, the user thread to the safe zone inside the code, will first identify itself into the security zone, so this time if the virtual machine GC Root enumeration or garbage collection, When the thread is about to leave the safe zone, it checks to see if the virtual machine has completed GC Root enumeration or any other phase of garbage collection that requires the thread to be suspended. If so, nothing continues, otherwise it waits.
JVM garbage Algorithm
1. Mark clearing marked objects to be cleared. Disadvantages of clearing: Low efficiency will produce a large amount of memory fragments
2. Mark copy This algorithm will divide the memory into two areas, each time using one of them, GC occurs, the uncollected memory will be put into another piece of memory, after the rest of the cleaning algorithm is simple and clear disadvantages: waste space can only use half of the memory object at a time, it is not cost-effective when the survival rate is high
3. Mark collation makes the living object go to a section after all the living boundary is cleaned up, which not only solves the problem of fragmentation but also solves the problem of memory waste in the case of high survival of objects in the replication algorithm. Disadvantages: very complex consumption of resources suitable for the old age
Coming back to JVM cross-generation references
1. Review the JVM now use generational recovery algorithm generational collection is currently recovery algorithm used by the virtual machine, it solved the tag does not apply to the problem of old age, memory can be divided into various s, using different algorithms in different s, to use the most appropriate algorithm, new generation survival rate is low, you can use the replication algorithm. However, the old age object has a high survival rate and there is no extra space to guarantee its allocation, so the marker collation algorithm is used.
2. Cross-generation reference problem: If an object reference from an older generation refers to the younger generation, what happens to the referenced object in the younger generation when it is garbage collected? Will the geriatric area be scanned again during GCRoot scan? The answer is definitely not because does that mean that every time you do a Minor GC you have to scan for old age? Isn’t that equivalent to a sub-level Full GC, so how does the JVM solve this problem? The answer is cards.
A general introduction to a memory set (basic idea) : A memory set is a data structure used to record a collection of Pointers from a non-collection region to a collection region. The JVM will put all references across generations together, and the objects referenced in the young generation will not be swept out but will age up until the old generation so that the cross-generation problem is eliminated
Card table (memory set implementation) : In the hotspot virtual machine, the card table is an array of bytes. Each item in the array corresponds to a contiguous address area in memory. If there are objects in the area that reference only the area to be recovered, the card table element is set to 1, and if there are none, the card table element is set to 0
Figure: Each successive memory address moves 9 bits to the right to its position on the card table
And write barriers: How does this data get dirty on a card table? It is clear when to get dirty. The corresponding card table element should become dirty when another generation region object refers to the region object. The time should be at the moment of reference assignment, so this can then be used to maintain the aOP-like aspect of the card table state by writing barrier instructions.
Card table side effects: 1. Performance overhead of write barriers 2. Memory “pseudo-sharing” When multiple threads modify independent variables that happen to be in the same cache row, they can affect each other (write back, invalidate, or synchronize) resulting in performance degradation
After JDK 7, HotSpot virtual machine added a new parameter -xx :+UseCondCardMark to decide whether to enable card table update.
Introduction: In-depth understanding of Java Virtual Machine version 3 by Zhiming Zhou