This is the 16th day of my participation in Gwen Challenge
JVM memory partition
The MEMORY structure of the JVM consists of five regions: program counters, virtual machine stack, local method stack, heap area, and method area.
Among them, the program counter, virtual machine stack, local method stack three areas are born and die with the thread, so the memory allocation and reclamation of these areas are deterministic, there is no need to think too much about reclamation, because when the method ends or the thread ends, the memory will naturally follow the reclamation. Unlike the Java heap and method areas, where the allocation and collection of memory is dynamic, it is the part of the garbage collector that needs to focus on. Before the garbage collector can collect the heap and method areas, it must first determine which objects in these areas can be collected and which cannot be collected for the time being, using algorithms to determine whether objects are alive or not.
The JVM’s algorithm for determining whether an object is alive or not
Reference counting algorithm
Algorithm analysis
Reference counting was an early strategy in the garbage collector. In this approach, there is a reference count for each object instance in the heap. When an object is created, the instance of the object is assigned to a variable whose count is set to 1. When any other variable is assigned a reference to this object, the count increases by 1 (a = b, then the counter of the object instance referenced by B +1), but when a reference to an object instance expires or is set to a new value, the object instance’s reference counter decreases by 1. Any object instance that references a counter of 0 can be garbage collected. When an object instance is garbage collected, the reference counter of any object instances it references decreases by one.
Advantages:
The reference counting collector can be executed quickly, interwoven with the execution of the program. This is good for real-time environments where programs need to be free from prolonged interruptions.
Disadvantages:
A circular reference could not be detected. If the parent object has a reference to the child object, the child object in turn references the parent object. Thus, their reference count can never be zero.
Code sample
Feeling boring? Let’s write some code to calm things down
public class ReferenceFindTest { public static void main(String[] args) { MyObject object1 = new MyObject(); MyObject object2 = new MyObject(); object1.object = object2; object2.object = object1; object1 = null; object2 = null; }}Copy the code
This code is used to verify that the reference counting algorithm cannot detect circular references. Object1 and object2 are null, meaning that object1 and object2 point to objects that cannot be accessed, but because they refer to each other, their reference counter is not zero, and garbage collector will never reclaim them.
Accessibility analysis algorithm
Accessibility analysis algorithm from the graph theory is introduced into discrete mathematics, program all the reference relationship as a diagram, starting from a node GC ROOT, find the corresponding to the reference node, continue to look for reference from the node, when all the reference node to find after, the rest of the nodes is considered not by reference to the node, That is, useless nodes. Useless nodes are considered recyclable objects.
In the Java language, objects that can be used as GC Roots include the following:
- Objects referenced in the virtual machine stack (local variables in the stack frame);
- The object referenced by the class static attribute in the method area; The object referenced by the constant in the method area;
- Objects referenced by JNI (Native methods) in the Native method stack.
References in Java
Whether the number of references is determined by reference counting algorithm or whether the reference chain of an object is reachable by reachability analysis algorithm, determining whether an object is alive or not is related to “reference”. In The Java language, references are divided into four types: strong reference, soft reference, weak reference, and virtual reference. The strength of these four kinds of references gradually decreases. Strong references are common in program code, such as Object obj = new Object(). As long as strong references exist, the garbage collector will never reclaim the referenced Object. Soft references are used to describe objects that are useful but not necessary. Objects associated with soft references are listed in the collection scope for a second collection before the system is about to run out of memory. An out-of-memory exception is thrown if there is not enough memory after this collection. Weak references are also used to describe non-essential objects, but they are weaker than soft references, and objects associated with weak references only survive until the next garbage collection occurs. When the garbage collector works, objects associated only with weak references are reclaimed regardless of whether there is currently enough memory. A phantom reference, also called a phantom reference or phantom reference, is the weakest type of reference. The existence of a virtual reference does not affect the lifetime of an object, nor can an object instance be obtained through a virtual reference. Its purpose is to receive a system notification when the object is reclaimed by the collector. Don’t be intimidated by the concept, and don’t worry about it. The purpose of this list is to show that both reference-counting algorithms and reachability analysis algorithms are based on strong references.
The subject’s last struggle before death (retrieval)
Even in the reachability analysis algorithm, unreachable objects are not necessarily dead. At this time, they are temporarily in the stage of “probation”. To truly declare an object dead, at least two marking processes are needed. First mark: If an object does not have a chain of references connected to GC Roots after reachability analysis, it will be marked first; Second tag: The first tag is followed by a filter to determine whether the object needs to execute the Finalize () method. If the reference chain is not re-established in the Finalize () method, it will be marked a second time. The object marked successfully the second time will actually be recycled, and if the object reassociates the reference chain in the Finalize () method, it will escape the collection.
How does the method area determine whether it needs to recycle
Whether the method area storage content needs to be recycled is not the same. The main contents of the method area are: discarded constants and useless classes. Deprecated constants can also be determined by reference reachability, but for useless classes, the following three conditions must be met: all instances of the class have been reclaimed, that is, there are no instances of the class in the Java heap; The ClassLoader that loaded the class has been reclaimed; The java.lang.Class object corresponding to this Class is not referenced anywhere, and the methods of this Class cannot be accessed anywhere through reflection.
Common garbage collection algorithms
Mark-clear algorithm
The mark-clearance algorithm adopts GC Roots to scan and mark the surviving objects, and then scans the unmarked objects in the whole space for recycling, as shown in the figure below. Mark-clear algorithm does not need to move objects, only to the non-viable object processing, in the case of a large number of viable objects is extremely efficient, but because of the mark-clear algorithm directly reclaim non-viable objects, so it will cause memory fragmentation!! []
Replication algorithm
Replication algorithm is proposed to overcome the overhead of handle and solve the problem of memory fragmentation. It starts off by dividing the heap into one surface of objects and several free surfaces. The program allocates space for objects from the object surface. When objects are full, garbage collection based on algorithms scans live objects from GC Roots. Each live object is copied to the free surface (such that there are no free holes between the memory occupied by the live object) so that the free surface becomes the object surface, the original object surface becomes the free surface, and the program allocates memory in the new object surface.
Mark-collation algorithm
The mark-collation algorithm marks objects in the same way as the mark-clear algorithm, but the clearing is different. After the space occupied by the non-viable objects is reclaimed, all the viable objects are moved to the left free space and the corresponding Pointers are updated. Mark-tidy algorithm is based on mark-clear algorithm, but also carries on the object movement, so the cost is higher, but it solves the problem of memory fragmentation. See the following figure for the specific process:
Generational collection algorithm
The generational collection algorithm is currently used by most JVM garbage collectors. Its core idea is to divide memory into several different regions according to the life cycle of the object. Under normal conditions, the reactor is divided into the Tenured Generation and the Young Generation, and beyond the reactor is the Permanet Generation. The characteristics of the old generation are that only a small number of objects need to be recycled in each garbage collection, while the characteristics of the new generation are that a large number of objects need to be recycled in each garbage collection. Therefore, the most suitable collection algorithm can be adopted according to the characteristics of different generations.
Young Generation recycling algorithm
- All newly generated objects are first placed in the young generation. The goal of the younger generation is to collect objects with short life spans as quickly as possible.
- The new generation memory is divided into one Eden zone and two survivor zones (survivor0,survivor1) in a 8:1:1 ratio. Most objects are generated in Eden zone. When recycling, the Eden zone surviving objects are first copied to a survivor0 zone and then Eden zone is emptied. When this survivor0 is also full, copy Eden and survivable objects in survivor0 to another survivor1 and empty Eden and survivable objects in survivor0. At this point, survivor0 is empty and then swap survivor0 with Survivor1, leaving Survivor1 empty and so on.
- When survivable objects of Eden and SURVIVable objects of SURVIVable 0 are insufficient in survivor1, survivable objects are stored directly to the old age. If the old generation is also Full, a Full GC will be triggered, that is, the new generation and the old generation will be recycled.
- Cenozoic GCS are also called Minor GCS, and minorgcs occur frequently (not necessarily when Eden is full).
Old Generation recycling algorithm
- Objects that survive N garbage collections in the young generation are put into the old generation. Therefore, you can think of the tenured generation as holding objects with long life cycles.
- The memory is also much larger than that of the new generation (approximately 1:2). When the memory of the old generation is Full, Major GC is triggered, namely Full GC. Full GC occurs at a low frequency, and the old generation objects have a longer survival time and a high survival rate.
Permanent Generation recycling algorithm
Used to store static files, such as Java classes, methods, etc. Persistent generation has no significant impact on garbage collection, but some applications may generate or call classes dynamically, such as Hibernate, etc. In such cases, a large persistent generation space needs to be set up to store the classes added during the run. Persistent generation is also called method region.
4. Common garbage collectors
The following diagram shows all the collectors that the HotSpot VIRTUAL machine contains.
Serial collector (Replication algorithm)
The new generation of single-threaded collectors, marking and cleaning are single-threaded, the advantage is simple and efficient. This is the default GC mode at the client level. You can use -xx :+UseSerialGC to specify the GC mode forcibly.
Serial Old collector (mark-collation algorithm)
Older single-threaded collector, older version of Serial collector.
ParNew collector (stop-copy algorithm)
The new generation of collectors, which can be considered multi-threaded versions of Serial collectors, perform better than Serial on multi-core cpus.
Parallel Scavenge Collector (Stop-copy algorithm)
Parallel collector, the pursuit of high throughput, efficient use of CPU. Throughput is typically 99% and throughput = user thread time /(user thread time +GC thread time). Suitable for scenarios with low requirements on interaction, such as background applications. -xx :+UseParallelGC specifies the number of threads to be collected. -xx :ParallelGCThreads=4
Parallel Old collector (stop-copy algorithm)
An older version of the Parallel Collector, the Parallel collector, throughput first. CMS(Concurrent Mark Sweep) collector (mark-clean algorithm) is a high concurrency, low pause, pursuit of the shortest GC recovery pause time, high CPU usage, fast response time, short pause time, multi-core CPU pursuit of high response time choice.
5, When does GC trigger (interview 5-star questions)
Because objects are processed in generations, garbage collection regions and times are different. There are two types of GC: Scavenge GC and Full GC.
Scavenge GC
Normally, when a new object is created and Eden fails to apply for space, the Scavenge GC is triggered to clean up the Eden exploiture, and the surviving objects are moved to the Survivor zone. Then the two zones of Survivor are collated. In this way, GC is performed on the Eden area of the young generation without affecting the old generation. Since most objects start from Eden, and Eden is not allocated very large, GC in Eden will occur frequently. Therefore, a fast and efficient algorithm is generally needed to make Eden free as soon as possible.
Full GC
Clean up the entire heap, including Young, Tenured, and Perm. Full GC is slower than Scavenge due to the need to recycle the entire heap, so minimize the amount of Full GC you do. A large part of the process of tuning the JVM is tuning the Full GC. Full GC can be caused by one of the following reasons: old years are Full; Persistent generation (Perm) is full; System.gc() is called on display; The allocation strategy of each Heap field changed dynamically after the last GC;